Hugging Face ships PI0 VLA model

Hugging Face released Transformers v5.4.0 and PI0, a vision‑language‑action (VLA) model aimed at robotics and embodied AI research, plus tools like VidEoMT to boost real‑time embodied workflows. That release tightens the bridge between foundation models and robot control stacks. (x.com)

The π_0 paper, titled "π_0: A Vision‑Language‑Action Flow Model for General Robot Control," names Kevin Black as lead author and includes 24 coauthors such as Chelsea Finn and Sergey Levine, with the most recent arXiv revision dated January 8, 2026. (arxiv.org) π_0 is described as a flow‑based vision‑language‑action (VLA) model that conditions on image, language, observation, and action tokens and uses a flow‑matching head for continuous action generation. (github.com) The openpi repository also exposes π_0 variants—π_0‑FAST (an autoregressive VLA using the FAST action tokenizer) and a π_0.5 upgrade aimed at better open‑world generalization—along with checkpoints and inference code. (github.com) Transformers 5.4.0 was published to PyPI on March 26, 2026, and its changelog lists new model additions including PI0 and VidEoMT among other updates such as Mistral 4 and PaddlePaddle support. (pypi.org) VidEoMT (Video Encoder‑only Mask Transformer) is an encoder‑only ViT approach for online video segmentation that the authors report runs 5×–10× faster than prior methods and reaches up to 160 FPS with a ViT‑L backbone in their benchmarks. (arxiv.org) Live assets and demos accompany the models on the Hub: a lerobot π_0_base model card is hosted on Hugging Face, and a VidEoMT demo Space processes uploaded videos for instance/semantic/panoptic masks. (huggingface.co) Transformers v5, positioned as a unified model‑definition framework, underwent a major redesign at initial v5 with roughly 1,200 commits and continues to add multimodal/video and robotics model definitions to simplify downstream integration. (huggingface.co)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.