NVIDIA's Physical AI Stack Showcased
Social media discussion is highlighting the capabilities of NVIDIA's physical AI stack for humanoid robots. A recent post detailed the integration of NVIDIA's Parakeet for voice, Cosmos Reason 2 for scene understanding, and GR00T N1.6 for action generation on a Pollen Robotics humanoid.
- Project GR00T, which stands for Generalist Robot 00 Technology, is a foundational model intended to enable robots to learn skills by observing human actions. The goal is to create a general-purpose model that can understand natural language to perform a wide variety of tasks in the real world. - The Cosmos Reason 2 model provides the "common sense" reasoning for the AI stack, allowing a robot to understand physics, plan multi-step tasks, and adapt to new situations. It is an open vision-language model (VLM) that comes in 2-billion and 8-billion parameter versions. - NVIDIA Parakeet is a family of automatic speech recognition (ASR) models; the specific model mentioned, Parakeet-tdt-0.6b-v3, is a 600-million-parameter version designed for high-throughput, multilingual transcription. It automatically adds punctuation and capitalization and can provide word-level timestamps. - To run this software stack, NVIDIA has developed a specialized computer called Jetson Thor, a system-on-a-chip (SoC) designed specifically for humanoid robots and other edge AI applications. It features a Blackwell GPU architecture and is designed to run large generative AI models locally on the robot. - The entire stack is part of the broader NVIDIA Isaac robotics platform, which heavily utilizes simulation for a "sim-to-real" workflow. Tools like Isaac Lab allow for reinforcement learning in thousands of parallel simulations before transferring the learned skills to a physical robot. - Pollen Robotics, the creator of the Reachy humanoid, was founded in 2016 and focuses on open-source robotics to foster community collaboration. Its Reachy 2 robot is designed as a bimanual mobile manipulator for AI and robotics research. - Major robotics companies like Boston Dynamics, Agility Robotics, Figure AI, and Sanctuary AI are also part of the ecosystem NVIDIA is building its AI platform for.