NVIDIA's Physical AI Stack Gains Traction

NVIDIA's physical AI stack is generating significant discussion for its application in humanoid robotics. Social media posts highlighted the integration of its voice, scene understanding, and action models on Pollen Robotics' Reachy 2 humanoid. The performance of NVIDIA's models is also being compared to competitors like Alibaba's RynnBrain, which recently broke 16 embodied AI benchmarks.

- NVIDIA's Project GR00T (Generalist Robot 00 Technology) is a foundation model aimed at creating a general-purpose AI for humanoid robots. This model enables robots to understand natural language and learn skills like coordination and dexterity by observing human actions. The initiative includes a new computer, Jetson Thor, which is a system-on-a-chip (SoC) based on the NVIDIA Blackwell architecture, designed specifically for the complex tasks and safe interaction required in humanoid robots. - The NVIDIA Isaac platform provides a comprehensive toolkit for robotics development, including Isaac Lab for robot learning, which is built on Isaac Sim for high-fidelity physics simulation. This open-source framework supports both imitation and reinforcement learning and is used by companies like Boston Dynamics, Agility Robotics, and Figure AI for training humanoid robots. Recently, NVIDIA introduced Isaac Lab-Arena for scalable policy evaluation and OSMO, a cloud-native orchestration platform. - Pollen Robotics' Reachy 2 is an open-source humanoid robot designed for research in real-world applications and embodied AI. It features two 7-degree-of-freedom arms, can be controlled via Python or a VR teleoperation app, and runs on ROS2. In a significant move for the open-source robotics community, Hugging Face acquired Pollen Robotics and now offers Reachy 2, which integrates with its LeRobot library of pretrained models and datasets. - Alibaba's RynnBrain is an open-source embodied AI model that introduces spatio-temporal memory and physical spatial reasoning to robotics. This allows a robot to remember the state of a task if interrupted and resume it later. Alibaba claims RynnBrain has achieved top performance across 16 embodied AI benchmarks, outperforming models like Google's Gemini Robotics ER 1.5. - Foundation models are a paradigm shift in robotics, moving from task-specific models to generalized frameworks that can be adapted to various applications with minimal fine-tuning. These large-scale models are pre-trained on diverse datasets, enabling them to generalize across different robotic scenarios, a key factor in developing general-purpose robots. - The development of physical AI relies on a three-computer system architecture: NVIDIA DGX AI supercomputers for training large foundation models, RTX PRO Servers running Omniverse for simulation and synthetic data generation, and the Jetson AGX Thor for on-robot, real-time inference. This structure addresses the significant challenge of acquiring sufficient real-world training data for robust robot learning. - Major tech companies are heavily investing in the humanoid robotics space, exemplified by Figure AI's $675 million Series B funding round with backing from NVIDIA, Microsoft, OpenAI, and Jeff Bezos. This funding is aimed at scaling up the manufacturing and deployment of humanoid robots, expanding GPU infrastructure for AI model training, and acquiring extensive real-world data to enhance embodied intelligence. Figure AI is leveraging NVIDIA's Isaac Sim for synthetic data generation and H100 GPUs for training its AI models.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.