Nvidia's DreamDojo Open-Sourced for Robot Training
Nvidia has open-sourced DreamDojo, a world model designed for training robots. The platform provides a testbed for sim-to-real transfer, multi-agent environments, and scalable reinforcement learning. The repository is intended to be a starting point for engineering teams building agentic robotic systems.
- DreamDojo is pretrained on a massive dataset of 44,711 hours of first-person human videos, which allows it to learn general physical rules before being fine-tuned for a specific robot's hardware. - This "Simulation 2.0," as Nvidia's Director of AI Jim Fan calls it, generates a simulated future in pixels without needing a traditional physics engine or 3D models. - To bridge the gap between human video and robot actions, the system uses "latent actions," which are 32-dimensional vectors that represent the critical motion between video frames. - A distilled version of DreamDojo can run in real-time at over 10 frames per second, enabling live VR teleoperation and model-based planning. - The pretraining for the 2B and 14B model variants required 100,000 NVIDIA H100 GPU hours. - DreamDojo is part of a broader Nvidia ecosystem for robotics, which includes Isaac Lab for physically-based simulation and Project GR00T, a general-purpose foundation model for humanoid robots. - The simulated success rates in DreamDojo show a very high correlation (Pearson r=0.995) with real-world performance, allowing for reliable policy evaluation without physical deployment. - DreamDojo builds on Nvidia's open-weight model, Cosmos, and all its components—weights, code, and datasets—are open-source to encourage community development.