Foundation Models Show Skill Transfer in Robotics

New research in embodied AI demonstrates that foundation models can effectively transfer manipulation skills across different robotic hardware. A model pre-trained on a complex 22-degree-of-freedom (DoF) robotic hand was successfully applied to a simpler 7-DoF gripper, resulting in a 30% performance gain. This showcases the potential for generalization, where large models can adapt learned behaviors to less complex, real-world robots.

- The underlying technology for many of these robotics models, like Google's RT-2 or OpenVLA, is the transformer architecture, which also powers large language models like GPT. These are adapted to be multimodal, processing inputs from vision, language, and sensors to output robot actions. - A primary challenge holding back wider adoption is the scarcity of high-quality, diverse robotics data; unlike LLMs trained on internet-scale text, robotic data is expensive and time-consuming to collect. - The 22-DoF hand mentioned is an example of a high-dexterity anthropomorphic hand, designed to mimic the complex motions of a human hand, making the successful skill transfer to a less complex 7-DoF arm a significant achievement in generalization. - This type of generalization is often achieved by pre-training on large, aggregated datasets, such as the Open-X-Embodiment dataset, which combines data from dozens of different robot morphologies and research institutions. - A key goal of this research is to achieve "zero-shot" or "few-shot" learning, where a model can perform novel tasks or operate new hardware with no or minimal fine-tuning, drastically reducing deployment time. - The field is attracting significant investment, with startups like Figure AI, Physical Intelligence, and Skild raising hundreds of millions of dollars to build general-purpose "brains" for robots. - Foundation models are being applied across the full robotics stack, from high-level perception and task planning to low-level motion control and dynamics prediction. - Researchers are actively exploring using generative AI to create synthetic data and augment real-world datasets, helping to overcome the data scarcity bottleneck and improve model robustness.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.