Stanford Hosts Physical AI Event

PL-Universe Robotics and Stanford University held a flagship event on February 26 focused on Physical AI and robotics. The event gathered experts to discuss topics like visual language models for autonomy and the future of robotics in manufacturing.

At the Stanford event, PL-Universe's Founder & COO, Ge Jin, introduced a new approach for intelligent manufacturing: a "universal ontology + rapidly replaceable dedicated end-effectors" solution. This strategy aims to provide the flexibility and reliability required for large-scale industrial deployment of robots. Quan Kuichen, the head of PL-Universe's Large Model Team, explained the company's progress in applying Vision-Language-Action (VLA) models to industry. He highlighted breakthroughs in multi-modal data collection, collaboration between cloud and on-device AI, and few-shot learning, which allows robots to learn new tasks from a small number of examples. These advancements are designed to move embodied AI from laboratory settings to active production lines with high precision. From a venture capital standpoint, TSVC General Partner Spencer Greene discussed the investment logic for Embodied AI startups. He pointed to structural labor shortages as a key driver for the adoption of Embodied AI systems and stressed the importance of focusing on real commercial value rather than the hype surrounding humanoid robots. The broader field of Embodied AI, which gives machines the ability to perceive, reason, and physically interact with the world, is experiencing significant growth. The global market reached $4.44 billion in 2025 and is projected to grow at a rate of 39% annually, with expectations to reach $23 billion by 2030. This growth is fueled by the increasing integration of AI into physical systems that can learn and adapt through interaction. Automotive industry observer Xing Lei provided a global perspective on the physical AI landscape, noting that China's strengths lie in supply chains and application scenarios. Meanwhile, the U.S. leads in the development of algorithms and semiconductor chips, suggesting a need for complementary cooperation between the two countries in the field. Vision-Language-Action (VLA) models are a important component of this technological shift, unifying perception, language understanding, and action generation. These models are moving beyond simple tasks, with ongoing research focused on improving their performance in complex industrial environments and for high-precision placement tasks. The development of open-source models, such as Stanford University's OpenVLA, is making this technology more accessible for broader research and application.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.