Berkeley Robot Learns Table Tennis Serves by Watching

A Berkeley PhD student demonstrated a robot learning to serve a ping pong ball simply through observation. The video highlights rapid advances in imitation learning and motion control, where autonomous systems can acquire complex physical skills without explicit programming.

The Berkeley project, named HITTER (HumanoId Table TEnnis Robot), utilizes a hierarchical framework that combines a model-based planner for predicting the ball's trajectory with a reinforcement learning-based whole-body controller for executing movements. This system was validated on a Unitree G1 humanoid robot, a general-purpose platform, without any specialized hardware modifications. For perception, the system relies on nine OptiTrack cameras operating at 360 Hz to track the ball's position with millimeter-level accuracy. The reinforcement learning policy, which controls all 29 joints of the robot, receives this data and outputs desired joint positions at a rate of 50 Hz. This setup enables the robot to react and return a human's smash in just 0.42 seconds. The whole-body controller was trained using Proximal Policy Optimization (PPO), a common reinforcement learning algorithm, entirely within the Isaac Lab simulation environment before being deployed on the physical robot in a zero-shot transfer. To encourage more natural and effective movements, human motion references were incorporated into the training process, helping the robot to mimic human-like waist rotation during a swing. The underlying hardware, the Unitree G1, is an electric humanoid standing 1.3 meters tall and weighing approximately 35kg. It features 23 to 43 degrees of freedom, depending on the configuration, and its joints are powered by self-developed motors capable of a maximum torque of 120 N·m. For onboard processing, it includes an 8-core CPU, with the EDU version offering an NVIDIA Jetson Orin module. This achievement in dynamic, real-world interaction is relevant to the work being done in the Los Angeles aerospace and robotics ecosystem. Companies like Anduril Industries, which builds autonomous defense systems, and GrayMatter Robotics, focused on AI-powered factory automation, are actively developing solutions that require robust motion control and real-time adaptation. The core challenge in imitation learning is often "trajectory drift," where small errors in the learned policy accumulate over time, causing the robot to enter states not seen in the expert demonstrations. Algorithms like Dataset Aggregation (DAgger) are designed to mitigate this by iteratively collecting new data from the learned policy and having an expert label it, which is a key concept for technical interviews in this domain. Future work on the HITTER project will focus on moving beyond rallies to actual gameplay. This includes developing autonomous serving capabilities and implementing multi-agent training frameworks to learn competitive strategies and adapt to the tactics of skilled human players.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.