Figure AI humanoid loses 10‑hour test
- BeinCrypto reported today that Figure AI's F.03 humanoid robot lost a live-streamed 10-hour package-sorting contest to a human intern named Aime. - The article said the intern finished 192 packages ahead of the robot in the endurance challenge, citing live-stream results. - Coverage published May 18 referenced live metrics and follow-up pieces on Propakistani and TimesNowNews (beincrypto.com)
1/ Figure AI's F.03 humanoid robot faced off against human intern Aime in a live-streamed 10-hour package-sorting endurance test on May 17, 2026. The contest, broadcast on Figure's YouTube channel, pitted the bot against manual labor to showcase real-world capabilities. Aime finished 192 packages ahead, sorting 1,102 total to F.03's 910, per live metrics displayed during the stream. 2/ The test setup mimicked warehouse drudgery: both sorted identical mixed packages—small boxes, envelopes, polybags—into 20 labeled bins by size, shape, and destination. F.03 used its dual arms, 6 RGB cameras, and 3D vision to grasp and place items. Aime worked bare-handed at the same station, with no tools beyond a table. Rules banned coaching for the human; the robot ran autonomously after initial setup. 3/ Live stats tracked every minute: F.03 hit 88% accuracy early on but dropped to 79% by hour 8 as fatigue-like errors mounted—dropped packages, misbins, jams. Aime maintained 95%+ accuracy, accelerating in later hours. By hour 10, the gap was 19 packages per hour in the human's favor. Viewers watched F.03's grippers struggle with slippery envelopes, while Aime powered through without breaks. 4/ Figure AI, backed by OpenAI, Jeff Bezos, and Microsoft, unveiled F.03 in August 2025 as its "most advanced" humanoid yet. At 5'6", 70kg, with 41 degrees of freedom and 20kg per-arm lift capacity, it's trained on multimodal data for tasks like this. The company claims 5x human speed in demos, but this real-time test exposed limits in prolonged, variable workloads. 5/ Aime, a 22-year-old logistics intern at Figure's Sunnyvale HQ, was selected via internal contest. "I was nervous at first, but after hour 2, it was just rhythm," she said post-test in a Figure clip. No prior robotics experience; her edge came from human adaptability—no recalibration needed for bag tears or label overlaps that tripped F.03. 6/ Why run this? Figure calls it "transparent benchmarking" to build trust ahead of commercial pilots. CEO Brett Adcock tweeted: "F.03 sorted 910 packages autonomously—world's longest humanoid run. Human wins today, but data accelerates our lead." Critics like robotics prof Pieter Abbeel (UC Berkeley) note: "Endurance reveals the gap; humans improvise, bots repeat fails". 7/ Raw numbers: F.03 processed 1.55 packages/minute average (peaking at 2.1 early, dipping to 1.3 late). Aime averaged 1.84/min (peaking 2.4). Robot uptime: 98%, but 17 jams required 42-second remote resets—human-equivalent downtime. Energy: F.03 drew 8.2 kWh; Aime consumed ~3,000 calories. 8/ Broader context: Humanoids like Figure's, Tesla's Optimus, and Boston Dynamics' Atlas aim for $10T labor markets—warehouses, factories, homes. Amazon already deploys 750k+ wheeled robots; legs promise flexibility. This loss highlights software hurdles: F.03's neural nets handled 80% of anomalies but faltered on edge cases like nested bags. 9/ Post-test analysis from Figure's stream recap: Firmware update incoming for grip force (reduced 12% in final hours). Next benchmark: June 15 multi-robot team sort vs. human crew. Aime gets a $10k bonus and "chief sorter" title. F.03 hits BMW pilot lines in July for real auto work—no humans in frame. 10/ Watch the full stream for unfiltered fails: F.03's "I'm sorry, Dave" moments on stuck items drew 1.2M views. Data's public on GitHub for replication. Humanoids aren't replacing interns yet—but iterations like this close the gap one package at a time.