Gemini Robotics‑ER 1.6 drops
DeepMind released Gemini Robotics‑ER 1.6 with measurable gains in spatial reasoning and instrument reading for robotic tasks. (x.com) The company reports instrument/gauge reading accuracy around 93% and improved multi‑view task verification that supports more autonomous inspection and monitoring workflows. (x.com) (x.com)
Google DeepMind has released Gemini Robotics‑ER 1.6, a new model for robots that can reason about physical spaces and read industrial instruments. (deepmind.google) Robotics researchers call that “embodied reasoning”: using camera views, object positions, and task goals to decide what to do in the real world, not just on a screen. DeepMind said the April 14 release improves spatial reasoning, multi-view understanding, task planning, and success detection. (deepmind.google) The model adds a new instrument-reading feature for gauges and sight glasses, which are the dials and fluid windows used in industrial equipment. DeepMind said the feature grew out of work with Boston Dynamics, whose Spot robot is used for inspection rounds. (deepmind.google) DeepMind’s published benchmark says Gemini Robotics‑ER 1.6 reached 93 percent on instrument reading with “agentic vision” enabled, versus 86 percent for ER 1.6 without it, 67 percent for Gemini 3.0 Flash, and 23 percent for Gemini Robotics‑ER 1.5. The company said other evaluations also showed gains in pointing, counting, and success detection. (rockingrobots.com) The practical use case is inspection work: a robot has to look at a pressure gauge, decide whether the reading is normal, and confirm from more than one camera angle that a task is done. DeepMind said ER 1.6 improves “multi-view” verification, which is the ability to judge the same task from several images instead of one. (deepmind.google) That is the split inside Google’s robotics stack. Gemini Robotics‑ER is the reasoning layer that interprets scenes and plans steps, while a separate vision-language-action model handles the robot’s actual motions. (deepmind.google) DeepMind introduced Gemini Robotics and Gemini Robotics‑ER on March 12, 2025, saying the goal was to move Gemini’s multimodal reasoning from digital tasks into machines that can act in the physical world. The company said ER was built for roboticists who want to plug Gemini’s spatial reasoning into their own control software. (deepmind.google) The new version is now available in preview through the Gemini application programming interface and Google AI Studio. Google’s developer documentation describes it as a vision-language model that can interpret visual data, reason about object relationships, and break natural-language commands into subtasks. (ai.google.dev) Google also said ER 1.6 is its safest robotics model so far on adversarial spatial-reasoning tests, while warning developers that generative models can still make mistakes and that physical robots can cause damage. For now, the release looks aimed less at household helpers than at factory, utility, and inspection jobs where reading a dial correctly matters more than chatting. (blog.google)