Gemini Robotics‑ER 1.6

- Google DeepMind released Gemini Robotics‑ER 1.6, a Gemini variant tuned for robotics control and multimodal perception. - The model boosts spatial reasoning, multi‑view success detection, and instrument reading with agentic vision accuracy reported at 93%. - Boston Dynamics integrations let Spot accept plain‑English inspection commands, and the model is available in preview via the Gemini API and Google AI Studio ( ).

Robots still need a “brain” that can read a scene, plan a task, and decide whether the job is done. Google DeepMind said on April 14 it released Gemini Robotics‑ER 1.6 to do that work for machines in the physical world. (deepmind.google) Gemini Robotics‑ER 1.6 is a vision‑language model, which means it takes in words plus images, video, and audio, then returns text instructions or reasoning a robot system can use. Google’s model card says this version is based on Gemini 3.0 Flash and supports a context window of up to 128,000 tokens. (deepmind.google) Google said the update improves three jobs that often trip up robots: spatial reasoning, checking whether a task succeeded from multiple camera views, and reading instruments such as gauges and sight glasses. In Google’s reported tests, “agentic vision” accuracy reached 93% on instrument reading and 90.5% on multiview success detection. (deepmind.google) That work sits one layer above the robot’s motors. Google describes Robotics‑ER as a high‑level reasoning model that can break a job into steps, decide when to retry, and call other systems — including a vision‑language‑action model that actually drives the robot’s movements. (deepmind.google; developers.googleblog.com) Google and Boston Dynamics used the new model on Spot, the four‑legged inspection robot, to let operators issue plain‑English requests instead of writing custom scripts. Google’s blog example asks Spot to inspect a valve, read the gauge, and report whether the reading is in range. (deepmind.google) Google also framed the release as a developer product, not just a lab demo. The Gemini API changelog says `gemini-robotics-er-1.6-preview` launched on April 14, and Google’s robotics documentation says users on 1.5 can switch by changing the model name in the API call. (ai.google.dev; ai.google.dev) The company split its robotics stack into two products last year. Gemini Robotics is the model that can output physical actions for direct control, while Gemini Robotics‑ER is the reasoning model for spatial logic, task planning, and progress estimation that developers can pair with their own robot software. (deepmind.google; deepmind.google) Google said 1.6 is its safest robotics model so far on adversarial spatial‑reasoning tests, but the company’s own model card also lists limits. It says performance can vary with camera quality, scene complexity, and task setup, and that the model is intended for high‑level reasoning rather than direct low‑level control of hardware. (blog.google; deepmind.google) For now, the release puts Google’s robotics push into the same distribution channel as its other Gemini models: preview access in the Gemini API and Google AI Studio. The pitch is straightforward — tell a robot what to inspect in plain English, and let the model handle more of the seeing, checking, and planning. (ai.google.dev; deepmind.google)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.