Gemini Robotics‑ER 1.6 highlighted for perception
- Google DeepMind said on April 14, 2026, Gemini Robotics-ER 1.6 became available via the Gemini API and Google AI Studio. - Google said the model adds instrument reading and improved multi-view spatial reasoning, while Boston Dynamics was cited as a partner in that use case. - Developers can access Gemini Robotics-ER 1.6 through Google AI Studio and the Gemini API, with a Google Colab linked by DeepMind.
Google DeepMind’s April 14 release of Gemini Robotics-ER 1.6 gave robotics developers a new reasoning model aimed at perception, planning and task verification in physical environments. The company described the system as a high-level model for visual and spatial understanding, task planning and success detection, and said it is available through the Gemini API and Google AI Studio. Google also said the model can call external tools, including search, vision-language-action models and other user-defined functions. ### Why did Gemini Robotics-ER 1.6 show up in perception discussions this week? Posts on X on May 15 and May 16 circulated Gemini Robotics-ER 1.6 as a perception-and-planning layer rather than a standalone sensor stack, comparing it with radar, lidar and camera-based systems in specific robot and autonomy tasks. The social discussion tied the model to problems such as unstructured bin picking and operation in difficult visual conditions, but the official Google materials frame the product as an embodied reasoning model, not as a replacement for physical sensors. (deepmind.google) Google DeepMind said Gemini Robotics-ER 1.6 “acts as the high-level reasoning model for a robot” and is designed to work with other systems, including vision-language-action models. That description is the clearest explanation for why users on X placed it alongside companies such as Arbe, Ouster and Mobileye: the comparison is about where reasoning software sits in a robotics stack, not about a like-for-like hardware matchup. (deepmind.google) That is an inference from Google’s product description and the companies’ own sensor positioning. ### What, exactly, did Google say the model does? Google DeepMind said the model improves “spatial reasoning and multi-view understanding” and specializes in “visual and spatial understanding, task planning and success detection.” The company said benchmark gains over Gemini Robotics-ER 1.5 and Gemini 3.0 Flash were concentrated in pointing, counting and success detection, and that a new feature called instrument reading lets robots interpret gauges and sight glasses. (deepmind.google) The April 2026 model card said Gemini Robotics-ER 1.6 is a vision-language model based on Gemini 3.0 Flash with up to a 128,000-token context window and text output up to 64,000 tokens. The same card said Google requires users to avoid deploying the robotics models in safety-critical applications such as healthcare or transportation, or in other settings where failure could lead to injury, death or property damage. (deepmind.google) ### Why were Mech-Mind examples part of the conversation? Mech-Mind markets 3D vision systems for industrial robot tasks that include piece picking, machine tending and bin picking. The company says its systems can detect challenging parts of different sizes and shapes, pick from inventory bins, and handle randomly piled materials with dark or reflective surfaces and complex structures. (storage.googleapis.com) Those examples overlap with the kinds of scene-understanding and task-selection problems that a reasoning model could sit on top of. Google’s own description of Gemini Robotics-ER 1.6 emphasizes understanding scenes, deciding intermediate steps and determining whether a task succeeded, which maps to planning and verification around industrial vision workflows rather than to raw depth capture itself. That comparison is an inference from the two companies’ product descriptions. (mech-mind.com) ### Why were Arbe and Ouster mentioned in harsh-weather comparisons? Arbe says its HD radar is built for “all weather perception” and says its 48x48 channel array is designed to maintain high-resolution separation accuracy in poor visibility. Ouster says its lidar systems are designed for reliability and robustness, including IP68/69K protection, and has published materials describing lidar performance in all-weather and harsh-environment use cases. (deepmind.google) Those claims describe sensor-layer performance under rain, dust, darkness or other environmental constraints. Google’s Gemini Robotics-ER 1.6 materials, by contrast, describe a reasoning model that interprets visual inputs, plans actions and checks outcomes. In practice, that means the products discussed in the X thread address different parts of a robotics system, even when users compare them in the same demo or deployment conversation. (arberobotics.com) ### Where does Mobileye fit in this comparison? Mobileye describes its ADAS and autonomous-driving products as vision-centric systems built on cameras, compute, mapping and software, with some configurations also using radar. The company says its platforms span driver assistance through more advanced automated-driving functions, and its Surround ADAS product combines cameras and radars on a single control unit. (deepmind.google) That makes Mobileye relevant to the discussion as an example of a mature perception-and-decision stack in a different market. Google’s robotics release does not claim to replicate Mobileye’s automotive system; it claims to provide a higher-level embodied reasoning layer for robots that can use other tools and models. ### What can developers do next? (mobileye.com) Google said on April 14 that Gemini Robotics-ER 1.6 is already available in the Gemini API and Google AI Studio, and its developer documentation says users can upgrade from the prior preview by changing the model name to “gemini-robotics-er-1.6-preview.” DeepMind also linked a Colab with configuration and prompting examples for embodied reasoning tasks. (deepmind.google)