Google's Gemini Gets Major Upgrade on Pixel

Google's latest Pixel drop significantly expands Gemini's on-device capabilities. The AI can now handle complex, multi-step tasks like ordering groceries and can also break down fashion items from an image using Circle to Search, showcasing Google's push into practical, multimodal AI.

This update is powered by a strategic use of different Gemini models; Gemini Nano handles on-device tasks like notification summaries for speed and privacy, while the more powerful Gemini 3 model underpins the advanced reasoning for complex visual searches. This dual approach represents a key system design choice, balancing on-device efficiency with the raw power of cloud-based models for more demanding computations. The "Circle to Search" enhancement for fashion leverages Gemini 3's agentic planning and reasoning capabilities. Instead of a simple image match, the AI performs a multi-step process: it identifies and isolates individual items in the image, runs several searches simultaneously, and then compiles the findings into a comprehensive breakdown. This feature is available on the Pixel 10 series and Samsung's Galaxy S26 devices. The ability for Gemini to handle multi-step tasks, like ordering groceries, is rolling out as a beta feature on the Pixel 10 and 10 Pro. It operates by launching the relevant app in a secure window and automating the taps, scrolls, and text inputs required to complete the task, all while providing progress notifications to the user. This move intensifies the philosophical battle between Google's cloud-centric AI and Apple's privacy-first "Apple Intelligence" architecture. While Apple emphasizes on-device processing for most tasks to keep user data localized, Google's strategy is a hybrid one, using on-device models for low-latency needs while leveraging its powerful cloud infrastructure for more complex, agentic behaviors that require vast computational resources. For aspiring software engineers, this signals Google's heavy investment in on-device machine learning and agentic AI. Job postings for related roles at Google emphasize skills in C++, ML infrastructure, and experience designing complex ML inference systems for resource-constrained environments like Android, often mentioning frameworks like TensorFlow Lite and the optimization of large language models (LLMs). The push into multimodal, agentic AI is also reflected in Google's recent acquisitions, such as the purchase of Common Sense Machines, a startup specializing in converting 2D images into 3D models. This move is aimed at building robust "world models," which are foundational for more advanced embodied AI and spatial understanding, indicating a long-term strategy that extends far beyond conversational assistants.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.