Google’s Gemma 4 on iPhone

Google has pushed a major AI move: Gemma 4 can now run on an iPhone locally, promising offline, free and more private language‑model use without a cloud roundtrip (x.com). The change matters because it lets users get advanced AI features without sending data to servers, and it’s a clear signal that large models are moving toward device‑level deployment (x.com) (x.com).

A language model is usually like a call center: your phone sends a question to a remote data center, the data center writes the answer, and the answer comes back over the internet. Google is now showing that Gemma 4, its newest open model family, can do that work on the phone itself, including on iPhone through its Google AI Edge Gallery app for iOS. (developers.googleblog.com) Running “on-device” means the model’s weights live on the handset, not on a company server. Google’s mobile deployment docs list iPhone support through Google AI Edge Gallery and the MediaPipe large language model inference stack, which is the software layer that lets apps execute the model locally. (ai.google.dev) That only works if the model is small enough and efficient enough to fit into phone memory and use phone chips without cooking the battery. Gemma was built as Google’s open model line for lighter hardware, and the new Gemma 4 release was pitched by Google as designed for “edge” devices, meaning phones, laptops, and other local hardware instead of giant cloud clusters. (blog.google) (developers.googleblog.com) Gemma 4 comes in several sizes because a phone and a server do not have the same room to work with. Google’s model card lists E2B, E4B, 26B A4B, and 31B variants, with the smaller versions aimed at constrained devices and the larger ones aimed at stronger hardware. (ai.google.dev) Google also packed more into this generation than simple text chat. The official Gemma 4 docs say the models can take text and image input, support more than 140 languages, and offer context windows up to 256,000 tokens, which is the chunk of text the model can keep in working memory at once. (ai.google.dev) On iPhone, Google is not slipping Gemma 4 into Apple’s built-in assistant. It is exposing the model through Google AI Edge Gallery, an app Google describes as a showcase for fully on-device AI experiences on iOS and Android, including chat, prompts, and function-calling demos. (developers.googleblog.com) Google added a feature it calls Agent Skills, which is a way for the model to carry out a chain of steps instead of answering with one block of text. In Google’s April 2 post, the company said Agent Skills can run multi-step autonomous workflows entirely on-device, which is the clearest sign that this is moving beyond toy offline chat. (developers.googleblog.com) The privacy angle is straightforward: if the request never leaves the phone, there is no cloud roundtrip carrying your prompt to a remote server. Google’s iOS announcement for AI Edge Gallery explicitly framed the app around high-performance on-device use cases inside the iPhone ecosystem rather than server-side inference. (developers.googleblog.com) The cost angle is just as important. Gemma 4 is released under the Apache 2.0 license, so developers can download the weights and build with them without paying per-query cloud fees to Google for every response. (developers.googleblog.com) This does not mean every iPhone suddenly runs the biggest model at full speed. Google’s own materials split Gemma 4 into multiple sizes and deployment paths because local artificial intelligence still trades off speed, memory, and battery life against model quality. (ai.google.dev 1) (ai.google.dev 2) What changed this week is not just one app update. Google is treating the phone as a real place to run modern models, and it is doing it with an open model family that can live on Apple hardware, Android hardware, laptops, and other edge devices without needing a permanent connection to Google’s servers. (developers.googleblog.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.