Gemma 4 Goes Local
NVIDIA and Google have enabled the Gemma 4 multimodal foundation model to run across the full stack — from Blackwell data‑center GPUs down to consumer RTX and edge devices — making agentic, multimodal AI deployable locally rather than only in the cloud. That shift cuts latency and privacy risk for real‑time robotics tasks like perception, manipulation, and on‑device reasoning, unlocking more reliable robot behavior when networks are unreliable. (developer.nvidia.com)
Google released Gemma 4 as a family of four open-source models that run from phone-sized hardware up to workstation-grade servers, and the release is available under the permissive Apache 2.0 license. (blog.google) NVIDIA says it has optimized those models to run across its full stack — from Blackwell data‑center GPUs to consumer RTX cards and Jetson edge modules — so the same Gemma 4 family can be deployed locally on everything from an embedded module to a GPU workstation. (developer.nvidia.com) Technically, Gemma 4 is multimodal, which means a single model can accept and reason over text, images, audio and video inputs (so one model can handle both camera perception and speech); NVIDIA and Google both highlight that capability for edge and on‑device scenarios. (developer.nvidia.com) The family includes a 31‑billion‑parameter dense model and a 26‑billion‑parameter “Mixture‑of‑Experts” model plus two smaller “effective” models for edge use (E4B and E2B); the smaller models are sized and optimized to run fully offline on phones, Raspberry Pi, and Jetson Orin Nano, while the larger variants target workstations and data centers. (developer.nvidia.com, blog.google) NVIDIA lists direct developer tooling and runtimes for local use — vLLM, Ollama, llama.cpp and Unsloth for inference, plus NeMo Automodel and NIM for fine‑tuning and deployment — and points to integrations like OpenClaw for building persistent local agents on RTX and DGX Spark systems; Google also documents on‑device agent skills and an AICore developer preview for Android. (developer.nvidia.com, blogs.nvidia.com, developers.googleblog.com)