NVIDIA and Google Cloud 'AI factories'
- NVIDIA and Google Cloud announced a collaboration to advance agentic and physical AI 'factories'. - The partnership aims to make infrastructure for large-scale agentic workflows and simulations easier to acquire. - Vendors are packaging pre-validated stacks, prompting choices about which AI capabilities belong inside regulated manufacturing versus enterprise marketing (blogs.nvidia.com).
NVIDIA and Google Cloud said on April 22 they are expanding their partnership to sell “AI factories” for software agents and industrial simulations through Google Cloud. (blogs.nvidia.com) The hardware piece is Google Cloud’s new A5X bare-metal instance, built around NVIDIA Vera Rubin NVL72 systems. NVIDIA said single-site clusters can scale to 80,000 Rubin graphics processors, and multisite clusters to 960,000. (blogs.nvidia.com) An AI factory is a data center tuned to generate tokens, predictions, or robot decisions at high volume, the way a power plant is tuned to generate electricity. NVIDIA said A5X is designed to deliver up to 10 times lower inference cost per token and 10 times higher token throughput per megawatt than the prior generation. (blogs.nvidia.com) The software piece is a stack for “agentic” AI, meaning systems that can plan and execute multi-step work with tools and data instead of answering one prompt at a time. Google launched Gemini Enterprise Agent Platform on April 22 as a service to build, govern, and optimize those agents. (cloud.google.com) Google said that platform folds together model selection, model building, agent building, orchestration, security, and DevOps features that had been spread across Vertex AI. NVIDIA said the same platform will support its Nemotron open models and NeMo framework on Google Cloud. (cloud.google.com) (blogs.nvidia.com) The “physical AI” part is about training models for machines that move in the real world, such as robots, warehouse systems, and digital twins of factories. NVIDIA said the Google Cloud tie-up is meant to push those systems from lab demos into production in manufacturing, drug discovery, energy, and robotics. (blogs.nvidia.com) (roic.ai) Google is also extending Gemini to on-premises and regulated settings through Google Distributed Cloud, where customers keep systems in their own facilities or in disconnected environments. Google said on April 22 that Gemini Flash models are now available in preview for Google Distributed Cloud connected customers on NVIDIA Blackwell and Blackwell Ultra platforms. (cloud.google.com) That matters for manufacturers, governments, and other buyers that cannot move sensitive data into a public cloud region. Google said its Distributed Cloud offering includes air-gapped deployments for maximum security and connected deployments with Google-managed software updates on customer hardware. (cloud.google.com) The partnership also shows how cloud vendors are packaging pre-validated stacks instead of asking customers to assemble chips, networking, models, security controls, and orchestration tools on their own. Google’s April 22 event in Las Vegas centered on what it called the “Agentic Enterprise,” with new infrastructure, agent tools, and security products released together. (cloud.google.com) For buyers, the question is becoming less about whether they can rent graphics processors and more about where these systems should run and who governs them. NVIDIA and Google are offering one answer: a bundled path for both office agents and factory-floor AI, sold through the same cloud relationship. (blogs.nvidia.com) (cloud.google.com)