Nvidia + Google Cloud push
- Nvidia and Google Cloud announced expanded collaboration enabling firms to build "AI factories" on Google infrastructure. - The partnership spans A4, A4X, A4X Max and fractional G4 VMs and claims scaling to nearly one million Rubin GPUs. - The announcement deepens hyperscaler-curated compute classes and raises questions about workload portability and vendor concentration. ( )
Nvidia and Google Cloud said on April 22 they are expanding their partnership so customers can build large artificial intelligence computing clusters on Google’s cloud. (blogs.nvidia.com) The companies said the lineup includes Google Cloud A4, A4X and A4X Max virtual machines, plus fractional G4 instances that split a graphics processor into smaller rented slices. Nvidia also said future A5X systems based on its Vera Rubin platform are planned to scale to “nearly 1 million” Rubin graphics processors. (blogs.nvidia.com) (cloud.google.com 1) (cloud.google.com 2) In cloud computing, those machine names are packaged server types: A4 uses Nvidia Blackwell B200 chips, A4X uses GB200 NVL72 systems, and A4X Max uses GB300 NVL72 systems with 72 Blackwell Ultra graphics processors and 36 Grace central processors in one rack-scale unit. Google said A4X Max is now in production and built to scale to tens of thousands of graphics processors on its Jupiter network. (cloud.google.com 1) (cloud.google.com 2) Google and Nvidia are also tying the hardware more tightly to software customers already use, including Google Kubernetes Engine, Vertex AI Training, Vertex AI Model Garden, Nvidia Dynamo, NeMo and Nemotron. The April 22 announcement added Gemini on Google Distributed Cloud and confidential computing support for Nvidia Blackwell systems. (blogs.nvidia.com) (cloud.google.com) The push comes as cloud providers are selling more pre-built artificial intelligence “factories,” their term for data center clusters tuned for training and serving models at scale. Instead of renting generic servers, customers increasingly buy a bundled stack of chips, networking, storage and software from one cloud. (blogs.nvidia.com) (cloud.google.com) That model gives buyers faster setup and tested performance, but it also gives the cloud provider more control over which chip generations, machine shapes and management tools are available. Google’s recent rollout of A4X Max and fractional G4 instances shows how those choices are becoming product categories curated by hyperscalers rather than standard commodity servers. (cloud.google.com 1) (cloud.google.com 2) (cloud.google.com 3) Google has been widening its Nvidia menu for more than a year. In 2025 it made A4 generally available, introduced A4X in preview, and launched G4 for lower-latency artificial intelligence, simulation and graphics workloads; in 2026 it moved A4X Max into production and previewed smaller fractional G4 rentals. (cloud.google.com) (cloud.google.com) (cloud.google.com) (cloud.google.com) Nvidia framed the latest step around “agentic” and “physical” artificial intelligence, shorthand for software agents that take actions and models used in robots, factories and industrial systems. Google has been pitching the same infrastructure to startups and enterprise customers building reasoning models and multimodal systems that mix text, images, video and sensor data. (blogs.nvidia.com) (cloud.google.com) (siliconangle.com) The immediate result is not a consumer product launch but a deeper supply agreement between the biggest maker of artificial intelligence chips and one of the largest cloud landlords. For companies buying compute in 2026, that means more Nvidia capacity on Google Cloud — and more of that capacity arriving as Google-defined machine classes rather than interchangeable infrastructure. (blogs.nvidia.com) (cloud.google.com)