Nvidia–Google AI Infra

- Nvidia and Google Cloud expanded collaboration around Blackwell GPUs and infrastructure for agentic and physical AI 'factories'. - Announcements referenced A4/A4X VMs, A4X Max, fractional G4 VMs, and Vera Rubin‑powered A5X instances with large scale ambitions. - The updates illustrate a stratified cloud stack from giant training clusters to enterprise-friendly serving options (blogs.nvidia.com).

Nvidia and Google Cloud widened their AI infrastructure partnership on April 22, adding new Blackwell systems for everything from giant training clusters to smaller inference jobs. (blogs.nvidia.com) Cloud providers rent out graphics processors, or GPUs, as virtual machines, and the newest systems are aimed at building and running large artificial intelligence models faster. Google said its lineup now spans A4 and A4X machines for large-scale training, A4X Max for bigger multimodal reasoning workloads, and G4 machines for lower-latency serving and visual computing. (cloud.google.com 1) (cloud.google.com 2) (cloud.google.com 3) The newest top-end system in the announcements was A5X, which Nvidia said will use Vera Rubin chips and scale to nearly 1 million GPUs. Nvidia also said customers will be able to use confidential computing on Blackwell systems and run Gemini on Google Distributed Cloud alongside Nvidia software such as NeMo and Nemotron. (blogs.nvidia.com) A4 VMs are based on Nvidia HGX B200 and are already generally available on Google Cloud. A4X VMs use Nvidia GB200 NVL72, entered preview in 2025, and were positioned for extra-large training and serving jobs that need more memory bandwidth and tighter links between chips. (cloud.google.com 1) (cloud.google.com 2) (cloud.google.com 3) Google said A4X Max is now shipping in production with Nvidia GB300 NVL72, 72 Blackwell Ultra GPUs and 36 Grace central processors in one system. The company said A4X Max delivers twice the network bandwidth of A4X and is designed to scale to tens of thousands of GPUs on Google’s Jupiter network fabric. (cloud.google.com) At the other end of the stack, G4 VMs use Nvidia RTX PRO 6000 Blackwell Server Edition GPUs and target latency-sensitive inference, simulation and graphics workloads. Google introduced G4 in preview in June 2025 and moved it to general availability in late 2025, saying it would bring GPU capacity to more regions and more regulated use cases. (cloud.google.com 1) (cloud.google.com 2) That split reflects how cloud AI spending is being carved up in 2026: the biggest customers want tightly connected clusters for training frontier models, while more companies want smaller rented slices for fine-tuning and serving. Google’s Cloud Next announcements this week also emphasized agent-building tools, and Chief Executive Sundar Pichai said cloud customers are already processing more than 16 billion tokens per minute through Google models via direct application programming interface use. (blog.google) (blog.google) Nvidia and Google have been building toward this layered lineup for more than a year. In March 2025, Google said it would pair Gemini and Google Cloud infrastructure more closely with Nvidia hardware, and by late 2025 Nvidia was describing the joint platform as an end-to-end Blackwell stack from A4X and A4 down to G4. (blog.google) (blogs.nvidia.com) The immediate test is whether Google can turn these hardware tiers into steady cloud demand beyond a handful of giant model builders. For now, the April 22 announcements showed Google and Nvidia trying to cover the full market, from million-GPU ambitions at the top to enterprise-friendly serving options lower down. (blogs.nvidia.com) (cloud.google.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.