Google balances Nvidia reliance

- Google is developing TPU 8t and TPU 8i chips to build a parallel AI infrastructure alongside Nvidia GPUs. - Simultaneously, Google and Nvidia are collaborating on agentic and physical AI, including confidential Blackwell GPU instances. - The dual approach mixes internal silicon cadence with vendor partnerships to balance capacity, bargaining power, and customer demand. ( )

Google is building two new in-house artificial intelligence chips even as it deepens its Nvidia partnership, giving Google Cloud two ways to meet surging demand. (cloud.google.com; blogs.nvidia.com) Google introduced its eighth-generation Tensor Processing Units on April 22 at Cloud Next ’26: TPU 8t for large training jobs and TPU 8i for inference and reinforcement learning. Google said TPU 8t scales to 9,600 chips in one superpod, while TPU 8i is tuned for fast response times in agent workloads. (blog.google; cloud.google.com) A chip for training is the system that teaches a model from huge datasets; a chip for inference is the system that answers prompts after training is done. Google split those jobs into separate products after making Ironwood, its seventh-generation TPU, an inference-first chip at Cloud Next 2025. (blog.google; cloud.google.com) Google is not replacing Nvidia. At Nvidia GTC 2026 in March, Google Cloud said it would support Nvidia’s upcoming Vera Rubin NVL72 platform and expand software support for Nvidia systems across Vertex AI and Google Kubernetes Engine inference tools. (cloud.google.com) The two companies are also pairing Google models with Nvidia hardware in regulated environments. Nvidia said in April 2025 that Google Distributed Cloud would run Gemini models on Blackwell HGX and DGX systems with Nvidia Confidential Computing, which keeps prompts and fine-tuning data encrypted during processing. (blogs.nvidia.com; blogs.nvidia.com) Google’s own pitch is that AI infrastructure now needs different machines for different jobs. In its technical write-up, Google said pre-training, post-training and real-time serving have “diverged,” and that TPU 8t and 8i are meant to remove different bottlenecks inside the same AI Hypercomputer stack. (cloud.google.com) That stack now includes Google-designed chips beyond TPUs. Google said its Arm-based Axion central processing units are built into the eighth-generation TPU system to handle data preparation and orchestration so the accelerators do not sit idle waiting for work. (cloud.google.com; cloud.google.com) Google has been moving in this direction for more than a year. In April 2025, it launched Ironwood as its seventh-generation TPU and called it the first TPU designed specifically for inference, after earlier generations were used more broadly across training and serving. (blog.google; blog.google) The result is a cloud strategy that mixes proprietary silicon with outside supply rather than betting on one vendor or one architecture. For customers buying compute in 2026, Google is offering both its own TPU roadmap and the newest Nvidia systems inside the same cloud. (blog.google; cloud.google.com)

Google balances Nvidia reliance

Get your own daily briefing