Google debuts TPU 8t and 8i

- Google used Cloud Next 2026 to unveil two eighth-generation AI chips, TPU 8t for training and TPU 8i for inference, splitting jobs by workload. - Google said TPU 8t scales to 9,600 chips per superpod, while TPU 8i delivers 80% better performance per dollar than Ironwood. - The launch extends Google’s push to sell AI infrastructure, networking and software as one stack. (cloud.google.com)

Tensor Processing Units are Google’s in-house AI chips, built to do the math behind model training and model serving faster than general-purpose processors. At Cloud Next on April 22, Google split its eighth-generation TPU line into TPU 8t for training and TPU 8i for inference. (docs.cloud.google.com) (blog.google) Training is the phase where a model learns from huge datasets; inference is the phase where a deployed model answers prompts and takes actions. Google said those jobs now need different hardware, so it designed TPU 8t and TPU 8i as separate systems instead of one general chip family. (cloud.google.com) Google said TPU 8t is aimed at frontier-model training and embedding-heavy workloads, with single-superpod scale up to 9,600 chips. The company said the system is hosted on its Axion Arm-based processors and linked by its Virgo network. (cloud.google.com 1) (cloud.google.com 2) TPU 8i is the inference-focused half of the launch, built for post-training, reinforcement learning and large-scale serving. Google said it offers an 80% performance-per-dollar gain over the prior Ironwood generation for low-latency inference on large mixture-of-experts models. (cloud.google.com) (blog.google) Google tied both chips to its AI Hypercomputer, the company’s package of chips, networking, storage and software for building and running AI systems. In the same Cloud Next rollout, Google pitched that stack as the infrastructure layer for “agentic” software that can reason, call tools and work across enterprise data. (cloud.google.com 1) (cloud.google.com 2) Chief Executive Sundar Pichai said Google’s models are now processing more than 16 billion tokens per minute through direct customer API use, up from 10 billion in the prior quarter. He also said just over half of Google’s machine-learning compute investment in 2026 is expected to go to the Cloud business. (blog.google) Google is not dropping Nvidia from its cloud lineup. Reuters reported the company launched the new TPUs while continuing to offer Nvidia chips, underscoring that custom silicon and third-party accelerators are now both part of the cloud arms race. (msn.com) The timing reflects a change in how cloud providers talk about AI hardware. Instead of one chip for every stage, Google is selling one system for building giant models and another for running fleets of AI agents cheaply and quickly. (blog.google) (cloud.google.com) Google said both TPU 8t and TPU 8i are coming later this year, with customers able to request more information now. The bet is that the next enterprise AI sale is not just a model, but the full machine room behind it. (blog.google)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.