Google debuts new TPU chips
- Google announced two eighth-generation TPU chips at Cloud Next for both AI training and inference workloads. - Google says the chips are faster and cheaper than the prior generation and are built for an "agentic era" of AI. - The announcement widens viable inference and training back ends, affecting cost and runtime routing for large-scale imagery ML (blog.google) (cnbc.com).
Google used its Cloud Next conference on April 22 to split its newest artificial intelligence chip line in two: one chip for training models, another for running them. (blog.google) The company calls the new processors TPU 8t and TPU 8i, its eighth-generation Tensor Processing Units. Google said TPU 8t is built for large training jobs, while TPU 8i is tuned for low-latency inference and reinforcement learning, with general availability planned later in 2026. (blog.google) Training is the step where a model learns from huge datasets; inference is the step where a finished model answers a prompt, classifies an image, or takes the next action. Google said those workloads now diverge enough that one general-purpose chip is no longer the best fit for both. (cloud.google.com) Google has rented TPUs to cloud customers since 2018, but earlier generations could handle both jobs. CNBC reported that the eighth generation is the first time Google has separated training and inference into distinct processors, a strategy Amazon Web Services also follows with Trainium for training and Inferentia for inference. (cnbc.com) The split reflects how artificial intelligence systems are being used in 2026. Google framed the chips around “agentic” software — systems that do multi-step work — and said TPU 8i is designed to keep those responses fast enough for real-time use. (blog.google) Google also tied the chips to its wider AI Hypercomputer system, which bundles processors, networking and software into one cloud stack. In a technical post, the company said TPU 8t scales to 9,600 chips in a single superpod using a 3D torus network, while TPU 8i is aimed at high-throughput serving and reinforcement learning. (cloud.google.com) The immediate target is Nvidia, whose graphics processing units still dominate artificial intelligence infrastructure. CNBC reported that Google remains a large Nvidia customer even as it pushes TPUs as a cloud alternative, and that Microsoft, Meta and Amazon are also building more custom silicon. (cnbc.com) Google is also building on a chip roadmap it accelerated last year. At Cloud Next 2025, it introduced Ironwood, a seventh-generation TPU built specifically for inference, and later said Ironwood delivered more than four times the per-chip performance of Trillium, also known as TPU v6e, for training and inference workloads. (blog.google) (cloud.google.com) For companies deciding where to run image models, language models or agent software, the announcement adds another routing choice inside Google Cloud. Instead of sending every job to the same kind of processor, customers can match a training run to TPU 8t and a live production workload to TPU 8i when those chips ship later this year. (blog.google)