Google launches TPU 8t and 8i

- Google introduced TPU 8t for training and TPU 8i for inference at its Cloud Next event. - The new TPUs are described as exaflop-scale accelerators, with demos, block diagrams, and blog material shared publicly. - Google is positioning these chips to compete directly with hyperscaler-class accelerators for large training and inference workloads (x.com/TechCrunch/status/2047024363497459958) (x.com/NewsFromGoogle/status/2046993861684183179).

Google used its Cloud Next event on April 22 to introduce two new artificial-intelligence chips: TPU 8t for training models and TPU 8i for running them in production. (cloud.google.com) A Tensor Processing Unit, or TPU, is Google’s in-house chip for neural networks, the math systems behind tools like Gemini. Google says the eighth-generation family splits into two designs because training and inference now have different bottlenecks. (cloud.google.com (cloud.google.com)) Training is the expensive phase where a model learns from huge datasets; inference is the faster phase where the finished model answers prompts. Google says TPU 8t is tuned for “frontier-model training,” while TPU 8i is tuned for large-scale inference and reinforcement learning. (cloud.google.com) Google said TPU 8t scales to 9,600 chips and 2 petabytes of shared high-bandwidth memory in one superpod, the company’s term for a giant connected cluster. Sundar Pichai said the system delivers three times the processing power of Ironwood and up to 2x more performance per watt. (blog.google) Google described both chips as exaflop-scale accelerators, meaning systems built to perform at least one quintillion operations per second. The company said TPU 8i is aimed at “cost-effective, near-zero latency inference,” a pitch aimed at customers running always-on chatbots and software agents. (cloud.google.com (blog.google)) The hardware push comes as cloud providers race to sell not just raw chips but whole artificial-intelligence stacks, including networking, storage, software, and data-center power. Google said TPU 8t and 8i are part of its AI Hypercomputer system, which bundles those pieces into one platform. (cloud.google.com) Google also said it moved the host side of the new TPU systems onto Axion, its Arm-based server processor, to reduce delays in data preparation and orchestration. In the same technical post, Google said TPU 8t keeps the company’s 3D torus network and adds SparseCore blocks for embedding-heavy workloads such as recommendation and language systems. (cloud.google.com (cloud.google.com)) This is the second straight year Google has used Cloud Next to spotlight a purpose-built TPU generation. At Cloud Next 2025, Google introduced Ironwood as its seventh-generation TPU and said that chip was designed specifically for inference. (blog.google) Google has not put the new chips into general availability yet. Its product page says customers can request more information now, with general availability planned for later in 2026. (blog.google (cloud.google.com))

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.