Google unveils TPU 8t and 8i

- Google introduced two new AI chips targeted at different workloads: training and inference. - The family is split into TPU 8t for training and TPU 8i for inference to match diverging AI needs. - Google says many customers already run AI in production, but it still relies on parts of Nvidia's ecosystem. (techcrunch.com)

Google has split its newest artificial-intelligence chip family in two, unveiling TPU 8t for training models and TPU 8i for running them in production. (blog.google) Google announced the chips at Cloud Next on April 22 in Las Vegas and said customers can request more information ahead of general availability later in 2026. Google Cloud says the new family is part of its AI Hypercomputer system, which bundles chips, networking, and software. (cloud.google.com) (blog.google) A training chip is used to teach a model from huge datasets, while an inference chip handles the live work after training, like answering prompts or powering software agents. Google said those jobs now have different bottlenecks, so its eighth-generation Tensor Processing Units are no longer one design for both. (cloud.google.com 1) (cloud.google.com 2) TPU 8t is aimed at frontier-model training and embedding-heavy workloads, with Google saying a single superpod can link 9,600 chips through its 3D torus network. TPU 8i is built for large-scale inference and reinforcement learning, with a pod design that directly connects 1,152 chips and adds more on-chip memory to keep responses fast. (cloud.google.com 1) (cloud.google.com 2) Google said TPU 8t delivers 2.8 times the performance of its seventh-generation Ironwood training chip at the same price. It said TPU 8i improves performance by 80% over the prior generation for inference workloads. (cnbc.com) (blog.google) The change comes as Google argues that artificial-intelligence use has moved from experiments to deployed products. Thomas Kurian, chief executive of Google Cloud, said nearly 75% of Google Cloud customers are already using AI in production environments. (digitimes.com) (techcrunch.com) Google is also trying to reduce delays outside the accelerator itself. The company said TPU 8 systems use its Arm-based Axion processors as hosts so data preparation and orchestration do not leave the chips waiting for work. (cloud.google.com 1) (cloud.google.com 2) The rollout does not mean Google is abandoning Nvidia. At the same conference, Google promoted access to Nvidia graphics processing units in its cloud, and outside analysts noted that none of the large cloud providers has displaced Nvidia in artificial-intelligence infrastructure. (techcrunch.com) (cnbc.com) Google has built Tensor Processing Units for years, but this is the first time it has publicly split a generation into separate training and inference products. The company is betting that customers building always-on agents will buy infrastructure the same way: one system to create models, another to serve them at low cost and low delay. (cloud.google.com) (thenextweb.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.