Google's new AI chips

- Google unveiled two new AI chips split between training and inference to capture more AI compute economics. - The announcement splits TPU 8 architecture into separate training and inference models, claimed faster and cheaper than prior versions. - The launch is a hedge against Nvidia dependence while Google still offers Nvidia hardware in cloud listings (techcrunch.com).

Google used Cloud Next on April 22 to unveil two new eighth-generation artificial intelligence chips, splitting its Tensor Processing Unit line into one part for training and another for inference. (cloud.google.com) Training is the phase where a model learns from huge data sets; inference is the phase where it answers prompts after training is done. Google named the new chips TPU 8t for training and TPU 8i for inference. (blog.google) Google said TPU 8t is built for frontier-model training and TPU 8i is built for large-scale inference and reinforcement learning, a method where models improve through trial and error. The company said the split reflects different infrastructure needs for pre-training, post-training and real-time serving. (cloud.google.com) The company also said the chips will sit inside its AI Hypercomputer system, which combines processors, networking, software and data-center design into one package. Google said the eighth-generation family will be hosted for the first time on its own Arm-based Axion processors. (cloud.google.com) Google framed the launch around a shift in the artificial intelligence business: fewer companies are only training giant models, and more are paying to run them continuously for chatbots, search, coding tools and software agents. In a Cloud Next post, Sundar Pichai said Google’s first-party models now process more than 16 billion tokens per minute through direct customer API use, up from 10 billion last quarter. (blog.google) That demand helps explain why Google is now selling separate chips for separate jobs instead of one general design. In its announcement, Google said TPU 8i is aimed at “massive inference throughput,” while TPU 8t is tuned for large memory pools needed to train complex models. (blog.google) The move also widens Google’s effort to reduce how much of the AI hardware stack depends on Nvidia, even as Google continues to sell Nvidia systems in its cloud. Last month, Google Cloud announced expanded support for Nvidia products including RTX PRO 6000 Blackwell Server Edition systems and planned support for Nvidia Vera Rubin NVL72. (cloud.google.com) Google is not abandoning its earlier chips as it makes that shift. At Cloud Next in April 2025, the company introduced Ironwood, its seventh-generation TPU, as its first TPU designed specifically for inference; by April 2026, Google said Ironwood was generally available to cloud customers. (blog.google) (cloud.google.com) The new split suggests Google now sees training and inference as distinct businesses inside cloud computing, with different cost, speed and memory demands. The next test is whether cloud customers buy enough TPU 8t and TPU 8i capacity to pull more AI spending onto Google’s own silicon instead of Nvidia’s. (techcrunch.com)

Google's new AI chips

Get your own daily briefing