Google Builds AI Chips

- Google is developing new AI chips aimed at speeding up inference and challenging Nvidia's dominance. - The effort builds on recent deals with Meta and Anthropic and prioritizes inference economics over training bragging rights. - That broadens the hardware landscape toward multiple specialised components rather than a single-vendor stack (bloomberg.com).

Google is preparing a new generation of in-house artificial intelligence chips built to answer prompts faster and cut the cost of serving results. (bloomberg.com) The chips are Google’s Tensor Processing Units, or TPUs, and Bloomberg reported on April 20 that the company plans to unveil the next version this week. Google has spent years using TPUs inside its own products and renting them through Google Cloud. (bloomberg.com) (cloud.google.com) In artificial intelligence, training is the expensive schooling phase and inference is the moment a model actually answers a user. Google said its seventh-generation Ironwood TPU, introduced on April 9, 2025, was its first chip designed specifically for inference. (blog.google) Google said Ironwood can be linked into “superpods” of up to 9,216 chips, and it pitched the system for high-volume, low-latency model serving. The company also spent 2025 adding vLLM support on TPUs so developers running PyTorch-style model stacks could move inference workloads with fewer code changes. (blog.google) (cloud.google.com) That focus tracks where the money is shifting. Once a model is trained, every chatbot reply, search summary, image generation, and coding suggestion becomes an inference job that must be delivered quickly and cheaply at large scale. (cloud.google.com) (blog.google) Google’s pitch has also started landing with outside customers. Anthropic said in October 2025 that it would significantly expand its use of Google Cloud TPUs and services in a deal it said was worth tens of billions of dollars and would bring well over a gigawatt of capacity online in 2026. (anthropic.com) Anthropic said again on April 8, 2026 that it trains and runs Claude on several kinds of hardware, including Amazon Web Services Trainium, Google TPUs, and Nvidia graphics processing units. The company said that mix lets it match workloads to the chips best suited for them, while Amazon remains its primary cloud provider and training partner. (anthropic.com) Meta is pushing the same idea from the other side. Meta said on March 11, 2026 that it is developing and deploying four new generations of its Meta Training and Inference Accelerator chips within two years for ranking, recommendations, and generative artificial intelligence workloads. (about.fb.com) Nvidia still dominates the market for the graphics processing units that train and run many leading models, but Google is trying to win a larger share of the serving layer where speed per answer and cost per answer matter most. The result is a market that looks less like one standard server and more like a rack of specialized parts tuned for different jobs. (bloomberg.com) (anthropic.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.