Google unveils TPU‑8 and agentic cloud
- Google used Cloud Next ’26 to launch its eighth-generation TPU family and a broader “Agentic Data Cloud,” reframing AI infrastructure around autonomous software agents. - The key split is TPU 8t for training and TPU 8i for inference, with 8t scaling to 9,600 chips and 121 exaflops. - This matters because Google is selling a full AI stack now — chips, models, data, and orchestration — not just cloud compute.
AI infrastructure is getting more specialized — fast. Google’s pitch at Cloud Next ’26 was that the old idea of one general-purpose stack for every AI job no longer fits. Training giant models, serving them in real time, and running swarms of software agents are different problems. So Google responded with two new TPU designs and a data platform built around agents, not just dashboards. ### What did Google actually launch? The headline hardware is Google’s eighth-generation TPU family, split into TPU 8t and TPU 8i. TPU 8t is the training system — meant for frontier-model pretraining and other memory-hungry jobs. TPU 8i is the inference system — meant for fast serving and reinforcement-learning-style workloads where latency matters more than raw training scale. Google also wrapped the software side in what it calls the Agentic Data Cloud, a data stack meant to feed and govern AI agents across enterprise systems. (cloud.google.com) ### Why split one TPU line in two? Because AI workloads have split in two. Training wants giant shared memory pools and huge clusters. Inference wants speed, responsiveness, and efficient movement between many small decisions. Google’s argument is basically that one chip can’t be best at both anymore. TPU 8t is optimized for the first problem. TPU 8i is optimized for the second. That is the real product decision here — specialization over one-size-fits-all silicon. (cloud.google.com) ### How big is TPU 8t? Very big. Google says TPU 8t can scale to superpods of 9,600 chips delivering 121 exaflops, with 2 petabytes of shared high-bandwidth memory for large-model training. It also says the chip family is hosted for the first time on Google’s own Axion Arm-based processors, which matters because Google is trying to co-design more of the whole machine — chip, host CPU, network, and software — instead of selling a loose bundle of parts. (cloud.google.com) ### What is TPU 8i for? TPU 8i is the half aimed at the “agentic” buzzword — but there is a real technical point underneath it. Agents don’t just answer one prompt. They plan, call tools, retrieve data, and often loop through multiple steps. That creates lots of inference traffic and punishes latency. Google says TPU 8i is built for that pattern, with architecture choices tuned for large-scale serving and reinforcement learning rather than giant pretraining runs. (cloud.google.com) ### So what is an Agentic Data Cloud? Google is trying to move enterprise data from a passive store into what it calls a “system of action.” In plain English, that means data platforms that do more than hold tables for analysts. They also feed agents, enforce governance, connect across clouds, and let software take actions against business systems. Google says the stack includes things like Cross-Cloud Lakehouse on Apache Iceberg so customers can query data without moving all of it into Google first. (cloud.google.com) ### Why does the cross-cloud part matter? Because most big companies do not live inside one vendor anymore. They have data in AWS, Azure, on-prem systems, and SaaS apps. If Google forced every customer to centralize everything in Google Cloud first, adoption would slow down. So the smarter pitch is interoperability — leave the data where it is, then layer models, agents, and governance on top. That makes the Agentic Data Cloud less like a warehouse replacement and more like an operating layer for enterprise AI. (cloud.google.com) ### What is Google really selling here? A full-stack AI platform. Not just chips. Not just Gemini. Not just storage. Google’s own framing at Next ’26 tied together TPUs, Gemini models, data systems, and enterprise tooling under one idea: the “agentic enterprise.” The company also said nearly 75% of Google Cloud customers are using its AI products, and that direct API use of its first-party models is now above 16 billion tokens per minute. (cloud.google.com) Those numbers are there to show this is already a cloud revenue story, not a lab demo. ### Bottom line? The interesting part is not that Google announced “TPU‑8.” It’s that Google split the chip line, rebuilt the data pitch around agents, and made the cloud stack itself the product. If AI really shifts from chatbots to software that acts, that architecture choice could matter more than any single benchmark. (cloud.google.com) (blog.google)