Google launches TPU split and AI agents push
- Google is shifting from demos to paid enterprise AI by centering “AI agents” as workflow tools. - It unveiled separate TPUs for training and inference to cut costs and compete with Nvidia. - The move signals a new commercial battleground between chips, cloud contracts, and enterprise workflow sales. ( )
Google used its Cloud Next conference on April 22 to turn “AI agents” into a sales pitch for business software and new in-house chips. (wincountry.com (blog.google) At the Las Vegas event, Google folded a set of products under the “Gemini Enterprise” name and said it was expanding Vertex AI, its platform for companies to build and run AI systems. Google also added governance and security features for agents, which it describes as software that can plan, decide and act across tasks. (wincountry.com) (blog.google) Google also changed its chip strategy. After years of using one Tensor Processing Unit design for both building models and running them, it introduced TPU 8t for training and TPU 8i for inference, with both due later in 2026. (cnbc.com) (blog.google) Training is the compute-heavy step where a model learns from data; inference is the repeated work of answering prompts after the model is built. Google said TPU 8i is tuned for fast, low-latency agent responses, while TPU 8t is built for training large models from a single large memory pool. (blog.google) (cnbc.com) The numbers show why Google is separating the jobs. CNBC reported Google said the training chip delivers 2.8 times the performance of the prior Ironwood generation at the same price, while the inference chip offers 80% better performance. (cnbc.com) Google tied that hardware push directly to enterprise demand. In its conference roundup, the company said nearly 75% of Google Cloud customers now use its artificial intelligence products, 330 customers processed more than 1 trillion tokens each over the past 12 months, and direct customer API traffic rose to more than 16 billion tokens per minute from 10 billion last quarter. (blog.google) Thomas Kurian, Google Cloud’s chief executive, told Reuters the main use of Vertex AI has shifted from older machine-learning work to companies building custom agents. Reuters also reported Google is targeting business customers as the steadiest revenue source while OpenAI and Anthropic push deeper into enterprise software. (wincountry.com) The chip move is aimed at Nvidia, but not a clean break from it. Google remains a large Nvidia customer, and TechCrunch reported Google still plans to offer Nvidia’s Vera Rubin chips in its cloud later this year while using its own TPUs as an alternative. (cnbc.com) (techcrunch.com) Amazon is following a similar path with separate Trainium and Inferentia chips, and Microsoft announced a second-generation AI chip in January, according to CNBC. That leaves cloud providers competing on three layers at once: the chips in their data centers, the cloud contracts that sell access to them, and the software agents companies use every day. (cnbc.com) Google’s message in Las Vegas was that the next AI sale is not just a model demo. It is a package deal of agent software, cloud infrastructure and custom silicon, sold to companies that want those systems in production now. (wincountry.com) (blog.google)