Google doubles down on AI agents

- Google repositioned enterprise AI around AI agents and unveiled new tools to build agent workflows at Cloud Next. - It also announced an eighth-generation TPU architecture that separates chips for training and inference to cut serving costs. - Google signed a multibillion-dollar infrastructure deal with Thinking Machines Lab, underlining an infrastructure-first competition for enterprise AI. (techcrunch.com)

Google used Cloud Next on April 22 to recast its enterprise pitch around AI agents — software that can plan and carry out multi-step work — and tied that pitch to new chips and a new cloud deal. (blog.google) Google’s main software launch was Gemini Enterprise Agent Platform, which combines Vertex AI model tools with agent integration, security, DevOps, and optimization features in one product for corporate developers. Google also said its first-party models are now processing more than 16 billion tokens per minute through direct customer API use, up from 10 billion last quarter. (blog.google; blog.google) For developers building those systems, Google is pushing an open-source Agent Development Kit and its Agent2Agent protocol, which is designed to let agents built on different platforms exchange tasks and results. Google’s documentation says the kit can orchestrate agent workflows on Vertex AI Agent Engine Runtime, while the protocol was updated last year to give developers a more stable interface. (docs.cloud.google.com; cloud.google.com) The hardware message changed too. Google introduced two eighth-generation Tensor Processing Units, or TPUs: TPU 8i for fast inference, the step where a trained model answers a prompt, and TPU 8t for training and fine-tuning, the step where the model learns from data. (blog.google) Google said that split is aimed at the economics of agent software, which has to respond quickly when it is executing chains of actions for users. In its announcement, Google said TPU 8i is built to help agents “reason, plan and execute multi-step workflows” with lower serving costs, while TPU 8t is tuned for the heavier work of training and adaptation. (blog.google) The company paired those product launches with a new infrastructure customer. TechCrunch reported on April 22 that Mira Murati’s Thinking Machines Lab signed a multibillion-dollar Google Cloud agreement valued in the single-digit billions, and Google said separately that the startup will expand its use of Google’s AI Hypercomputer. (techcrunch.com; googlecloudpresscorner.com) Google’s press materials said Thinking Machines will use A4X Max virtual machines with Nvidia GB300 systems and multiple Google Cloud services for model research, platform development, and frontier-model training. In the same announcement, Thinking Machines founding researcher Myle Ott said early testing showed training and serving speed doubled versus the prior GPU generation. (googlecloudpresscorner.com) That mix of agent software, custom TPUs, and Nvidia-based cloud rentals shows how Google is competing on both layers of the AI stack at once. Microsoft has tied Azure closely to OpenAI, and Amazon Web Services has done the same with Anthropic, while Google is pitching its own models, its own chips, and rented Nvidia capacity as a package. (techcrunch.com; blog.google) Google also used the event to give the strategy a name: the “Agentic Enterprise.” In its Cloud Next opening post, the company said organizations are choosing Google Cloud for a fully integrated AI stack, and in Sundar Pichai’s post it said just over half of Google’s machine-learning compute investment in 2026 is expected to go to the Cloud business. (cloud.google.com; blog.google) The immediate test is whether companies buy more than the language. Google opened Cloud Next by saying the “Agentic Enterprise is real,” and it spent the same week lining up the software, silicon, and compute contracts needed to make that claim stick. (cloud.google.com; blog.google)

Google doubles down on AI agents

Get your own daily briefing