Google's Agent Benchmark Push

- At Cloud Next, Google showcased Project Mariner, a Gemini 2.0 web‑browsing agent built for enterprise automation. - Mariner scored 83.5% on the WebVoyager benchmark and handled ten concurrent tasks in demonstrations. - Google paired the benchmark with enterprise tooling and a $750 million partner fund to manage agent sprawl and enterprise adoption ( ).

Google used Cloud Next on April 22 to turn Project Mariner from a research demo into part of a broader enterprise sales pitch for AI agents. (blog.google) Project Mariner is Google DeepMind’s web-browsing agent: it reads what is on a page, plans steps, clicks through sites, and can be interrupted by the user. Google says it now runs tasks in browsers on virtual machines and is available in the United States to Google AI Ultra subscribers. (deepmind.google) Google has tied Mariner to Gemini 2.0 since December 2024, when it said the agent scored 83.5% on the WebVoyager benchmark for end-to-end web tasks. WebVoyager is a test set of 643 tasks across 15 websites, covering things like search, forms, shopping, and navigation. (blog.google, leaderboard.steel.dev) The practical change is concurrency. Google said in 2025 that Mariner had been moved off the user’s local browser and into the cloud, where it could handle up to 10 tasks at once instead of tying up one active tab. (techcrunch.com) Cloud buyers are now being pitched less on one chatbot and more on fleets of software workers. Google Cloud Chief Executive Thomas Kurian said companies want models that can “delegate tasks and sequences of tasks to agents,” and Google rebranded Vertex AI into the Gemini Enterprise Agent Platform around that shift. (theregister.com) That platform bundles the plumbing around the agents: low-code building tools, a registry to catalog internal agents, a marketplace for partner-made agents, and runtime tools for deployment and monitoring. Google said nearly 75% of Google Cloud customers now use its artificial intelligence products, and 330 customers processed more than 1 trillion tokens each over the last 12 months. (theregister.com, blog.google) Google also put money behind the rollout. On April 22, it announced a $750 million fund for its 120,000-member partner ecosystem to finance agent prototypes, deployments, training, and embedded engineering support. (prnewswire.com) The partner list shows who Google thinks will carry this into big companies: Accenture, Capgemini, Cognizant, Deloitte, HCLTech, Tata Consultancy Services, Bain, Boston Consulting Group, and McKinsey are among the firms named for engineering help or early model access. (prnewswire.com) Google’s pitch lands in a crowded field. TechCrunch wrote last year that Mariner was competing with OpenAI’s Operator, Amazon’s Nova Act, and Anthropic’s Computer Use, while The Register noted Workday is also trying to help companies manage large numbers of agents. (techcrunch.com, theregister.com) Google is closing the loop by bringing Mariner’s computer-use features into the Gemini Application Programming Interface and other Google products. The message at Cloud Next was that the benchmark score gets attention, but the sale depends on who can package agents, controls, and consulting into one system. (deepmind.google, thenextweb.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.