Google's Gemini adds agentic workflows
- Google’s Gemini API stack is getting more explicitly agent-shaped, with Interactions API changes, webhooks, and updated agent docs all landing in early May. - The most concrete shift is structural: Google is renaming Interactions API outputs to steps on May 26, with legacy support ending June 8. - This matters because Gemini is moving from “model that can call tools” toward “platform for long-running agents” — with state, callbacks, and built-ins.
Google is turning Gemini into more of an agent platform and less of a one-shot chatbot API. That’s the real story here. The recent changes aren’t a single flashy launch so much as a stack of developer updates that make multi-step workflows easier to build, monitor, and keep alive over time. The gap Google is trying to close is obvious — calling a model once is easy, but building an agent that plans, waits, calls tools, resumes, and finishes reliably is the hard part. Over the last few days, Google has pushed that stack forward inside AI Studio and the Gemini API. (ai.google.dev) ### What actually changed? The clearest signal is in the Gemini API release notes. On May 6, Google posted an upcoming breaking change for the Interactions API: request and response schemas are shifting from `outputs` to `steps`, with the new schema becoming the default on May 26 and the legacy version going away on June 8. On May 4, Google also added event-driven webhooks so developers can stop polling long-running jobs and get notified when work finishes. (ai.google.dev) ### Why does “steps” matter? Because “steps” is agent language, not chatbot language. A normal chat API mostly returns an answer. An agent API has to expose intermediate actions — tool calls, reasoning turns, background tasks, and state transitions. Google’s own Interactions API description says it was built for “interleaved messages, thoughts, tool calls and their state,” which is basically the plumbing you need for software that does work in stages. (blog.google) ### What is the Interactions API? It’s Google’s unified endpoint for talking to either a Gemini model or a built-in agent through the same interface. Google introduced it in public beta in December 2025 inside AI Studio, with server-side state and background execution baked in. That matters because developers building agents usually end up re-creating those features th(blog.google) that into the platform. (blog.google) ### What kinds of agents is Google aiming at? Not simple chatbots. Google’s current agent docs define agents as systems that plan, execute a series of actions, interact with external systems, and then synthesize results. The docs point developers toward built-in tools like Google Search, Maps, Code Execution, URL Context, and Computer Use, plus custom function calling (blog.google)ually touch the world. (ai.google.dev) ### Didn’t Gemini already support this? Yes — but in pieces. Back in late 2025, Google added “thought signatures,” thinking controls, and more explicit function-calling support for Gemini 3. Thought signatures are especially important here because they preserve reasoning continuity across multiple calls, which helps agents avoid losing the thread halfway through a task. The new changes look like the next (ai.google.dev)ation. (developers.googleblog.com) ### Why add webhooks now? Because polling is clunky and expensive when agents run for a while. If a workflow has to search, browse files, call tools, and wait on a background job, constantly asking “are we done yet?” is bad infrastructure. Webhooks let Gemini push the completion event to the developer’s server instead. Google says they work for Batch jobs, Interactions, and video generation — which is exactly the kind of async behavior agent builders care about. (ai.google.dev) ### Is this about AI Studio or the API? Both. AI Studio is the front door where many developers start, but the underlying shift is in the Gemini API itself. Google’s docs now foreground agents as a first-class concept, and the changelog shows the API evolving around long-running, stateful workflows rather than just prompt-in, text-out responses. That’s a product direction, not a UI tweak. (ai.google.de([ai.google.dev)’s the bottom line? Google didn’t just add another “agentic” marketing label. It is steadily reshaping Gemini’s developer stack around actual agent mechanics — state, steps, built-in tools, background execution, and callbacks. Basically, Gemini is trying to become the place where a developer can build software that does a job, not just answers a question. (ai.google.dev)