Google unveils Gemini 3.5 Flash
- Google released Gemini 3.5 Flash on May 19 at I/O 2026, making the model generally available for developers, enterprises and users in the Gemini app. - Google said Gemini 3.5 Flash outperforms Gemini 3.1 Pro on most benchmarks and runs four times faster than other frontier models. - Google said Gemini 3.5 Pro is due next month, while 3.5 Flash is live now in the Gemini API and Google AI Studio.
Google used its I/O 2026 conference on May 19 to launch Gemini 3.5 Flash, the first release in a new Gemini 3.5 model family aimed at coding and AI agent workflows. The company said the model is generally available in the Gemini API and available to users through the Gemini app and AI Mode in Google Search. Google framed the release around speed, cost and agent execution rather than a single headline reasoning claim. The launch arrived alongside new agent tools in Antigravity, Google’s developer platform for building and deploying AI agents. ### What exactly did Google launch? Google said Gemini 3.5 Flash is the first model in its new 3.5 series and described it as its strongest model so far for sustained agentic and coding work. In its I/O materials, the company said the model is intended for “complex, agentic workflows” and long-horizon tasks that require planning, iteration and tool use. (blog.google) May 19 release notes from the Gemini API said `gemini-3.5-flash` is now a generally available model. The same update also introduced Managed Agents in public preview, letting developers run autonomous, stateful agents in Google-hosted Linux sandbox environments. ### Why is Google emphasizing agents and coding? Google DeepMind executives Koray Kavukcuoglu, Jeff Dean, Oriol Vinyals and Noam Shazeer said the 3.5 line is built to combine “frontier intelligence with action.” In practice, Google’s examples focused on software development, codebase maintenance, document preparation and multi-step tasks that can be delegated to collaborating subagents. (blog.google) (ai.google.dev) Varun Mohan and Logan Kilpatrick, writing in a separate I/O developer post, tied the model directly to Google’s broader shift “from prompts to action.” That post paired Gemini 3.5 Flash with Antigravity 2.0, an Antigravity CLI and SDK updates, showing that Google was packaging the model with tools for orchestration and deployment rather than offering it as a standalone model release. (blog.google) ### What performance claims did Google make? Google said Gemini 3.5 Flash outperforms Gemini 3.1 Pro across almost all benchmarks and called it its strongest agentic and coding model yet. The company cited Terminal-Bench 2.1 at 76.2%, GDPval-AA at 1656 Elo, MCP Atlas at 83.6% and CharXiv Reasoning at 84.2% for multimodal understanding. (blog.google) Google also said the model runs four times faster than other frontier models when measured by output tokens per second. That claim is central to the way Google is positioning 3.5 Flash: not as its highest-end model overall, but as a system that keeps higher-end performance while cutting latency for production use. (blog.google) ### How much does it cost? Google’s Gemini Developer API pricing page lists Gemini 3.5 Flash at $1.50 per 1 million input tokens and $9.00 per 1 million output tokens on the paid tier. The page also lists context caching at $0.15 and $1.00 per 1,000,000 tokens per hour for storage, with Batch API access offering a 50% cost reduction. Google said in its model blog that 3.5 Flash can often complete work at less than half the cost of other frontier models, though that comparison was not broken out model by model in the blog post. (blog.google) The pricing page does show that Google is presenting 3.5 Flash as a premium-but-production-focused model rather than its cheapest option; Gemini 3.1 Flash-Lite remains the lower-cost offering for high-volume tasks. (ai.google.dev) ### Where does this fit in Google’s wider AI push? Google said 3.5 Flash is available in the Gemini app, AI Mode in Search, Antigravity, Google AI Studio, Android Studio and enterprise products. That distribution matters because the company is using the same launch to reach consumers, developers and corporate customers. (blog.google) The I/O developer post also included a new Google AI Ultra subscription starting at $100 a month and bonus Antigravity credits for subscribers who hit usage limits. Those details show Google tying model access, agent tooling and paid subscriptions more closely together as it tries to move developers onto its own stack. (blog.google) ### What comes next? Google said Gemini 3.5 Pro is already being used internally and is scheduled to roll out next month. The company’s May 19 release notes also show more near-term platform changes, including a new schema default on May 26 and removal of the legacy Interactions API schema on June 8. For developers, the immediate next step is not another announcement but migration and testing: Gemini 3.5 Flash is already live in the Gemini API, while Managed Agents and the Antigravity agent are in public preview. (blog.google) Google’s current model catalog lists 3.5 Flash as stable. (ai.google.dev) (blog.google)