Google launches Gemini 3.1 Flash Live
Google shipped Gemini 3.1 Flash Live — a real‑time speech-to-speech model tuned for live, multi-step agent workflows and scoring highly on complex audio benchmarks — signalling a push toward voice-first agent orchestration. Platform teams will need to re-test session observability and integration points because the model changes output structure and error behavior compared with prior Gemini releases. (manilashaker.com) (blog.laozhang.ai)
Migration requires swapping the model string from gemini-2.5-flash-native-audio-preview-12-2025 to gemini-3.1-flash-live-preview and replacing the old thinkingBudget knob with thinkingLevel (options: minimal, low, medium, high), with the default set to minimal to prioritize latency. (ai.google.dev) A single BidiGenerateContentServerContent event can now contain multiple content parts simultaneously (for example, interleaved audio chunks plus transcript parts), so integrations must parse every part in each event to avoid dropped media or transcripts. (ai.google.dev) The Live API uses stateful WebSocket sessions; Google’s docs recommend server-side session context persistence, reconnection with exponential backoff, and sensible timeouts because the session remembers conversational context across the stream. (docs.cloud.google.com) Tool/function calls and multimodal outputs travel over the same low-latency bidi stream, meaning platform dispatchers should handle asynchronous function-call requests and simultaneous audio+visual parts rather than sequential, turn-based tooling. (docs.cloud.google.com) Google published the gemini-3.1-flash-live-preview entry and changelog on March 26, 2026, and announced that the model is available to developers via the Gemini Live API in Google AI Studio while powering Gemini Live and Search Live worldwide. (ai.google.dev) Google is applying SynthID provenance watermarking to generated media (including audio) for detection of AI-created content, so enterprise audio pipelines and compliance filters must treat generated audio as watermarked content and surface verification metadata where required. (support.google.com)