Google pushes Gemini Omni multimodal

- Google said on May 19 it began rolling out Gemini Omni Flash, a new multimodal model, across the Gemini app, Google Flow and YouTube Shorts. - Sundar Pichai said Gemini Omni Flash is available “starting today,” with API access for developers and enterprise customers coming in coming weeks. - In the coming weeks, Google said developers and enterprise customers will get Gemini Omni Flash through its APIs.

Google on May 19 began rolling out Gemini Omni Flash, the first model in a new Gemini Omni family, across the Gemini app, Google Flow and YouTube Shorts. The company said the model accepts images, audio, video and text as input and can generate video outputs grounded in Gemini’s “real-world knowledge.” Google described the release as part of its I/O 2026 product announcements and said broader API access for developers and enterprise customers would follow in the coming weeks. ### What exactly did Google launch? Google said the new release is “the first model in the Omni family: Gemini Omni Flash.” In its product post, the company said Omni is designed to “create anything from any input — starting with video,” combining text, images, audio and video as inputs for video generation and editing. Google also said additional output modalities, including image and audio, would be supported over time. (blog.google) Sundar Pichai said in Google’s I/O keynote post that Gemini Omni Flash was “available starting today” in the Gemini app, Google Flow and YouTube Shorts. He also said Google would roll it out to developers and enterprise customers via APIs “in the coming weeks.” ### What can Gemini Omni do that earlier Gemini products did not? Google said Gemini Omni lets users “edit your videos through conversation,” using natural-language prompts to change clips rather than conventional timeline tools. (blog.google) The company also said the model can generate “high-quality videos” from mixed media inputs and framed the product as a step forward in “world understanding, multimodality and editing.” (blog.google) The I/O 2026 collection page said Gemini Omni was one of two new models Google introduced at the event, alongside Gemini 3.5. That page described Omni as a model that “can create anything from any input, starting with video,” while placing it within a broader set of launches that tied model capabilities more directly to consumer products. (blog.google) ### Where is Google putting the model first? Google said the first rollout targets three consumer-facing surfaces: the Gemini app, Flow and YouTube Shorts. Flow is Google’s AI video creation tool, and YouTube Shorts gives Google a direct short-form video outlet for creator-facing features tied to the model. Josh Woodward, Google’s vice president for Google Labs, Gemini app and AI Studio, said in a separate May 19 post that the Gemini app itself is becoming “more agentic” and is being positioned for proactive, cross-surface help. (blog.google) That post did not detail Omni, but it showed Google pairing new model releases with a broader push to make Gemini act across products rather than remain a standalone chat interface. (blog.google) ### How does this fit into Google’s wider I/O push? Google’s I/O 2026 roundup said the company was “making AI more helpful for everyone” through new models and product integrations. In that post, Google paired Gemini Omni with Gemini 3.5 Flash, which it described as focused on “frontier intelligence with action,” while Omni was presented as the multimodal creation and editing layer. (blog.google) Google’s earlier video-generation posts had already tied Gemini to Veo-powered creation tools, including short generated clips and photo-to-video features. The Omni launch extends that effort by moving from a video-generation feature inside Gemini to a model family that Google says is built around mixed-input creation and conversational editing. That is an inference from Google’s sequence of product posts. (blog.google) ### What happens next? Google said API access for developers and enterprise customers is due in the coming weeks. The company also said image and audio outputs would be added over time, indicating that the May 19 launch is the first public step for the Omni family rather than the full feature set. (blog.google)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.