Gemini Omni Flash demo generates photoreal 'anything-to-anything' avatars

- Google on May 19 rolled out Gemini Omni Flash, a new multimodal model that turns text, images, audio and video into edited video. - Google said Omni Flash can “create anything from any input,” and The Verge reported the avatar demos made photoreal video impersonation look easy. - Google said Gemini Omni Flash is rolling out in the Gemini app, Google Flow and YouTube Shorts, with developer API evaluations to follow.

Google on May 19 introduced Gemini Omni Flash, the first release in a new Gemini Omni family built to turn mixed inputs — text, images, audio and video — into edited video. The launch came at Google I/O 2026 and put a new consumer-facing demo behind one of the company’s most ambitious claims in generative media: that a single model can “create anything from any input.” Google said Omni Flash is rolling out to the Gemini app, Google Flow and YouTube Shorts. The company’s model card says the system outputs high-resolution video with audio and is intended for video creation and conversational editing, while broader output modes such as image and audio are planned later. ### How is Gemini Omni Flash different from a normal video generator? (blog.google) Koray Kavukcuoglu, Google DeepMind’s chief technology officer, wrote that Gemini Omni can combine images, audio, video and text as input and generate video grounded in Gemini’s “real-world knowledge.” Google said users can then keep editing through natural-language prompts, with each instruction building on the last. (blog.google) Google’s model card describes Omni Flash as a transformer-based model with native multimodal support for text, vision, video and audio inputs. That matters because Google is presenting Omni not as a chain of separate tools, but as one model designed to reason across several media types at once. ### What did Google actually show with the avatar demos? TechCrunch reported that Google said users will be able to create videos with their own digital avatars, a feature tied to the company’s broader push into personalized video generation. (blog.google) In a media briefing, DeepMind product director Nicole Brichtova said users who want that capability must complete onboarding by recording themselves and reading a string of numbers. (deepmind.google) The Verge’s hands-on, cited in the source briefing for this story, focused on how easily the tool could turn personal images and short clips into realistic avatar-style outputs. That framing put the demo less in the category of novelty filters and more in the category of synthetic identity tools that can mimic a real person’s appearance on video. (techcrunch.com) ### Why are reviewers focusing on deepfakes and identity risk? Google said the avatar feature includes identity checks during onboarding, and DeepMind’s public model page says Omni Flash was developed with internal safety, security and responsibility teams and underwent red teaming and other evaluations. The company has not yet published the API-stage evaluations it says will cover video creation, video editing and image generation when the model reaches developers and enterprise customers. (theverge.com) The concern raised by reviewers is straightforward: a tool that lowers the work needed to make convincing avatar video also lowers the barrier to impersonation. The Verge’s review, as summarized in the briefing, warned that the demos made high-quality avatar-video generation look trivial, raising moderation and provenance questions for consumer products that let users post or remix media. (techcrunch.com) ### Where does this leave startups building social or creator apps? Google said Omni Flash is already being distributed through consumer products, including YouTube Shorts. That means startups are not looking at a distant research prototype; they are looking at a capability entering mainstream creation surfaces now. The immediate burden for smaller platforms is not model training but policy and controls: account verification, consent flows, labeling, abuse reporting and response speed. (theverge.com) Google’s own rollout sequence offers one marker to watch next — the company said fuller capability evaluations will be shared when Gemini Omni Flash expands to developers and enterprise customers through APIs. (deepmind.google) (blog.google)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.