Google Upgrades AI Image Generator
Google has launched Nano Banana 2, a major upgrade to its AI image tool. The new version boasts higher-resolution output, better handling of infographics and data visualizations, and improved instruction-following, and is now live in the Gemini app and Search’s AI Mode.
This latest model, technically named Gemini 3.1 Flash Image, is designed to merge the high-speed generation of Google's Flash models with the advanced capabilities previously found only in its Pro-tier offerings. The goal is to make features like rapid editing and iteration accessible to a much broader user base, for free. A key upgrade in Nano Banana 2 is subject consistency, allowing the model to maintain the appearance of up to five characters and 14 objects throughout a series of images. This directly addresses a common frustration with previous AI image generators where characters would change appearance in subsequent renderings, making narrative storyboarding difficult. The model also boasts improved "world knowledge," pulling from Gemini's knowledge base and real-time web search to more accurately render specific subjects and data. This enables the creation of detailed infographics, diagrams from notes, and data visualizations based on current information. Additionally, it features more precise text rendering for marketing mockups and can even translate text within an image. This launch follows a period of controversy for Google's image generation tools. In early 2024, the previous Gemini model was widely criticized for generating historically inaccurate images, such as depicting Nazi-era German soldiers as people of color, which led Google to temporarily halt the ability to generate images of people. Google's CEO, Sundar Pichai, called the outputs "completely unacceptable." Under the hood, Nano Banana 2 offers resolutions from 512px up to 2K for free users, with the ability to upscale to 4K. It also supports a wider range of aspect ratios, including 4:1 and 8:1, for different formats like social media stories or widescreen presentations. To help identify AI-generated content, Google is continuing its use of SynthID for invisible watermarking and now includes C2PA Content Credentials. The competitive landscape for AI image generation includes major players like OpenAI's DALL-E 3 and the artist-favored Midjourney, which is often considered the gold standard for image quality and user control. While Google aims for speed and accessibility, competitors have focused on different strengths, with Adobe Firefly emphasizing commercial safety and Ideogram specializing in text generation within images.