Google Gemini Generates 3D Models

Google has upgraded its Gemini 3 Deep Think model to allow users to generate 3D-printable models directly from simple sketches. The development represents a significant advance in generative computer vision, connecting user-friendly inputs with rapid prototyping capabilities.

- The sketch-to-3D model capability is an upgrade to Gemini 3 Deep Think, a specialized reasoning mode designed for complex problem-solving in science and engineering. This feature is now accessible to Google AI Ultra subscribers through the Gemini app and to select researchers and enterprises via the Gemini API. - This development follows Google's earlier work in 3D generation, including the DreamFusion model, which utilized a pretrained text-to-image diffusion model (Imagen) and Neural Radiance Fields (NeRF) to create 3D objects from text. The current upgrade focuses on translating the geometry and shapes from a user's drawing into a printable file. - The underlying technology for these types of generative models often involves techniques like Score Distillation Sampling (SDS), which leverages 2D image generation models to create 3D shapes. Researchers at institutions like MIT have been working on refining these methods to produce sharper, higher-quality 3D outputs without requiring expensive retraining of the AI models. - Competitors in the generative 3D space include OpenAI's Shap-E and Point-E, which are open-source, and Nvidia's LATTE3D, which aims to generate textured 3D meshes from text prompts in under a second. The broader AI-generated 3D asset market was valued at $1.63 billion in 2024 and is projected to grow significantly. - Beyond sketches, related research is exploring the reconstruction of entire 3D scenes from a single photograph and turning 2D storyboard sketches into 3D animations, indicating a broader trend toward simplifying 3D content creation. - The upgraded Deep Think model has also demonstrated high performance on various academic benchmarks, achieving top scores on reasoning tasks like ARC-AGI-2 (84.6%) and competitive programming challenges on Codeforces (Elo of 3455). It also showed gold medal-level results on the written sections of the 2025 International Physics and Chemistry Olympiads. - The model is capable of generating code for 3D visualizations, as demonstrated by its ability to create a 3D journey through the universe from a proton to the observable universe, showcasing a significant improvement in "vibe coding" over previous versions. User prompts for 3D model generation can be highly detailed, specifying geometry, structural decomposition, and even constraints for 3D printing like ensuring the mesh is "watertight". - This advancement is part of a larger trend of integrating generative AI into 3D modeling workflows to accelerate creation time from hours to seconds. The goal is to move beyond theoretical applications to provide practical tools for researchers and engineers to interpret complex data and model physical systems.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.