Google Gemini Adds AI Music Generation Feature

Google has launched Lyria 3, a new music generation feature integrated into its Gemini model. The tool allows users to create 30-second songs from text, photo, or video prompts. The update positions Gemini as a more comprehensive multimodal foundation model for both creative and enterprise applications.

- To mitigate copyright concerns, all audio generated by Lyria 3 is embedded with SynthID, an imperceptible, inaudible watermark designed to identify the content as AI-generated. This watermarking technology is robust against common modifications like MP3 compression or noise addition. - The development and training of large-scale models like Lyria and Gemini are powered by Google's custom-designed Tensor Processing Units (TPUs). This vertical integration of hardware and models is a strategic advantage, allowing for co-design that optimizes performance and cost-effectiveness for training and inference at scale. - For enterprise developers, Lyria is accessible through the Vertex AI API, allowing for the integration of music generation into third-party applications. This opens up go-to-market opportunities for use cases in advertising, gaming, and scalable content creation. - The recurring operational expense of running a model at scale, known as inference cost, is a critical factor for the profitability of AI applications. This cost is continuous and usage-based, making the performance-per-dollar of the underlying hardware, like Google's TPUs versus competitor GPUs, a key consideration for MLOps teams. - The competitive landscape for AI music generation includes startups like Suno, which has gained significant traction and is valued at an estimated $2 billion, and other major tech players like Adobe and potentially OpenAI. These companies are competing to offer more advanced features, such as generating full-length songs with vocals and providing DAW-like editing capabilities. - Lyria 3 is part of a broader family of music generation models from Google DeepMind, which also includes Lyria RealTime for interactive, streaming music creation. This signals a strategy to address different segments of the market, from casual creators to developers building real-time interactive experiences. - While the consumer-facing feature in the Gemini app generates 30-second clips, the underlying Lyria model available via the Vertex AI API for developers produces instrumental tracks that are 32.8 seconds long.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.