Google Drops Gemini 3.1 Flash-Lite
Google has unveiled Gemini 3.1 Flash-Lite, its fastest and most cost-efficient AI model to date. The model is engineered for high-volume, low-latency workloads, making it ideal for developers building apps with real-time features like smart assistants and live chat.
Gemini 3.1 Flash-Lite is engineered for high-volume, cost-sensitive workloads, launching in preview via the Gemini API in Google AI Studio and Vertex AI. It is priced at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens, making it Google's most affordable model in the Gemini 3 series. This new model shows significant performance gains over its predecessors, with a 2.5 times faster Time to First Answer Token and a 45% increase in output speed compared to the Gemini 2.5 Flash model. Despite being a "Lite" version, it maintains or improves on the quality of the more expensive 2.5 Flash. On the Arena.ai Leaderboard, Gemini 3.1 Flash-Lite achieves an Elo score of 1432, placing it in the same league as many open-weight and last-generation commercial models. It also surpasses previous, larger Gemini models on key reasoning and multimodal understanding benchmarks. Developers can fine-tune the model's performance by adjusting its "Thinking Levels." This allows for a trade-off between near-instantaneous responses for simple queries and more deliberate reasoning for complex tasks, providing granular control over latency and cost. The model is positioned for specific use cases such as high-frequency translations, large-scale content moderation, dynamic user interface generation, and running simulations. Its speed and cost-effectiveness are aimed at developers building applications that need to handle a massive number of requests in real-time. The release of a "Flash-Lite" version first is a strategic shift for Google, which has traditionally led with more powerful "Flash" or "Pro" models. This move prioritizes getting a low-cost, high-efficiency model into the hands of developers for large-scale applications.