Google Releases Gemini 3.1 Pro with Doubled Reasoning

Google has launched its Gemini 3.1 Pro model, which it claims has more than double the reasoning capability of its predecessor. The model achieved a 77.1% on the Arc AGI-2 Benchmark for general reasoning. It is now available via the Gemini app, Notebook LM, Vertex AI, and directly within Android Studio.

- For developers, the model features a large 1 million token context window for input and has significantly increased its output capacity to 65,536 tokens. This allows it to process entire code repositories and generate extensive, complete code files in a single turn, avoiding a previous limitation of truncating code at around 21,000 tokens. - On software engineering benchmarks, Gemini 3.1 Pro shows competitive performance, achieving an 80.6% pass rate on SWE-Bench Verified for resolving real-world GitHub issues. However, in terminal-heavy agentic tasks measured by Terminal-Bench 2.0, it scores 68.5%, while some competing models like GPT-5.3-Codex score higher in that specific domain. - This release is heavily focused on "agentic" workflows, with Google introducing a specialized `gemini-3.1-pro-preview-customtools` endpoint to improve the reliability of tool use in software engineering and automated tasks. It is also the core engine for Google's new agent development platform, Google Antigravity. - A novel feature for frontend engineers is the model's ability to generate animated, website-ready SVGs directly from text descriptions. Because these animations are code-based rather than pixel-based, they have very small file sizes and remain sharp at any resolution. - The model introduces a three-tier "thinking system" (low, medium, high), allowing developers to balance latency with reasoning depth. The new "medium" setting on 3.1 Pro is roughly equivalent in quality to the "high" setting on the previous 3.0 Pro model, but with lower latency. - While benchmarks for abstract reasoning and scientific knowledge are high, some developers on Hacker News report that for practical, day-to-day coding within an IDE, models like Claude Opus can feel more effective at tool use and maintaining context during complex tasks. - Gemini 3.1 Pro is priced aggressively at $2.00 per million input tokens and $12.00 per million output tokens. This pricing undercuts some competitors, like Anthropic's Claude Sonnet 4.6, while outperforming it on several benchmarks.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.