DeepMind’s Aletheia released
- DeepMind announced Aletheia, a system using Gemini 3 Deep Think to run autonomous mathematical research agents. - The project focuses on long‑horizon reasoning, planning, and verification in formal maths work. - Researchers view Aletheia as a testbed for autonomous research loops that could transfer to code and scientific domains. (infoq.com)
Google DeepMind has released Aletheia, an artificial intelligence system built to do autonomous mathematical research with Gemini 3 Deep Think. (deepmind.google) Formal mathematics means writing proofs in a way a computer or expert can check step by step, instead of relying on intuition or skipped algebra. DeepMind said Aletheia works in a loop: it generates candidate proofs, verifies them, and revises them until it either finds a solution or gives up. (arxiv.org) In a February 2026 paper, DeepMind researchers said Aletheia is powered by an advanced version of Gemini Deep Think, extra inference-time compute, and tool use for navigating long proofs and mathematical literature. The same paper describes the target as “autonomous mathematics research,” not just contest problem solving. (arxiv.org) That distinction follows DeepMind’s July 21, 2025 announcement that an advanced Gemini with Deep Think reached gold-medal standard at the International Mathematical Olympiad, a six-problem competition for pre-university students. Research math is a different task: problems are open-ended, proofs can run for pages, and there is no answer key. (deepmind.google) DeepMind said Aletheia scored up to 90% on IMO-ProofBench Advanced, a benchmark for proof-writing, as more compute was used at inference time. The company also said its internal “FutureMath Basic” benchmark suggests that scaling behavior continues beyond Olympiad-level work into PhD-level exercises. (deepmind.google) A second paper reported that Aletheia entered the inaugural FirstProof challenge, a set of 10 unpublished research-level problems, under a zero-human-intervention setup. The authors said expert reviewers judged 6 of the 10 submissions as solved by majority assessment, with disagreement on one of those problems. (arxiv.org) InfoQ reported the release on April 19, 2026, framing Aletheia as a move toward fully autonomous agentic math research rather than a one-shot theorem prover. That matters inside the field because the system is being presented as a reusable research loop: propose, check, revise, and only submit when confidence is high. (infoq.com) Google has also tied the project to broader product and platform plans around Deep Think. In February, the company said Gemini 3 Deep Think was becoming available to Google AI Ultra subscribers in the Gemini app, while researchers, engineers, and enterprises could register interest for early Gemini API access. (blog.google) DeepMind’s own write-up says the same agent pattern could transfer from formal math to coding and scientific research, where long chains of planning and verification also matter. For now, Aletheia is the clearest public test of whether a large model can run a research process, not just produce a polished final answer. (deepmind.google)