AlphaGeometry 2 cracks IMO geometry
- Google DeepMind’s AlphaGeometry 2 has moved from silver-medal-level geometry to gold-medalist benchmark performance on IMO geometry problems from 2000 through 2024. - The jump came from expanding its formal geometry language from 66% to 88% coverage, helping it solve 84% of the full set. - That matters because proof-grade math AI is shifting from clever pattern matching toward systems that can actually reason formally.
Geometry proofs are a weirdly good stress test for AI. They look like high-school math, but the hard part is not calculation — it’s inventing the right construction, then proving every step cleanly. That combination has been a wall for years. AlphaGeometry 2 looks like a real crack in it. Google DeepMind’s latest system now clears gold-medalist benchmark territory on Olympiad geometry sets, which is a meaningful jump from the original AlphaGeometry and from last year’s broader silver-medal IMO result. ### What actually improved? The short version is that AlphaGeometry 2 got better at both halves of the job. It uses a neural model to suggest useful moves — like where to add an auxiliary point — and a symbolic engine to check whether those moves really lead to a proof. The new paper says the system’s formal language now handles tougher cases, including moving objects, linear relations among objects, and constructive problems that the earlier version simply could not express. ### Why does “coverage” matter so much? Because an AI cannot solve a proof problem it cannot even write down in its own internal language. That was one of the big bottlenecks in the first AlphaGeometry. DeepMind says AlphaGeometry 2 lifted language coverage on IMO geometry problems from 66% to 88% across problems from 2000 to 2024. In plain English — many problems that used to be out of scope can actually start reasoning instead of failing at translation. ### How good is the result, exactly? There are two headline numbers, and both matter. On the full IMO 2000–2024 geometry set, the paper reports an 84% solve rate, up from 54% for the original AlphaGeometry. On a benchmark scored against human Olympiad performance, the authors say AlphaGeometry 2 surpassed the average gold medalist. Those are not the same claim, but together they say the system. It is operating in elite territory on this narrow domain. ### Is this the same as winning the IMO? Not really. The IMO has six problems across algebra, number theory, combinatorics, and geometry, under strict contest conditions. AlphaGeometry 2 is a geometry specialist. In the 2024 competition-style result, DeepMind paired it with AlphaProof, and the combined system solved four of six problems for 28 points — silver-medal standard. AlphaGeometry 2 Proof handled algebra and number theory. ### So why is geometry special? Because geometry forces structure. A language model can bluff its way through prose. It cannot bluff a valid Euclidean proof. Every point, angle, and equality has to fit. That makes geometry a bit like software verification — one bad step breaks the whole chain. AlphaGeometry’s neurosymbolic design matters for example. ### What changed beyond raw model size? The paper points to search. AlphaGeometry 2 uses a stronger language model, a faster symbolic engine, and a search setup that lets multiple trees share useful discoveries.