AlphaGeometry 2 cracks IMO geometry

- Google DeepMind’s AlphaGeometry 2 has moved from silver-medal-level geometry to gold-medalist benchmark performance on IMO geometry problems from 2000 through 2024. - The jump came from expanding its formal geometry language from 66% to 88% coverage, helping it solve 84% of the full set. - That matters because proof-grade math AI is shifting from clever pattern matching toward systems that can actually reason formally.

Geometry proofs are a weirdly good stress test for AI. They look like high-school math, but the hard part is not calculation — it’s inventing the right construction, then proving every step cleanly. That combination has been a wall for years. AlphaGeometry 2 looks like a real crack in it. Google DeepMind’s latest system now clears gold-medalist benchmark territory on Olympiad geometry sets, which is a meaningful jump from the original AlphaGeometry and from last year’s broader silver-medal IMO result. ### What actually improved? The short version is that AlphaGeometry 2 got better at both halves of the job. It uses a neural model to suggest useful moves — like where to add an auxiliary point — and a symbolic engine to check whether those moves really lead to a proof. The new paper says the system’s formal language now handles tougher cases, including moving objects, linear relations among objects, and constructive problems that the earlier version simply could not express. ### Why does “coverage” matter so much? Because an AI cannot solve a proof problem it cannot even write down in its own internal language. That was one of the big bottlenecks in the first AlphaGeometry. DeepMind says AlphaGeometry 2 lifted language coverage on IMO geometry problems from 66% to 88% across problems from 2000 to 2024. In plain English — many problems that used to be out of scope can actually start reasoning instead of failing at translation. ### How good is the result, exactly? There are two headline numbers, and both matter. On the full IMO 2000–2024 geometry set, the paper reports an 84% solve rate, up from 54% for the original AlphaGeometry. On a benchmark scored against human Olympiad performance, the authors say AlphaGeometry 2 surpassed the average gold medalist. Those are not the same claim, but together they say the system. It is operating in elite territory on this narrow domain. ### Is this the same as winning the IMO? Not really. The IMO has six problems across algebra, number theory, combinatorics, and geometry, under strict contest conditions. AlphaGeometry 2 is a geometry specialist. In the 2024 competition-style result, DeepMind paired it with AlphaProof, and the combined system solved four of six problems for 28 points — silver-medal standard. AlphaGeometry 2 Proof handled algebra and number theory. ### So why is geometry special? Because geometry forces structure. A language model can bluff its way through prose. It cannot bluff a valid Euclidean proof. Every point, angle, and equality has to fit. That makes geometry a bit like software verification — one bad step breaks the whole chain. AlphaGeometry’s neurosymbolic design matters for example. ### What changed beyond raw model size? The paper points to search. AlphaGeometry 2 uses a stronger language model, a faster symbolic engine, and a search setup that lets multiple trees share useful discoveries.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.