Gary Marcus checks OpenAI math claim

- Gary Marcus wrote on May 21 that he was checking recent math claims from OpenAI and Anthropic and urged readers to verify the underlying details. - Thomas Bloom, the mathematician Marcus highlighted, said the OpenAI counterexample may have been unusually well suited to LLM-assisted discovery and called for replication. - Marcus’s May 21 Substack post linked to Bloom’s comments, OpenAI’s announcement and sample verification material for readers to inspect.

Gary Marcus used a May 21 Substack post to press for caution around fresh AI-math headlines from OpenAI and Anthropic, arguing that readers should inspect the underlying claims rather than rely on promotional framing alone. Marcus’s post, “Checking the math behind OpenAI and Anthropic’s latest headlines,” appeared on his Substack feed on May 21, according to the publication archive. OpenAI had announced on May 20 that one of its models “has disproved a central conjecture in discrete geometry,” according to the company’s research page. TechCrunch reported the claim concerned an Erdős problem in geometry first posed in 1946, and said OpenAI published supporting remarks from mathematicians including Thomas Bloom, Noga Alon and Melanie Wood. ### What exactly was Marcus checking? Marcus’s post was aimed at two separate strands of AI-math publicity: OpenAI’s claim of a new result on a long-standing Erdős-related problem and Anthropic-related headlines that also suggested notable mathematical progress, according to the Substack page surfaced in search results. (garymarcus.substack.com) His framing, as reflected in the post title and subtitle, was to “check” the claims and read “the fine print.” (openai.com) The immediate backdrop was a burst of AI-assisted activity around Erdős problems. Thomas Bloom wrote on the Erdős Problems forum on May 15 that the site had seen “a lot of activity in the last few weeks” since the release of GPT 5.5, including “a few new solutions,” with some verified and others still awaiting verification. Bloom added that the site “is not an AI benchmark” and said the next step should be to “carefully try and understand these solutions.” (garymarcus.substack.com) ### Why did Bloom’s comments matter here? Thomas Bloom is not just a commentator on the sidelines. The Erdős Problems site identifies him as its creator and maintainer and as a research fellow at the University of Manchester. TechCrunch also identified Bloom as one of the mathematicians whose remarks accompanied OpenAI’s announcement. Bloom’s role matters because OpenAI’s claim was being discussed inside an existing community already tracking AI-assisted work problem by problem. (erdosproblems.com) On the Erdős Problems site, one entry says an “internal OpenAI model” provided a negative answer to part of Problem #1091, and that page was last edited on April 9. That does not by itself settle the broader headline, but it shows the OpenAI result was being logged in a specialist venue Marcus pointed readers toward for closer scrutiny. (erdosproblems.com) ### What was the caution, specifically? Marcus’s caution was not that the OpenAI claim had been disproved. It was that readers should separate a mathematically interesting result from the broader marketing leap that often follows such announcements, according to the Substack post description and the materials he linked. Bloom’s own public comments pointed in the same direction. On May 15, he said mathematicians care not only whether a problem is true or false, but also how to understand the reason, communicate it and place it in context. (erdosproblems.com) That emphasis on verification, explanation and replication is the standard Marcus said readers should apply to the new claims. ### How much of OpenAI’s claim had outside support? (garymarcus.substack.com) TechCrunch reported on May 20 that OpenAI, unlike in an earlier episode involving overstated Erdős claims, had this time published supporting remarks from outside mathematicians including Bloom. The article said Bloom had previously criticized an earlier OpenAI-related claim as “a dramatic misrepresentation,” and presented the new episode as more carefully backed. (erdosproblems.com) Scientific American separately reported that mathematicians were treating the result as potentially publishable at the top level if a human had produced it unaided. Marcus’s intervention did not erase that outside interest; it focused attention on what had actually been shown, by whom, and under what conditions. ### What comes next? (techcrunch.com) The next step is likely to be more checking by mathematicians rather than a single corporate update. Bloom said on May 15 that some AI-assisted solutions were still awaiting verification, and Marcus’s May 21 post directed readers to examine linked comments and sample checks themselves. (erdosproblems.com) (scientificamerican.com)

Gary Marcus checks OpenAI math claim

Get your own daily briefing