Cloudflare praises Claude Mythos reasoning

- Cloudflare said on May 18 that Anthropic's Claude Mythos Preview marked a step forward after tests on more than 50 repositories. - Grant Bourzikas wrote Mythos' exploit-chain reasoning “looks like the work of a senior researcher,” distinguishing it from automated scanners in Cloudflare's evaluation. - Anthropic launched Project Glasswing on April 7, and Reuters reported May 20 that debate over Mythos' real-world risk continues.

Cloudflare said this week that Anthropic’s Claude Mythos Preview showed a level of security reasoning beyond earlier models it had tested on its own systems. In a May 18 blog post, Cloudflare said it pointed the model at more than 50 internal and open-source repositories as part of Anthropic’s Project Glasswing. The company said Mythos could do more than flag isolated bugs: it could connect smaller weaknesses into working exploit paths and generate code to test whether a suspected flaw was real. PC Gamer reported the assessment on May 20, highlighting Cloudflare’s description of the model’s reasoning as resembling the work of a senior researcher. ### What exactly did Cloudflare say Mythos was doing differently? Cloudflare said on May 18 that Mythos Preview represented “a real step forward” after months of testing security-focused large language models on its infrastructure. The company said the jump was not just an incremental improvement over previous frontier models but “a different kind of tool doing a different kind of work.” (blog.cloudflare.com) Grant Bourzikas, who wrote the Cloudflare post, said the model stood out in two areas: exploit-chain construction and proof generation. Cloudflare said Mythos could take several attack primitives, reason about how they fit together into a working exploit, then write and run code in a scratch environment to test its own hypothesis. Bourzikas wrote that “the reasoning it shows along the way looks like the work of a senior researcher rather than the output of an automated scanner.” (blog.cloudflare.com) ### Why did Cloudflare focus on exploit chains instead of just bug counts? Cloudflare said real attacks rarely rely on a single flaw and instead combine multiple smaller weaknesses. In its description, the company said Mythos could move from a bug such as a use-after-free issue to arbitrary read-and-write access, then to control-flow hijacking and return-oriented programming chains. (blog.cloudflare.com) Anthropic made a similar case when it introduced Mythos Preview on April 7. In a technical post, Anthropic said the model was “strikingly capable” at computer security tasks and said it had found and exploited zero-day vulnerabilities in every major operating system and every major web browser during testing. Anthropic also said more than 99% of the vulnerabilities it found had not yet been patched, limiting what it could disclose publicly. (blog.cloudflare.com) ### Does Cloudflare’s praise settle the broader security debate? Reuters reported on May 20 that early fears Mythos would dramatically turbocharge hacking now looked overstated a month after release, based on the evidence available so far. Reuters said practitioners had taken a more measured view than policymakers, even after Anthropic’s April warnings and government discussions about possible controls on how advanced models are released. (red.anthropic.com) Isaac Evans, founder and chief executive of Semgrep, told Reuters there was “a really big communication gap between practitioners and policymakers.” Evans called Mythos “a real technical advance,” but said the broader response was “not substantiated by what we actually know about how those capabilities will translate in the field.” Reuters also reported that experts with early access said AI-assisted vulnerability discovery had already been improving for months or years, and that the harder problem remained validating, prioritizing and fixing flaws. (money.usnews.com) ### Where does cost and deployment fit into this story? Cloudflare’s post focused on capability and workflow rather than price. The company said architecture and process around these models would need to change if they were to be used at scale, suggesting that the operational harness around the model still matters alongside the model itself. PC Gamer’s May 20 article framed Cloudflare’s write-up as one of the clearest public descriptions so far of what Mythos may be good at in practice. (money.usnews.com) Anthropic, for its part, has kept Mythos in preview and tied access to Project Glasswing, its effort to use the model to secure critical software before wider release. (blog.cloudflare.com) ### What happens next? Anthropic said on April 7 that Project Glasswing was meant to help secure critical software and prepare the industry for new defensive practices around models like Mythos. Cloudflare said on May 18 that it is still working through how these systems should be used at scale. Any broader release decision, additional technical disclosures, or new guardrails from Anthropic and participating security teams will be the next concrete milestones to watch. (pcgamer.com) (blog.cloudflare.com)

Cloudflare praises Claude Mythos reasoning

Get your own daily briefing