Anthropic paused a model over vulnerabilities

Anthropic reportedly kept a new model private after discovering thousands of external vulnerabilities across major operating systems and browsers and launched a patching effort dubbed 'Project Glasswing' to close those holes before release. The episode highlights how more capable agentic systems can expand the attack surface and force vendors to delay public rollouts for security work. (artificialintelligence-news.com)

Anthropic built a new model that was good enough at hacking that it decided not to put it on the public internet. On April 7, 2026, the company said its unreleased Claude Mythos Preview had already found thousands of high-severity software flaws, including bugs in every major operating system and web browser. (anthropic.com) A software vulnerability is a mistake in code that acts like a hidden broken lock in a building. A zero-day vulnerability is the worst kind of broken lock, because the vendor does not know it exists yet and has had zero days to fix it. (red.anthropic.com) For years, the hard part in hacking was not just spotting a bug but turning it into a working break-in. Anthropic says Mythos can do both steps on its own, moving from reading code to producing proof-of-concept exploits across major systems and browsers. (red.anthropic.com) Anthropic says this jump did not come from training a “hacker model” from scratch. It says the capability showed up as a side effect of a more general model getting better at coding, reasoning, and acting autonomously over long tasks. (red.anthropic.com) The company’s own examples are unusually concrete. In one benchmark on Firefox 147 JavaScript engine bugs, Anthropic says its older Claude Opus 4.6 produced working shell exploits twice in several hundred attempts, while Mythos succeeded 181 times and gained register control 29 more times. (helpnetsecurity.com) Anthropic says Mythos also scanned about 7,000 entry points from open-source repositories in the OSS-Fuzz testing corpus. In that test, Claude Sonnet 4.6 and Claude Opus 4.6 each reached full control-flow hijack once, while Mythos did it ten times on fully patched targets. (helpnetsecurity.com) One reason the company is being vague is that most of the bugs are still live. Anthropic says more than 99 percent of the vulnerabilities it found were still unpatched when it published its technical write-up, so it withheld details under a coordinated disclosure process. (red.anthropic.com) The oldest example it did describe was a 27-year-old bug in OpenBSD’s Transmission Control Protocol selective acknowledgment code. Anthropic says that flaw could let a remote attacker crash any OpenBSD host responding over Transmission Control Protocol traffic. (helpnetsecurity.com) Instead of a public launch, Anthropic created Project Glasswing and handed access to a small defense-focused group. The launch partners include Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, with more than 40 additional infrastructure groups also getting access. (anthropic.com) Anthropic says it is putting up to $100 million in usage credits and $4 million in direct donations to open-source security groups behind the effort. The idea is to use the model like a tireless code auditor before similar capabilities spread to criminals, spies, and other actors with fewer restraints. (anthropic.com) This is also a story about AI agents, which are models allowed to keep working through tools instead of stopping after one answer. Anthropic’s system card says Mythos is “significantly more capable” and more agentic than any prior model it released, which is exactly what makes it more useful for both patching systems and working around restrictions. (anthropic.com) Anthropic’s own release decision is the clearest signal in the whole episode. Its system card says Mythos Preview is its most capable frontier model to date, and that increase in capability led the company to decide against general availability and limit it to a defensive cybersecurity program with selected partners. (anthropic.com)

Anthropic paused a model over vulnerabilities

Get your own daily briefing