Anthropic paused a model over security findings
Anthropic reportedly kept a new AI model private after discovering thousands of external vulnerabilities across major operating systems and browsers during testing. The episode highlights how cutting‑edge model work is increasingly tangled with security posture and release governance rather than being a pure research milestone. That makes secure release processes and vulnerability discovery central to any company shipping advanced AI capabilities. (artificialintelligence-news.com)
Anthropic built a new model, called Claude Mythos Preview, and then decided not to release it to the public after testing showed it could find and exploit previously unknown software flaws across every major operating system and every major web browser. Anthropic announced the holdback on April 7, 2026, and limited access to selected partners instead. (anthropic.com) A software vulnerability is a mistake in code that can act like an unlocked window in a house. A zero-day vulnerability is the nastier version: the developer does not know the window is open yet, so there is no patch on the wall beside it. (red.anthropic.com) Anthropic says Mythos Preview did not just spot these hidden openings. During testing, the company says the model could also turn them into working break-ins when a user asked it to, which is why Anthropic treated the model more like a restricted security tool than a normal chatbot launch. (red.anthropic.com) The scale is what changed the conversation. Anthropic says the model found thousands of high-severity vulnerabilities, including flaws in every major operating system and browser, and said more than 99 percent of the bugs it found were still unpatched when it wrote up the results. (anthropic.com) (red.anthropic.com) Some of the bugs were old enough to vote. Anthropic says the oldest example it can discuss was a 27-year-old bug in OpenBSD, and outside reporting says the company also cited a 16-year-old flaw in FFmpeg, which is widely used software for handling audio and video files. (red.anthropic.com) (thehackernews.com) Instead of opening the model to everyone, Anthropic created Project Glasswing, a defensive program with Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic says more than 40 additional organizations that maintain critical software also got access to use the model for scanning and fixing systems. (anthropic.com) Anthropic also put money behind the quarantine. The company says it is committing up to $100 million in usage credits for Mythos Preview and another $4 million in direct donations to open-source security groups, which are the volunteer and nonprofit teams that maintain code much of the internet quietly runs on. (anthropic.com) This is a different kind of artificial intelligence release decision than “the model is inaccurate” or “the model says rude things.” Anthropic’s own technical write-up says the concern is that models have reached a level where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. (anthropic.com) That changes the bottleneck in cybersecurity. For years, attackers often needed two separate hard steps — finding a hidden bug and then figuring out how to weaponize it — and Anthropic is arguing that Mythos Preview compresses both steps into one machine workflow. (red.anthropic.com) (helpnetsecurity.com) So the real story is not only that Anthropic paused a model. The real story is that an April 2026 model launch turned into a coordinated vulnerability disclosure campaign, a restricted-access program, and a security funding push before the public ever got a download button. (anthropic.com) (red.anthropic.com)