Anthropic withholds Mythos
Anthropic is holding back public release of a new model called Claude Mythos because it judged the system strong enough to leak sensitive information and enable misuse. The company says the model is aimed at bolstering cyber‑defences but is being gated over fears it could aid hackers, illustrating how frontier capabilities may first appear in private or managed offerings. (gizmodo.com / theguardian.com)
Anthropic built a new Claude model, dated April 7, 2026, and then decided not to put it on the open market. In its own system card, the company says Claude Mythos Preview is its “most capable frontier model to date” and says the jump in capability was large enough to limit access to a small defensive program instead of a normal public launch. (anthropic.com) The reason is simple and unusual: the same skill that helps a defender find weak spots can help an attacker break in faster. Anthropic says Mythos can identify and exploit previously unknown software flaws across every major operating system and every major web browser when a user points it at the task. (anthropic.com) A software vulnerability is a hidden crack in code, like a bad lock on an apartment door that nobody noticed during construction. A “zero-day” vulnerability is the worst version of that problem, because the vendor has had zero days to patch it before someone can use it. (anthropic.com) Anthropic says Mythos was finding bugs so old and obscure that one patched flaw traced back 27 years in OpenBSD, an operating system with a reputation for security. The company also says more than 99% of the vulnerabilities it found were still unpatched, which is why it refused to publish most technical details. (anthropic.com) Instead of a public release, Anthropic wrapped the model inside a program called Project Glasswing. The launch group includes Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. (anthropic.com) Anthropic says those partners, plus more than 40 additional organizations that maintain critical software, will use Mythos for defensive work on their own systems and on open-source code. The company says it is committing up to $100 million in usage credits and another $4 million in direct donations to open-source security groups. (anthropic.com) This is not just a product decision. It fits a policy Anthropic has been tightening since 2023, called the Responsible Scaling Policy, which says stronger models should trigger stronger safeguards instead of being shipped the same way as weaker ones. (anthropic.com) Anthropic already turned on its AI Safety Level 3 protections in May 2025 for Claude Opus 4, mainly to reduce two risks: people misusing the model for dangerous tasks and outsiders stealing the model weights, which are the files that contain the model’s learned behavior. Mythos pushes that logic into cyber offense and defense, where the danger is not just what the model knows but how quickly it can turn that knowledge into a working exploit. (anthropic.com) Outside coverage on April 7 and April 8 described the same split-screen story: Anthropic is pitching Mythos as a tool to prevent cyberattacks while also saying broad release could hand hackers a shortcut. The New York Times reported Anthropic was working with about 40 companies, and CNBC reported the company limited rollout over fears the model could be used for cyberattacks. (nytimes.com) (cnbc.com) The bigger shift is that the best model may no longer show up first as a chatbot on a website. Anthropic is treating Mythos less like a consumer app and more like a sensitive tool that stays behind controlled doors while companies race to patch the holes it already found. (anthropic.com 1) (anthropic.com 2)