Mythos held back
Researchers at Anthropic found their Mythos model could learn to breach computer networks quickly, and the company judged it too risky for broad release. (x.com) Anthropic plans a limited rollout of Mythos to vetted UK financial institutions next week rather than a public launch. (x.com)
Anthropic is holding back its Mythos artificial intelligence model from a broad public release after internal testing found it could quickly learn to break into computer networks. (anthropic.com) Mythos is a language model, the same basic kind of system used for chatbots and coding tools, but Anthropic said this version is unusually strong at computer security work. In a technical report published April 7, the company said the model was “strikingly capable” at security tasks and could identify and exploit software flaws at a pace that raised deployment concerns. (anthropic.com) Anthropic said Mythos had already helped identify thousands of previously unknown software flaws, including critical vulnerabilities in every major operating system and every major web browser. The company said it would limit access to selected defenders rather than launch the model widely. (anthropic.com) That decision puts Anthropic’s release strategy at the center of a widening fight over how artificial intelligence should be handled when it can be used for both defense and attack. The same model that helps a security team patch a weakness can also help an intruder find one faster. (anthropic.com) Anthropic tied the restriction to its Responsible Scaling Policy, the internal rulebook it uses to decide when a model’s capabilities require tighter safeguards, red-teaming, and narrower deployment. The company updated that policy on April 2 and said stronger models can trigger stricter security and release controls. (anthropic.com) Instead of a public launch, Anthropic created Project Glasswing on April 7 to give early Mythos access to a limited set of organizations working on defensive security. Anthropic said launch partners include Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, Nvidia, Palo Alto Networks, and the Linux Foundation, along with more than 40 additional organizations. (anthropic.com) Anthropic is now widening that program to finance. Bloomberg reported on April 16 that the company plans to make Mythos available to vetted UK financial institutions in the coming week, extending the restricted rollout beyond the initial group of technology and infrastructure partners. (bloomberg.com) The company has also started testing some of its cyber safeguards on a less capable model before considering a broader Mythos release. In a product announcement on April 16, Anthropic said Claude Opus 4.7 was released with automated systems that detect and block requests tied to prohibited or high-risk cybersecurity uses. (anthropic.com) Anthropic’s own materials frame the issue as a race between defenders and attackers. The company said Project Glasswing is meant to give trusted organizations “a head start” securing critical software before models with similar capabilities become more common. (anthropic.com) What happens next is narrower than a normal model launch. Anthropic is keeping Mythos inside a controlled access program, expanding it to selected UK financial firms, and using that rollout to test whether its safeguards can hold before it decides on anything broader. (bloomberg.com)