Anthropic built a model too strong to release

Anthropic has developed an AI described as ‘too powerful’ for public release and is instead deploying it through Project Glasswing, a coalition of about 40 companies that will use the model proactively to help find and patch cybersecurity vulnerabilities. That move underscores a new pattern: leading AI firms are coordinating access to advanced models for defenders rather than pushing open public releases. (x.com) (x.com)

Anthropic says its newest unreleased model, Claude Mythos Preview, is strong enough at finding and exploiting software bugs that it will not put the system on the public internet at all. Instead, it is giving access to a closed group of companies through a program called Project Glasswing. (anthropic.com) The basic problem is simple: modern software is full of hidden mistakes, and some of those mistakes can be turned into break-ins. Anthropic says Mythos Preview can spot those weak points better than almost everyone except the very best human specialists. (anthropic.com) Anthropic says the model has already found thousands of high-severity vulnerabilities, including flaws in every major operating system and web browser. That is why the company is treating the model less like a chatbot launch and more like a controlled security tool. (anthropic.com) Project Glasswing starts with named partners including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic says it has also extended access to more than 40 additional organizations that build or maintain critical software infrastructure. (anthropic.com) The work is defensive, not consumer-facing. Partners are supposed to use the model to scan their own systems and important open-source code, then patch the holes before criminals or state hackers can use them. (anthropic.com) Anthropic says one bug Mythos Preview found had been sitting in OpenBSD for 27 years. CyberScoop reported another was a 16-year-old flaw in FFmpeg, a video software project, that automated testing had missed even after hitting the same line of code 5 million times. (cyberscoop.com) Anthropic says the vulnerabilities it found have been patched and that it contacted the maintainers of the affected software. The company is putting up to $100 million in usage credits behind the project and another $4 million in direct donations to open-source security groups. (anthropic.com) (cyberscoop.com) This is also a policy choice, not just a product choice. In its February 24, 2026 Responsible Scaling Policy update, Anthropic said advanced models may need stricter safeguards once they cross dangerous capability thresholds, especially when a model can take multi-step actions and create new misuse risks. (anthropic.com) So the company is trying a new release pattern: keep the strongest cyber-capable model inside a small trust network, let defenders use it first, and share lessons with the rest of the industry later. NBC News reported that Anthropic is framing this as a way to keep AI-driven hacking from outrunning AI-driven defense. (nbcnews.com) (anthropic.com) That may be the bigger shift than this one model. For years, the default move in artificial intelligence was to ship a stronger system and let the world figure out the consequences; Project Glasswing flips that by treating frontier cyber capability like something closer to a controlled security clearance. (anthropic.com 1) (anthropic.com 2)

Anthropic built a model too strong to release

Get your own daily briefing