Private Model, Public Risk
Anthropic kept a new model private after it discovered thousands of external vulnerabilities and launched an internal effort to patch them before disclosure. The initiative—reported as “Project Glasswing”—highlights how advanced models can surface real-world security flaws, and it has spurred wider industry moves such as OpenAI planning security-focused products and public tension between labs. (artificialintelligence-news.com) (axios.com) (cnbc.com)
Anthropic built a new artificial intelligence model that was so good at finding software break-ins that it kept the model off the public market and handed it only to a small circle of defenders. The model is called Claude Mythos Preview, and the restricted program is called Project Glasswing. (anthropic.com) Anthropic says Claude Mythos Preview can help find serious weaknesses across major operating systems, web browsers, and other critical software before criminals do. The company launched Glasswing on April 7, 2026 with partners including Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, Nvidia, and Palo Alto Networks. (anthropic.com) This is not a normal chatbot story. This is a computer security story, where a “vulnerability” is a flaw in code that can act like an unlocked window in a house. (anthropic.com) The surprise is that the same model that can point defenders to those unlocked windows can also show attackers where to climb in. Anthropic’s answer was to keep Mythos private instead of releasing it the way companies usually release a new flagship model. (techcrunch.com) Reports this week said internal testing showed the model could autonomously generate exploits for thousands of high-severity vulnerabilities across major platforms. That is why Anthropic framed the release as a containment problem, not a product launch. (businessinsider.com) Anthropic’s pitch is that the safest use for a model like this is a closed club of companies that already run large parts of the internet. Glasswing’s partner list includes the Linux Foundation, which helps steward core open-source software used inside servers, phones, and cloud systems. (anthropic.com) That setup changes the usual rhythm of cyber disclosure. Instead of publishing a tool widely and waiting for bugs to surface in public, Anthropic is trying to find flaws first, route them to the companies that maintain the software, and patch them before details spread. (anthropic.com) OpenAI is now moving in the same direction. Axios reported on April 9, 2026 that OpenAI is finishing a cybersecurity product with advanced capabilities and plans to release it only to a small set of partners. (axios.com) The rivalry is no longer just about who has the smartest assistant for writing and coding. CNBC reported that OpenAI sent shareholders a memo this week criticizing Anthropic as “compute constrained” and claiming OpenAI plans 30 gigawatts of compute by 2030, versus Anthropic at roughly 7 to 8 gigawatts by the end of 2027. (cnbc.com) So the new race looks stranger than the last one. The most valuable model may be the one you are not allowed to use, because a system that can spot weak points in “almost every computer on earth,” as one report put it, is useful to defenders and dangerous to everyone else. (msn.com)