Anthropic's Mythos, a guarded step

Anthropic built a very powerful model called Claude Mythos but is not releasing it publicly; instead the company is forming a 40‑firm coalition called Project Glasswing to help cybersecurity teams prepare software defences. At the same time, a flaw in Anthropic’s Claude Code that lets developer rules be bypassed via command‑padding has been reported, underscoring the operational risks even as companies try to harden defences. (x.com) (x.com)

Anthropic has built a model it says is better at finding software flaws than anything it has released before, and it is doing something unusual with it: keeping it off the public market. On April 7, the company announced Claude Mythos Preview, then immediately fenced it inside a new program called Project Glasswing, where a small circle of companies will use it for defensive security work instead of general access. Anthropic says the model is powerful enough that the normal product launch playbook no longer feels safe. (anthropic.com, cnbc.com) The coalition is large enough to look less like a beta test than a controlled mobilization. Anthropic says Glasswing starts with launch partners including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks, while more than 40 additional organizations that build or maintain critical software infrastructure will also get access. The company is putting up to $100 million in usage credits and $4 million in donations behind the effort. (anthropic.com, cyberscoop.com) Anthropic’s pitch is simple enough to picture. Give a very strong coding model a large codebase, let it inspect the software the way a patient and tireless security researcher would, and ask it to hunt for the kind of bug that can sit unnoticed for years. The company says Mythos has already found thousands of previously unknown vulnerabilities, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg that automated testing had missed even after millions of runs through the affected line of code. Anthropic says those bugs have now been patched. (cyberscoop.com, techcrunch.com) That is the promise. The problem is that a model that can spot weak points for defenders can also spot them for attackers. Anthropic says Mythos was not trained specifically for cyber work, but that its coding and reasoning ability made it unusually good at it anyway, which is a polite way of saying the capability emerged once the model became strong enough. The company told CNBC there was “a lot of internal deliberation” before it chose this limited release, and it has said outright that Mythos Preview is not headed for general availability. (cnbc.com, anthropic.com, techcrunch.com) The timing would already have made the announcement delicate. It became sharper because Anthropic is making its case for careful deployment while one of its own developer tools is under scrutiny for a security lapse. A report published on April 2 described a flaw in Claude Code, Anthropic’s terminal-based coding agent, that let attackers bypass developer-set deny rules by padding a shell command with more than 50 harmless subcommands before the real payload. In the reported behavior, a dangerous command that would normally be blocked could slip through once it was buried deep enough in a long chain. (adversa.ai, code.claude.com) That bug matters because Claude Code is not a toy that lives in a browser tab. It can read files, edit code, run shell commands, and manage git workflows from inside a developer’s machine or environment. Anthropic has spent months talking about sandboxing and other ways to put hard walls around those actions, but the reported bypass showed how quickly a safety system can become brittle when performance shortcuts meet real-world command parsing. (code.claude.com, anthropic.com, adversa.ai) So the story around Mythos is not just that Anthropic built a stronger model. It is that the company is trying to invent a release pattern for a tool that may be too useful to publish and too consequential to shelve. On one side is Glasswing, with big firms scanning critical software before someone else does. On the other is a very current reminder that even the guardrails around defensive AI can fail in ordinary, concrete ways, like a command line that stops checking after the fiftieth clause. (anthropic.com, adversa.ai)

Anthropic's Mythos, a guarded step

Get your own daily briefing