AI Finds thousands of bugs
A new Anthropic model running under Project Glasswing has automatically identified thousands of high‑severity vulnerabilities across major operating systems and browsers — and it scored about 83% on CyberGym benchmark tests, outperforming most human teams. This capability is being tightly gated: Anthropic’s Claude Mythos Preview is available only to a defensive coalition that includes AWS, Google and Microsoft so it’s meant for remediation not exploitation. The takeaway is simple — automated models are already finding real, high‑risk bugs faster than people, which could force security teams to rethink vulnerability triage and patch priorities now. (x.com) (x.com)
A software vulnerability is a mistake in code that leaves a door unlocked. Security researchers usually find those mistakes by reading code, testing programs, and trying to make them break before criminals do. (anthropic.com) Anthropic says its new Claude Mythos Preview model can now do that work at a level that beats almost everyone except the very top human specialists. The company says the model has already found thousands of high-severity vulnerabilities, including flaws in every major operating system and every major web browser. (anthropic.com) A zero-day vulnerability is a bug the defender does not know about yet, which means there are zero days to prepare once an attacker finds it. Anthropic’s security team says Mythos Preview identified and exploited zero-day vulnerabilities across major open-source codebases during testing. (red.anthropic.com) Some of those bugs were old enough to vote. Anthropic says the oldest bug Mythos found was a 27-year-old vulnerability in OpenBSD, which is an operating system known mainly for aggressive security auditing. (red.anthropic.com) The benchmark behind the headline is called CyberGym. It tests whether a model can reproduce real software exploits against target code, and Anthropic reports Mythos Preview scored 83.1 percent, compared with 66.6 percent for Claude Opus 4.6. (anthropic.com) (red.anthropic.com) Anthropic is not putting this model on the open market. Its system card says the company decided not to make Claude Mythos Preview generally available and is limiting it to a defensive cybersecurity program with a small set of partners. (anthropic.com) That partner list explains what Anthropic is trying to protect first. Project Glasswing launched with Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. (anthropic.com) Anthropic says those partners will use the model on “foundational systems,” which means the shared plumbing other companies build on top of. The company also says more than 40 additional organizations that maintain critical software infrastructure are getting access to scan both their own code and open-source projects. (anthropic.com) The money behind this is large because the backlog is large. Anthropic says it is committing up to $100 million in usage credits for Mythos Preview and another $4 million in direct donations to open-source security organizations. (anthropic.com) The uncomfortable part is that the same skill that helps defenders patch bugs also helps attackers weaponize them. Anthropic’s technical write-up says more than 99 percent of the vulnerabilities it found are still unpatched, which is why it is withholding most details for now. (red.anthropic.com) This changes the tempo of software security. If one model can search huge codebases, find subtle flaws, and turn them into working exploits faster than most human teams, patch queues and triage rules built for human-speed review stop making sense. (anthropic.com) (red.anthropic.com)