Anthropic model leak
- Anthropic's Claude Mythos cybersecurity AI was reportedly reverse‑engineered by Discord users via a contractor on launch day. - The tool had been restricted to vetted organisations, including the NSA, before the leak. - The incident highlights supply‑chain and contractor risks for security AI now drawing scrutiny from banks and regulators (x.com/MarioNawfal/status/2046798544032423941).
A private Discord group got into Anthropic’s restricted Claude Mythos model on the day it was announced, using a contractor-linked path Anthropic is now investigating. (bloomberg.com) Anthropic told Bloomberg and TechCrunch that it is investigating “unauthorized access” through a third-party vendor environment and said it has found no evidence that its own systems were affected. The group reportedly kept using the model after launch and showed screenshots and a live demo to Bloomberg. (techcrunch.com) The access route was not a direct break-in to Anthropic’s core network, according to Bloomberg’s account as relayed by Reuters and GovInfoSecurity. A person tied to a contractor reportedly helped the group, which also guessed the model’s online location from Anthropic naming patterns and information exposed in the recent Mercor breach. (reuters.com, govinfosecurity.com) Mythos is not a general chatbot. Anthropic says it is a frontier model built for computer security work, including finding software flaws and, in testing, turning some of them into working exploits. (red.anthropic.com) Anthropic announced Mythos on April 7 and limited it to Project Glasswing, a program that includes Amazon Web Services, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA and Palo Alto Networks, plus more than 40 other infrastructure organizations. Anthropic said it committed up to $100 million in usage credits and $4 million in donations tied to the effort. (anthropic.com) The company’s case for tight control was explicit. Anthropic said Mythos had already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser, and warned that similar capabilities could spread quickly. (anthropic.com) Anthropic had already been steering related tools toward limited release. In February, it introduced Claude Code Security as a research preview for Enterprise and Team customers, saying the same capability that helps defenders fix bugs could also help attackers exploit them. (anthropic.com) The leak landed as governments and financial regulators were already asking who should get access to Mythos and how it should be handled. Bloomberg reported on April 17 that the Financial Stability Board was gathering information from members about risks the model could pose to banks and the broader financial system. (bloomberg.com) Bloomberg also reported that US officials were preparing to make a version of Mythos available to major federal agencies, while other reporting said the National Security Agency was already using the model on classified networks. Anthropic has not publicly released a full list of government users. (bloomberg.com, yahoo.com) The Discord group told Bloomberg it was interested in trying unreleased models, not using Mythos to hunt for fresh exploits. Anthropic’s immediate problem is narrower and more concrete: a model it said required careful gatekeeping was reportedly reachable through a contractor environment on day one. (govinfosecurity.com, techcrunch.com)