Anthropic model labelled too risky

Coverage highlighted that Anthropic chose not to release a new frontier model publicly, framing that decision as an escalation in how labs treat model risk and suggesting release governance is becoming a public‑risk issue. Media discussion this week linked those release choices to broader calls for treating leading AI labs with heightened operational and security scrutiny. (youtube.com)

Anthropic said on April 7 it would not release Claude Mythos Preview to the public, limiting the model to a closed cybersecurity program instead. (anthropic.com) The company said Mythos Preview will be used through Project Glasswing, a defensive security effort with 12 launch partners including Amazon Web Services, Apple, Google, JPMorganChase, Microsoft, NVIDIA, Palo Alto Networks, and the Linux Foundation. (anthropic.com) Anthropic said it withheld the model because it is unusually good at finding software flaws that attackers could turn into break-ins, and CNBC reported the system is being rolled out only to selected companies rather than through a public product release. (cnbc.com) Anthropic has spent two years building a release rulebook for moments like this. Its Responsible Scaling Policy, first published in September 2023 and updated to version 3.1 on April 2, 2026, says stronger models trigger stronger safeguards before training or launch. (anthropic.com) In that policy, Anthropic says each “Artificial Intelligence Safety Level” works like a higher security setting on a machine: if a model crosses a capability threshold, the company is supposed to add tighter controls on misuse and model theft. Anthropic’s February 24 policy update said later safety levels were left less defined at first and are being filled in as models improve. (anthropic.com) That approach is already visible in Anthropic’s current public models. In its May 2025 system card, the company said Claude Opus 4 was deployed under Artificial Intelligence Safety Level 3, while Claude Sonnet 4 stayed at Level 2 after pre-deployment testing. (anthropic.com) The Mythos decision pushed that release logic into a new phase: not just extra monitoring on a public model, but no broad public release at all. Anthropic’s April 2 policy update also says the company remains free to pause development or take stronger measures even when the written policy does not require it. (anthropic.com) The release landed amid direct attention from Washington. Reuters, citing CNBC, reported on April 10 that Vice President JD Vance and Treasury Secretary Scott Bessent had questioned major technology chief executives about artificial intelligence model security and cyberattack response before Mythos was announced. (usnews.com) Not everyone accepts Anthropic’s framing at face value. TechCrunch reported April 9 that some cybersecurity and startup executives said limiting access could also help Anthropic protect enterprise deals and make it harder for rivals to copy frontier systems through distillation, while Anthropic’s public explanation focused on misuse risk. (techcrunch.com) What changed this month is concrete: a frontier lab built a model, judged broad release too risky, and kept it behind a partner gate. Anthropic’s own policy language now treats launch decisions as security decisions, not just product decisions. (anthropic.com)

Anthropic model labelled too risky

Get your own daily briefing