Mythos and Cyber Risk

Reporting says OpenAI and Anthropic are withholding powerful cyber‑capable tools—Anthropic’s Mythos reportedly found thousands of vulnerabilities and was judged too risky to release—highlighting a new release calculus for tool‑competent systems. Once agents can operate software or probe environments reliably, sandboxing, permissions and abuse monitoring become non‑negotiable even for consumer products (axios.com) (ia.acs.org.au).

A modern artificial intelligence model does not just answer questions anymore. Anthropic said in February 2026 that models now browse the web, write and run code, use computers, and take autonomous multi-step actions, which turns a chatbot into something closer to a junior operator with a keyboard. (anthropic.com) That shift is why cybersecurity has become its own safety category. OpenAI said in its April 15, 2025 Preparedness Framework update that it tracks cybersecurity capabilities separately from biology, chemistry, and artificial intelligence self-improvement because those abilities can create severe harm if they are strong enough. (openai.com) The new story is that both Anthropic and OpenAI are now holding back some of their strongest cyber-capable systems instead of shipping them broadly on day one. Axios reported on April 9, 2026 that OpenAI is planning a staggered rollout of a new cybersecurity model, while Anthropic is already limiting access to Mythos. (axios.com) Anthropic made that decision public on April 7, 2026 when it introduced Claude Mythos Preview as part of Project Glasswing rather than as a normal public product. The company said Mythos Preview is its most capable frontier model to date and is being given first to organizations doing defensive security work on critical software. (anthropic.com) The reason is not that Mythos was built as a single-purpose hacking bot. Anthropic said it is a general-purpose model with stronger coding and reasoning, and those general skills turned out to be good enough at security work that the company chose not to make it generally available. (anthropic.com) Anthropic says the model found “thousands” of zero-day vulnerabilities over a period of weeks, including many critical ones in software that had been around for one to two decades. In plain terms, a zero-day vulnerability is a hidden software flaw that defenders have not patched yet, which makes it valuable to both a security team and an attacker. (techcrunch.com) Instead of opening that capability to everyone, Anthropic built a walled preview around it. Project Glasswing launched with Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, and Anthropic said more than 40 additional organizations that maintain critical software will also get access. (anthropic.com) OpenAI has been moving in the same direction with different branding. In February 2026 it introduced Trusted Access for Cyber, an identity-based gated program that gives enhanced cyber capabilities to vetted users while trying to keep them away from malicious ones. (openai.com) OpenAI had already signaled the technical threshold before this week’s reporting. Its March 2026 GPT-5.4 Thinking system card says the model was the first general-purpose model to implement mitigations for “High” cybersecurity capability, and its Codex cyber-safety page says high-risk requests can be routed through extra monitoring and review. (deploymentsafety.openai.com) (developers.openai.com) That is the real change in release logic. The old question was whether a model could answer dangerous questions in text; the new question is whether a model can reliably operate tools, inspect codebases, and probe real systems well enough that a bad actor could turn speed into scale. (anthropic.com) (openai.com) Once a model crosses that line, safety stops being mainly about the words it outputs. Anthropic’s April 2026 alignment risk report for Mythos discusses sandboxing, model-weight security, blocking interventions, monitoring, and limits on autonomous operation, which is the language of containing an active system rather than moderating a chat reply. (anthropic.com) So the headline is not just that one company has a scary new model. It is that two frontier labs now appear to agree that when artificial intelligence can use software like a capable human defender, consumer-style open release is no longer the default and gated access becomes part of the product itself. (axios.com) (anthropic.com) (openai.com)

Mythos and Cyber Risk

Get your own daily briefing