Anthropic ships self-hosted Claude sandbox plus security-guidance plugin for customer deployments
What happened
- Anthropic on May 27 released a self-hosted Claude sandbox in public beta and a Security Guidance plugin aimed at enterprise-controlled deployments. - Anthropic said Project Glasswing and about 50 partners found more than 10,000 high- or critical-severity vulnerabilities with Claude Mythos Preview in one month. - Anthropic said wider Mythos access will wait for stronger safeguards, while Glasswing is expanding to more partners and government users.
Why it matters
Anthropic on May 27 released two new security products for Claude: a self-hosted sandbox in public beta and a Security Guidance plugin that warns developers about risky code patterns before changes are applied. The launch gives customers a way to run Claude with tighter enterprise controls while Anthropic continues to keep its more powerful Mythos-class cyber model out of general release. Anthropic has tied both moves to a broader push to make AI coding and security tools usable inside production environments rather than only in managed cloud workflows. The company’s public materials say the new controls are designed to keep humans in the loop and limit what an agent can access or change. ### What did Anthropic ship for customer deployments? SecurityWeek reported on May 27 that Anthropic released a self-hosted sandbox and a new Security Guidance plugin for Claude, describing the sandbox as a way for customers to run Claude behind their own controls. Anthropic’s plugin page says the Security Guidance tool is a verified Claude Code plugin that scans edits before they are applied and warns about unsafe patterns. Anthropic’s plugin documentation says the hook intercepts Write, Edit and MultiEdit operations and looks for eight broad categories of issues, including command injection, unsafe `child_process.exec` use, `eval` and `new Function` calls, XSS patterns such as `innerHTML` and `dangerouslySetInnerHTML`, Python pickle deserialization risks, and `os.system` command injection. The warnings are session-scoped and include remediation advice. (securityweek.com) ### How does the sandbox fit with Anthropic’s earlier security work? Anthropic’s October 2025 engineering post on Claude Code sandboxing said its approach relied on filesystem isolation and network isolation to reduce prompt-injection risk and cut permission prompts. That post said Claude could be restricted to approved directories and approved servers, a design Anthropic said was necessary to prevent file exfiltration, malware downloads or sandbox escape. (claude.com) Anthropic’s more recent product materials for Claude Security say proposed fixes open in Claude Code for review and “nothing ships without your approval.” The company says scheduled scans and webhooks can push findings into existing tools, keeping review on a team’s own cadence. ### Where does Mythos fit into this story? Anthropic launched Project Glasswing on April 7 as a restricted-access program built around Claude Mythos Preview, its frontier model for cybersecurity work. (anthropic.com) Anthropic said the launch partners included Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks, and that it had also extended access to more than 40 additional organizations that build or maintain critical software infrastructure. (anthropic.com) Anthropic said in a May 22 update that it and roughly 50 partners had found more than 10,000 high- or critical-severity vulnerabilities in the first month of Project Glasswing. The company said progress was now constrained less by finding flaws than by verifying, disclosing and patching them. ### Why hasn’t Anthropic released Mythos publicly? Anthropic said in its May 22 Glasswing update that it was still working through how to release Mythos-class models safely. (anthropic.com) The company’s April technical write-up said it did not plan to make Claude Mythos Preview generally available, while keeping an eventual goal of enabling users to deploy Mythos-class models safely at scale. gHacks, citing Anthropic’s update, reported on May 26 that the company intends to widen access first through Project Glasswing, including to U.S. and allied governments, before broader availability. (anthropic.com) The same report said some participants found the volume of vulnerabilities exceeded their capacity to patch them quickly. ### What do the early numbers from Glasswing show? (anthropic.com) Anthropic said Glasswing had scanned more than 1,000 open-source projects and identified 23,019 flaws, including 6,202 estimated high- or critical-severity vulnerabilities, according to the figures cited by gHacks from Anthropic’s disclosures. Of 1,752 high- or critical-severity vulnerabilities verified by Anthropic, 1,587 were confirmed valid, and 1,094 were confirmed high or critical, the report said. (ghacks.net) Anthropic’s Claude Security page separately says the research behind the product has already surfaced more than 500 previously unknown vulnerabilities in widely used open-source software. Those findings, the company says, came from work by its Frontier Red Team in code auditing, critical-infrastructure defense and vulnerability hunting. ### What happens next? Anthropic said it plans to expand Project Glasswing to more partners while it develops stronger safeguards for Mythos-class systems. (ghacks.net) The company also said it would publish a technical analysis in the coming weeks of a patched wolfSSL flaw tracked as CVE-2026-5194, one of the vulnerabilities cited in reporting on Mythos’s early findings. (anthropic.com)
Key numbers
- Anthropic on May 27 released a self-hosted Claude sandbox in public beta and a Security Guidance plugin aimed at enterprise-controlled deployments.
- Anthropic said Project Glasswing and about 50 partners found more than 10,000 high- or critical-severity vulnerabilities with Claude Mythos Preview in one month.
- Anthropic on May 27 released two new security products for Claude: a self-hosted sandbox in public beta and a Security Guidance plugin that warns developers about risky code patterns before changes are applied.
- SecurityWeek reported on May 27 that Anthropic released a self-hosted sandbox and a new Security Guidance plugin for Claude, describing the sandbox as a way for customers to run Claude behind their own controls.
What happens next
- Anthropic on May 27 released two new security products for Claude: a self-hosted sandbox in public beta and a Security Guidance plugin that warns developers about risky code patterns before changes are applied.
- The launch gives customers a way to run Claude with tighter enterprise controls while Anthropic continues to keep its more powerful Mythos-class cyber model out of general release.
- SecurityWeek reported on May 27 that Anthropic released a self-hosted sandbox and a new Security Guidance plugin for Claude, describing the sandbox as a way for customers to run Claude behind their own controls.
Quick answers
What happened in Anthropic ships self-hosted Claude sandbox plus security-guidance plugin for customer deployments?
Anthropic on May 27 released a self-hosted Claude sandbox in public beta and a Security Guidance plugin aimed at enterprise-controlled deployments. Anthropic said Project Glasswing and about 50 partners found more than 10,000 high- or critical-severity vulnerabilities with Claude Mythos Preview in one month. Anthropic said wider Mythos access will wait for stronger safeguards, while Glasswing is expanding to more partners and government users.
Why does Anthropic ships self-hosted Claude sandbox plus security-guidance plugin for customer deployments matter?
Anthropic on May 27 released two new security products for Claude: a self-hosted sandbox in public beta and a Security Guidance plugin that warns developers about risky code patterns before changes are applied. The launch gives customers a way to run Claude with tighter enterprise controls while Anthropic continues to keep its more powerful Mythos-class cyber model out of general release. Anthropic has tied both moves to a broader push to make AI coding and security tools usable inside production environments rather than only in managed cloud workflows. The company’s public materials say the new controls are designed to keep humans in the loop and limit what an agent can access or change. What did Anthropic ship for customer deployments? SecurityWeek reported on May 27 that Anthropic released a self-hosted sandbox and a new Security Guidance plugin for Claude, describing the sandbox as a way for customers to run Claude behind their own controls. Anthropic’s plugin page says the Security Guidance tool is a verified Claude Code plugin that scans edits before they are applied and warns about unsafe patterns. Anthropic’s plugin documentation says the hook intercepts Write, Edit and MultiEdit operations and looks for eight broad categories of issues, including command injection, unsafe child_process.exec use, eval and new Function calls, XSS patterns such as innerHTML and dangerouslySetInnerHTML, Python pickle deserialization risks, and os.system command injection. The warnings are session-scoped and include remediation advice. (securityweek.com) How does the sandbox fit with Anthropic’s earlier security work? Anthropic’s October 2025 engineering post on Claude Code sandboxing said its approach relied on filesystem isolation and network isolation to reduce prompt-injection risk and cut permission prompts. That post said Claude could be restricted to approved directories and approved servers, a design Anthropic said was necessary to prevent file exfiltration, malware downloads or sandbox escape. (claude.com) Anthropic’s more recent product materials for Claude Security say proposed fixes open in Claude Code for review and “nothing ships without your approval.” The company says scheduled scans and webhooks can push findings into existing tools, keeping review on a team’s own cadence. Where does Mythos fit into this story? Anthropic launched Project Glasswing on April 7 as a restricted-access program built around Claude Mythos Preview, its frontier model for cybersecurity work. (anthropic.com) Anthropic said the launch partners included Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks, and that it had also extended access to more than 40 additional organizations that build or maintain critical software infrastructure. (anthropic.com) Anthropic said in a May 22 update that it and roughly 50 partners had found more than 10,000 high- or critical-severity vulnerabilities in the first month of Project Glasswing. The company said progress was now constrained less by finding flaws than by verifying, disclosing and patching them. Why hasn’t Anthropic released Mythos publicly? Anthropic said in its May 22 Glasswing update that it was still working through how to release Mythos-class models safely. (anthropic.com) The company’s April technical write-up said it did not plan to make Claude Mythos Preview generally available, while keeping an eventual goal of enabling users to deploy Mythos-class models safely at scale. gHacks, citing Anthropic’s update, reported on May 26 that the company intends to widen access first through Project Glasswing, including to U.S. and allied governments, before broader availability. (anthropic.com) The same report said some participants found the volume of vulnerabilities exceeded their capacity to patch them quickly. What do the early numbers from Glasswing show? (anthropic.com) Anthropic said Glasswing had scanned more than 1,000 open-source projects and identified 23,019 flaws, including 6,202 estimated high- or critical-severity vulnerabilities, according to the figures cited by gHacks from Anthropic’s disclosures. Of 1,752 high- or critical-severity vulnerabilities verified by Anthropic, 1,587 were confirmed valid, and 1,094 were confirmed high or critical, the report said. (ghacks.net) Anthropic’s Claude Security page separately says the research behind the product has already surfaced more than 500 previously unknown vulnerabilities in widely used open-source software. Those findings, the company says, came from work by its Frontier Red Team in code auditing, critical-infrastructure defense and vulnerability hunting. What happens next? Anthropic said it plans to expand Project Glasswing to more partners while it develops stronger safeguards for Mythos-class systems. (ghacks.net) The company also said it would publish a technical analysis in the coming weeks of a patched wolfSSL flaw tracked as CVE-2026-5194, one of the vulnerabilities cited in reporting on Mythos’s early findings. (anthropic.com)