AI governance frictions rise

Reporting this week highlights interpretability and governance problems with advanced AI models, noting researchers still struggle to explain how models make decisions. At the same time, outlets say federal agencies are quietly testing Anthropic’s Mythos while Anthropic restricted access after the model displayed self‑teaching hacking behavior, and product defaults have shifted in live services like Claude Code and OpenAI's Codex integration. (nytimes.com) (politico.com) (pymnts.com) (axios.com) (help.openai.com)

Artificial intelligence companies are shipping more powerful systems while still struggling to explain, govern, and consistently control what those systems do. (nytimes.com) Interpretability is the effort to trace how a model reaches an answer, instead of treating it like a sealed box that takes in text and spits out text. Researchers still cannot fully map those internal steps in today’s largest models, even as mechanistic interpretability has become a fast-growing field in 2026. (nytimes.com) (technologyreview.com) That gap is colliding with deployment decisions in real time. On April 7, Anthropic announced Claude Mythos Preview, said it was unusually strong at computer security work, and limited access to a defensive program called Project Glasswing instead of a broad public release. (red.anthropic.com) (cnbc.com) Anthropic said launch partners in Project Glasswing include Amazon Web Services, Apple, Google, Microsoft, Nvidia, CrowdStrike, and Palo Alto Networks, with roughly 40 other companies participating. Politico reported on April 9 that Anthropic claimed Mythos could exploit vulnerabilities across every major operating system and internet browser, while some researchers disputed the company’s benchmarks and analysis. (cnbc.com) (politico.com) Washington is testing the edges of that risk at the same time. Semafor reported on April 14 that the Treasury Department is seeking access to Mythos to hunt for vulnerabilities, after Treasury and the Federal Reserve urgently summoned Wall Street leaders over fears the model could destabilize the financial system if misused. (semafor.com) (politico.com) The policy fight is sharper because Anthropic is already in a separate dispute with the Pentagon over government use of its models. Politico reported in March that the Defense Department labeled Anthropic a supply-chain risk after the company tried to restrict some military uses of Claude, and a federal judge later paused that designation. (politico.com 1) (politico.com 2) The product layer is shifting too. Axios reported on April 16 that Claude users have been complaining across online forums that the service suddenly feels worse, as Anthropic tests Mythos and keeps its most advanced capabilities tightly gated. (axios.com) OpenAI has been making its own live changes around coding access and pricing. OpenAI’s Help Center says Codex is now included in ChatGPT Plus, Pro, Business, and Enterprise and Edu plans, is temporarily available in Free and Go, and as of April 2 moved many customers to token-based pricing instead of per-message pricing. (help.openai.com 1) (help.openai.com 2) OpenAI also added a new Codex-only seat for ChatGPT Business on April 2, with usage-based billing and no monthly fixed cost, while cutting standard Business seat prices by $5 per month. Those changes make the coding agent more available inside paid plans even as billing becomes more granular and usage-sensitive. (help.openai.com 1) (help.openai.com 2) The common problem is not just what these models can do, but how little of their internal decision-making is legible before companies, customers, and governments have to set rules for them. The black box is still mostly closed, even as access policies, safety limits, and pricing models are already being rewritten around it. (nytimes.com) (red.anthropic.com)

AI governance frictions rise

Get your own daily briefing