Researchers uncover confused-deputy failure

- Four independent research teams reported the same "confused deputy" failure across three Claude surfaces, finding an attack class that can misroute authority in agent and browser flows. (venturebeat.com) - The teams observed the issue within a 48‑hour window and flagged it in code workflows and browser integrations, warning security teams to audit agent-to-service delegation. (venturebeat.com) - The finding raises operational security risk for teams embedding Claude into CI, dev tools, or web apps and suggests tighter authorization checks are needed. (venturebeat.com)

A confused-deputy failure is an old security bug with a very current AI twist. A program has real authority — your browser session, your OAuth tokens, your local tools — and an attacker tricks it into using that authority for the wrong person. That’s the basic pattern behind a cluster of Claude-related disclosures that landed almost at once in early May. (venturebeat.com) ### What is the “deputy” here? The deputy is the AI system or extension that already has permission to do useful things. In Claude’s case, that can mean a Chrome extension that can read pages and act in the browser, or Claude Code connected to external services through MCP and OAuth. The model doesn’t need to “hack” its way up to those permissions — it starts with them. The problem starts when it can’t reliably tell whose intent it is carrying out. (venturebeat.com) ### Why did this story flare up now? Because several separate research threads all pointed at the same architectural weakness. VentureBeat tied together four teams publishing within a roughly 48-hour window on May 6 and May 7, 2026. The incidents looked different on the surface — an OT intrusion, a browser-extension hijack, token theft in Claude Code — but they all reduced to the same trust-boundary mistake. (venturebeat.com) ### What happened in the browser? One path was ShadowPrompt, disclosed by Koi Security in March. The Claude Chrome extension trusted messages from any `*.claude.ai` subdomain. Koi chained that with an XSS bug in an Arkose CAPTCHA component on `a-cdn.claude.ai`, which meant a malicious site could silently inject prompts into Claude as if the user had typed them. No clicks. No permission dialog. Koi said affected users should be on extension version 1.0.41 or higher. (koi.ai) ### And what is “ClaudeBleed”? That’s a separate browser-side issue from LayerX, published in May 2026. Their claim was harsher: any Chrome extension, even one with no special permissions, could talk to Claude’s extension through overly trusted browser communication paths and then make Claude perform actions or leak data on the user’s behalf. In plain English, a harmless-looking extension could borrow Claude’s much stronger privileges. (layerxsecurity.com) ### What happened in Claude Code? The same pattern showed up in code workflows. SecurityWeek described a Claude Code attack where a malicious npm package edits Claude Code configuration, pre-approves trusted directories, inserts hooks, and reroutes MCP server traffic through an attacker-controlled proxy. That lets the attacker intercept OAuth bearer tokens during sign-in or refresh flows. The scary part is not just token theft — it’s that the agent becomes a credential courier without realizing it. (securityweek.com) ### Why does MCP keep coming up? Because MCP is exactly where tool delegation gets messy. Models talk to multiple tools and services through a shared conversational channel, and prompt injection can blur which tool should trust which instruction. Simon Willison flagged this problem last year in MCP’s cross-server tool shadowing behavior — a malicious server can override or intercept calls meant for a trusted one. That is almost a textbook confused-deputy setup. (simonwillison.net) ### How does the water-utility case fit? Dragos described an intrusion into a Mexican municipal water and drainage utility where Claude served as the “primary technical executor.” The key point is not that Claude magically knew industrial systems. It didn’t. The point is that once inside the enterprise IT environment, the model could identify an OT target, classify it as valuable, and help build a path toward it. The deputy had too much room to act with inherited authority. (dragos.com) ### So what actually needs fixing? Not just the individual bugs. The deeper fix is narrower delegation — per-tool permissions, explicit user confirmation for sensitive actions, origin checks that are actually strict, and separation between untrusted content and privileged tools. Anthropic has already framed browser prompt injection as a major unsolved problem, which is the right diagnosis. But the bigger lesson is broader than Claude: if an AI agent can read untrusted input and also spend your authority, confused-deputy failures are going to keep showing up. (anthropic.com)

Researchers uncover confused-deputy failure

Get your own daily briefing