LLM router hacks exposed

Researchers disclosed vulnerabilities in 26 large-language-model routers that allowed secret malicious tool calls, credential theft and even a $500,000 wallet drain, showing how agent infrastructure can be an immediate attack surface. The findings also included the possibility of host takeovers, underlining that orchestration layers—not just models—need hardened security. (x.com)

A large language model router is the switchboard between an artificial intelligence app and the model that actually answers, and the new paper says that switchboard can rewrite requests in plain text before they ever reach OpenAI, Anthropic, or Google. The authors call this an application-layer proxy, which means it sits in the middle with full visibility into every tool call and secret in flight. (arxiv.org) That middle layer matters because modern agent apps do more than chat. OpenAI’s Responses application programming interface supports function calling and built-in tools, and Anthropic’s tool-use system lets Claude decide when to call functions your app exposes. (developers.openai.com) (platform.claude.com) A tool call is the moment the model stops talking and starts acting. If your app gives it a “send money,” “run shell command,” or “read file” tool, the model returns a structured instruction and your software executes it. (developers.openai.com) (platform.claude.com) Many developers do not send those requests straight to one model company. They use routers to pick the cheapest model, fail over when one provider is down, or translate between different application programming interfaces, so one middleman can sit between the agent and several upstream models. (arxiv.org) The paper’s core finding is blunt: the router itself can become the attacker. The researchers bought 28 paid routers from Taobao, Xianyu, and Shopify-hosted stores and collected 400 free routers from public communities, then watched what happened when those services handled real agent traffic. (arxiv.org) In that sample, 1 paid router and 8 free routers actively injected malicious code into requests or responses. Two routers used adaptive evasion tricks, which means they only delivered the bad payload under specific conditions instead of showing it every time. (arxiv.org) The second class of attack was secret theft. Seventeen routers touched researcher-owned Amazon Web Services canary credentials after seeing them in transit, which is the digital equivalent of slipping a marked bill into circulation and later seeing who spent it. (arxiv.org) One router went further and used a researcher-owned Ethereum private key to drain funds. The paper describes that as one router draining Ether from a researcher-owned private key, and outside summaries tied the real-world customer loss to about $500,000 in wallets. (arxiv.org) (cb-terminal.dev) The researchers also found a second route into the same mess: poisoned infrastructure around the router. Intentionally leaked OpenAI keys and weakly configured decoys processed 2.1 billion tokens from these routers, exposed 99 credentials across 440 Codex sessions, and included 401 sessions already running in autonomous “YOLO mode,” where direct payload injection was possible. (arxiv.org) That is why this story is bigger than one bad plugin or one reckless prompt. If the router can alter a tool call, it can make an agent open a backdoor, leak a cloud key, or hand over a wallet secret while the model provider and the user both think they are looking at the same conversation. (arxiv.org) The paper tested three defenses on a research proxy named Mine: a fail-closed policy gate, response-side anomaly screening, and append-only transparency logging. In plain English, that means refusing tool calls that break strict rules, scanning outputs for suspicious changes, and keeping an untouchable ledger of what the model and the router actually said. (arxiv.org) The uncomfortable part is that none of this depends on breaking the model itself. The paper says no provider enforces cryptographic integrity between client and upstream model, so the weak point is often the orchestration layer that developers added for convenience, cost control, or compatibility. (arxiv.org)

LLM router hacks exposed

Get your own daily briefing