LLM API router risks

Security researchers flagged widespread problems in LLM API routers, reporting that some routers can rewrite tool calls and have been used in live tests to siphon funds — one sweep cited 428 routers and a simulated $500,000 drain. The same thread pushed interest in offline LLM tools like LM Studio as safer local alternatives for builders of custom AI workflows ( ).

Large language model agents often sit behind “routers” that relay requests to model providers, and new research says some of those middlemen have been rewriting tool calls in transit. (arxiv.org) Tool calling is the feature that lets a model ask an app to run code, query a database, or move money, which means the router can see the full JavaScript Object Notation payload before it reaches the upstream model. OpenAI’s documentation says function calling is used to connect models to external tools and systems. (help.openai.com; platform.openai.com) In the new paper, researchers from the University of California, Santa Barbara, the University of California, San Diego, Fuzzland, and World Liberty Financial tested 428 routers and found 1 paid router and 8 free routers actively injecting malicious code. They also found 17 routers touching researcher-owned Amazon Web Services canary credentials and 1 router draining Ether from a researcher-owned private key. (arxiv.org) The paper describes routers as application-layer proxies with plaintext access to every in-flight request, and says no provider currently enforces cryptographic integrity between the client and the upstream model. The authors call the two main attack classes payload injection and secret exfiltration. (arxiv.org) The same study says leaked keys and weakly configured decoys processed 2.1 billion tokens from these routers, exposing 99 credentials across 440 Codex sessions. It says 401 of those sessions were already running in autonomous “YOLO mode,” which let the researchers inject payloads directly into tool-using workflows. (arxiv.org) That warning landed weeks after Trend Micro detailed a March 24, 2026 compromise of LiteLLM, a widely used proxy package for routing model traffic across providers. Trend Micro said the trojanized PyPI releases 1.82.7 and 1.82.8 stole cloud credentials, Secure Shell keys, and Kubernetes secrets. (trendmicro.com) The router model exists because developers want one gateway that can switch between providers, normalize application programming interfaces, and manage costs. OpenRouter’s own documentation says it standardizes tool calling across OpenAI, Anthropic, and other providers, while RouteLLM describes itself as a framework for serving and evaluating routers. (openrouter.ai; github.com) Some developers responding to the research pointed to local setups instead of third-party relays. LM Studio’s documentation says it can run downloaded models entirely offline, and that requests to its local server can use OpenAI-compatible endpoints while staying on localhost or a local network. (lmstudio.ai; lmstudio.ai) Local tools do not remove every risk: LM Studio says model discovery, downloads, runtime downloads, and app updates still require network requests. But its docs also say chats, document retrieval, and local server inference can run without data leaving the machine once the model files are already installed. (lmstudio.ai) The immediate question for teams building agents is whether their “router” is just a convenience layer or a machine-in-the-middle with keys, prompts, and tool outputs in clear view. The new paper’s answer is that this layer now belongs in the same threat model as any other privileged production dependency. (arxiv.org)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.