Claude gains multi-agent orchestration tools
- Anthropic has been rolling out the plumbing for multi-agent Claude systems — first in Claude Code, then in Managed Agents — so teams can run longer jobs. - The concrete shift is architectural: Anthropic split an agent into session, harness, and sandbox, while Claude Code now lets users assemble agent teams. - That matters because long-running AI work usually fails in the scaffolding, not the model — memory, tool access, and crash recovery.
Anthropic is turning Claude from a single chatbot into something closer to a coordinated software worker. The news is not just “Claude got smarter.” It’s that Anthropic has been shipping the missing infrastructure around the model — agent teams in Claude Code, a hosted Managed Agents service, and sandboxed execution — so Claude can keep working across longer, messier jobs. That sounds abstract, but the stakes are simple: if an AI agent can’t recover state, call tools safely, or survive long runs, it breaks before the model’s intelligence really matters. ### What actually changed? Anthropic’s recent releases fit together as one stack. In February, Claude Opus 4.6 added support in Claude Code for assembling “agent teams” and added API compaction so Claude could summarize its own context and keep running longer tasks. In April, Anthropic introduced Managed Agents, a hosted service for long-horizon work. Underneath that, the company has been standardizing the pieces an agent needs to run reliably. (anthropic.com) ### Why do multi-agent systems need extra plumbing? Because the model is only one part of the job. A useful agent needs memory of what happened, a loop that decides what to do next, and a place to run code or edit files. Anthropic’s point is that developers often over-focus on the “brain” and under-focus on the machinery around it. But long-running failures usually come from that machinery — stale assumptions in the harness, lost sessions, or brittle execution environments. (anthropic.com) ### What is Anthropic virtualizing? Anthropic says it split an agent into three abstractions: the session, the harness, and the sandbox. The session is the running log of everything that happened. The harness is the control loop that calls Claude and routes tool use. The sandbox is the execution environment where Claude can run code and edit files. Basically, Anthropic wants those layers to be swappable, so the infrastructure can change without forcing developers to rebuild everything above it. (anthropic.com) ### Why does that “decoupling” matter? Because tightly coupled agents are fragile. Anthropic describes an earlier setup where the session, harness, and sandbox all lived in one container. That made the system easier to start, but it also created a “pet” server problem — if that container failed, the session was lost. Decoupling the parts means the work can survive infrastructure changes and failures more gracefully. That is the boring enterprise stuff that ends up deciding whether an agent is a demo or a product. (anthropic.com) ### Where does sandboxing fit in? Sandboxing is the safety and autonomy layer. In Claude Code, Anthropic built filesystem and network isolation so Claude can work more freely inside predefined boundaries. That reduces constant permission prompts while also limiting what a compromised or prompt-injected agent can touch. Anthropic said internal usage showed sandboxing cut permission prompts by 84%, which is a big clue about the real goal here — less babysitting, not just more security. (anthropic.com) ### How do agents hand work off? Anthropic has been converging on structured handoffs. In its long-running application work, it described context resets paired with structured artifacts that carry state and next steps from one agent or session to the next. It also described planner, generator, and evaluator roles — a simple multi-agent pattern where one agent breaks down work, one does it, and one checks it. That is not flashy, but it is how you stop long jobs from drifting off course. (anthropic.com) ### Is this really new, or just packaging? A bit of both. Anthropic has talked about multi-agent research systems since 2025, and the company has been publishing pieces on harnesses, orchestration, and long-running Claude for months. What feels new now is that the ideas are being turned into product interfaces developers can actually build on, instead of staying as internal engineering patterns. (anthropic.com) ### Bottom line The important shift is not one magical feature. It’s that Anthropic is productizing the unglamorous layers around Claude — coordination, memory, execution, and recovery. If that works, Claude stops being just a strong model and starts looking more like infrastructure for enterprise automation. (anthropic.com 1) (anthropic.com 2)