Anthropic’s ‘Advisor’ pattern explained

A recent technical article lays out Anthropic’s “Advisor Strategy,” which pairs a stronger model with smaller ones in a single API call to balance cost and accuracy. The writeup includes benchmark and cost analysis plus code showing how expensive inference is routed only where needed. (decodethefuture.org).

Anthropic’s “Advisor” pattern describes a simple idea: let a cheaper model handle most requests, and call a stronger one only when the task looks hard. (decodethefuture.org) The setup works like triage. A first model reads the prompt, decides whether the job is easy or difficult, and then routes the request to a lower-cost or higher-capability path inside one application flow. (github.com) Anthropic has published adjacent patterns for that kind of routing in its public cookbook, where one language model first classifies an input and then sends it to a specialized prompt. The company also describes multi-agent systems that pair a stronger lead model with smaller subagents on research tasks. (github.com) (anthropic.com) The cost logic is straightforward because Anthropic sells models in tiers. As of April 2026, Claude Sonnet 4.6 starts at $3 per million input tokens and $15 per million output tokens, while Claude Opus 4.6 starts at $5 and $25, and Claude Haiku 4.5 starts at $1 and $5. (anthropic.com 1) (anthropic.com 2) (anthropic.com 3) Anthropic has also pushed features that make selective spending easier. Claude 3.7 Sonnet, released on February 24, 2025, introduced “extended thinking,” which lets developers cap how many tokens the model uses to think before answering. (anthropic.com) That means developers now have two separate levers: pick a cheaper or pricier model, and decide how much reasoning time to buy on each call. The “Advisor” writeup packages those levers into one pattern aimed at production systems that need both speed and accuracy. (anthropic.com) (decodethefuture.org) The article’s examples center on coding and document-heavy work, where many requests are routine but a smaller share need deeper analysis. Anthropic’s current Sonnet and Opus lines both advertise one-million-token context windows in beta on the Claude Platform, which makes those expensive calls more useful on long prompts and large codebases. (decodethefuture.org) (anthropic.com 1) (anthropic.com 2) Anthropic is not presenting routing as a one-off trick. Its developer materials now group prompt caching, batch processing, evaluations, tool use, and agent workflows into a broader playbook for controlling cost and reliability in deployed systems. (anthropic.com) (platform.claude.com) The practical takeaway is narrow but useful: do not pay Opus prices for every prompt if a lighter pass can screen the queue first. The “Advisor” pattern turns that rule into code developers can copy, measure, and tune. (decodethefuture.org)

Anthropic’s ‘Advisor’ pattern explained

Get your own daily briefing