Pair models for parts of work
- Microsoft said Anthropic models are rolling out in Copilot Studio alongside OpenAI models, giving customers separate choices for orchestration, chat and deep-reasoning work inside one agent platform. - OpenAI’s own API guides now tell developers to mix model roles: use reasoning models for planning and decisions, then hand execution to faster GPT models for bounded tasks. - Anthropic and OpenAI are both documenting the same pattern: stable outer workflows, specialist inner steps, and routing layers that swap models without rewriting the whole system. (microsoft.com)
Microsoft is turning multi-model workflows into a product feature, adding Anthropic models alongside OpenAI models in Copilot Studio. (microsoft.com) The company said customers can now choose Anthropic models for orchestration, chat and deep reasoning scenarios, while OpenAI remains the default for new agents. Microsoft named Claude Sonnet 4 and Claude Opus 4.1 in the rollout. (microsoft.com) That product move lines up with how model companies now describe real deployments. OpenAI’s documentation says reasoning models work best for complex problem-solving, coding, scientific reasoning and multi-step agent workflows, while faster GPT models are better for straightforward execution. (developers.openai.com 1) (developers.openai.com 2) OpenAI goes further than that. Its guide says “most AI workflows” will use a combination: o-series models for planning and decision-making, GPT-series models for execution when speed and cost matter more. (developers.openai.com) The underlying idea is simple: one model maps the route, another carries the boxes. In OpenAI’s agents documentation, that shows up as “handoffs” when a specialist should take over, or “agents as tools” when a manager keeps control and calls a narrower helper such as a summarizer. (developers.openai.com) Anthropic is describing a similar architecture from the other side of the stack. Its Managed Agents post says the durable part of an agent system is not one giant prompt, but a set of interfaces for a session, a harness that routes tool calls, and a sandbox where work runs. (anthropic.com) That matters for long documents and repeated background material, where teams often want one model to hold a lot of context and another to make or execute decisions. Anthropic says Claude prompt caching can cut latency by more than 2x and costs by up to 90%, and says smaller knowledge bases can sometimes be passed directly into the prompt instead of using retrieval systems. (anthropic.com) The practical shift is away from asking one model to do everything in one pass. Vendors are now publishing patterns for routers, specialists and bounded tool calls, and platform companies like Microsoft are packaging those choices into enterprise software. (developers.openai.com) (microsoft.com)