Enterprise AI goes multi‑model
Companies are moving away from betting on a single AI model and toward systems that orchestrate several models to improve reliability and auditability. Microsoft now lets GPT and Anthropic’s Claude check each other inside Microsoft 365 Copilot’s Researcher agent, a practical signal that enterprises will stitch models together rather than standardise on one vendor. That shift raises new procurement and governance challenges — from cost attribution to audit trails — as outputs become composites rather than the work of a single system. (geekwire.com)
Microsoft just did something big inside Microsoft 365 Copilot: it let OpenAI’s GPT write a research draft and Anthropic’s Claude critique it inside the same workflow, instead of forcing customers to pick one model and live with its mistakes. (geekwire.com) The feature sits inside Researcher, Microsoft’s deep-research agent for work, which already pulls from web pages plus company files, emails, meetings, and chats a worker is allowed to access. (learn.microsoft.com) Microsoft calls the new setup “Critique,” and the idea is simple: one model produces an answer, then a second model tries to find weak spots, missing evidence, or bad reasoning before the report goes back to the user. (techcommunity.microsoft.com) Microsoft also added “Council,” which runs several models side by side so a user can compare different answers instead of seeing one polished paragraph and guessing what got left out. (techcommunity.microsoft.com) That sounds like a product tweak, but it breaks a basic assumption from the first enterprise artificial intelligence wave, when companies were told to choose one “foundation model” the way they once chose one cloud vendor or one database. (geekwire.com) Microsoft had already been moving in this direction. In September 2025, it said Researcher could be powered by either OpenAI reasoning models or Anthropic’s Claude Opus 4.1, giving customers model choice inside the same product. (microsoft.com) It made the same shift in Copilot Studio, Microsoft’s tool for building custom workplace agents, where administrators got controls for Anthropic access and even automatic fallback to OpenAI GPT-4o if Anthropic models were turned off. (microsoft.com) Once one answer is built from two or three models, the hard part moves from “which model is smartest” to “which model did what.” A legal team, an auditor, or a procurement office now has to track which system drafted, which system checked, and which bill each step belongs to. (geekwire.com) Microsoft’s own support pages now describe Researcher as supporting multiple artificial intelligence models from OpenAI and Anthropic, which means the composite answer is no longer an exception or a lab demo. It is product behavior. (support.microsoft.com) That creates a new kind of audit trail problem. If a board memo contains a wrong number copied from an internal spreadsheet and then “validated” by a second model, the company has to reconstruct not just the source document but the chain of model decisions layered on top of it. (learn.microsoft.com) (techcommunity.microsoft.com) The practical result is that enterprise buyers are starting to purchase an orchestration system, not a single artificial intelligence brain. Microsoft is selling the traffic control tower that routes work between rival models, logs the steps, and keeps the whole process inside Microsoft 365. (microsoft.com) (geekwire.com)