OpenAI unbundles model stack

- OpenAI split its 2025 developer lineup into separate model families and attachable tools, launching GPT‑4.1 on April 14 and adding image generation, Code Interpreter, and remote MCP tools to Responses on May 21. - The clearest sign is pricing: GPT‑4.1 mini was introduced at 83% lower cost than GPT‑4o, while web search is billed at $10 per 1,000 calls and containers separately from model tokens. - The shift moved OpenAI from selling one flagship chatbot toward selling models, tools, and runtimes as parts developers mix themselves. (openai.com)

OpenAI spent 2025 breaking apart the all-in-one model pitch it popularized with ChatGPT. Developers now buy a model, then add tools like web search, image generation, code execution, or voice on top. (openai.com 1) (openai.com 2) (openai.com 3) The split became visible on April 14, 2025, when OpenAI launched GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano as API-only models instead of folding everything into ChatGPT’s main picker. OpenAI said GPT‑4.1 supports up to 1 million tokens of context and that the family was tuned for coding, instruction following, and long documents. (openai.com) Five weeks later, on May 21, 2025, OpenAI added built-in tools to its Responses API: image generation, Code Interpreter, improved file search, and support for remote Model Context Protocol servers. OpenAI described Responses as its “core API primitive” for agentic applications and said developers had already processed trillions of tokens through it. (openai.com) That matters because a model is now only one layer of the product. The model writes and reasons; separate tools fetch live information, run code, search files, call outside services, or generate images. (openai.com 1) (openai.com 2) OpenAI’s pricing page shows the pieces explicitly. Text models, realtime voice models, image models, web search, and containers are listed as separate billable components, with web search priced per call and containers priced per session alongside token charges. (openai.com) The company also pushed customers toward smaller, more specialized model choices. In its GPT‑4.1 launch post, OpenAI said GPT‑4.1 mini matched or beat GPT‑4o on many intelligence evaluations while cutting latency by nearly half and reducing cost by 83%, and it positioned GPT‑4.1 nano as its fastest and cheapest option. (openai.com) Audio followed the same pattern. On March 20, 2025, OpenAI introduced a separate suite of audio models for speech-to-text and text-to-speech, rather than treating voice as a built-in property of every flagship release. (openai.com) By late 2025 and 2026, OpenAI was building more infrastructure around that modular stack. It added MCP support to realtime voice, introduced shell and patch tools for coding agents, and expanded the Responses API with WebSocket connections and hosted execution environments. (openai.com 1) (openai.com 2) (openai.com 3) (openai.com 4) ChatGPT itself did not disappear, but it increasingly looks like a packaged front end over a changing back end. OpenAI’s model release notes show frequent swaps, fallbacks, and retirements inside ChatGPT, while the API side exposes the underlying parts more directly to developers. (help.openai.com) For software companies that build on OpenAI, that means more freedom and more assembly work. Instead of waiting for one bigger flagship model, they can mix a cheaper text model with web search, a container, an image model, or a voice model — and pay for each layer separately. (openai.com) (openai.com)

OpenAI unbundles model stack

Get your own daily briefing