OpenAI ships GPT‑5.4 Mini/Nano

OpenAI rolled out GPT‑5.4 Mini and Nano inside ChatGPT to give developers lighter, faster model options and a new “thinking‑time” toggle that trades latency for deliberation — the system may also switch models automatically based on query complexity. This expands choices for on‑device and low‑latency coding or productivity workflows and raises fresh questions about transparency when the service abstracts model selection. (help.openai.com)

OpenAI published GPT‑5.4 mini and GPT‑5.4 nano on March 17, 2026, positioning both as smaller, faster members of the GPT‑5.4 family and making them available in ChatGPT and the API. (openai.com) OpenAI’s benchmark table shows SWE‑Bench Pro scores of 57.7% (GPT‑5.4), 54.4% (GPT‑5.4 mini), and 52.4% (GPT‑5.4 nano), while Terminal‑Bench 2.0 scores read 75.1%, 60.0%, and 46.3% respectively, and OSWorld‑Verified lists 75.0%, 72.1%, and 39.0% for the same trio. (openai.com) The company reports GPT‑5.4 mini runs more than 2× faster than the prior GPT‑5 mini and “approaches” GPT‑5.4‑level pass rates on several evaluations, with latency estimates that account for tool‑call duration, sampled tokens, and input tokens. (openai.com) OpenAI recommends GPT‑5.4 nano for ultra‑low‑latency, high‑throughput tasks such as classification, extraction, ranking and lightweight coding subagents, while describing mini as the preferred fast option for coding loops, code navigation, and multimodal tasks. (openai.com) ChatGPT’s “thinking‑time” control — surfaced in the message composer when selecting a Thinking model — lets users trade standard versus extended inference duration (multiple tiers available for Pro/Business plans), with the selection persisting across chats until changed. (techradar.com) ChatGPT also exposes an “Auto” mode and a Configure→“Auto‑switch to Thinking” setting that will route queries between Instant/Fast and Thinking models based on prompt complexity, and users can turn automatic switching off via the model picker. (help.openai.com) Observers and press coverage note that Auto’s abstraction of model selection centralizes latency/quality tradeoffs but reduces transparency about which model answered a given query, a dynamic TechCrunch coverage tied to OpenAI’s reworked “Auto/Fast/Thinking” settings and ongoing debate over the model picker. (techcrunch.com) OpenAI published customer feedback from Hebbia’s CTO claiming GPT‑5.4 mini matched or exceeded competitive models on several tasks while delivering stronger end‑to‑end pass rates and source attribution at lower cost, a use‑case signal aimed squarely at coding assistants and agentic workflows. (openai.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.