GPT-5.5 benchmark results published as OpenAI posts performance suite and enterprise details
- OpenAI said on April 23 it is rolling out GPT-5.5 to ChatGPT and Codex, then added GPT-5.5 and GPT-5.5 Pro to its API on April 24. - OpenAI’s posted scores showed GPT-5.5 Pro at 90.1% on BrowseComp and 39.6% on FrontierMath Tier 4, while GPT-5.5 hit 82.7% on Terminal-Bench 2.0. - The launch ties OpenAI’s newest model to Codex browser use and enterprise app access in ChatGPT. (openai.com)
OpenAI rolled out GPT-5.5 on April 23 for ChatGPT and Codex, then expanded it to the application programming interface on April 24 with a new system card update. (openai.com 1) (openai.com 2) The company described GPT-5.5 as its newest frontier model for coding, computer use, knowledge work, online research, spreadsheets, and documents. It said Plus, Pro, Business, and Enterprise users in ChatGPT and Codex were getting the rollout first. (openai.com) OpenAI published a benchmark table alongside the release. In that table, GPT-5.5 scored 82.7% on Terminal-Bench 2.0, 84.9% on GDPval wins or ties, 78.7% on OSWorld-Verified, 55.6% on Toolathlon, and 51.7% on FrontierMath Tier 1 through 3. (openai.com) The same table showed GPT-5.5 Pro at 90.1% on BrowseComp and 39.6% on FrontierMath Tier 4. OpenAI said GPT-5.5 matches GPT-5.4 per-token latency in real-world serving while using fewer tokens on the same Codex tasks. (openai.com) Those benchmarks are tests for agent-style work rather than chatbot trivia. Terminal-Bench measures work in a command line, BrowseComp measures finding hard-to-locate information on the web, and OSWorld-Verified measures carrying out computer tasks across software tools. (openai.com 1) (openai.com 2) OpenAI tied the model directly to its enterprise and workplace push. Its ChatGPT help documentation says connected apps can search company data, run deep research with citations, sync content into a workspace knowledge base, and in some cases take actions in outside services. (help.openai.com) In Codex, OpenAI’s developer docs now recommend GPT-5.5 for most tasks when it appears in the model picker. The docs say it is strongest for complex coding, computer use, knowledge work, and research workflows, though it is not available through API-key authentication inside Codex. (developers.openai.com) OpenAI’s Codex changelog paired the GPT-5.5 launch with browser features inside the Codex app. The April 23 entry says Codex can operate an in-app browser for local development servers and file-backed pages, with settings to manage allowed or blocked websites. (developers.openai.com) The company also used the release to emphasize safeguards. OpenAI said it ran its full predeployment safety evaluations, targeted red-teaming for advanced cybersecurity and biology capabilities, and gathered feedback from nearly 200 early-access partners before release. (openai.com) The release leaves OpenAI with one message for customers buying AI for office and software work: GPT-5.5 is not just a chat model, but a model OpenAI is packaging around tools, browsers, files, and enterprise systems. (openai.com) (developers.openai.com)