OpenAI enhances coding model, diversifies hardware

OpenAI has released GPT-5.3 Codex, a new model purpose-built for code generation that achieved a 77.3% score on the 'terminal bench' benchmark and is 25% faster than its predecessor. Separately, the company is diversifying its hardware stack through a partnership with Cerebras and a collaboration with NVIDIA that nearly doubled the output of one of its models.

- Beyond the 'terminal bench' score, GPT-5.3 Codex showed significant gains on agentic benchmarks, jumping 26.5 percentage points over its predecessor on OSWorld-Verified, a test that measures the ability to complete tasks using a mouse and GUI apps. - The partnership with Cerebras is a multi-year, $10 billion+ cloud services deal for 750MW of compute, not a hardware purchase, with deployment beginning in 2026. The goal is to provide a dedicated low-latency inference solution by leveraging Cerebras's Wafer-Scale Engine (WSE) architecture, which uses massive on-chip SRAM to mitigate memory bottlenecks common in GPU and HBM setups. - The collaboration with NVIDIA focused on accelerating the open-weight `gpt-oss-120b` model, which uses a Mixture-of-Experts (MoE) architecture. By using TensorRT-LLM and a technique called "disaggregated serving," they achieved up to 1.5 million tokens per second on a single NVIDIA GB200 NVL72 system. - A key feature of GPT-5.3 Codex is "real-time steering," which allows developers to provide feedback and guidance while the model is in the middle of executing a task, without losing context. - OpenAI engineers used early versions of GPT-5.3 Codex to debug its own training runs and analyze evaluation results, a sign of the model's maturity for internal MLOps workflows. - The NVIDIA optimization effort also demonstrated rapid performance gains, with one collaboration with Artificial Analysis showing a 35% acceleration in the output of the `gpt-oss-120b` model in just one week on a DGX B200 system. - GPT-5.3 Codex is the first model OpenAI has classified as "high capability" under its Preparedness Framework, indicating it was specifically trained to identify and help fix software

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.