OpenAI runs Codex‑Spark on Cerebras

- OpenAI released GPT-5.3-Codex-Spark in research preview on February 12, 2026, introducing a real-time coding model served on Cerebras hardware. - OpenAI and Cerebras said Codex-Spark delivers more than 1,000 tokens per second, with rollout to ChatGPT Pro users in Codex tools. - OpenAI’s Codex models page lists gpt-5.3-codex-spark as a text-only research preview available to ChatGPT Pro users.

OpenAI released GPT-5.3-Codex-Spark in research preview on February 12, 2026, describing it as a smaller version of GPT-5.3-Codex and the company’s first model built for real-time coding. The company said the model runs on Cerebras hardware and is tuned for low-latency use inside its Codex product. Cerebras said the launch is the first product release in the companies’ fast-inference collaboration. OpenAI said the model is rolling out to ChatGPT Pro users in the Codex app, command-line interface and IDE extension. ### What exactly did OpenAI launch? OpenAI said GPT-5.3-Codex-Spark is a “research preview” rather than a general release, and positioned it as a speed-focused companion to the broader GPT-5.3-Codex model. The company said Spark is “our first model designed for real-time coding,” while GPT-5.3-Codex remains the higher-capability model for longer, more complex software work. (openai.com) The Codex models page lists gpt-5.3-codex-spark as a text-only model optimized for “near-instant, real-time coding iteration.” That page says the model is available to ChatGPT Pro users through Codex interfaces rather than as a separate general API product. ### Why is Cerebras part of this launch? Cerebras said February 12 marked the first release in its collaboration with OpenAI, and tied the product directly to its Wafer-Scale Engine systems. (openai.com) OpenAI had announced that partnership in January, saying it would add 750 megawatts of high-speed AI compute to reduce inference latency and make ChatGPT faster for real-time workloads, with capacity scheduled to come online in multiple tranches through 2028. (developers.openai.com) Andrew Feldman, Cerebras’ co-founder and chief executive, said in the January partnership announcement that “real-time inference will transform AI,” framing low latency as a product feature rather than only an infrastructure metric. OpenAI’s February launch note described Codex-Spark as the first milestone under that arrangement. ### How fast do the companies say it is? (cerebras.ai) OpenAI said Codex-Spark delivers “more than 1000 tokens per second” when served on ultra-low-latency hardware. Cerebras described the model as running at over 1,000 tokens per second, and one later Cerebras post referred to more than 1,200 tokens per second for the same model family. OpenAI’s own launch post uses the lower “more than 1000” figure. (openai.com) Cerebras said the product is intended for software development tasks where responsiveness matters as much as model quality. OpenAI said the goal is to make the model feel “near-instant,” a description that points to interactive coding sessions rather than long unattended runs. ### Where does this sit inside OpenAI’s Codex lineup? OpenAI introduced GPT-5.3-Codex separately on the same product cycle as its higher-end coding and agentic model, saying that model improved on GPT-5.2-Codex and was 25% faster. (openai.com) In that framing, Spark is the lighter, faster option, while GPT-5.3-Codex is aimed at longer-horizon engineering work that uses tools, research and execution over extended sessions. The Codex changelog says Spark was added in research preview for ChatGPT Pro users in the latest Codex app, CLI and IDE extension. The same changelog later points users toward newer Codex models as they appear, indicating Spark sits inside a rapidly updated product stack rather than as a standalone flagship. ### Who can use it now, and what comes next? (openai.com) ChatGPT Pro users are the named audience in OpenAI’s release materials, and the company said the rollout began on February 12 across the Codex app, CLI and IDE extension. OpenAI also linked the launch page to a Codex app waitlist, indicating broader access was still being staged at the time of the announcement. (developers.openai.com) Through 2028, OpenAI and Cerebras plan to bring additional high-speed inference capacity online under the 750-megawatt partnership announced in January. OpenAI’s developers site continues to list gpt-5.3-codex-spark as a research-preview model for ChatGPT Pro users, which is the clearest current marker of where the product stands. (openai.com 1) (openai.com 2)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.