OpenAI Deploys First Production Model on Non-Nvidia Chips

OpenAI launched its GPT-5.3-Codex-Spark model on Cerebras hardware, marking its first major production AI workload deployed away from Nvidia's GPGPU architecture. The move signals a potential shift in the hardware economics and performance characteristics for large-scale AI inference.

- The deployment leverages Cerebras's third-generation Wafer-Scale Engine (WSE-3), a single dinner-plate-sized chip with 900,000 AI-optimized cores and 44GB of on-chip SRAM. This architecture contrasts with GPU clusters by keeping the entire model on one processor, drastically reducing the latency caused by data movement between multiple chips. - GPT-5.3-Codex-Spark is a smaller, specialized version of OpenAI's Codex model, optimized for high-speed, interactive tasks like real-time code editing. On the Cerebras hardware, it can generate over 1,000 tokens per second with a 128k context window. - This move is part of a multi-year, $10 billion agreement for OpenAI to deploy 750 megawatts of Cerebras compute capacity for low-latency inference, which will be brought online in phases through 2028. - The Cerebras architecture is designed to excel at inference workloads where responsiveness is critical. For a 120 billion-parameter model, the Cerebras CS-3 has been benchmarked at over 2,700 tokens per second, compared to 900 tokens per second on Nvidia's Blackwell B200 GPU. - From a hardware economics perspective, Cerebras claims its systems can offer significant price-performance and power efficiency advantages. One analysis indicated a Cerebras CS-3 system was 32% lower cost and used one-third the power of a comparable Nvidia DGX system while delivering results 21 times faster. - OpenAI has stated that GPUs remain fundamental to its operations for large-scale training and cost-effective, general-purpose inference. The addition of Cerebras creates a complementary, specialized tier for workloads that demand extremely low latency, as part of a broader strategy to build a more resilient and diverse hardware portfolio.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.