OpenAI Launches GPT-5.3-Codex-Spark on Cerebras Chip

OpenAI has released GPT-5.3-Codex-Spark, an ultra-fast, real-time coding model with a 128k token context. In a strategic shift, the model is optimized for Cerebras’ WSE-3 chip, moving away from Nvidia's hardware for this deployment. The move is designed to enable high-throughput, low-cost inference for enterprise use cases and is available to ChatGPT Pro users.

- The Cerebras WSE-3 chip boasts 4 trillion transistors and 900,000 AI-optimized cores, delivering 125 petaflops of peak AI performance. This wafer-scale engine is designed to train models up to 10 times larger than GPT-4 and Gemini, with the ability to store a 24 trillion parameter model in a single logical memory space. For inference, the architecture's massive on-chip memory and high bandwidth are engineered to minimize latency, a critical factor for the real-time responsiveness targeted by GPT-5.3-Codex-Spark. - The 128k token context window is a significant feature for coding tasks, theoretically allowing the model to process and "remember" approximately 80-100 files of code at once. However, research indicates that many models exhibit a "U-shaped" attention pattern, focusing heavily on the beginning and end of the context while ignoring the middle. This creates a gap between the theoretical capacity of a large context window and its effective utilization in complex, multi-file software engineering tasks. - This model's focus on ultra-low latency is tailored for agentic AI workflows, where an AI can autonomously plan and execute a series of tasks. By delivering over 1,000 tokens per second, GPT-5.3-Codex-Spark is designed for interactive development and can be easily interrupted and redirected, which is crucial for developers working in real-time collaboration scenarios. This complements slower, more powerful models that handle long-running, complex tasks. - OpenAI's partnership with Cerebras, along with recent deals with AMD and Broadcom, signals a strategic move to diversify its hardware supply chain beyond Nvidia. While Nvidia remains the core of OpenAI's training and inference stack, these partnerships provide specialized hardware for different tiers of performance, such as the low-latency inference targeted by the Cerebras deployment. - For enterprise adoption, the primary challenges remain data quality, security, and integration with legacy systems. A significant percentage of enterprises report that a lack of technical expertise and concerns over compliance and data privacy are major barriers to deploying AI. AI governance frameworks are becoming critical to manage these risks, ensuring that models operate within ethical and legal boundaries. - The developer experience for this new model is facilitated through OpenAI's SDKs, which abstract away much of the complexity of direct API calls. However, building production-ready agentic systems requires significant engineering effort beyond basic SDK implementation, involving challenges in managing conversation history, tool integration, and robust error handling. The SDKs provide patterns for function calling, which allows the model to interact with external tools and APIs, a key component for building autonomous agents.

OpenAI Launches GPT-5.3-Codex-Spark on Cerebras Chip

Get your own daily briefing