War for inference chips

- AI infrastructure competition is shifting from training GPUs to cheaper, faster inference hardware offered by challengers like Cerebras. - Cerebras refiled for an IPO and reportedly carries an OpenAI contract valued at more than $10 billion. - That filing frames a test of Nvidia’s pricing power and highlights inference economics as the next battleground for AI deployment (spacedaily.com).

Cerebras has reopened its bid for the public markets, turning an obscure chip fight into a test of who profits from running artificial intelligence, not just training it. (sec.gov) The Sunnyvale, California, company filed to go public on April 17, 2026, after withdrawing its earlier paperwork in October 2025. Its new filing follows a January deal with OpenAI and a March cloud partnership with Amazon Web Services. (cnbc.com) (aboutamazon.com) Inference is the step where a model answers a prompt, writes code, or generates an image after training is done. OpenAI said its Cerebras partnership adds 750 megawatts of high-speed compute through 2028 to cut response delays in real-time products. (openai.com) Cerebras builds wafer-scale processors, chips so large they use an entire silicon wafer instead of a small cutout. The company says that design keeps more compute, memory, and data movement on one piece of silicon, which is meant to speed up both training and inference. (sec.gov) The commercial stakes are large. Reuters reported on April 16 that OpenAI had agreed to spend more than $20 billion over three years on Cerebras-powered servers and could receive an equity stake, while OpenAI’s January announcement publicly described the deal as 750 megawatts through 2028. (reuters.com) (openai.com) That puts pressure on Nvidia in the part of the market customers feel every time they use a chatbot. Nvidia says its Blackwell systems lower total cost of ownership for inference and can cut cost per token by 35 times versus Hopper in one flagship configuration. (nvidia.com) Cerebras is arguing that buyers do not need to accept the usual tradeoff between speed and price. In March, Amazon Web Services said Cerebras hardware would be deployed in AWS data centers and exposed through Amazon Bedrock in the coming months. (aboutamazon.com) The filing also gives investors a first hard look at the business behind the pitch. CNBC reported Cerebras posted $510 million in 2025 revenue, up nearly 76% from 2024, and $87.9 million in net income, after a $484.8 million net loss a year earlier. (cnbc.com) That profit picture comes with caveats. PitchBook said the filing disclosed a $24.6 billion order backlog tied mostly to OpenAI, a $1 billion OpenAI loan, and warrants for 33 million shares, leaving Cerebras heavily exposed to one customer relationship. (pitchbook.com) Cerebras’ first listing attempt stalled after a federal review of its ties to Abu Dhabi-backed G42, and the company pulled the offering in October 2025. Its second attempt arrives with bigger contracts, more named partners, and a simpler question for public investors: whether cheaper, faster answers can loosen Nvidia’s grip on artificial intelligence infrastructure. (techcrunch.com) (bloomberg.com)

War for inference chips

Get your own daily briefing