Cerebras CEO claim

Cerebras’ CEO publicly claimed their wafer‑scale SRAM approach can generate tokens ‘15x faster’ than NVIDIA Blackwell GPUs, citing memory bandwidth advantages. The claim is being positioned as a latency and token‑generation differentiator for inference. (x.com)

OpenAI signed a multi‑year agreement on Jan. 14, 2026 to add 750 megawatts of Cerebras wafer‑scale systems to its infrastructure, a deployment the companies say will roll out through 2028. (openai.com) Cerebras’ WSE‑3 wafer‑scale chip is described in company materials as a 4‑trillion‑transistor device with roughly 900,000 AI cores, ~125 petaflops of peak AI compute and 44 GB of on‑chip SRAM. (cerebras.ai) Cerebras publicly quantifies that on‑chip SRAM as producing about 21 petabytes/sec of aggregate memory bandwidth—figures the company uses to explain why some inference workloads can avoid external memory bottlenecks. (cerebras.ai) Company and partner benchmarks published by Cerebras cite example throughput wins such as ~2,700 output tokens/sec on GPT‑OSS‑120B versus ~900 tokens/sec on an NVIDIA DGX B200 Blackwell cluster in the same tests. (cerebras.ai) OpenAI’s Feb. 12, 2026 research preview for GPT‑5.3‑Codex‑Spark notes the model running on Cerebras hardware at more than 1,000 tokens/sec and marks the first production deployment of an OpenAI inference variant on Cerebras systems. (openai.com) Independent benchmarking sites and multi‑vendor aggregates report much wider variance for the same models, with some public runs of GPT‑OSS‑120B averaging ~187 tokens/sec across many tests as of mid‑March 2026. (llm-benchmarks.com) Major outlets reported the OpenAI–Cerebras arrangement as a deal valued at roughly $10 billion, a commercial vote of confidence that underpins the aggressive latency and token‑generation positioning in Cerebras’ public statements. (techcrunch.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.