Big bets on inference silicon

Social posts and coverage flagged massive hyperscaler and research bets on non‑GPU inference silicon — OpenAI, Meta, and Anthropic are reportedly investing heavily in AMD, Cerebras, Broadcom and Trainium hardware and custom stacks (x.com). The trend is driving an inference alternatives market that could fragment deployment paths and vendor choices.

OpenAI announced) a multi‑year, roughly $10 billion agreement with Cerebras on Jan. 14, 2026 to provision about 750 megawatts of wafer‑scale inference capacity through 2028. (cnbc.com) OpenAI signed) a separate collaboration with Broadcom on Oct. 13, 2025 to co‑develop and deploy 10 gigawatts of custom accelerators and Ethernet‑based racks, with rollout slated to start in the second half of 2026 and complete by the end of 2029. (openai.com) Amazon Web Services announced) on March 13, 2026 a Cerebras partnership that pairs AWS Trainium 3 for prefill with Cerebras Wafer‑Scale Engines for decode, and AWS said the integrated Bedrock service will begin rolling out in the second half of 2026. (press.aboutamazon.com) Anthropic disclosed) on Nov. 22, 2024 an expanded collaboration with AWS that included a $4 billion tranche (bringing Amazon’s total Anthropic stake to $8 billion) and deep technical work optimizing the Neuron/Trainium stack, while AWS reported Anthropic was expected to run more than one million Trainium2 chips by the end of 2025. (anthropic.com) NVIDIA and OpenAI announced) a letter of intent to deploy at least 10 gigawatts of NVIDIA systems, with NVIDIA saying it would invest up to $100 billion progressively as each gigawatt is deployed and the first NVIDIA‑powered gigawatt targeted for the second half of 2026. (investor.nvidia.com) Cerebras has published benchmark and partner results showing multi‑thousand tokens‑per‑second inference on OpenAI’s gpt‑oss‑120B (reports cite ~3,000 t/s and sub‑second time‑to‑first‑token), a performance profile Cerebras says enables real‑time agentic and long‑context workloads that are hard to run on GPU clusters. (cerebras.ai)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.