New 'CPU Yellow Pages' Ranks AI Datacenter Chips
An industry analysis provides a comparative ranking of 14 CPUs for AI datacenters, including chips from Nvidia, AMD, Intel, and hyperscalers like AWS and Microsoft, across nine metrics for reasoning workloads. The report, dubbed the "AI Datacenter CPU Yellow Pages," argues that CPUs are not commodities in the agentic AI era, with orchestration and memory bandwidth becoming decisive performance factors. A social media summary of the report is also available.
- The "Yellow Pages" report focuses on CPU performance in reasoning workloads, where the CPU orchestrates tasks for the GPU, making factors like per-core performance and CPU-to-GPU interconnect bandwidth critical. For instance, Nvidia's Grace Hopper superchip uses a 900 GB/s NVLink-C2C interconnect, which is 7 times faster than PCIe Gen5, to co-locate the CPU and GPU, significantly speeding up workloads that require large memory. - AMD's Instinct MI300X accelerator is a key competitor, emphasizing high memory capacity and bandwidth with 192 GB of HBM3 memory and 5.3 TB/s of bandwidth, which is advantageous for large language models. However, while the MI300X has a higher theoretical compute capacity, studies have shown it achieves 37-66% of the Nvidia H100/H200's performance in real-world LLM inference. - Intel's Gaudi 3 is positioned as a strong competitor on a price-performance basis, with some tests showing it outperforming the Nvidia H100 in specific inference scenarios and offering up to a 1.6x performance per dollar advantage over the H200. Gaudi 3 features 128GB of HBM memory and a bi-directional network bandwidth of 1200 gigabits per second. - Hyperscalers are increasingly designing their own custom silicon to optimize for their specific workloads and reduce reliance on third-party chipmakers. This "build vs. buy" trend includes chips like AWS's Trainium and Microsoft's Maia. - Microsoft's Azure Maia 100 is an AI accelerator built on a 5-nanometer process with 105 billion transistors, designed for large language model training and inference. Microsoft is taking a vertically integrated approach, co-designing the chip, custom server boards, racks, and cooling systems to maximize efficiency for its AI workloads. - AWS's Trainium chips are purpose-built for high-performance deep learning training, with the goal of offering up to 50% cost-to-train savings over comparable EC2 instances. The second generation of these instances, Trn1, can be scaled up to 30,000 Trainium chips, providing 6 exaflops of compute power. - The push for more powerful AI data centers is fueling a record investment surge, with global spending reaching $61 billion in 2025. This investment is driven by the need for infrastructure that can handle the massive electrical and cooling requirements of high-density AI accelerators.