CPUs join the bottleneck
Agentic AI workloads are driving a sharp rise in demand for server CPUs, not just accelerators, making CPUs a new constraint for data‑centre builders and buyers. TrendForce and others report that orchestration‑heavy, real‑time inference systems are shifting the CPU:GPU balance, and Amazon reportedly tripled its CPU server fleet and still ran short as agentic AI consumed available processors (insights.trendforce.com, ).
The chip crunch in artificial intelligence has spread beyond graphics processors: data-center operators now say server central processing units are running short too. (amd.com) Agentic artificial intelligence systems do more than generate text. Amazon Web Services says they are software agents that “reason, act, adapt, and collaborate,” which means more database calls, more scheduling, more memory traffic, and more control work outside the model itself. (aws.amazon.com) That extra work lands on central processing units, the general-purpose chips that move data, schedule tasks, manage input and output, and keep accelerators fed. Advanced Micro Devices said on March 13 that inference is becoming a multistep workflow and that central processing units handle scheduling, data preparation, memory and input-output, and control flow in modern clusters. (amd.com) That is a change from the first wave of generative artificial intelligence infrastructure, when one central processing unit often supported several graphics processing units and most of the spending focus went to accelerators. Reporting this week, citing SemiAnalysis chief analyst Dylan Patel, said the ratio is moving much closer as reinforcement learning and agentic inference consume more host compute. (wccftech.com) The result is that buyers building artificial intelligence capacity now have two balancing problems instead of one: enough accelerators to run models, and enough central processing units to orchestrate them. Intel said on April 8 that “GPU only inference architectures” are hitting limits, and pitched Xeon 6 processors as host and “action” central processing units in a mixed system with other chips. (intel.com) Nvidia made the same argument from the other side of the market. At its March 2026 GTC conference, the company launched the Vera central processing unit and said reasoning and agentic artificial intelligence are making performance, scale, and cost depend more heavily on the infrastructure that plans tasks, runs tools, interacts with data, runs code, and validates results. (nvidianews.nvidia.com) Amazon is the clearest sign of how fast demand has moved. A widely circulated April 14 report, again attributing the claim to Dylan Patel, said Amazon tripled its central processing unit server fleet and still ran short as agentic artificial intelligence workloads consumed available processors. (wccftech.com) Some of the loudest shortage claims still rest on analyst commentary and secondary reports rather than detailed procurement data from cloud providers themselves. But the product roadmaps now line up with the same thesis: Advanced Micro Devices, Intel, Nvidia, and Arm are all talking about central processing units as a first-order part of artificial intelligence system design, not just support hardware. (amd.com) That leaves data-center builders chasing a less glamorous part of the stack. In 2026, the machine that decides which task runs next is starting to look almost as scarce as the machine that writes the answer. (nvidianews.nvidia.com)