AI hardware widens

- Demand for AI compute is broadening beyond GPUs into CPUs, memory and orchestration layers. - Morgan Stanley says “agentic AI” will widen chip spending beyond graphics processors to CPUs. - That shift redistributes bottlenecks across the stack and changes procurement and data‑centre planning. (reuters.com)

Artificial intelligence spending is starting to spread beyond the graphics chips that powered the first boom, with Morgan Stanley saying newer “agentic” systems will pull in more central processors and memory. (reuters.com) In a note published April 20, Morgan Stanley said agentic AI could add $32.5 billion to $60 billion to a data-center central processing unit market that it expects to exceed $100 billion by 2030. The bank said demand for graphics processing units remains strong. (reuters.com) Agentic AI means software that does more than answer a prompt once: it plans steps, calls tools, and hands work to other models or services. OpenAI’s Agents documentation describes agents as systems that plan, call tools, collaborate, and keep state across multi-step tasks. (openai.com; developers.openai.com) That changes the hardware mix inside a data center. Morgan Stanley said the bottleneck shifts toward CPUs and memory as AI moves from generating text or images to taking actions across multiple steps. (reuters.com) The reason is simple: GPUs do the heavy math, but CPUs often act as the control layer that schedules jobs, moves data, manages networking, and runs the surrounding software. Intel said on April 8 that “GPU only inference architectures” are hitting limits for agentic workloads, and its new design with SambaNova uses Xeon 6 chips as host and action CPUs. (intel.com) Memory also moves closer to the center of the spending plan. Micron said on June 10, 2025 that it had shipped 36-gigabyte 12-high high-bandwidth memory 4 samples to multiple customers, calling AI training and inference a major driver of demand for faster memory. (micron.com) Chip companies have already been building systems around that wider stack. Nvidia’s GB200 NVL72 rack combines 72 Blackwell GPUs with Grace CPUs and an NVLink fabric that the company says delivers 130 terabytes per second of GPU-to-GPU bandwidth in one rack. (nvidia.com) Advanced Micro Devices is making a similar pitch from the processor side. AMD says its EPYC server CPUs can handle small- to medium-size inference jobs on their own and serve as host processors for larger graphics-processor clusters, including multi-agent pipelines. (amd.com) Cloud operators are adjusting around the same idea. Intel and Google said on April 9 they would deepen work on Xeon processors and custom infrastructure processing units for artificial intelligence and cloud workloads, with Google continuing to use Xeon across inference and general-purpose computing. (intel.com) The immediate result is not a retreat from GPUs. It is a broader shopping list: more CPUs to coordinate work, more memory to keep models fed, and more networking and orchestration gear to keep those systems busy. (reuters.com; nvidia.com; micron.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.