FourWeekMBA flags CPU shortage

- FourWeekMBA said on May 23 that AI infrastructure shortages are spreading from GPUs to CPUs, as orchestration, preprocessing and serving workloads consume conventional compute. - Jensen Huang told investors on May 21 Nvidia is gaining share in inference “very, very quickly” and expects frontier AI companies to adopt Vera Rubin. - Computex 2026 in Taipei opens this week, where Nvidia is expected to discuss Vera processors and broader inference infrastructure.

FourWeekMBA said on May 23 that the AI compute squeeze is widening beyond graphics processors to central processors, arguing that orchestration, preprocessing, storage and serving are now straining conventional infrastructure as much as accelerators. The analysis, published a day after Nvidia’s latest earnings commentary, pointed to CPU availability as a new pressure point in AI systems. Yahoo Finance reported on May 24 that Nvidia Chief Executive Jensen Huang said the company is gaining share in inference “very, very quickly” and expects major frontier AI companies to adopt its Vera Rubin platform. Together, the two items describe a market in which AI bottlenecks are shifting from a single chip class to the rest of the stack. ### Where is the shortage showing up if not only on GPUs? FourWeekMBA wrote that the shortage is now a “full-stack infrastructure story,” citing AMD’s push to accelerate production timelines as AI demand tightens the CPU market. The piece said the constraint is showing up in the non-model parts of AI systems, including orchestration, preprocessing and serving. Mixed video workloads are one example of that pattern. (fourweekmba.com) In video systems, CPU time, memory bandwidth and input-output capacity are consumed by transcoding, container handling, metadata extraction and file movement before a model runs or after it finishes, according to the FourWeekMBA analysis. ### Why do video platforms run into CPU and I/O limits so quickly? Video pipelines combine several compute-heavy steps that do not all belong on GPUs. (fourweekmba.com) FFmpeg transcodes, archive reprocessing, clip exports, metadata jobs and queue orchestration often run in parallel, which means the bottleneck can move to host CPUs, storage throughput or network transfer even when accelerator capacity is available, according to FourWeekMBA’s description of mixed workloads. That matters because inference growth does not remove those surrounding tasks. Yahoo Finance reported that Huang said Nvidia’s inference business is expanding rapidly, a sign that more AI systems are moving into production rather than staying in training environments. More production inference means more data preparation, model serving and output handling around the accelerator layer. (fourweekmba.com) ### What did Jensen Huang actually say about the next phase? Jensen Huang said on Nvidia’s earnings commentary last week that the company is growing share in inference “very, very quickly,” according to Yahoo Finance. He also said he expects every major frontier AI company to adopt the Vera Rubin platform. Digitimes reported on May 23 that Huang, arriving in Taiwan ahead of Computex, described Vera Rubin as potentially the largest product rollout in Taiwan’s electronics industry and Nvidia’s most successful product generation. (finance.yahoo.com) That reporting adds to the picture of Nvidia pushing an inference-centered platform that extends beyond GPUs alone. ### What does that change for operators of video infrastructure? (finance.yahoo.com) Queue design becomes more important when CPU-bound and GPU-bound jobs compete for the same system resources. FourWeekMBA said batch transcodes, archive reprocessing and non-urgent renders should be separated from latency-sensitive tasks such as transcript search, clip extraction and rapid exports. FFmpeg stages, file movement and metadata services also become targets for optimization when the constraint sits in the data plane rather than in accelerator supply. (digitimes.com) FourWeekMBA said operators may get better returns from tuning those stages than from adding more GPU pools alone. ### What should readers watch next? Computex 2026 opens in Taipei this week, with Nvidia expected to discuss Vera processors and its next inference platform. (fourweekmba.com) Digitimes reported Huang arrived in Taiwan on May 23 ahead of the event, and Yahoo Finance said investors are already hearing Nvidia frame Vera Rubin as the next major system in its AI roadmap. (digitimes.com)

FourWeekMBA flags CPU shortage

Get your own daily briefing