Nvidia squeeze and Rubin delays

Demand for Nvidia's Blackwell GPUs remains 'off the charts' while analysts warn Rubin GPU rollouts may be delayed, which could slow the next phase of AI infrastructure upgrades. (ibtimes.com.au) That tight, lumpy supply profile means builders should expect constrained, expensive compute for agentic and inference-heavy products in the near term. (networkworld.com)

Nvidia’s problem is not weak demand. It is that the hottest artificial intelligence chips are arriving in waves, with one generation still sold out while the next one may slip. (nvidianews.nvidia.com) (networkworld.com) Blackwell is Nvidia’s current family of data-center chips, and Jensen Huang said on November 19, 2025 that “Blackwell sales are off the charts” while cloud graphics processing units were sold out. Nvidia reported $57.0 billion in quarterly revenue that day, with data center revenue at $51.2 billion. (nvidianews.nvidia.com) Rubin is the next family after Blackwell, and Nvidia said on January 5, 2026 that Rubin is built to cut inference token cost by as much as 10 times versus Blackwell. Nvidia also said Rubin could train mixture-of-experts models with one-quarter as many graphics processing units as Blackwell. (nvidianews.nvidia.com) Inference is the part where a model answers users instead of learning from data, and that is now the expensive part for chatbots, coding tools, and software agents that stay online all day. Nvidia’s January 2026 Rubin launch explicitly tied the new system to “agentic AI reasoning” and lower token costs. (nvidianews.nvidia.com) That is why a Rubin delay matters more than an ordinary product slip. If the cheaper-per-answer machine shows up late, companies keep buying time on the older machine that is already scarce. (networkworld.com) (nvidianews.nvidia.com) TrendForce said on April 8, 2026 that Blackwell’s share of Nvidia’s high-end graphics processing unit shipments in 2026 is now expected to rise to 71%, up from an earlier 61% forecast. In the same update, Rubin’s share fell to 22% from 29%. (trendforce.com) The bottleneck is not one missing part. TrendForce said Rubin faces four separate hurdles at once: validating new high-bandwidth memory called HBM4, moving network links from ConnectX-8 to ConnectX-9, handling higher power draw, and tuning more advanced liquid cooling. (trendforce.com) High-bandwidth memory is the stack of ultra-fast memory chips sitting next to the processor, like putting ingredients on the chef’s cutting board instead of in a pantry across the room. Nvidia’s own 2025 roadmap slide said Blackwell Ultra uses HBM3e in the second half of 2025, while Vera Rubin moves to HBM4 in the second half of 2026. (s201.q4cdn.com) The power jump is large enough to change the building around the chip. Nvidia’s March 2025 keynote slide showed a Vera Rubin NVL144 rack system with 144 Rubin graphics processing units, 144 NVLink switches, and 576 ConnectX-9 network cards, compared with a Blackwell NVL72 system built around 72 graphics processing units. (s201.q4cdn.com) When systems get denser like that, hyperscalers usually get first access because they can redesign power, cooling, and networking fastest. Network World reported on April 9, 2026 that enterprises often wait another 6 to 12 months to see those chips appear through cloud services and application programming interfaces. (networkworld.com) So the near-term picture is awkward but clear. Nvidia is still selling everything it can make, Blackwell is likely to dominate 2026 shipments, and anyone building inference-heavy products should expect expensive, reserved, and uneven compute until Rubin moves from roadmap slide to volume racks. (nvidianews.nvidia.com) (trendforce.com) (networkworld.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.