Nvidia’s next-gen squeeze

Supply-chain snags risk delaying Nvidia’s Rubin GPUs and shifting demand toward Blackwell accelerators, meaning next-generation capacity may arrive later than customers expect. Packaging and HBM4 component bottlenecks — and possible CoWoS limits — could keep systems scarce and expensive even if some chips enter mass production. (theregister.com, news.futunn.com, markets.financialcontent.com)

Nvidia’s newest artificial intelligence chip may be “in production” and still be hard to buy, because the bottleneck is no longer just making the chip itself. TrendForce said on April 8 that Nvidia’s 2026 shipment mix is shifting back toward Blackwell, with Rubin’s share cut to 22% from 29%. (dramexchange.com) That sounds backwards until you look at how these systems are built. A modern artificial intelligence server is less like one processor on one board and more like a custom-built rack that has to arrive with the graphics processors, memory, networking cards, power delivery, and cooling all matching each other. (nvidia.com) Blackwell is the current Nvidia family already shipping into those giant racks. Nvidia’s GB200 NVL72 system packs 72 Blackwell graphics processors into one liquid-cooled rack and links them with 130 terabytes per second of NVLink traffic, which is why buyers can order a whole system instead of waiting for a new design to stabilize. (nvidia.com) Rubin is the next family after Blackwell, and it adds a new memory generation called High Bandwidth Memory 4. High Bandwidth Memory 4 is the stacked memory sitting right next to the processor like pantry shelves built against the stove, so the chip can grab data faster than it could from regular server memory. (financialcontent.com) TrendForce says that memory is one reason Rubin may slip, because High Bandwidth Memory 4 still needs validation. In plain terms, Nvidia and its suppliers have to prove that the memory stacks, the processor, and the full server behave correctly together before cloud customers will deploy thousands of them. (dramexchange.com) The networking piece is changing too. TrendForce says Rubin systems are moving from ConnectX-8 to ConnectX-9, and Nvidia’s own ConnectX-9 documentation says the card supports up to 800 gigabits per second for Ethernet and InfiniBand links inside huge artificial intelligence clusters. (dramexchange.com) (docs.nvidia.com) Power is another problem hiding inside the spec sheet. TrendForce says Rubin brings “significantly higher power consumption,” which means a data center cannot just swap old boxes for new ones if the electrical feeds, rack layout, and backup systems were sized for Blackwell-era gear. (dramexchange.com) Cooling gets harder at the same time. TrendForce says Rubin needs more advanced liquid cooling, and Nvidia’s Blackwell rack already uses liquid cooling at rack scale, so the next step up is not a small tweak but a tighter plumbing and thermal design problem across the whole cabinet. (dramexchange.com) (nvidia.com) Then there is packaging, which is the factory step that bolts the processor and stacked memory together into one finished module. The April 8 report cited possible limits in Chip-on-Wafer-on-Substrate capacity, and TrendForce reported in December that TSMC’s CoWoS-L and CoWoS-S lines were already described as fully booked while capacity expansion continued through 2026. (theregister.com) (trendforce.com) That is why “mass production” does not automatically mean broad availability. One bullish April 8 market article said Rubin R100 entered mass production ahead of schedule, but TrendForce’s forecast says the systems built around it can still land later and in smaller volumes because memory, networking, cooling, and packaging all have to clear at once. (financialcontent.com) (dramexchange.com) So the near-term winner may be the chip Nvidia is already trying to replace. TrendForce now expects Blackwell to take 71% of Nvidia’s high-end graphics processor shipments in 2026, up from 61%, because customers that cannot wait for Rubin will keep buying the more mature platform that can actually be assembled and delivered. (dramexchange.com) The practical result is simple: the next Nvidia generation may arrive first on roadmaps, press releases, and a small number of early systems, while the bulk of real compute in 2026 still comes from Blackwell racks. In the artificial intelligence server market, the scarce part is often not the brain but the memory stacks, the network plumbing, and the factory slots needed to turn a pile of parts into one working machine. (dramexchange.com) (nvidia.com)

Nvidia’s next-gen squeeze

Get your own daily briefing