NVIDIA's Next-Gen AI Chip Taps Samsung & SK Hynix
NVIDIA's upcoming Vera Rubin AI accelerator will exclusively use HBM4 memory from Samsung and SK Hynix, excluding Micron from the initial launch. The new hardware, featuring a 576GB capacity, is set to be unveiled at the GTC 2026 conference in Silicon Valley this month.
The move to High Bandwidth Memory 4 (HBM4) is driven by a critical bottleneck in AI: memory bandwidth. As GPU cores become more powerful, they need faster access to data to avoid sitting idle. HBM4 provides a significant leap, with the official JEDEC specification doubling the interface width to 2,048 bits, enabling bandwidth of up to 2 terabytes per second per stack. Nvidia's Vera Rubin platform is more than just a GPU; it's a full system architecture. The flagship NVL72 rack configuration combines 72 Rubin GPUs with 36 new "Vera" CPUs, which are custom Arm-based cores designed for data movement and agentic reasoning. The entire system is interconnected with NVLink 6, providing 260 TB/s of scale-up bandwidth. This supplier decision highlights the intense competition in the advanced memory market. SK Hynix has been a dominant HBM supplier for Nvidia, helping it surpass Samsung as the top memory maker by revenue for the first time in Q2 2025. However, Samsung was the first to announce the start of HBM4 mass production in February 2026, positioning itself to secure a major share of the Vera Rubin orders. While Micron was a key HBM supplier for Nvidia's Hopper and Blackwell GPUs, it appears to have been left out of the initial top-tier Vera Rubin accelerator. Industry sources suggest Micron's HBM4 will likely be used in midrange products within the broader Rubin family, rather than the flagship AI accelerator. The Vera Rubin platform, built on a 3nm process, is expected to offer up to 5 times the AI inference performance of the Blackwell generation it succeeds. Nvidia is claiming the new architecture can deliver up to a 10x reduction in the cost per million tokens for inference tasks, a critical metric for any startup building LLM-based products. The GTC 2026 conference, scheduled for March 16-19 in San Jose, serves as Nvidia's primary stage for major data center and AI announcements. The formal unveiling of the Vera Rubin architecture follows Nvidia's aggressive new one-year release cadence, a strategy designed to outpace competitors in the rapidly evolving AI hardware landscape.