NVIDIA's next‑gen stack

NVIDIA updated its data‑center roadmap with Groq LPUs, stacked Feynman GPUs, the Vera Rubin CPU and optical NVLink—signaling a push toward specialized LPUs and ultra‑high bandwidth interconnects for AI workloads. Bloomberg's recent coverage frames this as part of a broader 'agentic AI' compute arms race and a radical increase in infrastructure scale. (tomshardware.com) (bloomberg.com)

NVIDIA’s Vera Rubin NVL72 rack, slated for 2026, ships the Vera CPU and Rubin GPU alongside five additional processors: Groq LP30, BlueField‑4 DPU, an NVLink‑6 scale‑up switch, Spectrum‑X with co‑packaged optics, and the ConnectX‑9 1600G SuperNIC. (tomshardware.com) NVIDIA explicitly added a seventh chip to the Rubin lineup—the Groq 3 LPX low‑latency inference accelerator—in a Vera Rubin platform blog update published March 16, 2026. (developer.nvidia.com) NVIDIA’s integration of Groq tech follows a roughly $20 billion deal that moved Groq leadership into NVIDIA and coincided with public job listings for a new LPU team focused on packaging, optics, and system software. (datacenterdynamics.com) Korean outlets and industry reporting say Groq 3 will be produced at Samsung Foundry and is targeted to begin shipping in the second half of 2026. (trendforce.com) Groq 3’s on‑die SRAM is reported at roughly 500 MB delivering about 150 TB/s of internal bandwidth, compared with Rubin GPU HBM4 figures cited around 288 GB per GPU and roughly 22 TB/s of bandwidth. (trendforce.com) NVIDIA’s 2027 Rubin Ultra roadmap calls for accelerators with four compute chiplets and 1 TB of HBM4E per package, paired with Groq LP35 supporting the NVFP4 data format, and a Kyber NVL144 rack that packs 144 Rubin Ultra packages to deliver at least 4× the performance of Oberon NVL72. (tomshardware.com) NVIDIA’s public materials frame the roadmap as a shift to rack‑scale co‑design where co‑packaged optics and NVLink scale‑up switches enable higher sustained token throughput for “agentic” multi‑agent workloads, and the company describes treating the data center—not a single server—as the unit of compute. (developer.nvidia.com)

NVIDIA's next‑gen stack

Get your own daily briefing