Vera Rubin: 10x cheaper AI

NVIDIA’s new Vera Rubin platform promises roughly 10x lower training costs on trillion‑parameter models, 3–4x compute density over Blackwell and about 10x cheaper inference — Jensen Huang tied it to as much as $1 trillion in AI infrastructure demand. ( ). NVIDIA flagged CPUs as the next bottleneck for agentic AI and partners like HPE are shipping full‑stack racks — HPE’s Cray GX5000 references Vera Rubin NVL72 plus Quantum‑X800 InfiniBand and Blackwell in a single system. ( )

The NVL72 rack combines 72 Rubin GPUs with 36 Vera CPUs and an NVLink‑6 switch to create a single rack‑scale system for large models. ( nvidia.com ) Nvidia’s Rubin GPU is described in vendor breakdowns as packing 288 GB of HBM4 memory and roughly 50 PFLOPS of NVFP4 compute per GPU, producing about 3.6 exaflops of NVFP4 performance per NVL72 rack. ( hashrateindex.com ) Nvidia says the Vera Rubin launch is a seven‑chip ecosystem in full production that ties Rubin GPUs to Vera CPUs, NVLink‑6 switches, ConnectX‑9 SuperNICs, BlueField‑4 DPUs, Spectrum‑6 Ethernet and Groq 3 LPUs. ( nvidianews.nvidia.com ) Nvidia integrated external low‑latency inference accelerators into the platform by incorporating Groq 3 LPU decode/acceleration in rack configurations aimed at agentic, real‑time workloads. ( storagereview.com ) HPE’s Cray GX5000 lineup was updated to offer NVL72 rack options and new Quantum‑X800 InfiniBand networking, and HPE says its GX240 compute blades will include up to 16 Vera CPUs for high‑density installations. ( storagereview.com ) HPE materials and earlier vendor disclosures show a fully populated GX240 blade can scale to hundreds of Vera CPUs per rack (HPE cites configurations of up to 640 CPUs and tens of thousands of Arm‑compatible cores in GX5000 racks). ( t.co / theregister.com ) Nvidia first previewed Rubin systems at CES in early January and expanded the platform at GTC on March 16, 2026, while product pages and company statements list NVL72 and rack SKUs as entering production and shipping toward the second half of 2026. ( tomshardware.com / nvidianews.nvidia.com )

Vera Rubin: 10x cheaper AI

Get your own daily briefing