NVIDIA’s Vera Rubin push
NVIDIA unveiled a 7‑chip Vera Rubin platform — Vera CPU + Rubin GPU + a Groq LPU with NVLink 6 — and claimed ~10x performance-per-watt vs Grace Blackwell and 5x inference gains, plus roughly $1 trillion in AI chip orders through 2027 ( ). Jensen Huang framed GTC around model scale, perf/watt and enterprise APIs, while sessions also pushed Nemotron 3 for secure, multi‑modal enterprise LLMs and NVQLink for early quantum-classical hybrids ( ).
The flagship NVL72 rack configuration ties 72 Rubin GPUs to 36 Vera CPUs in a non‑blocking NVLink 6 fabric that NVIDIA describes as delivering roughly 260 TB/s of rack‑scale switching bandwidth and 3.6 TB/s per‑GPU switch capacity. (developer.nvidia.com) Rubin GPU die-level details released by NVIDIA show 288 GB of HBM4 per GPU and about 22 TB/s of HBM bandwidth, with company slides calling out multi‑PFLOPS NVFP4 peak numbers for inference and training workloads. (videocardz.com) The Groq integration stems from a late‑2025, roughly $20 billion non‑exclusive licensing and talent agreement that brought Groq leadership into NVIDIA and produced the Groq 3 LPX rack — a liquid‑cooled inference appliance built from 256 Groq 3 LPUs with very large on‑chip SRAM and multi‑hundred‑TB/s rack interconnects. (groq.com) NVIDIA’s Nemotron 3 family was published as a three‑tier release (Nano, Super, Ultra) with technical papers describing Mixture‑of‑Experts hybrids, support for context windows up to 1 million tokens, and optimizations (including NVFP4) aimed at agentic, multi‑modal enterprise deployments. (arxiv.org) NVQLink — NVIDIA’s low‑latency quantum‑classical interconnect — is being adopted by national labs and quantum vendors for microsecond‑scale control loops, with PNNL, Atom Computing and others demonstrating integrations that use CUDA‑Q APIs for hybrid workflows. (nvidia.com) NVIDIA’s launch materials and partner disclosures list major model developers and cloud providers as early Vera Rubin customers, and several cloud vendors — including Microsoft — have begun validating the NVL72 rack for production AI workloads. (venturebeat.com) The Groq licensing deal and its rapid product rollout have drawn regulatory and congressional attention, with U.S. senators querying the structure and scale of the transaction even as NVIDIA positions the technology across its rack portfolio. (bloomberg.com)