Microsoft validates Nvidia NVL72
Microsoft is the first cloud provider to validate Nvidia’s Vera Rubin NVL72 in its cloud environment — a notable sign that high-performance AI silicon is reaching managed cloud stacks announced. That validation narrows the performance gap for compute-heavy workloads that used to be exclusively on‑prem.
The NVL72 rack pairs 72 Rubin GPUs with 36 Vera CPUs and is specified by NVIDIA to deliver about 3.6 exaFLOPS of NVFP4 inference and 2.5 exaFLOPS of training while exposing roughly 20.7 TB of HBM4 and 54 TB of LPDDR5x at rack scale. videocardz.com Microsoft’s Azure engineering blog identifies its Fairwater AI superfactory sites in Wisconsin and Atlanta as ready targets for Rubin-style racks and cites prior large GB200/GB300 NVL72 rollouts as the operational precedent for those upgrades. azure.microsoft.com NVIDIA’s Rubin co‑design stitches NVLink‑6 switch fabric, ConnectX‑9 SuperNICs and BlueField‑4 DPUs to create a rack‑local pool with quoted per‑GPU NVLink 6 link rates and an aggregate NVL72 scale‑up bandwidth figure (about 260 TB/s in NVIDIA briefings). nvidianews.nvidia.com Lambda’s Supercluster notes that an NVL72 behaves like “one massive GPU” inside a single NVLink domain, keeping KV caches and model state on rack HBM4 to enable model‑parallel runs and more predictable, low‑latency actor‑learner and RL loops. lambda.ai NVIDIA told partners that Rubin is in full production and named cloud partners and hyperscalers — including CoreWeave, Lambda, AWS, Google Cloud, Nebius and OCI — as early Rubin adopters with partner availability slated through 2026. investor.nvidia.com Both NVIDIA and Microsoft highlight substantial datacenter upgrades for Rubin-class racks — Microsoft cites power, thermal and networking modernization at Fairwater sites, and Lambda specifies 100% direct‑to‑chip liquid cooling and BlueField‑4 DPU offload for inter‑rack scale‑out. azure.microsoft.com