CoreWeave becomes first AI cloud to deploy NVIDIA Vera Rubin NVL72

- CoreWeave said on June 1 it had completed the first AI-cloud bring-up and validation of NVIDIA’s Vera Rubin NVL72 for production use. - The central claim is economic: CoreWeave and NVIDIA said Vera Rubin can cut cost per token by 10x versus Blackwell-class systems. - NVIDIA has said Rubin-based partner systems will be available in the second half of 2026, including through CoreWeave.

CoreWeave said Monday it had completed the first AI-cloud bring-up and validation of NVIDIA’s Vera Rubin NVL72, making the rack-scale system operational on CoreWeave Cloud. The announcement matters less as a product-launch headline than as an infrastructure signal: the next fight in AI cloud is moving from who has GPUs to who can stand up the newest racks fastest and run them efficiently. CoreWeave said the system has passed full rack-scale validation, and NVIDIA has separately said Rubin-based products will ship through partners in the second half of 2026. The hardware itself is a rack-scale package, not a single chip. NVIDIA’s Vera Rubin NVL72 combines 72 Rubin GPUs and 36 Vera CPUs in one system, and NVIDIA describes it as a platform built for large-scale reasoning and inference workloads. CoreWeave said its validation covered the full rack architecture, which is the harder part operationally because power, cooling, networking and system software all have to work together under load. (coreweave.com) ### What did CoreWeave actually announce? CoreWeave’s June 1 statement said it had brought up Vera Rubin NVL72 on its cloud and completed system-level validation for the entire rack-scale architecture. That is more specific than saying it had ordered Rubin hardware or planned future availability; CoreWeave had already said in January that it expected to be among the first cloud providers to deploy Rubin in the second half of 2026. (nvidia.com) Monday’s update was the company saying the system is now up and validated. NVIDIA had previewed that timeline in January when it said Rubin was in full production and that partners including CoreWeave would offer Rubin-based instances in 2026. That earlier roadmap matters because it shows Monday’s announcement was the execution step on a deployment plan already laid out by both companies. (coreweave.com) ### Why are the “10x” claims getting so much attention? CoreWeave said Vera Rubin NVL72 delivers up to 10 times better inference per watt and one-tenth the cost per million tokens compared with NVIDIA Blackwell 1. NVIDIA used similar language in its January Rubin announcement, saying the platform could deliver up to 10 times lower cost per token than Blackwell. Those are vendor claims, but they go directly to the part of AI economics cloud buyers now watch most closely: how much useful inference they can buy for a fixed power and budget envelope. (investor.nvidia.com) The comparison also suggests why rack design now matters as much as chip design. CoreWeave tied the Rubin deployment to its own operating software, cooling controls and observability tooling, arguing those layers are what let customers use a new rack at production scale rather than as a lab demo. (finance.yahoo.com) ### What is inside an NVL72 system? NVIDIA says Vera Rubin NVL72 unifies 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9 SuperNICs and BlueField-4 DPUs in one rack-scale system. CoreWeave said the rack uses a sixth-generation NVLink fabric with 260 terabytes per second of bandwidth, and industry coverage described the platform as fully liquid-cooled. (finance.yahoo.com) Those details matter because AI clouds are no longer just renting out individual accelerators. They are selling tightly integrated systems whose performance depends on interconnects, thermals, orchestration software and uptime as much as on the GPU itself. That is an inference from the architecture both companies described. ### Who is this aimed at? (nvidia.com) CoreWeave said Rubin is intended for customers building agentic AI, reasoning and large-scale inference workloads. Jane Street’s head of quantitative research, Craig Falls, said in the company release that his team was interested in the rack-scale efficiency gains translating into faster training runs and shorter iteration cycles. (nvidia.com) NVIDIA, in its January announcement, linked Rubin to trillion-parameter and mixture-of-experts workloads and named CoreWeave among early cloud partners. That positions the system for AI labs, enterprises and startups that are already bottlenecked less by model availability than by serving cost, power draw and cluster efficiency. (investors.coreweave.com) ### What happens next? NVIDIA said Rubin-based products will be available from partners in the second half of 2026, and CoreWeave has said it will add Rubin technology across its platform for customers running production AI workloads. On Monday, CoreWeave shares were trading sharply higher intraday, with Yahoo Finance showing the stock up more than 12% by early afternoon U.S. trading. (investor.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.