Vera CPU & Rubin
NVIDIA unveiled the Vera CPU / Vera Rubin architecture aimed at agentic AI workloads — a move positioned to accelerate on‑device and low‑latency agent execution (x.com) (tomshardware.com). GTC recaps also named LPUs (for faster responses) and Dynamo 1.0 as part of the stack that Nvidia says will boost efficiency for scaled AI (x.com) (tomshardware.com).
NVIDIA’s new Vera CPU ships as an 88-core design and the company showed a Vera CPU rack that aggregates 256 liquid‑cooled Vera CPUs, a configuration NVIDIA says produces up to a 6× gain in CPU throughput for CPU‑heavy workloads. (tomshardware.com) The Vera Rubin platform announcement lists seven new chips now in production and a five‑rack, rack‑scale architecture that includes Vera Rubin NVL72 GPU racks, Vera CPU racks, Groq 3 LPX inference racks, BlueField‑4 STX storage racks and Spectrum‑6 SPX networking. (investor.nvidia.com) NVIDIA’s Groq 3 LPX LPU is presented as a rack‑scale low‑latency inference accelerator that co‑designs 256 LPUs per rack with Rubin systems and claims up to 35× higher inference throughput per megawatt and a 10× revenue opportunity uplift for trillion‑parameter models. (developer.nvidia.com) Dynamo 1.0 was released as an open‑source “inference operating system” aimed at datacenter‑scale generative and agentic workloads, and NVIDIA says Dynamo integrates with TensorRT‑LLM and is available to developers now. (nvidianews.nvidia.com) Onstage at GTC, CEO Jensen Huang projected demand for Rubin and related systems large enough to drive roughly $1 trillion in orders for NVIDIA products through 2027, a financial target cited during the keynote. (cnbc.com)