NVIDIA ships Vera CPU with 88 cores

- NVIDIA said on May 18 it began delivering its Vera data-center CPU, an Arm chip with 88 custom Olympus cores, to major AI customers. - NVIDIA said Vera has 1.2 TB/s of memory bandwidth and 50% faster per-core performance, with Anthropic, OpenAI, SpaceX and Oracle among recipients. - NVIDIA’s Vera Rubin platform pages and blog posts now list Vera CPUs paired with Rubin GPUs for training and long-context inference.

NVIDIA said on May 18 that it had started delivering its Vera data-center CPU to early customers including Anthropic, OpenAI, SpaceX and Oracle. The company described Vera as an Arm-compatible chip built around 88 custom “Olympus” cores and positioned it as a CPU for AI systems that spend time on orchestration, tool use and data movement rather than only GPU math. NVIDIA disclosed the shipments in a company blog post and on product pages for its Vera and Vera Rubin platforms. The timing matters because NVIDIA is not presenting Vera as a general-purpose server CPU first. NVIDIA’s own description says the chip is aimed at “agentic” AI workloads, reinforcement learning, long-context inference and data orchestration, where CPU work can become a bottleneck around GPU clusters. A Georgia Tech and Intel paper cited in the broader discussion found that tool processing on CPUs could account for 50% to 90.6% of total latency in some agentic AI workloads. (blogs.nvidia.com) ### What exactly did NVIDIA say it shipped? NVIDIA’s May 18 blog post said Ian Buck, the company’s vice president of hyperscale and high-performance computing, hand-delivered the first Vera CPU systems to Anthropic, OpenAI, SpaceX and Oracle. The post framed Vera as “a new class of CPU” built for concurrent, real-time AI tasks such as tool calls, orchestration layers and retrieval operations. (blogs.nvidia.com) NVIDIA’s product page says Vera includes 88 Olympus cores, full Armv9.2 compatibility and support for FP8 precision. The same page says the chip delivers 2x the performance of its predecessor, while the May 18 blog post said Vera provides 1.2 terabytes per second of memory bandwidth and 50% faster per-core performance. ### Why is NVIDIA talking about CPUs in an AI cycle dominated by GPUs? (blogs.nvidia.com) The Georgia Tech-Intel paper focused on agentic AI execution rather than classic model training. The authors said heterogeneous CPU-GPU systems handle these workloads because many external tools that give agents their capabilities either run on the CPU or are orchestrated by it. In a summary of the paper’s findings, tool processing on CPUs was reported to consume as much as 90.6% of total latency in some cases. (nvidia.com) NVIDIA’s language tracks that argument closely. The company said “every tool call,” “every orchestration layer” and “every long-context retrieval operation” is CPU work, and said traditional core-density designs were not built for that pattern. ### How does Vera fit with Rubin? NVIDIA’s Vera Rubin platform materials show Vera paired directly with Rubin GPUs in superchips and rack-scale systems. (arxiv.org) The DGX Vera Rubin NVL72 page says the system combines Vera CPUs with Rubin GPUs to scale training and long-context inference within existing power envelopes. A January NVIDIA technical blog described Vera as one of the core chips in the Rubin platform and said the CPU uses 88 custom Olympus cores optimized for next-generation AI factories. (blogs.nvidia.com) That post placed Vera alongside Rubin GPUs, NVLink switching and networking components as part of one integrated platform. (nvidia.com) ### Which customers are attached to the first wave? Anthropic, OpenAI, SpaceX and Oracle were named in NVIDIA’s May 18 delivery post. Earlier NVIDIA materials around Vera and Vera Rubin had also listed cloud providers, AI labs and infrastructure companies including Oracle Cloud Infrastructure, OpenAI and Anthropic among companies working with the systems. NVIDIA has not, in the materials reviewed, disclosed unit volumes, pricing or deployment dates for those first deliveries. (developer.nvidia.com) The company’s live product pages now position Vera as part of the broader Vera Rubin rollout, with DGX Vera Rubin NVL72 and related rack-scale systems serving as the next visible milestone for customers building AI infrastructure around the new CPU-GPU pairing. (nvidia.com) (blogs.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.