Nvidia's Blackwell GPUs Dominate LLM Benchmarks

New benchmarks reveal that NVIDIA’s latest Blackwell series GPUs, including the B200, significantly outperform previous generations in LLM inference tasks. The new chips lead in both throughput and energy efficiency, which could alter the economic calculations for financial firms building in-house model serving capabilities. Meanwhile, Nvidia insiders have reportedly sold over $100 million in company stock so far in 2026.

- The Blackwell architecture is built on a custom TSMC 4NP process and features a dual-die design, connecting two chips with a 10 TB/s interconnect to function as a single, unified GPU with 208 billion transistors. - The DGX B200 system, which incorporates eight Blackwell GPUs, delivers up to 3 times the training performance and 15 times the inference performance compared to the previous generation DGX H100 system. - Blackwell GPUs feature fifth-generation NVLink, which provides 1.8 TB/s of GPU-to-GPU bandwidth, doubling the capability of the NVLink 4.0 used in the Hopper generation. - The GB200 Grace Blackwell Superchip combines a Grace CPU with two Blackwell B200 GPUs, and can be scaled in a liquid-cooled, rack-scale system called the NVL72, which connects 72 Blackwell GPUs to act as a single massive GPU. - Despite a higher thermal design power (TDP) of 1000W compared to the H100's 700W, the Blackwell architecture can reduce energy consumption by up to 25 times for certain LLM inference workloads. - The insider stock sales since the start of 2026 were primarily conducted by three executives: Jay Puri (EVP, Worldwide Field Operations) sold over $73 million, Colette Kress (EVP and CFO) sold over $17 million, and Donald Robertson (Principal Accounting Officer) sold $15.2 million.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.