NVIDIA pushes new inference chips at GTC

At GTC 2026 NVIDIA revealed inference‑focused AI chipsets that incorporate Groq’s LPU technology as it races to blunt custom ASIC rivals — the company is also broadening support for automakers and robotics customers. (digitimes.com) (techbuzz.ai)

The Groq 3 LPU die ships with roughly 500 MB of on‑chip SRAM and is billed at about 150 TB/s of internal memory bandwidth per chip. (tomshardware.com — ) NVIDIA’s LPX rack configuration packs 256 LPUs for a stated 128 GB of aggregate on‑chip SRAM and a claimed 640 TB/s scale‑up bandwidth per rack. (nvidia.com — ) Samsung Foundry is the manufacturing partner for Groq 3, producing the LPU on a 4 nm process as NVIDIA confirmed a mass‑production ramp and penciled first shipments for Q3 2026. (koreajoongangdaily.joins.com — ) NVIDIA frames the LPUs as decode‑phase co‑processors inside the Vera Rubin NVL72 rack, pairing Rubin GPUs for training/reasoning with LPUs to handle low‑latency, large‑context inference workloads. (storagereview.com — ) Rubin GPUs tied to the Vera Rubin platform use HBM4 stacks delivering about 22 TB/s of bandwidth per GPU, while Samsung showcased its new HBM4/HBM4E memory at GTC to support that demand. (tomshardware.com — news.samsung.com — ) NVIDIA’s own technical materials claim the LPX+Rubin combination can yield up to 35× higher inference throughput per megawatt and about 10× lower token cost versus prior Blackwell systems. (developer.nvidia.com — ) GTC announcements expanded DRIVE Hyperion Level‑4 partnerships to BYD, Geely, Isuzu and Nissan and outlined a full‑stack robotaxi rollout with Uber across 28 cities by 2028, alongside new robotics integrations with ABB, FANUC and other industrial players. (investor.nvidia.com — nvidianews.nvidia.com — ) Separately, OpenAI has told investors it is tempering infrastructure commitments and has scaled back an ambitious NVIDIA agreement as it prepares for an IPO, a move cited by market analysts as moderating near‑term hyperscaler GPU demand. (cnbc.com — )

NVIDIA pushes new inference chips at GTC

Get your own daily briefing