NVIDIA's inference pivot
NVIDIA is previewing a next-gen inference push at GTC—Vera Rubin is positioned to replace parts of the H200 line to cut cost and HBM dependence. At the same time NVIDIA’s IGX Thor is being positioned for real-time AI in autonomous robots and industrial automation, signaling a hardware push into edge robotics. (openpr.com) (x.com)
Microsoft announced it powered on NVIDIA Vera Rubin NVL72 systems and made its Foundry Agent Service generally available at GTC 2026 (March 16, 2026). (blogs.microsoft.com) NVIDIA’s NVL72 rack pairs 72 Rubin GPUs with 36 Vera CPUs and is billed at 3.6 exaFLOPS NVFP4 for inference with rack-level NVLink scale-up bandwidth in the hundreds of terabytes per second. (videocardz.com) NVIDIA’s Rubin architecture combines HBM4 on the GPU with large pools of LPDDR5x and system-level memory pooling to target dramatically lower token cost—NVIDIA slides and partners tout up to ~10x lower inference token cost versus prior Blackwell-based systems. (supermicro.com) Rubin GPUs introduce a third‑generation Transformer Engine with hardware‑accelerated adaptive compression and the NVFP4 numeric format to boost throughput for large‑context, low‑latency inference workloads. (storagereview.com) By contrast, NVIDIA’s H200 remains a high‑HBM3e, high‑capex option—its public spec lists 141 GB of HBM3e—and market price guides in early 2026 placed H200 purchase prices in the roughly $30k–$40k range and cloud rental between about $3.7–$10.6 per GPU‑hour. (nvidia.com) NVIDIA’s IGX Thor for robotics and industrial edge combines a Blackwell iGPU and optional dGPU, delivers up to 5,581 FP4 teraflops of AI compute with 400 GbE connectivity, and is offered as IGX T5000 modules and IGX T7000 board kits with a stated 10‑year lifecycle. (blogs.nvidia.com) IGX Thor’s announced early adopters and ecosystem moves include Diligent Robotics, EndoQuest, Hitachi Rail, Joby Aviation, Nexcom’s humanoid dev kit (MARS400 T20) and Advantech integrations, and the platform explicitly targets industrial and medical functional‑safety certification paths (ISO 26262, IEC 61508, IEC 62304). (blogs.nvidia.com)