GTC: Open models + token math

At Nvidia’s GTC Jensen Huang pushed a big thesis — open models are now “the second-largest model category globally” and are on track to dominate many industries, framing a hybrid future of proprietary + open systems (youtube.com). Panelists also shifted the budgeting conversation to tokens — Huang argued a $500K/year engineer should be spending about $250K a year in inference tokens, signaling a move from capex GPU buys to opex token economics (youtube.com). Infrastructure players underscored the scale: Super Micro showed deployments exceeding 100,000 liquid-cooled GPUs as AI factories scale out (youtube.com).

NVIDIA staged GTC in San Jose from March 16–19, 2026, where executives and partners framed the conference around an industry pivot from training to industrial-scale, always-on inference systems. (nvidia.com; Datacenter Frontier). NVIDIA announced the Nemotron Coalition — eight founding members listed as Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab — to co‑develop open frontier foundation models. (GlobeNewswire; The New Stack). NVIDIA and Mistral said the coalition’s first base model is being co‑developed by NVIDIA and Mistral AI and will be trained on NVIDIA DGX Cloud infrastructure. (Mistral AI; GlobeNewswire). On an All‑In Podcast recorded at GTC on March 19, 2026, Jensen Huang said an engineer paid $500,000 a year should be consuming at least $250,000 worth of inference tokens and that he would be “deeply alarmed” otherwise. (All‑In podcast; CNBC). Huang told investors and developers at GTC that demand tied to NVIDIA’s Blackwell and Vera Rubin systems could approach $1 trillion in orders through 2027, a figure company spokespeople cited as the market signal for inference capacity. (CNBC). Supermicro said it has deployed more than 100,000 NVIDIA GPUs with its direct liquid‑cooling solutions for large AI “factories” and is shipping over 100,000 GPUs per quarter in its rack‑scale liquid‑cooled servers. (Supermicro press release; TechPowerUp). Panel discussions and analyst coverage at GTC explicitly reframed IT budgeting around token‑based operating expenses rather than one‑time GPU capex, with multiple outlets dubbing the moment the “inference inflection point” for enterprise procurement. (TechRepublic; SiliconANGLE).

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.