NVIDIA Pushes 'AI Factory' Strategy with New Model

NVIDIA is framing its strategy around turning data centers into "AI Factories" that generate intelligence, powered by its Blackwell and next-gen Rubin platforms. As part of this, the company launched Nemotron-Terminal, a new base model optimized for agentic AI to enable robots and drones to perform complex, continuous tasks with better planning and memory.

The next-generation Rubin platform, named for astronomer Vera Rubin, will feature six co-designed chips, including the Vera CPU and Rubin GPU. NVIDIA is targeting a 10x reduction in inference token cost and a 4x reduction in the number of GPUs needed to train Mixture-of-Experts (MoE) models compared to the Blackwell platform. Major cloud providers like AWS, Google Cloud, and Microsoft are slated to deploy Rubin-based systems in 2026. The current Blackwell architecture is built on a dual-die chip with 208 billion transistors, manufactured using a custom TSMC 4NP process. The flagship GB200 NVL72 system connects 72 Blackwell GPUs and 36 Grace CPUs into a single liquid-cooled, rack-scale unit, delivering up to 30x faster inference for trillion-parameter models. NVIDIA's "AI Factory" strategy extends beyond chips to architecting the entire data center stack. The company provides blueprints for partners covering everything from high-density compute to low-latency networking with Spectrum-X and data processing units like the BlueField-4, which acts as the "OS for an AI factory". This full-stack approach is necessary to manage power and cooling for AI racks that can exceed 100kW. Nemotron-Terminal is not a general-purpose chatbot but a family of models (8B, 14B, 32B parameters) fine-tuned from the Qwen3 model specifically for autonomous command-line interaction. It reads the raw text state of a terminal via the lightweight Terminus 2 framework and outputs structured JSON containing its analysis, a step-by-step plan, and the exact keystrokes to execute, inverting the typical agent design by putting intelligence in the model's weights rather than complex external wiring. This focus on agentic AI is also central to NVIDIA's robotics efforts, highlighted by Project GR00T (Generalist Robot 00 Technology). GR00T is a foundation model designed to serve as a general-purpose "brain" for humanoid robots, enabling them to understand natural language and learn skills by observing human actions on video. To accelerate development, GR00T is supported by a suite of simulation tools. This includes the NVIDIA Isaac platform for generating synthetic data and a new open-source physics engine, Newton, being co-developed with Google DeepMind and Disney Research to better simulate real-world interactions for robot learning.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.