NVIDIA ships Nemotron & NemoClaw
NVIDIA announced Nemotron 3 Super (an open MoE model) and an agent platform called NemoClaw — a clear hardware+software push to capture enterprise agent workloads and build a software moat on top of silicon announced.
Nemotron 3 Super is a 120-billion‑parameter hybrid Mixture‑of‑Experts model that activates 12B parameters for inference, as described in NVIDIA’s technical blog. developer.nvidia.com huggingface.co NVIDIA reports Nemotron 3 Super delivers over 5× throughput improvements and a native 1,000,000‑token context window, enabled by LatentMoE, Multi‑Token Prediction (MTP) layers, and a hybrid Mamba‑Transformer backbone. developer.nvidia.com arxiv.org The model was pretrained in NVIDIA’s NVFP4 format and NVIDIA compares inference performance as roughly 4× faster on a Blackwell B200 versus FP8 on an H100, with RL fine‑tuning run across more than 1.2 million environment rollouts according to NVIDIA’s release notes. developer.nvidia.com NVIDIA has published open weights, training recipes and a Hugging Face model page for Nemotron 3 Super (model tag includes NVFP4 and shows dataset/pretraining cutoff metadata), enabling on‑prem customization and fast starts for enterprise pipelines. huggingface.co Reporting indicates NVIDIA is developing an open‑source agent orchestration platform codenamed “NemoClaw,” and that company executives have pitched the project to enterprise software vendors including Salesforce, Google, Adobe, Cisco and CrowdStrike ahead of GTC 2026. cnbc.com dataconomy.com NemoClaw is described in coverage as intended to be open‑source and hardware‑agnostic while integrating with NVIDIA’s NeMo stack and NIM inference microservices (NeMo docs detail unified endpoints, deployment via Helm/Kubernetes, and NIM proxy patterns). nemoclaw.bot docs.nvidia.com The Nemo/NeMo microservices docs also document infrastructure features relevant to production agent fleets—NeMo Studio for visual workflow management, a NeMo Data Store API compatible with Hugging Face, and NIM deployment configuration options (including env vars to auto‑pull PEFT/LoRA adapters) useful for observability and reproducible rollouts. docs.nvidia.com docs.nvidia.com NVIDIA’s timing follows a wave of high‑profile agent security incidents and an MITRE ATLAS investigation into OpenClaw, and published coverage says NemoClaw will include enterprise privacy and security controls as part of its pitch to partners. mitre.org dataconomy.com Combining Nemotron 3 Super’s claimed >5× throughput with NeMo/NIM deployment patterns suggests a concrete path to lower per‑agent inference cost for always‑on, long‑context agents (inference‑based observation supported by NVIDIA throughput claims and NeMo deployment docs). developer.nvidia.com docs.nvidia.com