Nvidia unveils 30B open-weight model and Vera CPU

- Nvidia on April 28 released Nemotron 3 Nano Omni, an open multimodal 30B-A3B model, after introducing its Vera CPU for agentic AI at GTC. - The 30B-A3B model handles video, audio, images and text in one system, while Nvidia said Vera delivers twice the efficiency and 50% faster results. - Nemotron 3 Nano Omni is available on Hugging Face, OpenRouter and build.nvidia.com; Vera deployments include Alibaba, Meta and Oracle Cloud Infrastructure.

Nvidia’s latest AI push has two parts that fit together. On April 28, the company released Nemotron 3 Nano Omni, an open multimodal model built to process video, audio, images and text in one system. Six weeks earlier, on March 16 at GTC in San Jose, Nvidia launched Vera, a data-center CPU it said was designed for “agentic AI” and reinforcement learning. Together, the announcements show how Nvidia is packaging models, chips and rack systems for customers building AI services rather than selling only standalone accelerators. ### What exactly is the 30B model Nvidia released? Nemotron 3 Nano Omni is a 30B-A3B hybrid mixture-of-experts model, according to Nvidia’s April 28 product materials. Nvidia said it is an open multimodal model with open weights, and in its technical write-up said developers also get open datasets and recipes. The model is built to take in text, images, audio, video, documents, charts and graphical interfaces, and return text output. (blogs.nvidia.com) Nvidia described it as a multimodal perception and context sub-agent for larger agent systems, replacing separate stacks for vision, audio and language. ### Why does Nvidia keep calling this an agent system component? Nvidia’s own description is narrower than a general chatbot. (blogs.nvidia.com) The company said Nemotron 3 Nano Omni is meant to act as the “eyes and ears” inside agentic systems, working alongside models such as Nemotron 3 Super and Ultra or other proprietary models. In its technical blog, Nvidia said fragmented model chains increase inference hops and orchestration complexity, which raises cost and weakens cross-modal consistency. (blogs.nvidia.com) That is why the company is pitching one model that can reason across screens, documents, audio, video and text inside a single perception-to-action loop. ### Where does Vera fit into that picture? Vera is Nvidia’s data-center CPU for the infrastructure around those models. On March 16, Nvidia said Vera was “purpose-built for the age of agentic AI and reinforcement learning,” and said the chip delivers results with twice the efficiency and 50% faster than traditional rack-scale CPUs. (developer.nvidia.com) Nvidia said the CPU is meant to handle the supporting work around AI models — planning tasks, running tools, interacting with data, running code and validating results. On its product page, the company says Vera acts as the host CPU in accelerated systems, directing data movement, managing memory and orchestrating system control. (investor.nvidia.com) ### Is Nvidia selling chips, or whole AI systems now? Nvidia’s March 16 Vera Rubin announcement answered that directly in product terms. The company said the platform combines the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, Spectrum-6 Ethernet switch and Groq 3 LPU, and said the parts are “designed to operate together as one incredible AI supercomputer.” (investor.nvidia.com) Jensen Huang described that package as “seven breakthrough chips, five racks, one giant supercomputer.” Nvidia also said AI infrastructure is moving from “discrete chips and standalone servers” to “fully integrated rack-scale systems, POD-scale deployments, AI factories and sovereign AI.” That language comes from Nvidia, but it is the clearest statement of how the company is framing the business. (investor.nvidia.com) ### Who is using the new hardware and where can developers get the model? Nvidia said customers collaborating to deploy Vera include Alibaba, ByteDance, Meta and Oracle Cloud Infrastructure, with manufacturing partners including Dell Technologies, HPE, Lenovo and Supermicro. In a later company blog post, Nvidia said first Vera CPU systems had been delivered to Anthropic, OpenAI, Oracle Cloud Infrastructure and SpaceXAI. (investor.nvidia.com) Nemotron 3 Nano Omni has been available since April 28 through Hugging Face, OpenRouter, build.nvidia.com and more than 25 partner platforms, Nvidia said. The company also listed Aible, Foxconn and Palantir among adopters, with Dell Technologies, Docusign, Infosys, Oracle and others evaluating the model. ### What comes next from here? March 16 remains the key date for Nvidia’s broader infrastructure roadmap because that is when the company said the Vera Rubin platform entered full production. (investor.nvidia.com) Nvidia said the platform is aimed at every phase of AI, from pretraining and post-training to test-time scaling and real-time agentic inference. (blogs.nvidia.com) April 28 is the operative date for developers who want the model itself. Nvidia said Nemotron 3 Nano Omni is already available through Hugging Face, OpenRouter and build.nvidia.com, while Vera deployments are proceeding with cloud and lab partners including Alibaba, Meta, Oracle Cloud Infrastructure, Anthropic and OpenAI. (blogs.nvidia.com) (investor.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.