Azure + NVIDIA agent engineering

Microsoft Azure and NVIDIA are co‑engineering production tools for AI agents, including Microsoft Foundry paired with NVIDIA Nemotron models and new Azure hardware like the Vera Rubin NVL72 systems. The partnership also aims to connect agent tooling with Fabric and Omniverse for physical AI workflows, blurring software and infra boundaries for agent deployment (x.com).

An artificial intelligence agent is just software that can take a goal, break it into steps, call tools, and keep going without waiting for a human after every click. Microsoft said on March 16, 2026 that it is now building that kind of system with NVIDIA as one stack, not as separate model, cloud, and hardware layers. (blogs.microsoft.com) The software layer in this story is Microsoft Foundry, which Microsoft describes as the system for building, deploying, and operating enterprise artificial intelligence. In the same March 16 announcement, Microsoft said Foundry is being expanded so developers can run production agents on NVIDIA accelerators and NVIDIA Nemotron models. (blogs.microsoft.com) A model is the part that does the thinking, like the engine inside a car before you decide what roads it can drive on. Microsoft’s Foundry team said on March 16 that NVIDIA Nemotron 3 Super is now available in Foundry through NVIDIA Inference Microservices, giving developers an open reasoning model built for long context and multi-step agent work. (techcommunity.microsoft.com) Inference is the moment a model is actually answering questions and making decisions, and that is where agents get expensive because they keep calling the model over and over. NVIDIA says its Vera Rubin platform is tuned for “agentic inference” and long-context reasoning, and Microsoft says Azure will be the first hyperscale cloud to power on Vera Rubin NVL72 systems for those workloads. (nvidianews.nvidia.com) (blogs.microsoft.com) The Vera Rubin NVL72 is not one chip but a full rack-scale machine, which is closer to buying an entire kitchen than buying a single stove. NVIDIA says each NVL72 system combines 72 Rubin graphics processors, 36 Vera central processors, ConnectX-9 networking cards, BlueField-4 data processing units, and NVLink 6 switching in one platform. (nvidia.com) Microsoft has been preparing Azure data centers for that hardware before this week’s announcement. In a January 2026 Azure post, Microsoft said its Fairwater sites in Wisconsin and Atlanta were being engineered so Vera Rubin NVL72 racks could be integrated into next-generation “AI superfactory” deployments. (azure.microsoft.com) The new part is that Microsoft is not pitching this as faster chatbots. Microsoft said the goal is to connect Foundry with Microsoft Fabric and NVIDIA Omniverse so companies can build “physical AI” systems that move from data to simulation to real-world operations. (blogs.microsoft.com) Microsoft Fabric is the company’s data platform, so it is where a business keeps the tables, dashboards, and pipelines an agent would need to read before acting. NVIDIA Omniverse is the simulation layer, and NVIDIA said in December 2025 that Omniverse libraries on Azure were already being used for industrial workflows such as factory operations and computer-aided engineering. (blogs.microsoft.com) (blogs.nvidia.com) That means the partnership is starting to erase the old line between “model company” and “cloud company.” The same vendor pair is now offering the model through Nemotron, the runtime through Foundry and NVIDIA Inference Microservices, the rack through Vera Rubin NVL72, the data layer through Fabric, and the simulation layer through Omniverse. (techcommunity.microsoft.com) (blogs.microsoft.com) (nvidia.com) NVIDIA has been framing Rubin as the hardware generation for this exact shift. In January 2026, NVIDIA said the Rubin platform could cut inference token cost by up to 10 times versus Blackwell, and lower token cost is what turns an always-on agent from a demo into something a company can afford to run all day. (investor.nvidia.com) So the story is not just that Microsoft added another model to Azure. The March 2026 announcements show Microsoft and NVIDIA trying to sell agent building the way cloud companies once sold web apps: one platform, one bill, and one path from prototype to a rack of machines in a data center. (blogs.microsoft.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.