Microsoft, NVIDIA unify stack for AI

- Microsoft and NVIDIA said on June 2 they are linking Windows devices, local systems and Azure cloud into one stack for agentic AI deployment. - NVIDIA said DGX Station for Windows can run frontier models with up to 1 trillion parameters locally, using 20 petaflops and up to 748GB memory. - DGX Station for Windows systems are due in Q4 from ASUS, Dell, GIGABYTE, HP, MSI and Supermicro.

Microsoft and NVIDIA used Microsoft Build on June 2 to lay out a shared software-and-hardware stack for building and deploying AI agents across Windows PCs, local systems and Azure cloud. NVIDIA said the package combines Windows AI tools, Microsoft Foundry services, NVIDIA models and runtimes, and new Windows hardware including DGX Station for Windows. The announcements extend a partnership the companies have been building across cloud infrastructure, developer tools and PC AI. They also move more model execution closer to end users, including on deskside systems marketed for enterprise workflows. ### What exactly did Microsoft and NVIDIA announce? NVIDIA said the companies are offering a “unified accelerated computing stack” spanning Windows devices, Azure cloud and local deployments. In NVIDIA’s June 2 Build post, the company said developers will be able to build, run and scale agentic and physical AI across that range using NVIDIA hardware, Microsoft Foundry services and Windows AI software. Microsoft’s Build live coverage listed “NVIDIA, Microsoft join forces on unified stack” among the conference’s headline announcements. Microsoft’s Windows AI developer page says the Windows side includes Windows AI APIs, Windows ML and Foundry Local, which Microsoft describes as a platform for running open-source or custom models on-device across CPU, GPU and NPU hardware. (blogs.nvidia.com) ### Where does DGX Station for Windows fit in? NVIDIA announced DGX Station for Windows on May 31 at GTC Taipei and described it as “the world’s most powerful deskside AI supercomputer” for Windows. The company said the system is designed to build, run and connect always-on AI agents to Windows applications and workflows, and is capable of running frontier models of up to 1 trillion parameters locally. (news.microsoft.com) NVIDIA said the machine uses the GB300 Grace Blackwell Ultra Desktop Superchip, with up to 748GB of coherent memory and 20 petaflops of FP4 performance. Systems are expected in the fourth quarter from ASUS, Dell, GIGABYTE, HP, MSI and Supermicro, according to NVIDIA’s Build post and newsroom release. ### What software stack connects the PC, the workstation and the cloud? (nvidianews.nvidia.com) NVIDIA said RTX Spark laptops and small desktops, along with DGX Station for Windows, are meant to let developers build, tune and run agents natively on Windows. The company also said both product lines run NVIDIA OpenShell, which it described as a secure-by-design runtime for autonomous agents. DGX Station for Windows will support OpenShell on Windows using what NVIDIA called new Windows security and containment primitives. (blogs.nvidia.com) Microsoft’s side of the stack centers on Foundry and Windows AI. Microsoft says Foundry Local is generally available and supports local AI execution with no cloud dependency in the request path, while Windows ML is the local inferencing framework for Windows. NVIDIA said enterprises can also use NVIDIA, Anthropic and OpenAI models in Foundry Agent Service on Azure, with “built-in identity and governance.” (blogs.nvidia.com) ### Why does this change where AI work happens? Microsoft and NVIDIA are explicitly moving AI execution across more locations. Microsoft says Foundry now spans “cloud to edge,” including cloud-hosted frontier models, on-premises and distributed deployments, and local execution on devices such as desktops and laptops. NVIDIA’s DGX Station release says workloads that previously sat in Linux data centers can now be brought directly into the Windows ecosystem. (devblogs.microsoft.com) That means the same application stack can involve model download, local caching and on-device inference as well as cloud-hosted services. Microsoft says Foundry Local manages model download, load into device memory, inference management and unload, and that later runs can load the model from local cache on the user’s device. (devblogs.microsoft.com) ### What are the governance and security questions? NVIDIA said enterprises can use Foundry Agent Service on Azure with identity and governance controls, while OpenShell is positioned as a secure runtime. Microsoft says Windows AI and Foundry on Windows are intended to provide a “unified, reliable and secure platform” for model selection, fine-tuning, optimization and deployment. (devblogs.microsoft.com) The architecture described by both companies also places models and related artifacts in more than one place: cloud services, local caches and enterprise deskside systems. Microsoft’s documentation says Foundry Local downloads optimized models to the device and reloads them from local cache on subsequent runs, while NVIDIA says DGX Station is meant for always-on enterprise agents tied to Windows workflows. (blogs.nvidia.com) ### What comes next from here? NVIDIA said RTX Spark systems will arrive this fall from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI. The company separately said DGX Station for Windows systems are expected in Q4 from ASUS, Dell, GIGABYTE, HP, MSI and Supermicro. Microsoft’s Build 2026 materials point developers to sessions, demos and Windows AI documentation covering Foundry Local, Windows ML and related APIs. (devblogs.microsoft.com) (blogs.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.