Nvidia opens GPU tooling to Kubernetes
Nvidia donated its GPU Dynamic Resource Allocation (DRA) driver to the Kubernetes community to simplify large-scale AI workloads and improve GPU efficiency and isolation. The company also announced partnerships to develop grid-flexible “AI factories,” signaling tighter links between AI compute and energy infrastructure (blogs.nvidia.com) (nvidianews.nvidia.com).
The NVIDIA k8s-dra-driver-gpu repository is published under the Apache-2.0 license and shows 585 stars and 1,478 commits on GitHub as of this week, indicating active upstream development. ) The driver targets Kubernetes 1.32 or newer with DRA enabled and exposes features such as dynamic reconfiguration of GPUs, allocation of ComputeDomains for Multi-Node NVLink, and explicit support for NVIDIA Multi-Instance GPU (MIG) workflows. ) Recent commits add a ComputeDomainClique CRD and work on dynamic MIG device management, reflecting changes to the resource API that operators and schedulers must adopt. ) At KubeCon, NVIDIA also posted updates to its Kubernetes AI (KAI) Scheduler and introduced GPU support for Kata Containers as a confidential-containers option to tighten isolation for GPU workloads. ) Separately, a CERAWeek announcement on March 23, 2026 names AES, Constellation, Invenergy, NextEra Energy, Nscale Energy & Power and Vistra as partners in Emerald AI’s DSX reference architecture and Conductor orchestration work to make “AI factories” grid‑flexible and faster to energize. ) Canonical published a KubeCon commentary on March 24, 2026 describing how the new stack (GPU Operator, Modern Resource Stack around DRA, and KAI) will change GPU lifecycle automation, and industry writeups from Rafay and Phoronix detail vendor and distro plans to integrate the new APIs.; docs.rafay.co/blog/2026/03/25/advancing-gpu-scheduling-and-isolation-in-kubernetes/ (docs.rafay.co); phoronix.com/news/NVIDIA-Open-Source-KubeCon-2026 (phoronix.com))