Cisco and NVIDIA Partner on Sovereign 'AI Factories'
Promoting a philosophy that "intelligence should be owned, not rented," Cisco is launching joint "AI factories" with NVIDIA. The partnership aims to provide enterprises with secure and sovereign agentic AI capabilities. The offerings will focus on agent orchestration, secure workflow automation, and ensuring data privacy.
- The "AI Factory" architecture is built on Cisco UCS servers, including the X-Series, which can house NVIDIA GPUs like the H100, H200, and the RTX PRO 6000 Blackwell Server Edition. These servers are connected via Cisco Nexus networking switches, designed to provide the high-bandwidth, low-latency fabric required for large-scale AI training and inference. - A key technical aspect of the partnership is the integration of Cisco's Silicon One networking chips into NVIDIA's Spectrum-X Ethernet networking platform. This allows for a unified networking architecture that can be managed with existing enterprise tools, combining Cisco's enterprise networking dominance with NVIDIA's AI-optimized hardware. - The software foundation of the offering is the NVIDIA AI Enterprise suite, a cloud-native platform that includes tools, frameworks, and pre-trained models. This stack features NVIDIA NIM (NVIDIA Inference Microservices) for deploying models and NVIDIA NeMo for customizing large language models, which can run on Kubernetes distributions like Red Hat OpenShift. - The concept of "Sovereign AI" extends beyond data residency to include control over the entire infrastructure stack, from hardware to models and operational processes. This approach is designed to mitigate risks associated with foreign laws, vendor lock-in, and geopolitical instability, ensuring enterprises maintain control over their intellectual property and data. - One of the primary use cases highlighted for this infrastructure is the acceleration of Retrieval-Augmented Generation (RAG) pipelines. The integration with storage partners like VAST Data aims to reduce RAG latency from minutes to seconds, enabling AI agents to access data and provide responses in near-real-time. - Management of this converged infrastructure is handled through tools like Cisco Intersight, which provides unified, cloud-based management for the UCS servers and networking fabric. This is intended to simplify operations and automate the scaling of compute and GPU resources tailored to specific AI and ML demands. - The partnership offers pre-validated reference architectures, referred to as Cisco AI PODs and Cisco Validated Designs (CVDs). These blueprints are designed to reduce deployment complexity and risk for enterprises building out their AI infrastructure. - The security aspect of the "Secure AI Factory" involves a multi-layered approach. It integrates Cisco's security portfolio, such as Hypershield and AI Defense, with NVIDIA's software to provide monitoring, policy enforcement, and protection for AI models and applications against potential threats.