NVIDIA adds MRC to Spectrum‑X
- NVIDIA said on May 6 that Spectrum‑X Ethernet now supports Multipath Reliable Connection, or MRC, and submitted the transport as an open spec. - MRC lets one RDMA connection spread traffic across many paths and planes, with hardware-speed failover already used by Microsoft and Oracle. - That matters because AI networking is shifting from generic Ethernet plumbing into a competitive software-and-protocol layer.
Ethernet is supposed to be the boring part. Plug things in, move packets around, done. But giant AI clusters broke that neat story, because thousands of GPUs stop acting like normal servers once they all need to exchange model state at the same time. That is the gap NVIDIA is trying to close with MRC — Multipath Reliable Connection — which it added to its Spectrum‑X Ethernet stack on May 6 and also pushed into the Open Compute Project as an open specification. ### What is MRC, in plain English? MRC is a transport protocol for RDMA over Ethernet. The useful idea is simple — instead of treating a connection like one lane on one road, MRC lets a single reliable connection use many network paths at once, then steer around congestion or failures without the application having to micromanage every detour. NVIDIA says diversity and fast recovery stop being nice extras and become table stakes. ### Why wasn’t regular Ethernet enough? Because AI training traffic is ugly. It comes in synchronized bursts, and a few slow flows can stall a whole job. Standard Ethernet can carry the traffic, and RoCEv2 already brought RDMA semantics onto Ethernet, but very large clusters still hit problems around congestion spreading, retransmissions, and failures that make Ethernet behave more like a purpose-built AI fabric without abandoning Ethernet itself. ### What did NVIDIA actually announce? Two things. First, Spectrum‑X now includes MRC as part of its AI networking stack. Second, NVIDIA says the protocol is no longer just an internal or proprietary trick — it has been released as an open specification through OCP. That second part matters because NVIDIA is trying to argue that this is not just “buy our cloud." ### Is this already real, or still a lab demo? NVIDIA’s pitch is that MRC is already running in production. The company specifically pointed to Microsoft’s Fairwater and Oracle Cloud Infrastructure’s Abilene data center as large AI facilities using MRC for frontier-model training and deployment. That does not make it universal, but it does move the story out of slideware territory. ### What does Spectrum‑X add on top? The protocol is only half the story. Spectrum‑X wraps MRC with hardware-accelerated load balancing, congestion handling, telemetry, and very fast path-failure bypass. ServeTheHome also highlighted multiplanar networking — multiple independent fabrics between GPUs