Inference now stresses networks

- As inference workloads scale, data‑center bottlenecks have shifted from raw compute to networking and power. (x.com) - Arista/Anet style networking plays are benefiting: Anet is cited at roughly $3.25B of AI revenue in FY26 estimates. (x.com) - Operators are moving to denser racks and liquid cooling to pack more accelerators per pod and reduce space/power friction. (x.com)

Artificial intelligence is moving from training to serving answers, and that shift is turning data-center strain toward networks, power feeds, and cooling systems. (blogs.nvidia.com) Inference is the part of artificial intelligence that runs after a model is built: a user asks a question, the system moves data between accelerators, and the hardware returns a result in milliseconds. Microsoft said its Fairwater facilities are linked by a dedicated network so sites in Wisconsin and Atlanta can work as one “AI superfactory” across hundreds of thousands of graphics processors. (news.microsoft.com) That architecture changes what breaks first. Nvidia said older data centers were built around roughly 20 kilowatts per rack, while hyperscale artificial intelligence sites now support more than 135 kilowatts per rack, pushing heat removal and power delivery to the foreground. (blogs.nvidia.com) Operators are responding by packing more chips into each rack and switching from moving cold air around a room to sending coolant directly to the chip, like running plumbing to the hottest part of the machine. Nvidia said its GB200 NVL72 and GB300 NVL72 rack systems are liquid-cooled and designed for large-language-model inference, while Microsoft said Atlanta uses advanced liquid cooling and a two-story layout to raise graphics-processor density. (blogs.nvidia.com) (news.microsoft.com) Cooling is no longer a side system. Nvidia said cooling has historically accounted for as much as 40% of a data center’s electricity use, and Microsoft said cold-plate cooling can cut greenhouse-gas emissions and energy demand by about 15% and water consumption by 30% to 50% over a facility’s life, depending on design assumptions. (blogs.nvidia.com) (news.microsoft.com) The companies selling the digital plumbing are gaining leverage. Arista Networks reported $9 billion in 2025 revenue on February 12, 2026, said it had shipped a cumulative 150 million ports, and has been pitching itself as a provider for “large AI” data-center networks. (investors.arista.com) Arista has also been widening the hardware aimed at these clusters. On October 29, 2025, the company introduced its R4 family for artificial intelligence, data-center, and backbone routing deployments, built around 800-gigabit Ethernet products meant to move traffic inside and between large computing pods. (investors.arista.com) The money flowing into the physical layer is spreading beyond servers and switches. Microsoft, BlackRock, Global Infrastructure Partners, and MGX said in September 2024 that their Global AI Infrastructure Investment Partnership could mobilize up to $100 billion for data centers and the power infrastructure needed to run them, with most of that investment slated for the United States. (news.microsoft.com) The result is a different kind of artificial intelligence buildout. The next constraint is less likely to be whether a company can buy one more accelerator than whether it can feed, cool, and connect a whole rack of them fast enough to keep inference running. (blogs.nvidia.com) (news.microsoft.com)

Inference now stresses networks

Get your own daily briefing