Cooling and Power Evolve for AI

Two new products highlight the physical infrastructure challenges of scaling AI. Akash Systems announced diamond-cooled servers to manage heat in dense GPU clusters, while Vertiv released a high-capacity busway system for flexible power delivery. Both address the growing problem of power and thermal management in AI data centers.

The push for greater AI capabilities is driving hardware power densities to levels that challenge traditional infrastructure, with AI server racks often requiring 30-100+ kW per rack compared to the 7-10 kW of traditional servers. This surge in power consumption strains both the electrical grid and the data center's internal power distribution and cooling systems. In response, the industry is projected to spend $1 trillion on data center upgrades to support AI. Akash Systems' use of synthetic diamond aims to tackle the heat generated by this increased power density. Diamond possesses the highest thermal conductivity of any known material, transferring heat up to five times faster than copper, the current industry standard for heat management. This technology, previously developed with NASA for cooling satellites, is designed to be an additive layer to existing air and liquid cooling, reducing GPU hotspot temperatures by 10-20°C. For power delivery, Vertiv's double-stack busway system addresses the need for higher capacity in a smaller footprint. This modular, overhead track system allows for flexible placement of "tap-off" boxes that connect racks to the main power supply, a design that preserves valuable floor space for revenue-generating IT equipment. The system can be reconfigured live without requiring a shutdown, a critical feature for maintaining uptime. The operational impact of thermal issues is significant, as every 10°C increase in temperature can cut the lifespan of electronic components in half. Inefficient cooling not only risks hardware failure and costly downtime but also leads to performance throttling, where chips automatically reduce speed to prevent overheating. Effective heat management, like that proposed by Akash, can enable GPUs to be overclocked by up to 25% while also reducing energy used by fans. These infrastructure innovations are critical as global electricity consumption by data centers is projected to more than double by 2030, largely driven by AI. In the U.S. alone, data centers consumed over 4% of the country's total electricity in 2024, a figure expected to grow significantly. Some regions are already feeling the strain, with data centers in Virginia consuming about 26% of the total electricity supply in 2023. Vertiv's busway system allows for vertical scaling of power distribution, increasing capacity without expanding the system's footprint across the ceiling. It supports configurations up to 2000A under UL standards and 2500A for IEC standards, accommodating the high-power requirements of modern AI and high-performance computing environments. Optional integrated metering provides real-time data on power usage to help operators with capacity planning. Akash Systems has already secured a $300 million launch order for its diamond-cooled servers featuring AMD Instinct MI350X GPUs. The company has also delivered diamond-cooled NVIDIA GPUs to NxtGen, a large Indian cloud provider, under a $27 million contract. These servers are projected to increase computational output per watt by up to 15% in high ambient temperature environments.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.