Data centers of all sizes are integrating advanced AI servers into their infrastructure as the workload demand increases. Incorporating AI into legacy data centers is complex because they weren’t designed to handle increased server loads and heat.
AI integration requires specialized designs similar to those in hyperscale data centers, which also face high load challenges. Advancements in processing speed and the growing demand for AI servers cause processors to generate more heat than conventional cooling systems in legacy data centers. Cooling systems must evolve to keep pace with increased temperatures. This involves infrastructure and design changes to floor loading, cabinet space, rack density and power management, as well as the integration of liquid cooling systems.
Floor loading and cabinets
Early rack cabinets were 24 inches (610 millimeters) square and weighed about 250 pounds (113 kilograms). Most building floors could support this when loads were spread by raised access floors. Advancements and integration of AI servers increase the weight and size of cabinets. Modern cabinets can now support 2,500-3,000 pounds (1,134-1,361 kg). Heavier cabinets might require floor ratings that exceed most new construction.
Admins should replace or reinforce cabinets to support the increased weights of AI servers. This requires assessment of cabinet size and weight, the number of server racks, aisle design, cooling systems and raised floor weight capacity, which varies by floor height and equipment placement. Exceptionally deep cabinets typically do not fit into legacy row spacings, making the layout challenging. IBM has an available floor load calculator that estimates the equipment floor load value.
Power
Standard data center cabinets contain 42 rack units, with the most common rack configurations ranging between 100 kilowatts and 150 kW. Legacy data centers were designed to use a rack density of 5 kW to 10 kW. AI server integration demands a rack density of at least 50 kW. Conventional AC circuits and wiring can’t efficiently deliver the electrical current necessary for larger AI arrays. Additionally, conventional power cords, plugs and receptacles aren’t rated for the temperatures in AI server cabinets.
Many AI platforms have standardized on 400-volt DC, requiring special power supplies and integrated power distribution buses. Admins should hire experts to integrate all power into a cabinet complex with the computing rack and install power buses to manage electrical distribution above raised floors. This increases airflow and cooling efficiency.
The circuit ratings in Figure 1 only serve one AI computing cluster platform, which is likely the most installed in a legacy data center. Larger installations use even higher voltages, and all power should be redundant.
Stable uninterruptible power supply energy is critical for AI infrastructures, which operate at 100% capacity all the time. Admins must assess the increased power necessary to integrate and sustain AI server operations and update electrical systems to accommodate the energy demand. Admins should also assess backup generators and upgrade them to ensure backup systems can handle increased energy consumption to mitigate downtime.
Cooling
Many AI servers support the integration of direct-to-chip liquid cooling loops, which cool the processors. Direct-to-chip systems cool up to 75% of the heat load, while conventional cooling systems manage the remaining heat.
As an example, if 25% of supplemental air is required for a 60 kW cabinet, 15 kW of air cooling is necessary and should be within the capacity of well-designed data center cooling equipment. A 150 kW cabinet requires 30 kW to 45 kW of air cooling, which is beyond the spare capacity of most legacy air systems. A 250 kW installation could require 50 kW to 75 kW or more of air cooling, which is possible in hyperscale data centers.
Alternative cooling systems to consider
Cooling systems must provide around-the-clock service with redundancy and exceptionally high reliability. Effective cooling systems depend on the data center’s location. Evaporative cooling or the use of separate cooling towers, for example, might be best in hot, dry climates. Dry cooling is best when water is scarce or in cooler climates.
Distributing water to computing equipment requires coolant distribution units (CDUs). These specialized heat exchangers connect the building’s facility water supply to the cabinet’s technology water supply.
Direct-to-chip cooling uses microchannels that can be obstructed by contaminated water. The CDU enables the facility’s water supply to be thoroughly filtered and treated. Small CDUs are available for rackmounting, and large CDUs have balancing valves that connect equipment with different flow and pressure requirements.
Another alternative cooling method is the use of active cabinet door coolers that require chilled water. Active door coolers consume fan power but are generally more energy-efficient than large air cooling plants, which can make integrated chip and air-cooled cabinets feasible.
Robert McFarlane is senior principal in charge of data center design for the international consulting firm Shen Milsom and Wilke LLC. McFarlane has spent more than 40 years in communications consulting and has experience in every segment of the data center industry.