Cooling a supercomputer consumes more electricity than is required to run the machine, even machines as powerful as the IBM Blue Gene/P — called Intrepid — at the U.S. Department of Energy's (DOE) Argonne National Laboratory. Though Intrepid is one of the fastest and most energy-efficient computers in the world, researchers at Argonne's Leadership Computing Facility (ALCF) are continually looking for ways to further reduce the power needed to operate the machine.
"Electricity should never be wasted," said Pete Beckman, director of the ALCF. "Each improvement to our power efficiency makes us more environmentally friendly, reduces our carbon footprint and saves taxpayer dollars."
In the winter the ALCF saves as much as $25,000 every month in electricity costs by leveraging the Chicago area's cold climate to chill the water used to cool Intrepid for free. That is in addition to the millions of dollars saved by the energy-efficient architecture of the Blue Gene/P, which uses about one-third as much electricity as a comparable supercomputer.
The latest in a series of energy-efficient innovations at the ALCF focuses on cooling the machine itself. Varying the chilled water temperature to match the demand of the machine allows the chillers to use less energy. This effort involves mapping the optimum chilled water temperature to the machine load to determine the sweet spot for energy-efficient cooling.
"The trick," said Jeff Sims, ALCF project manager, "is to find the warmest chilled water temperature you can live with at a given machine load, thus reducing the electric load on the chillers and maximizing the free cooling period. While the energy consumed to cool the machine has been reduced, there has been no impact on Intrepid's performance."
It is still too early in the process to calculate the exact energy savings, but the change is noticeable. As data are collected and analyzed, the ALCF team will start experimenting with the temperature of the data center itself to see how far the temperature can be pushed without causing any degradation of the system.
Other enhancements include using smart power management functions to turn off chips and storage systems when they are not in use, as well as scheduling intensive compute jobs to run at night when the power grid has more capacity and temperatures are lower.
"Supporting breakthrough science is our main goal at the ALCF," added Sims. "We will not sacrifice performance for energy efficiency, but we will keep trying everything we can to reduce power consumption. Every little bit helps."
As Argonne continues to focus on energy-efficient enhancements today, it also has an eye to the future. Currently, Argonne is working with IBM and Lawrence Livermore National Laboratory in designing the next-generation supercomputer platform.
"Energy efficiency and computational power are not mutually exclusive," said Beckman. "Future supercomputers will push performance boundaries and deliver unprecedented design innovations that will use merely a trickle of the electricity that is required today."
The team successfully worked together in designing the initial Blue Gene system from the ground up specifically for scientific applications. The architecture of each supercomputer processor board was designed to reduce energy consumption, translating into its low-power, system-on-a-chip architecture and communications fabric. This approach enables science applications to efficiently scale to the highest performance by increasing the system's parallelism and using more power-efficient voltages and clock speeds.
The ALCF is operated by Argonne for the DOE Office of Science as part of the larger DOE Leadership Computing Facility strategy. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy's Office of Science.