Just as Fashion Week elicits howls of "I can't wear that!" twice a year, release of the semiannual TOP500 list draws commentary that questions whether its rankings of the world's fastest supercomputers--based solely based on peak flops according to the LINPACK benchmark--are relevant to applications such as weather modeling and genomics. Hidden in the TOP500 rankings is the fact that these machines are already being built for specific purposes, with success in TOP500 as a side benefit. As TOP500 founder and University of Tennessee computer science professor Jack Dongarra says, "You shouldn't think of these highly ranked computers as just a bunch of processors on a commodity network. They're codesigned by computer architects, applications people, and software designers for a particular set of problems."
Now this process of collaboration, also known as "codesign," is going even further. "Hardware/Software Co-design of Global Cloud System Resolving Models," a recent paper in the Journal of Advances in Modeling Earth Systems, describes how climatologists, system designers, and chipmakers are collaborating to build "Green Flash," a supercomputer highly optimized for cloud modeling.
Their goals are ambitious: If successful, the resulting computer will be able to model cloud systems covering the entire globe at a one-kilometer resolution. Doing so would require 20 million processors, achieving a sustained rate of more than 28 petaflops: That's almost 30 times as many cores as the November 2011 TOP500 list's leader, operating continually at about 2.5 times its peak performance, while consuming less than half its power. But its creators believe that this will be possible through a combination of early expert involvement, "auto-tuning" of algorithms during the design phase, novel use of processes borrowed from the embedded systems industry, and natural advances in the industry before what one of its developers considers a "conservative" release date of 2016.
Speed Through Simplicity
Michael F. Wehner, a climatologist and staff scientist at Lawrence Berkeley National Laboratory who's working on Green Flash, appreciates the advantages that such codesign promises. "The way it works now, application scientists basically react to the machines that we get," he says. "As computational or numerical experimentalists, we need to take a lesson from real experimentalists, who design machines to answer specific questions. That's a very different philosophical viewpoint."
Doing so, however, requires overcoming obstacles to cooperation among scientists in multiple fields. "To codesign successfully, you have to bring together hardware designers, software designers, and end users," Wehner says. "It's interesting sociologically because these are people who normally wouldn't talk to each other. We all speak different languages: That's part of the challenge."
But according to the paper, the payoffs of codesigning can be tremendous. For example, an analysis that took into account the number of vertices required dictated how many processor cores would be needed, while characteristics of fluid dynamics that govern clouds pointed out ways to optimize cache size and connections for specific algorithms. Once domain experts have defined the project's scope and goals, the next step is to create designs that address it most directly. For Green Flash, the answer is "auto-tuning," a process that iteratively tests design variations to find the best one. (In one example, auto-tuning reduced cache size from 100 Kbytes to 1 Kbyte.)
Finally, the design is ready for production. While supercomputers today are usually built using commodity parts--such as Intel and AMD CPUs and NVIDIA GPUs as accelerators--the design of Green Flash calls for an embedded system on a chip (SoC) solution more common to consumer devices.
According to John Shalf, a staff computer scientist at Lawrence Berkeley National Laboratory who's the principal inestigator on Green Flash, there are several advantages to this approach. "Tools from the embedded space can quickly create many different variations that we can use to analyze design tradeoffs," he says. Unnecessary, resource-sapping parts of commodity chips are left off of the resulting chip. "The big win from the Green Flash project was in what we removed from the design, not what we added," he says. As is typical in embedded systems design, Green Flash calls for licensed intellectual property for such components as memory controllers and the communication subsystem.
Ultimately, Shalf says, the slimmer design will lead to two kinds of cost savings: first, by producing chips at a cost that commodity processors can't match; second, through substantially reduced energy demands. (Even though it's not designed specifically to run LINPACK for the Top500, Green Flash's proposed energy usage using a 22 nanometer chip lithography process comes to about 5,500 megaflops per watt, almost three times as "green" as the most energy-efficient system on the latest TOP500 list.) For Wehner, such cost savings ultimately lead to the increased use of supercomputers in science. "I'm pretty confident that purpose-built supercomputers like Green Flash would be cheaper than just going out and buying a general-purpose supercomputer," he says. "For the same amount of money, we think you could buy several machines, each customized for a different discipline."
Hardware/software co-design of global cloud system resolving models. Wehner, M.F., Oliker, L., Shalf, J., Donofrio, D., Drummond, L.A., Heikes, R., Kamil, S., Kono, C., Miller, N., Miura, H., Mohiyuddin, M., Randall, D., and Yang, W. Journal of Advances in Modeling Earth Systems, 3, October 2011. http://james.agu.org/index.php/JAMES/article/view/v3n12/v3n12HTML
Exascale: Past, Present, Future. Shalf, J. Presentation at DOE Challenges in Exascale Computing Architecture, August 3, 2011. Available at http://www.orau.gov/archI2011/presentations/shalfj.pdf.
Proposed "Green Flash" supercomputer: http://www.lbl.gov/cs/html/greenflash.html
Tom Geller is an Oberlin, Ohio-based science, technology, and business writer.