Like the Olympic spirit of "Citius, Altius, Fortius," SC10 has embraced a couple of ranked lists of high-performance computers beyond the original Top500. The Green500 was launched by Wu Feng (Virginia Tech) and others in 2007 to increase awareness of the need for power efficiency in high-performance computing systems. The list has evolved into a vehicle for bragging rights among computer vendors and supercomputing centers. In his introduction to the list, Wu expressed a concern that the list can be abused or gamed, and outlined some goals for making the list more relevant, including expanding the application set beyond LINPACK and making stricter requirements on how power is measured and reported, among others.
Wu invited Craig Steffen of NCSA to talk about the methodology used to measure their Green500 entry, called EcoG. Craig had some great photos that showed how they embedded a clamp-based current probe into one of the power distribution units (PDUs), whose output was connected to an off-the shelf data sampler that collected instantaneous power consumption at one-second intervals. Don't try this at home--the PDUs are running at 208V. The measurement PDU serves 8 of the 128 nodes, as the Green500 rules allows total power to be reported by measuring a subset of the machine and extrapolating to the size of the full computer. Craig showed some really cool graphs of power versus time that encompass multiple back-to-back LINPACK runs. Even within a single run, power varies by about 15% (peak to peak variation) and average power actually decreased over the duration of the run. Craig indicated that they want to capture the data at a much finer grain, starting at every 200 microseconds (the limit of their current sampler), so that they can correlate the power variation to application behavior. Another interesting aspect is that EcoG decided to report performance/Watt for 80% of a LINPACK run (starting at 10% of the way into a run) rather than the minimum requirement of 20% for the Green500 list. They felt that the middle 80%, pruning out the startup and wrapup phases at the beginning and end of the run, is more representative than selecting the best 20% of the run. I'd be curious to see the time-varying power graphs for other machines to see what kind of variation they exhibit across the run.
After keeping the audience in suspense for a while, Wu announced the top 10 of the Green500 (soon to appear, if not posted already). Eight of the top 10 are heterogeneous systems (either Cell- or GPU-based accelerators). He then made three Green 500 awards:
- "Greenest Supercomputer in the World" to the IBM BlueGene/Q prototype located at IBM Research. This computer topped the list at a reported 1684 MFlops/Watt (38KW total). I stopped by the IBM booth afterward to take a look at the hardware on display. While not as physically imposing as the Blue Waters shelf I mentioned in my post a couple of days ago, BlueGene/Q employs an array of custom technologies including a custom node card with a BlueGene chip and up to 16GB of local memory directly attached. The system is also water cooled, eliminating things like fan power and likely reducing circuit leakage energy by running at low temperature.
- "Greenest Production Supercomputer in the World" to Tsubame 2.0 at the Tokyo Institute of Technology, which registered second overall at 958 MegaFlops/Watt (1244 KW total). Tsubame 2.0 has already been deployed; it employs three NVIDIA Tesla 20-series GPUs for each of two Intel Westmere chips, a higher ratio than the other GPU-based machines lower down on the list.
- "Greenest Self-Built Computer in the World" to EcoG at the National Center for Supercomputing Applications (NCSA). Logging in at 933 MegaFlops/Watt (36KW total) and 3rd overall, EcoG was built as a student project in close conjunction with NVIDIA Research (I mentioned this machine a couple of days ago). EcoG employs a 1:1 ratio of GPU to CPU chips, but instead of high-end CPUs it uses Core i3's which provide better CPU energy efficiency while sacrificing some serial performance. It's interesting to note that EcoG was assembled in a matter of days from commodity components that one can order on the web.
I had a few more minutes to walk around the exhibit hall this afternoon. Instead of boring you with a replay of the barrage of technical and marketing pitches one gets there, I decided to make a few awards of my own:
Best cones of silence: Los Alamos National Labs for the plexiglass "audio domes" that facilitate a somewhat immersive single-person audio-visual experience.
Best living room: National Center for Atmospheric Research (NCAR) for their comfy curved couch and space-age housing for the large high-definition display.
Best cookies: Convey Computer for their tasty peanut butter cookie packs. Their heterogeneous CPU/FPGA systems are pretty interesting, and I'm looking forward to a demonstration of a scaled-up version.
Best theater of the absurd: Brocade for the dude wiggling out of a straightjacket while riding a unicycle and talking about networking solutions.
Best swag: I actually didn't see anything that was too outrageous. My modest favorite was the luggage tag-making machine at the NASA booth. I am constantly losing my luggage tags, so it was nice to stock up!
I did not have enough time to even walk by all of the exhibit booths, so I encourage any of you who also attended to add your own booth awards via the comment field below.
I swung by the SC10 celebratory party this evening at Mardi Gras World, which sounds a bit like an adult-oriented ride at Disneyland. In fact, it is the studio that creates and stores most of the Mardi Gras floats. While walking toward food, drink, and a New Orleans brass band, we had the pleasure of viewing (in supersized form) a torso of Mark Twain, a bear in diapers, a satyr eating what looks like an ice cream bar, and more jesters than you can shake a stick at. All-in-all a great party to wrap up a great conference. Kudos to the SC10 organizing committee and all of the volunteers for all of their hard work in making SC10 a resounding success.
Well, that about wraps up SC10. If you enjoyed my blog posts and the small glimpses of SC that they have provided, you might consider planning to attend SC11 next year in Seattle. Until next time, Laissez les Bon Temps Rouler!
Steve Keckler is the Director of Architecture Research at NVIDIA and Professor of Computer Science and Electrical and Computer Engineering at the University of Texas at Austin. He has conducted research in parallel computer architecture for 20 years and has co-authored more than 100 publications on the subject. Keckler's research and teaching awards include a Sloan Foundation Fellowship, an NSF CAREER award, the ACM Grace Murray Hopper award, the Edith and Peter O'Donnell award for Engineering, and the President's Associates Teaching Excellence Award at UT-Austin.