Greetings from New Orleans and SC10--the annual supercomputing conference and trade show. I'll be filing blog posts over the next couple of days on the highlights of my experiences at the conference. SC is a true spectacle with more than 300 exhibitors (which, by the way, fills only a small fraction of the New Orleans convention center; this place is huge) and nearly 400 scheduled technical events (paper presentations, tutorials, birds-of-feather sessions, etc.). This number does not even include the numerous technology presentations hosted in the vendor booths. I can only seek to provide you with a small window into SC10.
The opening keynote speaker today was Clayton Christensen of the Harvard Business School, renowned for his research into innovation and author of The Innovator's Dilemma. Christensen gave a fantastic and humorous lecture on how even the rational decisions of technical business leaders (such as those who might be educated at Harvard) can often fail when confronted with disruptive technologies. While he gave numerous examples--the one I liked the most was that of the early days of transistors. RCA invested a lot of money into solid state devices, but at the time transistors could not be made nearly as capable (e.g., performance, power) as the incumbent vacuum tube. They made a perfectly rational decision to continue to make their products, such as radios and televisions, out of tubes. Meanwhile, companies such as Sony started putting the lousy transistors into new products such as portable radios; here the transistor was competing against a much easier target: nonconsumption. Of course, transistors got better and eventually killed off the vacuum tube, and RCA with it.
Christensen commented on a more contemporary example--that of cheap solar generators in rural parts of the developing world. In such places where there is no regular electricity service, even lousy solar generation (providing consumers with only intermittent access to radio or television, for example) is selling like hot cakes (at least where he observed in Mongolia) because it is competing against the nonconsumption of electricity. His talk was filled with many other gems and I encourage anyone to make a point to see Christensen speak. His presentation was all the more impressive given that he is recovering from a recent stroke. Here's to wishing him a complete recovery.
On the technical side, heterogeneous high-performance computing systems is a huge topic here, and one that was selected by the conference organizers as one of three major thrust areas of SC10. Part of the buzz is still resonating the announcement a couple of weeks ago that Tianhe-1A in Tianjin, China achieved more than 2.5 PetaFLOPs on LINPACK, superceding the performance of the Cray XT-5 homogeneous CPU machine at ORNL (Jaguar) by more than 45%. The latest edition of the Top500 list was officially announced this evening, and Tianhe-1A is joined by 3 other heterogeneous systems in the top 7 (Nebulae, Tsubame 2.0, and Roadrunner), all achieving more than 1 LINPACK PetaFLOP. In all, I counted more than 18 heterogeneous computing systems on the TOP500 with 10 powered by NVIDIA GPUs, 6 by IBM PowerXCells, and 2 by AMD/ATI GPUs. Eight of these 18 systems are new to the list and all eight are based on GPUs.
As one of the folks who tabulates the Top500, Erich Strohmaier, presented an analysis of the machines on the list. It is of course striking that in terms of total Top500 Flops, China has gone from insignificant to being behind only the United States and the EU. Given all of the activity going on in China, I wonder when the total compute capability there will top all others. I also find it interesting that the uptake in heterogeneous computing at the very top end is dominated by GPU-based machines in China and Japan. The first EU GPU-based machine appears at #22 while the first U.S. GPU-machine appears at #72. At the end of the session, one of the audience members posed a question of how representative LINPACK is for predicting the usefulness of supercomputers (particularly of heterogeneous supercomputers) on real applications. Jack Dongarra responded by challenging the new GPU-based supercomputers to come back next year with Gordon Bell prize nominations, demonstrating their capability to solve real scientific problems.
My sentimental favorite computer on the list is the new student-assembled GPU cluster created by a partnership between NVIDIA Research and NCSA. The 128-node cluster was designed to be extremely energy-efficient and I'm looking forward to the Thursday announcement of the Green500 list of the most efficient supercomputers to see how that machine stacks up against the competition on MFlops/Watt.
I wandered the show floor a bit this afternoon and here are a couple of quick highlights:
- SGI was showing off some new heterogeneous computing "sticks" that are 2U high, 1/3 the width of a standard rack, and about 30 inches long. Each stick is rated at 1KW and includes two mid-range CPUs for energy-efficiency and two PCI slots for accelerators. They claim to be offering these with GPU or Tilera accelerator options. Imagine populating a 60KW rack with the potential of more than 60 TFlops.
- IBM had an impressive piece of engineering on display--the Power7-based "shelf" to be used in their Blue Waters installation at NCSA. Weighing in at more than 350 pounds and occupying about 6 square feet of area, the shelf is a sight to behold. IBM pulled out all the stops on this machine with custom water cooled CPU/network/DRAM modules, multiple multi-chip module packages, and an integrated optical network. They showed a bunch of these packed into an extra-wide and extra-deep rack for a peak of 94 TFlops and guaranteed not to exceed power of 20KW per shelf (about 240KW for the rack assuming the rack could handle all of the power and cooling).
- On the slightly wackier side, Green Revolution Cooling is demonstrating an immersive cooling system in which the computer components sit in a mineral oil bath (back to the days the Cray-2). They claim to be able to reduce system cooling costs by half. I went looking more info this evening and found another company (Icetope) in the same space and a pretty humorous case study of a do-it-yourself aquarium PC.
I'll have more comments from the floor exhibits as the week progresses.
Steve Keckler is the Director of Architecture Research at NVIDIA and Professor of Computer Science and Electrical and Computer Engineering at the University of Texas at Austin. He has conducted research in parallel computer architecture for 20 years and has co-authored more than 100 publications on the subject. Keckler's research and teaching awards include a Sloan Foundation Fellowship, an NSF CAREER award, the ACM Grace Murray Hopper award, the Edith and Peter O’Donnell award for Engineering, and the President's Associates Teaching Excellence Award at UT-Austin.