BLOG@CACM
Architecture and Hardware

Learning About Parallel and Distributed Computing

Posted
Joel Adams
Joel Adams, Calvin College

Let’s start with some history…

The Era of the Free Lunch

From the time of the first manufactured integrated circuits until the mid-2000s, two phenomena occurred in the manufacturing of computer central processing units (CPUs):

Every 18-24 months, manufacturers reduced the size of their transistors by 50%, so that in the same area of a chip, they could fit twice as much circuitry as they could previously. Gordon Moore famously observed this, and people have used the phrase Moore’s Law to describe this phenomenon.

Every 18-24 months, manufactureres double the clock speeds of their CPUs. I first noticed this when playing a game back in the early 1980s. The game had characters that spoke, and when I moved the game from an older to a newer machine, the characters sounded like one of the Chipmunks because of the new processor’s faster clock speed.

The faster clock speeds in the latter phenomenon were a windfall for software developers: if their software was sluggish on the current-generation hardware, within two years, the CPUs would be twice as fast, and their software might well perform acceptably. As a result, this time period has sometimes been called "the era of the free lunch" for software developers.

The era of the free lunch came to an abrupt end in the mid-2000s. The doubling of CPU clock frequencies was unsustainable due to heat generation, power consumption, electron bleeding, and related problems. High-end CPU clock speeds have plateaued (generally under 4 GHz), and many current CPU models are clocked much slower than this, to reduce heat production, reduce power consumption, and so on.

The Multicore Era

While manufacturers were forced to stop increasing CPU clock speeds, Moore’s Law continued without interruption, and by 2005, manufacturers could fit all of a CPU’s functionality into half of a CPU’s area. What to do with the vacated space? Add the functionality of a second CPU! And so in 2006, dual-core CPUs appeared, in which one CPU chip contained the core functionality of two traditional CPUs. By 2008, quad-core CPUs were available, in which a CPU contained the functionality of four traditional CPUs. Since then, this trend has continued, as Moore’s Law has let manufacturers produce CPUs containing more and more cores. As examples, AMD’s Interlagos, Abu Dhabi, and Delhi  processor lines offer 4, 8, 12, and 16-core models; Intel’s Xeon E7-88xx series offers 4, 10, 16, and 18-core models, and its Xeon Phi coprocessor currently offers up to 61 cores, with a 72-core standalone model expected this year.

As a result of this shift, sequential processors have been replaced by parallel multiprocessors. Other factors being equal, a dual-core CPU has twice the computational potential of a single-core CPU, a quad-core CPU has four times the potential of a single-core CPU, and so on.

The problem is that a traditional, sequential program will only use one of a multicore processor’s cored; that is, running a traditional, sequential program on a quad-core CPU only uses 25% of the potential of that CPU; 75% of that potential is squandered. To take advantage of a multicore CPU’s potential, software must be re-engineered using parallel computing techniques.

Implications for CS Education

Prior to 2006, parallel computing was a niche area, mainly for researchers and specialists in high performance computing. As a result, parallel (or high performance) computing was an elective area in the 2001 ACM/IEEE CS Curriculum, and relatively few universities offered undergraduate courses on the subject.

However, with advent of the multicore era in 2006, parallel multiprocessors suddenly became inexpensive. Virtually all CPUs today are multicore CPUs, and because of this ubiquity, all computer science graduates need at least a basic understanding of parallel computing. Put differently, multiprocessors are the de facto hardware foundation on which virtually all of today’s software runs. Developers who are still writing sequential software are writing for yesterday’s hardware foundation, not today’s; and CS programs that teach their students to think solely in terms of sequential computing are doing their students a disservice.

In recognition of this seismic shift in the computing landscape, the ACM/IEEE CS Curriculum 2013 (CS 2013) contains a new knowledge area named Parallel and Distributed Computing (PDC), with 15 hours of topics in the core CS curriculum. The Systems Fundamentals knowledge area also includes additional hours of PDC topics.

It is worth mentioning that PDC topics extend well beyond the concurrency-related topics (multithreading, synchronization primitives, monitors, etc.) that have been in the core CS curriculum for years. The new PDC knowledge area includes topics like performance, speedup, computational efficiency, scalability, Amdahl’s law, and so on.

A Professional Development Opportunity

This shift of PDC from being an elective to the CS core may seem like a daunting challenge for computer science educators, since few CS faculty members have any prior experience, much less expertise in parallel computing. Fortunately, there are professional development opportunities to help faculty members learn about this exciting area.

One such opportunity is the CSinParallel Chicago 2015 Regional CS Educators Workshop, which is taking place Aug. 3-6, 2015 at Loyola University in Chicago, IL. The workshop is organized by CSinParallel, an NSF-funded project that is one of the parallel computing resources mentioned in the CS 2013 report. (Full disclosure: I am one of the PIs on the CSinParallel project.) This 3-day workshop will introduce attendees to software technologies such as OpenMP for shared-memory multithreading; MPI for distributed-memory multiprocessing; WebMapReduce, a front-end for Hadoop; Nvidia’s CUDA for GPU computing; and others. The workshop schedule includes introductions to these technologies, different models for integrating them into the CS curriculum, and time for self-paced, hands-on exploration of them. There will also be a session on the many parallel and distributed hardware platform options available today, including networks of workstations, local inexpensive Beowulf cluster options, remote high performance Beowulf clusters, national supercomputing resources like XSEDE, and cloud services like Amazon’s EC2. Regional travel support, housing, and food will be available on a first-come-first-served basis. More information is available at the CSinParallel website, using the link at the beginning of this paragraph.

If you wish to attend this workshop, please complete this online application form. Whether you are experienced or just getting started in PDC, this workshop offers a chance to spend three days immersed in PDC, learning new skills, and building relationships with other CS educators who are interested in preparing our students for our parallel world. I hope to see you there!

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More