Home → Magazine Archive → September 2014 (Vol. 57, No. 9) → The Network Is Reliable → Abstract

The Network Is Reliable

By Peter Bailis, Kyle Kingsbury

Communications of the ACM, Vol. 57 No. 9, Pages 48-55

[article image]

back to top 

"The network is reliable" tops Peter Deutsch's classic list of "Eight fallacies of distributed computing," all [of which] "prove to be false in the long run and all [of which] cause big trouble and painful learning experiences" (https://blogs.oracle.com/jag/resource/Fallacies.html). Accounting for and understanding the implications of network behavior is key to designing robust distributed programs—in fact, six of Deutsch's "fallacies" directly pertain to limitations on networked communications. This should be unsurprising: the ability (and often requirement) to communicate over a shared channel is a defining characteristic of distributed programs, and many of the key results in the field pertain to the possibility and impossibility of performing distributed computations under particular sets of network conditions.

For example, the celebrated FLP impossibility result9 demonstrates the inability to guarantee consensus in an asynchronous network (that is, one facing indefinite communication partitions between processes) with one faulty process. This means that, in the presence of unreliable (untimely) message delivery, basic operations such as modifying the set of machines in a cluster (that is, maintaining group membership, as systems such as Zookeeper are tasked with today) are not guaranteed to complete in the event of both network asynchrony and individual server failures. Related results describe the inability to guarantee the progress of serializable transactions,7 linearizable reads/writes,11 and a variety of useful, programmer-friendly guarantees under adverse conditions.3 The implications of these results are not simply academic: these impossibility results have motivated a proliferation of systems and designs offering a range of alternative guarantees in the event of network failures.5 However, under a friendlier, more reliable network that guarantees timely message delivery, FLP and many of these related results no longer hold:8 by making stronger guarantees about network behavior, we can circumvent the programmability implications of these impossibility proofs.


No entries found