Research and Advances
Artificial Intelligence and Machine Learning Review Articles

A Linearizability-based Hierarchy for Concurrent Specifications

Two linearizability-style correctness conditions that can be used to argue safety properties of progressively more concurrent behaviors of objects.
Posted
  1. Introduction
  2. Key Insights
  3. Specifying and Implementing a Concurrent Object
  4. Set-Linearizability
  5. Interval-Linearizability
  6. Conclusion
  7. Acknowledgments
  8. References
  9. Authors
  10. Footnotes
  11. Sidebar: 1. Sequential Specifications
  12. Sidebar: 2. Recalling the Notion of Linearization
  13. Sidebar: 3. The Nonblocking Property and Nonblocking Progress Conditions
colored wires, illustration

In the early 2000s, the multicore revolution began, when it became difficult to increase the clock speed of microprocessors, and manufacturers shifted to the approach of increasing performance through multiple processing cores per chip. The “free lunch” of relying on increasing clock speed to obtain faster programs was over. Unfortunately, developing multi-process programs can be dramatically more difficult. The difficulty of reasoning about many things happening at the same time is compounded by the fact that processes are executed in a highly asynchronous and unpredictable way, possibly even crashing, and furthermore, affected by low-level architectural details. An approach strongly advocated to cope with these difficulties was elegantly presented by Shavit:36

Back to Top

Key Insights

  • While is the de facto correctness condition to reason about safety properties in concurrent it deals only with sequential specifications.
  • Reducing the complexity of reasoning about concurrency through sequential thinking has limitations, which can be dealt with through two natural, progressively more general generalizations of linearizability.
  • Set-linearizability is a correctness condition used to reason about specifications of sets of operations happening at the time. Interval-linearizabilty can be used for specifications about operations that can overlap in time in arbitrary ways. Both retain the properties of linearizability that have made it so popular: Proving the correctness of individual objects implies the correctness of the whole system; it is possible to reason about nonblocking implementations; they are state based.

“It is infinitely easier and more intuitive for us humans to specify how abstract data structures behave in a sequential setting, where there are no interleavings. Thus, the standard approach to arguing the safety properties of a concurrent data structure is to specify the structure’s properties sequentially.”

Providing the illusion of a sequential computation from the users’ perspective has been used since the early seminal works that paved the way of modern distributed systems, for example, by Lamport,27 as well as in the subsequent advances in defining general, practical correctness conditions, most notably, linearizability introduced by Herlihy and Wing.24 The successful history of reducing the complexity of concurrency through sequential thinking spans over half a century but may be reaching its limits.31

First limitation: Inherently concurrent problems. It is not clear that providing users with the illusion of a sequential computation should go as far as implementing solutions only to sequential problems. It is not obvious because some problems are inherently concurrent. Consider for example a ticket reservation system, say for seats at a concert. There is nothing wrong with taking care of two reservations of different seats concurrently, and it may be more efficient than serializing the two reservations. Furthermore, it may be acceptable to let both reservations happen at the same time from the users’ perspective, namely, none of them went first. Linearizability precludes such concurrent problems from being implemented in a distributed system simply because, by definition, linearizability is a tool to show that a concurrent algorithm implements a problem specified through a sequential specification (see sidebars 1 and 2). In the case of the ticket reservation example, a sequential version would be a queue. In what sense is a queue a “sequential version” of the data structure of the example? This is an interesting philosophical question in its own, in any case, a sequential version may artificially change the semantics of the problem.

There are examples of distributed problems that do not have any sequential version that could even remotely mimic their behavior. Consider Java’s Exchanger object, which allows two processes (threads) to atomically exchange a value if invocations are concurrent. In a sequential version, the object would always return that the exchange failed to take place.

Second limitation: The penalty of sequential specifications. The second reason for doubting the axiom of illusion of sequentiality is that linearizable implementations of sequential specifications may be expensive, or even impossible to implement. A classic result is the impossibility of solving consensus by asynchronous processes that may crash, using only simple read/write primitives.15,28 It is impossible to build concurrent implementations of some of the classic sequential specifications (for example, sets, queues, stacks) that completely eliminate the use of expensive synchronization primitives.5 Finally, there are formal complexity lower bounds for some sequential objects.11 As a result, distributed system designers in some cases have to give up either the idealized goals of scalability and availability, or relax the consistency linearizability requirement.

Three benefits of linearizability. Despite these limitations, programmers are afraid of relaxing consistency for very real and concrete technical reasons. Building large systems demands composability, and up to now linearizability is the de facto standard because it allows us to do so: it is sufficient to prove that each of the implementations of concurrent objects is linearizable to guarantee that the whole system of multiple objects is also linearizable. Additionally, linearizability is particularly attractive for reasoning about nonblocking implementations (see Sidebar 3). Finally, in any linearizable implementation there is a well-defined notion of the state of the system at any time, which in turn facilitates writing correctness proofs.22

The universe of concurrent object specifications. Numerous correctness conditions have been proposed over the years. More recently, algorithms implementing concurrent objects have been adapted to cope with multicore processors with relaxed memory architectures, requiring new correctness conditions. For example, Viotti and Vukolic39 present a formal framework for defining correctness conditions for multicore architectures, covering both standard conditions for totally ordered memory and newer conditions for relaxed memory. Yet, the sequential paradigm is so entranced that correctness of concurrent implementations is understood in terms of conditions that determine relationships between concurrent executions of an implementation and sequential executions of the object being implemented. Even recently proposed correctness conditions for recoverable objects in multicore architectures with durable persistent memory are based on sequential specifications, for example, strict linearizability,1 recoverable linearizability4 and durable linearizability.25

Concurrent specifications. The desire for truly concurrent semantics is old, going back to Lamport,27 where a specification of a concurrent object is simply the set of all the concurrent executions that are considered correct. Reasons for the desire of concurrent specifications have been argued since at least the seminal work of Montanari:29 concurrent specifications are more informative, testing sequences defining partial orderings may carry the same information as an exponentially larger number of interleaving traces. Another reason is that in truly concurrent models the existing fine parallelism of the application is fully specified. In some situations, truly concurrent semantics is the most natural, as in Petri nets. Other important works presenting concurrent semantics have been proposed.14

Linearizability-based concurrent specifications. The Linearizability framework (see sidebars 1 and 2) can be progressively extended to specify more concurrent objects, as illustrated in Figure 1. The automaton used in a sequential specification is modified to specify set-sequential objects, where a transition can be labeled with more than one operation taking place at the same time. Further, interval-sequential objects are defined with an automaton where operations overlap in arbitrary ways. The aim is to define concurrent specifications while preserving the notion of state. Set-linearizability and interval-linearizability, the associated correctness conditions, define a way of associating a concurrent execution with a concurrent specification, either of a set-sequential or an interval-sequential object. Intuitively, we move from linearization points to linearization sets of points, and more generally to linearization intervals, representing the overlapping in time among concurrently executed operations.

f1.jpg
Figure 1. A linearizability-based hierarchy.

The goal of this article is to describe set-linearizability and interval-linearizability, stressing the benefits of linearizability are kept—both are composable, state-based, and nonblocking conditions. First, set-linearizability,20,30 where more than one operation can be linearized at the same point. Then, interval-linearizability,8 where operations can overlap in arbitrary ways. In fact, interval-linearizable specifications have been shown to be the most expressive ones.17

Back to Top

Specifying and Implementing a Concurrent Object

The lattice agreement problem has been actively investigated recently. Unlike consensus, it is implementable in an asynchronous system where processes can fail. Several papers have presented replicated state machine implementations based on lattice agreement,12,40 instead of the usual consensus-based implementations. Recently, Kuznetsov, Rieutord, and Tucci-Piergiovanni26 showed how to implement re-configurable lattice agreement and explained how it can directly be used to obtain re-configurable versions of several sequential types such as max-register, conflict detector, and in fact, any state-based commutative abstract data type (ADT).

A join-semilattice is a tuple (L, ⊑), where L is a set partially ordered by the binary relation ⊑, such that for all elements of x, yL, there exists a least upper bound, called join. For the purposes of this exposition, we take L as the set with all finite sets of natural numbers and ⊑ as the subset relation, with the join of x, y being xy.

Specifying lattice agreement. The lattice agreement abstraction is presented in26 in the style that has been traditionally used in distributed computing: a list of requirements that operations must satisfy. An operation propose (x) invoked by process p with input xL, returns a value v’L, such that:

Validity. If a propose (v) operation returns v’ then v’ is the join of some proposed values including v and all values returned by previous operations.

Consistency. The values returned are totally ordered by ⊑.

In addition, there is a progress requirement, that will not be central here; a common one is if a process invokes a propose operation and does not fail then the operation eventually returns (see Sidebar 3).

This specification is not very formal. The idea of using this style of specification started with the goal of describing objects, by which it is usually meant a sequential specification (Sidebar 1). Consider an automaton defining our lattice agreement example as a sequential object. A transition from s to s’ would be labeled with propose (v) → v’, meaning that if the object is in state s, and a process that invokes propose (v) could get back v’ in case the object moves to state s’.

The first challenge is to come up with such a sequential automaton, which would be a formal specification of the above informal list of requirements. A natural one would identify each state with a pair of subsets of L, to remember all the elements that have been proposed and those that have been returned so far. The initial state is s0 = (∅, ∅), and for the transition from s = (s1, s2) to s’, we would have that s’ = (s1v, s2v’). The reader can easily complete the formal specification of the sequential automaton.

Identifying correct implementations. The second challenge is to check the correctness of an execution of the object against the sequential automaton specification, following the linearizability definition (Sidebar 2). In Figure 2, an example of an execution for three processes, p1, p2 and p3, is represented, where one can see that operation calls (from invocation to response) overlap in time. Linearizability requires to find a linearization point in between each invocation and response, so that these points define a valid sequential execution of the lattice agreement automaton. In the example of Figure 2, the linearization points do not produce a valid sequential execution because the operation call by p1 cannot possibly return the value 2, which has not yet been proposed. Furthermore, from a linearizability point of view, the execution is incorrect, because there are no linearization points that satisfy the automaton specification: any way we order the operations by p1 and p2 would give an incorrect response for one of them. The same problem occurs with respect to any sequential specification of lattice agreement.a

f2.jpg
Figure 2. Example of a non-linearizable lattice agreement execution.

But what is wrong with the execution of Figure 2? It certainly satisfies the lattice agreement consistency requirement stated above. The problem is the validity requirement—the execution would violate it with respect to any sequential specification of lattice agreement. Validity seems to assume a priori that no operations are invoked concurrently. But the whole point of the lattice agreement state machine replication idea was to avoid using consensus to order operations!

There are several lattice agreement implementations.26,40 For illustration, consider the simple one-shot lattice agreement implementation using read/write primitives on a shared memory in Figure 3 (adapted from 8); one-shot means that each process invokes only once the propose operation.b In the algorithm, each process first writes its proposal in a dedicated entry of the shared memory (Line 1), and then repeatedly reads the whole memory and computes the join of the proposals that have been written so far, until it sees no new relevant proposal (Lines 3 to 7). The execution of Figure 2 can be produced by this algorithm, if p1 and p2 write their proposals (in Line 1, in any order), and then both execute the loop of Line 4 twice, both returning {1, 2}, before p3 starts executing its code.

f3.jpg
Figure 3. A one-shot lattice agreement implementation based on read/write primitives (code of process pi).

From sequential objects to truly concurrent objects. What is lattice agreement, given that it has no sequential specification? What problem is the algorithm of Figure 3 solving? One encounters publications with similar situations: a list of requirements are used to specify a problem with no sequential specification, and for the lack of a name to such an entity, researchers have either used different ad hoc names such as “abstraction” or “problem” or “type,” or simply called it an “object with no sequential specification,” without further explanation of what this might be.

Finally, in 1994, Neiger30 came up with an idea largely overlooked in the literature. So overlooked, that some 20 years later Hemed, Rinetzky and Vafeiadis20 independently rediscovered it—the idea of a set-sequential specification (called concurrency-aware specification20). The transitions of the automaton specification are labeled with sets of operation invocations, each one together with its response, as in Figure 4.

f4.jpg
Figure 4. Part of a set-sequential automaton.

The corresponding correctness condition is set-linearizability. Its aim is to allow for the simultaneity of some operations: one can put linearization points grouping together several operations at the same moment of time in a set. Figure 5 shows a set-linearization of the execution we have been considering. It can be tested against the set-sequential automaton illustrated in Figure 4 to show that it is a correct execution, that is, set-linearizable.

f5.jpg
Figure 5. A set-linearizable lattice agreement execution.

But now let us consider the execution of Figure 6. Again, it seems to satisfy the consistency requirement of lattice agreement, and it seems correct with respect to an intuitive interpretation of the validity requirement. Furthermore, again there is an algorithm that can produce it, namely the one in Figure 3. As before, p1 and p2 execute their write operations (Line 1) concurrently, but now p1 executes alone the loop of Line 4 twice, while p2 is delayed, then p3 executes its write operation, and finally both p2 and p3 execute the loop of Line 4 twice.

f6.jpg
Figure 6. An interval-linearizable execution of lattice agreement that is not set-linearizable.

However, the execution of Figure 6 is not only not linearizable but also is not set-linearizable. The reason is the operations by p1 and p3 are not concurrent, and hence they cannot be set-linearized together. But the operation of p2 must be set-linearized with both because its proposed value has been returned by the operation of p1, and it has returned the value proposed by the operation of p3. Namely, the operation of p2 could not have taken effect at a single point of time.

This type of example motivated us to propose in Castañeda et al.8 one further generalization of linearizability—interval-linearizability. The corresponding generalization of a set-sequential object is an interval-sequential object. It is defined in terms of an automaton whose transitions are labeled with sets of operation invocations, but each such invocation is not necessarily matched with a response; the response can appear later in another transition. The interval of each operation is now marked with either one linearization point (in case it appears to be executed instantaneously) or with two linearization points (in case it overlaps with at least two other non-overlapping operations). An example of part of such an automaton for lattice agreement is presented in Figure 7. This automaton validates the execution of Figure 6. Notice that the operation of p2 is invoked in the first transition, and its corresponding response appears in the second transition, concurrently with the operation of p3 (a single point).

f7.jpg
Figure 7. Part of an interval-sequential automaton.

Hence, interval-linearizability extends set-linearizability by allowing time-ubiquity of operations—an operation can appear as being executed concurrently with several consecutive, non-overlapping operations.


Interval-linearizability extends set-linearizability by allowing time-ubiquity of operations—an operation can appear as being executed concurrently with several consecutive, non-overlapping operations.


Keeping the benefits of linearizability. We stress that the extensions of linearizability to set-linearizability and interval-linearizability are not done at the price of losing any of its three properties, as proved in Castañeda et al.8 First, both are state-based specifications, which is useful for documentation and correctness proofs. Second, they are composable, can safely use several linearizable, set-linearizable or even interval-linearizable object implementations because their composition will maintain the corresponding property. Third, the non-blocking property of linearizability also is preserved (see Sidebar 3).

We proceed now to explore in more detail set-linearizability and then interval-linearizability, with additional examples.

Back to Top

Set-Linearizability

As discussed, the idea is to allow pre-defined subsets of operations to be seen as occurring simultaneously; such a set of operations is called a concurrency class. Hence, set linearizability is associated with operation simultaneity. A set-sequential object is specified by an automaton whose transitions are labeled with concurrency classes. It defines a set of valid set-sequential executions, each one consisting of a sequence of concurrency classes. The corresponding correctness notion, set-linearizability, allows several operation calls to be linearized at the same linearization point, namely, all the operations belong to the same concurrency class.

Observe that when each concurrency class consists of a single operation, set-linearizability boils down to linearizability (recall Figure 1). Moreover, the containment is strict, since there are set-sequential objects with no sequential specification, such as the two different set-sequential objects.

The exchanger object. The Java documentation provides the following specification:

“A synchronization point at which threads can pair and swap elements within pairs. Each thread presents some object on entry to the exchange method, matches with a partner thread, and receives its partner’s object on return.”

Clearly there is no sequential specification of an exchanger. Such a specification is outside the domain of linearizability simply because linearizability rules out concurrency. An exchanger however can be specified as a set-sequential object whose set-sequential executions contain a concurrency class for each pair of operation calls exchanging elements, and a concurrency class for every operation call that is not able to exchange its element. Figure 8 depicts an example of a set-linearizable execution of a concurrent exchanger implementation.

f8.jpg
Figure 8. Example of a set-linearizable exchanger execution.

Exchangers are useful synchronization objects that have been used in several concurrent implementations,23,35,37 however, the lack of a sequential specification of an exchanger makes correctness proofs intricate. As a concrete example, consider the scalable and linearizable elimination backoff stack implementation of Hendler, Shavit, and Yerushalmi.23 Very briefly, the idea in this stack implementation is as follows: whatever the state of the stack, two concurrent push(x) and pop() invocations can be “eliminated” if the pop() operation returns x, since the pop operation can be linearized right after push(x); if an operation does not find a concurrent operation to be eliminated with, it uses a “slower” stack implementation to complete its invocation. The elimination scheme is implemented through an array of exchanger objects where operations try to exchange elements.

Having a formal specification of the exchanger object is important for developing modular verification techniques for concurrent implementations. The set-sequential specification of the exchanger object has been exploited in Hemed et al.20 to obtain a modular proof of the elimination backoff stack. Namely, the exchanger objects used in the elimination scheme are independently shown to be correct (that is, set-linearizable) and then the elimination back-off stack is shown to be linearizable assuming the elimination scheme is made of set-linearizable exchanger objects. Thus, the correctness of the elimination back-off stack does not rely on any implementation of the elimination scheme.

A relaxed queue. Despite of its benefits, linearizability has drawbacks, beyond the fact there are inherently concurrent objects without a sequential specification. There are impossibility results for several concurrent objects, like sets, stacks, queues, and work-stealing, showing that any linearizable implementation must use synchronization mechanisms that can be implemented only through expensive instructions of current multicore architectures. These synchronization mechanisms are the read-modify-write primitives, like fetch&inc, swap, and compare&swap, and the read-after-write synchronization pattern (also known as the flag principle22,32), in which a process writes in a shared variable A and then reads another shared variable B.c

Herlihy21 showed that any linearizable nonblocking implementation of a queue or stack cannot use only the simple read/write primitives, it must use more powerful read-modify-write primitives. In the same direction, Attiya et al.5 proved that any linearizable implementation with the minimal progress guarantees of a set, stack, queue, or work stealing must use either read-after-write synchronization patterns or read-modify-write primitives.

Recently, set-linearizability has been used to define relaxations of queues and stacks that admit set-linearizable implementations that use only read/write primitives and without read-after-write synchronization patterns, hence evading the impossibility results. Intuitively, in a queue with multiplicity,9 the usual definition of a sequential queue is relaxed in a way that distinct dequeue operation calls can return the same item, but this can happen only if they are concurrent. Then, dequeue operations returning the same item belong to the same concurrency class. In all other cases, the object behaves like a usual sequential queue. The expressiveness of set-linearizability allows to precisely specify that the relaxation can happen only in case of concurrency.

As an example of a set-linearizable implementation, consider the simple implementation of a single-enqueuer queue with multiplicity in Figure 9. Single-enqueuer means there is one distinguished process, called the enqueuer, that can invoke the enqueue operation. As we will explain, the implementation uses only read/write primitives and is devoid of read-after-write synchronization patterns. (For clarity, we consider here the single-enqueuer case, but it has been shown there are set-linearizable implementations for the multi-enqueuer case with similar properties9). It merits mention that the impossibility results in Attiya et al.5 and Herlihy21 apply also for the single-enqueuer case. The implementation in Figure 9 is derived from the implementations in Castañeda et al.,7 where work-stealing with multiplicity is studied, and shown to be useful to derive relaxed work-stealing implementations with better performance than classic (that is, non-relaxed) work-stealing solutions, when solving problems such as parallel spanning tree.

f9.jpg
Figure 9. A set-linearizable implementation of a single-enqueuer queue with multiplicity (code for process pi).

The simple-enqueuer implementation in Figure 9 uses a shared array ITEMS where items are stored in/removed from, and two shared integers, TAIL and HEAD, to store the current head and tail of the queue. ITEMS and TAIL are manipulated through simple read/write primitives, while HEAD is manipulated by dequeuers through the max _ read and max _ write linearizable operations: max _ read returns the maximum value written so far in HEAD and max _ write writes a new value in HEAD only if it is greater than the largest value that has been written so far. Aspnes, Attiya, and Censor-Hillel have proposed wait-free (see Sidebar 3) linearizable implementations of max _ read and max _ write that use only read/write primitives and are devoid of read-after-write synchronization patterns,2 and thus the implementation in Figure 9 possesses these properties too.

In the implementation, whenever the enqueuer wants to enqueue an item, it first reads the current value t of TAIL, then stores its item x in ITEMS[t] and finally increments TAIL by one (Lines 9 to 11). A dequeue operation first reads the current value h of HEAD using max _ read and then reads the value x in ITEMS[h] (Line 13 and 14); if x is distinct from ⊥, then x is an item that has been enqueued and the operation return x, after it increments HEAD by one using max _ write which logically “marks” the item in position ITEMS[h] as taken (Lines 16 and 17), otherwise x is equal to ⊥, which means that the queue is empty as HEAD has “surpassed” TAIL, and hence the operations returns empty (Line 19).

Two or more concurrent enqueue operation calls can return the same item. For example, the operations can read one after the other, in some arbitrary order, the same value h from HEAD in Line 13 and then read one after the other, again in some order, the value in ITEMS[h] in Line 14. Namely, all these operations read the item in ITEMS[h] before the first of them “marks” the item in ITEMS[h] as taken by updating HEAD using max _ write in Line 16. Due to the semantics of max _ read and max _ write, HEAD only “moves forward,” hence a “slow” dequeue operation cannot write a small value in HEAD that could cause another (possibly non-concurrent) dequeue operation to return an item that has already been dequeued. Therefore, enqueue operation calls that return the same item can be linearized at the same linearization point, that is, in the same concurrency class.

Back to Top

Interval-Linearizability

Set-sequential specifications are more expressive than sequential ones, but there are situations where an operation appears as being executed concurrently with a sequence of several sequentially executed operations, as discussed previously. An interval-sequential specification8 defines a sequence of concurrency classes, with the possibility that an operation has its invocation in one concurrency class and its response in a later concurrency class, implying that it executed over an interval of time instead of a single point. Figure 10 takes the view of a poset determined by the operation intervals, and the corresponding order diagram, for the three types of specifications.d

f10.jpg
Figure 10. Interval poset vs. order diagram.

The batched counter. The rapid increase of data production nowadays naturally asks for parallelization of computations in big data processing systems, to achieve timely responsiveness. A common task in such systems is that of counting events in batches (for example, number of queries of a website) for doing some statistical analysis later. The task is modeled in the batched counter, a sequential object that stores an integer R initialized to 0, and provides two operations, update(x) that increments R by x, and query() that returns the current value of R. One would like to have a concurrent implementation that allows several processes to update and query the counter concurrently and rapidly. It turns out that linearizability prevents the existence of efficient linearizable implementations of the batched counter using the simple read/write primitives. Rinberg and Keidar33 proved that for any linearizable read/write implementation of batched counter for n processes that is wait-free, the update operation (arguably the most frequently called operation in a big data processing system) has step complexitye Ω(n). This lower bound calls for well-defined relaxations of the objects that admit efficient implementations.


Set-sequential specifications are more expressive than sequential ones, but there are situations where an operation appears as being executed concurrently with a sequence of several sequentially executed operations.


Indeed, Rinberg and Keidar proposed a relaxation of the batched counter that has a wait-free read/write implementation with constant step complexity of its update operation. The implementation appears in Figure 11. The relaxation is formally defined through the intermediate value (IV) linearizability formalism introduced in Rinberg et al.,33 an extension of linearizability for quantitative data processing sequential objects. Loosely speaking, IV-linearizability admits update operation calls to return a value that approximates the correct response; the sequential specification of the object we seek to implement defines the correct responses. An execution is IV-linearizable if the output of every update operation lies in an interval defined by two sequential executions of the object.

f11.jpg
Figure 11. An interval-linearizable implementation of a batched counter (code for process pi).

The relaxed batched counter can be specified by an interval sequential automaton. In each valid interval-sequential execution, update operations happen atomically and sequentially, namely, each such operation spans a single concurrency class, and no more than one update operation appears in a concurrency class. A query operation, however, can span several concurrency classes, and its output value lies in the interval defined by the contents of the counter when the operation starts and terminates, respectively; a query operation can also appear in a single concurrency class, denoting that it is not concurrent with any other operation, and hence it must return the current value of the counter in this case.

Figure 12 depicts an interval-linearizable execution of the relaxed batched counter. The first query operation by p3 returns value 10 because it is implemented by reading the update(2) of p1 and then the two update operations of p2. This query() → 10 operation cannot be linearized at a single point: update(1) of p1 must happen before the query() → 8 of p2, which in turn happens before the update(3) of p2.

f12.jpg
Figure 12. An interval-linearizable execution of a batched counter object.

The implementation of the batched counter in Figure 11 from Rinberg et al.33 is indeed interval-linearizable. In an interval-linearization of an execution of the implementation, the update operation calls are sequentially ordered according to the moment they execute Line 21 (each operation has its own concurrency class), while each query operation call is interval-linearized to the interval of that sequence that spans the update operations that are concurrent to it.

To conclude our example, we observe that the implementation in Figure 11 has a drawback—the step complexity of its query operation is linear in the number of processes n. This drawback can be solved as follows.

First, we note that using the atomic read-modify-write fetch&inc primitive, it is easy to obtain a linearizable implementation of the batched counter with constant step complexity in both its update and query operations. The fetch&inc(R, d) primitive atomically returns the current value of R and adds d to R. In a simple linearizable implementation of the batched counter, there is a shared variable A initialized to zero, and update(xi) simply performs fetch&inc(A, xi), while update(xi) simply reads A, that is, read(A). Despite the good theoretical properties of this simple implementation, it does not perform well in practice as all processes work on the shared variable A which becomes a bottleneck, creating high contention in real multi-core architectures.

An intermediate solution between the linearizable solution, and the one in Figure 11 consists in having an array A of length K (instead of length n as in the implementation in Figure 11), where K is a system-dependent constant (or maybe a sublinear function); update(xi) first randomly picks an entry A[ki] of A and performs fetch&inc(A[ki], xi), while update(xi) returns the sum of the K entries of A, similarly to update in Figure 11. The idea is to randomly spread the contention over the distinct components of A. The implementation has good properties: it retains wait-freedom and interval-linearizability and has a constant step complexity in both operations.

Completeness results for interval-linearizability. It is known that interval-sequential specifications are complete in the sense that they are powerful enough to specify any concurrent object given by a set of concurrent executions, that is, sequences of invocations and responses. We will say that such specifications are set-based. Arguably, set-based specifications, proposed by Lamport,27,f are the most general way to define a concurrent object. Such a set specifies all the concurrent behaviors that are considered valid for any concurrent algorithm implementing the object. For example, the set-based specification of a FIFO queue contains all executions that are linearizable, while the set-based specification of a FIFO queue with multiplicity contains all executions that are set-linearizable.

It turns out that interval-sequential specifications can model any set-based specification having some reasonable properties, like non-emptiness, prefix-closure and well-formedness (that is, each process alternates between issuing invocations and responses, starting with an invocation). The result was originally proved in8 under some assumptions, and later generalized by Goubault, Ledent, and Mimram.17 Furthermore, they prove that in any reasonable computational shared memory model, every algorithm for a given set-based specification must satisfy all properties. Therefore, in a formal sense, interval-sequential specifications are fully general.

Back to Top

Conclusion

This article has presented two known extensions of linearizability—set-linearizability (that captures simultaneity) and interval-linearizability (that captures time-ubiquity), together with the corresponding formalisms to define more general concurrent objects: set-sequential automaton and interval-sequential automaton. This extended linearizability framework preserves the benefits of composability, nonblockingness, and the notion of state. We surveyed recent work that has already been taking advantage of this approach, but there seem to be many more opportunities.

There is a very active current trend by practitioners to move away from sequential specifications, due to the performance limitations, and even for simplicity, where it would be interesting to explore the use of the extended linearizability framework. Notable is the history of blockchain technology, which started with Bitcoin and its paradigm of sequentializing all monetary transactions in the system via tremendously energy consuming consensus mining algorithms, toward recent efforts of allowing for concurrent ledgers cooperation,38 and ledgers restricted to only monetary transactions that do not need consensus.6,10 The project CALM of Hellerstein and Alvaro focuses on the class of programs that can achieve distributed consistency without the use of coordination.19 Conflictfree replicated data types34 provide another interesting direction for future work.16,26 Their benefits of commutativity have been extended to composable libraries and languages, enabling programmers to reason about correctness of whole programs in languages like Bloom.19 In the context of distributed storage systems, large-fragmented objects with relaxed read operations have been introduced in Fernández et al.,13 which admit efficient implementations. Another recent trend is relaxed specifications.18 There have been several studies on relaxation in the shared-memory context, focusing on SkipLists, log-structured merge trees and other sequential data structures. Another line of research consists of looking for possible links between the presented linearizability hierarchy and the notions of strict linearizability,1 durable linearizability,25 recoverable linearizability.4

Back to Top

Acknowledgments

This work has been partially supported by the French projects BYBLOS (ANR-20-CE25-0002-01) and PriCLeSS (ANR-10-LABX-07- 81) devoted to the design of modular distributed computing building blocks, and UNAM-PAPIIT projects IN106520 and IN108720.

Back to Top

Back to Top

Back to Top

Back to Top

Back to Top

Back to Top

    1. Aguilera, M.K. and Frølund, S. Strict linearizability and the power of aborting. Technical Report HPL-2003-241. Hewlett-Packard Labs.

    2. Aspnes, J., Attiya, H., and Censor-Hillel, K. Polylogarithmic concurrent data structures from monotone circuits. J. ACM 59, 1 (2012), 2:1–2:24.

    3. Afek, Y., Attiya, H., Dolev, D., Gafni, E., Merritt, M., and Shavit, N. Atomic snapshots of shared memory. J. ACM 40, 4 (1993), 873–890.

    4. Attiya, H., Ben-Baruch, O., and Hendler, D. Nesting-Safe Recoverable Linearizability: Modular Constructions for Non-Volatile Memory. In Proceedings of the ACM Symp. Principles of Distributed Computing. ACM Press, 2018, 7–16.

    5. Attiya, H., Guerraoui, R., Hendler, D., Kuznetsov, P., Michael, M.M., and Vechev, M.T. Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages. ACM Press, 2011, 487–498.

    6. Auvolat, A., Frey, D., Raynal, M., and Taïani, F. Money transfer made simple. Bulletin of the European Assoc. Theo. Computer Science, 2020, 132:22–43.

    7. Castañeda, A. and Piña, M. Fully read/write fence-free work-stealing with multiplicity. In Proceedings of 35th Symp. Distributed Computing. LIPICS Series, 2021, 16:1–16:20.

    8. Castañeda, A., Rajsbaum, S., and Raynal, M. Unifying concurrent objects and distributed tasks: interval-linearizability. J. ACM 65, 6 (2018), Article 45.

    9. Castañeda, A., Rajsbaum, S., and Raynal, M. Relaxed queues and stacks from read/write operations. In Proceedings of 24th Intern Conf. on Principles of Distributed Systems. LIPICS 184 (2020), 13:1–13:19.

    10. Collins, D., et al. Online payments by merely broadcasting messages. In Proceedings of the 50th IEEE/IFIP Intern. Conf. on Dependable Systems and Networks. IEEE Press, 2020, 26–38.

    11. Ellen, F., Hendler, D., and Shavit, N. On the inherent sequentiality of concurrent objects. SIAM J. on Computing 41, 3 (2012), 519–536.

    12. Faleiro, J. M., Rajamani, S., Rajan, K., Ramalingam, G., and Vaswani, K. Generalized lattice agreement. In Proceedings of the 2012 ACM Symp. Principles of Distributed Computing. ACM Press, 2012, 125–134.

    13. Fernández Anta, F., Georgiou, C., Hadjistasi, T., Nicolaou, N., Stavrakis, E., and Trigeorgi, A. Fragmented objects: boosting concurrency of shared large objects. In Proceedings of the 28th Intern. Colloquium on Structural Information and Communication Complexity. Springer LNCS 12129 (2021), 106–126.

    14. Filipovic, I., O'Hearn, P., Rinetzky, N., and Yang, H. Abstraction for concurrent objects. Theoretical Computer Science 411, 51–52 (2010), 4379–4398.

    15. Fischer, M., Lynch, N., and Paterson, M. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (1985), 374–382.

    16. Frey, D., Guillou, L., Raynal, M., and Taïani, F. Consensus-free ledgers: when operations of distinct processes are commutative. In Proceedings of the 16th Intern. Conf. on Parallel Computing Technologies. Springer LNCS, 2021.

    17. Goubault, E., Ledent, J., and Mimram, S. Concurrent specifications beyond linearizability. In Proceedings of. 22nd Intern Conf. on Principles of Distributed Systems. LIPICS 125 (2018), 28:1–28:16.

    18. Henzinger, T., Kirsch, C., Payer, H., Sezgin, A., and Sokolova, A. Quantitative relaxation of concurrent data structures. In Proceedings of the. 40th Annual ACM SIGPLAN-SIGACT Symp. Principles of Programming Languages, (2013), 317–328.

    19. Hellerstein, J. and Alvaro, P. Keeping CALM: When distributed consistency is easy. Commun. ACM 63, 9 (Sept. 2020), 72–81.

    20. Hemed, N., Rinetzky, N., and Vafeiadis, V. Modular verification of concurrency-aware linearizability. In Proceedings of 29th Symp. Distributed Computing. Springer LNCS 9363 (2015), 371–387.

    21. Herlihy, M. Wait-free synchronization. ACM Trans. on Progr. Languages and Systems 13, 1 (1991), 24–149.

    22. Herlihy, M. and Shavit, N. The Art of Multiprocessor Programming. Morgan Kaufmann, 2008.

    23. Hendler, D., Shavit, N., and Yerushalmi, L. A scalable lock-free stack algorithm. J. of Parallel and Distributed Computing 70, 1 (2010), 1–12.

    24. Herlihy, M. and Wing, J. Linearizability: A correctness condition for concurrent objects. ACM Trans. on Progr. Languages and Systems 12, 3 (1990), 463–492.

    25. Izraelevitz, J., Mendes, H., and Scott, M. Linearizability of persistent memory objects Under a full-system-crash failure model. In Proc. of the 30th Intern. Symposium on Distributed Computing (DISC 2016), Springer LNCS 9888, 313–327 (2016).

    26. Kuznetsov, P., Rieutord, T.H., and Tucci Piergiovanni, S., Reconfigurable lattice agreement and applications. In Proceedings of 23th Intern. Conf. on Principles of Distributed Systems. LIPICS, 31:1–31:17 (2020).

    27. Lamport, L. On inter-process communications, part I: basic formalism, part II: algorithms. Distributed Computing 1, 2 (1986), 77–101.

    28. Loui, M. and Abu-Amara, H. Memory requirements for agreement among unreliable asynchronous processes. Advances in Computing Research 4 (1987), 163–183.

    29. Montanari, U. True concurrency: theory and practice. Mathematics of Program Construction. Springer Verlag LNCS 669, (1993), 14–17.

    30. Neiger, G. Set linearizability (Brief announcement). In Proceedings 13th Aannual ACM Symp. Principles of distributed computing. ACM Press, 1994, 396.

    31. Rajsbaum, S. and Raynal, M. Mastering concurrent computing through sequential thinking: A half-century evolution. Comm. ACM 63, 1 (Jan. 2020), 78–87.

    32. Raynal, M. Concurrent Programming: Algorithms, principles and foundations. Springer, 2013.

    33. Rinberg, A. and Keidar, I. Intermediate value linearizability: a quantitative correctness criterion. In Proceedings of 34th Int'l Sym. Distributed Computing. LIPICs 179 (2020), 2:1–2:17.

    34. Shapiro, M., Preguiça, N., Baquero, C. and Zawirski, M. Conflict-free replicated data types. In Proceedings Symp. Self-Stabilization, Safety, and Security of Distributed Systems. Springer LNCS 6976, 2011, 386–400.

    35. Scherer III, W.N., Lea D., and Scott M.L. Scalable synchronous queues. Commun. ACM 52, 5 (May 2009), 100–111.

    36. Shavit, N. Data structures in the multicore age. Commun. ACM 54, 3 (Mar. 2011), 76–84.

    37. Shavit, N. and Touitou D., Elimination trees and the construction of pools and stacks. Theory Computing Systems 30, 6 (1997), 645–670.

    38. Sompolinsky, Y. and Zohar, A. PHANTOM: A scalable blockDAG protocol. Cryptology ePrint Archive Report 104, (2018).

    39. Viotti, P. and Vukolic, M. Consistency in non-transactional distributed storage systems. ACM Computing Surveys 49, 1 (2016), Article 19.

    40. Zheng, X., Hu, G., and Garg, V. Lattice agreement in message passing systems. In Proceedings 32nd Intern. Symp. Distributed Computing. LIPICs 121 (2018), 23:1–23:17.

    a. However, one can define sequential specifications where the automaton somehow "predicts" the values that will be proposed in the future. The execution in Figure 2 is indeed linearizable with respect to these arguably unreasonable sequential specifications.

    b. One-shot lattice agreement is equivalent to one-shot atomic snapshot.3

    c. The read-after-write synchronization pattern has been used in many algorithms, starting with the first mutual exclusion solutions.

    d. There has been much work on interval orders, both due to their mathematical interest and because of their applications to biology, algorithms, psychology, among others. The interval order for a collection of intervals on the real line is the partial order corresponding to their left-to-right precedence relation, one interval being considered less than another if it is completely to the left of the other.

    e. The step complexity of an operation is the worst-case number of primitive steps required to complete the operation.

    f. Lamport originally used his happened-before relation, while from Herlihy and Wing24 on the string approach is predominant in the literature.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More