runtime verification of concurrency-specific correctness criteria

12
Software Tools for Technology Transfer manuscript No. (will be inserted by the editor) Runtime Verification of Concurrency-Specific Correctness Criteria Shaz Qadeer 1 , Serdar Tasiran 2 1 Microsoft Research, Redmond, WA, USA 2 Koc University, Istanbul, Turkey Received: date / Revised version: date Abstract. We give an overview of correctness criteria specific to concurrent shared-memory programs, and run- time verification techniques for verifying these criteria. We cover a spectrum of criteria, from ones focusing on low-level thread interference such a races to higher-level ones such as linearizability. We contrast these criteria in the context of runtime verification. We present the key ideas underlying the runtime verification techniques for these criteria, and summarize the state of the art. Finally, we discuss the issue of coverage for runtime ver- ification for concurrency and present techniques that im- prove the set of covered thread interleavings. 1 Introduction This paper is an overview of runtime verification cri- teria and techniques for shared-memory multithreaded programs. The focus of the work covered in this paper is on errors caused by undesirable interference between threads and not on sequential functional correctness. As such, these techniques do not detect violations of a set of program-specific properties. Instead, the correctness properties sought are generic, applicable to all shared- memory concurrent programs, or concurrently-accessed data structures. We focus on safety properties. Liveness properties such as termination and deadlock-freedom are outside the scope of this paper. We cover a spectrum of techniques ranging from race detectors to refinement checkers. This corresponds to definitions of undesirable interference ranging from lower- level, more syntactic notions to higher-level, more se- mantic ones. We present and contrast the different def- initions of thread interference, and summarize the state of the art in tools checking for them. By focusing on Send offprint requests to: a particular concurrency-specific correctness criterion, the runtime verification techniques in this paper pro- vide much improved observability than simple testing with reasonable runtime overhead. Bugs triggered are detected closer to the source, and thus diagnosed and debugged more easily. Checking for lower-level concurrency errors is anal- ogous to certain sanity checks performed by language runtimes, such as checks for null pointer dereferences and out-of-bounds array indexing. The absence of such lower-level errors is typically required for a program- wide safety property, such as sequential consistency. Furhter- more, the lower-level error is almost always undesirable with few exceptions such as races on performance coun- ters being considered benign races. A lower-level error is sometimes a bug in itself, sometimes symptomatic of a higher-level bug. In order to define thread interference at a higher level, concurrent program correctness criteria require cer- tain relationships between concurrent program execu- tions and more serial (less interleaved, sometimes se- quential) ones. Atomicity and serializability require that program executions be equivalent to ones interleaved at a coarser level of granularity, where atomic blocks are ex- ecuted serially, without interruption. Linearizability and refinement make use of sequential specifications for data structure operations, and require a correspondence be- tween concurrent program executions and executions of the sequential specification. Higher-level correctness criteria are widely applicable not only to concurrently-accessed data structures, which provide to their concurrent clients the illusion of instan- taneous, sequential access, but also to other concurrent code, where the design follows a separation of concerns. In the latter, synchronization mechanisms are used to ensure a certain kind of interference-freedom for certain code blocks, and functional reasoning within those code blocks is carried out using sequential reasoning. Higher-

Upload: ku

Post on 10-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Software Tools for Technology Transfer manuscript No.(will be inserted by the editor)

Runtime Verification of Concurrency-Specific CorrectnessCriteria

Shaz Qadeer1, Serdar Tasiran2

1 Microsoft Research, Redmond, WA, USA2 Koc University, Istanbul, Turkey

Received: date / Revised version: date

Abstract. We give an overview of correctness criteriaspecific to concurrent shared-memory programs, and run-time verification techniques for verifying these criteria.We cover a spectrum of criteria, from ones focusing onlow-level thread interference such a races to higher-levelones such as linearizability. We contrast these criteriain the context of runtime verification. We present thekey ideas underlying the runtime verification techniquesfor these criteria, and summarize the state of the art.Finally, we discuss the issue of coverage for runtime ver-ification for concurrency and present techniques that im-prove the set of covered thread interleavings.

1 Introduction

This paper is an overview of runtime verification cri-teria and techniques for shared-memory multithreadedprograms. The focus of the work covered in this paperis on errors caused by undesirable interference betweenthreads and not on sequential functional correctness. Assuch, these techniques do not detect violations of a setof program-specific properties. Instead, the correctnessproperties sought are generic, applicable to all shared-memory concurrent programs, or concurrently-accesseddata structures. We focus on safety properties. Livenessproperties such as termination and deadlock-freedom areoutside the scope of this paper.

We cover a spectrum of techniques ranging from racedetectors to refinement checkers. This corresponds todefinitions of undesirable interference ranging from lower-level, more syntactic notions to higher-level, more se-mantic ones. We present and contrast the different def-initions of thread interference, and summarize the stateof the art in tools checking for them. By focusing on

Send offprint requests to:

a particular concurrency-specific correctness criterion,the runtime verification techniques in this paper pro-vide much improved observability than simple testingwith reasonable runtime overhead. Bugs triggered aredetected closer to the source, and thus diagnosed anddebugged more easily.

Checking for lower-level concurrency errors is anal-ogous to certain sanity checks performed by languageruntimes, such as checks for null pointer dereferencesand out-of-bounds array indexing. The absence of suchlower-level errors is typically required for a program-wide safety property, such as sequential consistency. Furhter-more, the lower-level error is almost always undesirablewith few exceptions such as races on performance coun-ters being considered benign races. A lower-level error issometimes a bug in itself, sometimes symptomatic of ahigher-level bug.

In order to define thread interference at a higherlevel, concurrent program correctness criteria require cer-tain relationships between concurrent program execu-tions and more serial (less interleaved, sometimes se-quential) ones. Atomicity and serializability require thatprogram executions be equivalent to ones interleaved ata coarser level of granularity, where atomic blocks are ex-ecuted serially, without interruption. Linearizability andrefinement make use of sequential specifications for datastructure operations, and require a correspondence be-tween concurrent program executions and executions ofthe sequential specification.

Higher-level correctness criteria are widely applicablenot only to concurrently-accessed data structures, whichprovide to their concurrent clients the illusion of instan-taneous, sequential access, but also to other concurrentcode, where the design follows a separation of concerns.In the latter, synchronization mechanisms are used toensure a certain kind of interference-freedom for certaincode blocks, and functional reasoning within those codeblocks is carried out using sequential reasoning. Higher-

2 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

level runtime verification techniques for concurrent pro-grams do not only target bug symptoms, or check desir-able sanity properties, but also verify more global prop-erties of a given trace that come closer to full functionalcorrectness, assuming that a more sequential version ofthe code is correct.

As correctness criteria become higher level, and areable to disregard apparent low-level conflicts, they be-come more permissive, and declare correct more highly-concurrent and optimized implementations. We examinecorrectness criteria in the context of runtime verifica-tion, where full coverage is often impractical. Therefore,a correctness criterion that is sufficiently stringent in thecontext of static proofs may not provide enough infor-mation when only a subset of the program’s executionsare covered. In this setting, getting the most validationinformation from the examined traces is desirable.

In the context of concurrent programs, coverage ofdifferent possible program interleavings is of central con-cern, rather than the coverage of different program in-puts, which sequential verification is concerned with. Wepresent a number of techniques aimed at improving thecoverage of interesting and important thread interleav-ings. Some of these techniques control the thread sched-uler and lead the program execution towards unexplored,interesting new interleavings. Others gather data from asingle program execution and infer other possible inter-leavings and possible concurrency errors in them

Section ?? presents a very basic formalization of con-current program executions. In Section 3, we cover run-time race-detection techniques. Section 4 covers atom-icity, serializability and runtime techniques for check-ing them. Linearizability and refinement are presentedin Section 5. Section 6 is concerned with techniques forimproving the coverage of runtime verification for con-current programs.

2 Multithreaded Programs: Formalization

In the following, we present a formalization of executionsof multithreaded programs that has been adapted fromthat in [15]. Rather than a completely general formaliza-tion that models all features of a multithreaded program,we opt for one that is simple and allows us to illustratethe key ideas for each runtime verification approach. Wedo not provide a formal syntax and semantics for mul-tithreaded programs, and only reason in terms of theirexecutions in this paper.

A program consists of a number of concurrently ex-ecuting threads, each of which has an associated uniquethread identifier t ∈ Tid .

The set of possible actions that a thread t can per-form include:

– rd(t, x, v) and wr(t, x, v), which read a value v fromvariable x and write a value v to x, respectively.

The possible effects of the read and write actions onthe local and global stores are given by the languagespecification and memory model and are left unspec-ified in this paper.

– acq(t,m) and rel(t,m), which acquire and release alock m.

– begin(t, l, s) and end(t), which mark the beginningand end of the execution of a designated code blockconsisting of the program statement s. It might bethe programmer’s intention to have this block beatomic, serializable, etc. The block label l uniquelyidentifies a designated code block.

– fork(t, t′) and join(t, t′), which create a new thread t′

and wait until a thread t′ terminates, respectively.– call(t, l,m,p) and ret(t, l,m, r), which represent the

call and return actions of an operation m. These ac-tions are thread-local, i.e., they do not modify theglobal store. The label l uniquely identifies a partic-ular execution of the operation m.

In code examples, we often omit t and l when they areclear from the context or irrelevant, and we use morefamiliar syntax, such as x = v, for reads and writes. Weuse the function tid(a) to extract the thread identifierfrom an action.

A trace is a finite sequence of actions α = a1, a2, ..., an.A particular occurrence of an action (i, ai) in a trace iscalled an event. The behavior of a trace α = a1, a2, ..., anis defined by the relation Σ0 →α Σn, which holds ifthere exist intermediate states Σ1, . . . , Σn−1 such thatΣ0 →a1 Σ1 →a2 · · · →an Σn.

Event ei = (i, ai) is referred to using the type of ac-tion ai is, e.g., a write or read event. The write-predecessorof a read event ej where aj = rd(()t, x, v) in a trace isthe largest j such that ej is a write event that writes tox.

An intended-atomic block in a trace α is the sequenceof actions executed by a thread t starting with a begin(t, l, s)action and containing all t actions up to and including amatching end(t) action. If an action a by thread t doesnot occur within an intended-atomic block for t, then theaction a by itself is considered a (unary) intended-atomicblock. This terminology was chosen in order to empha-size that atomicity is a specification (an annotation) andnot a guarantee provided by the platform.

Similarly, an operation execution in a trace α is thesequence of actions executed by a thread t starting withcall(t, l,m,p) and ending with the matching ret(t, l,m, r)action. For simplicity, we restrict our attention to exe-cutions where for all begin(t, l, s) actions (resp. for allcall(t, l,m,p) actions), a matching end(t) action (resp.a begin(t, l, s) action) exists.

Two actions (or events) in a trace conflict if:

1. they access (read or write) the same variable, and atleast one of the accesses is a write;

2. they operate on (acquire or release) the same lock;or

Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria 3

3. they are performed by the same thread.

If two actions (or events) do not conflict, they commute.A trace α is said to be a permutation of a trace β iff

α and β both consist of n actions, and there is a permu-tation f of {1, ..., n} such that, if α = a1, a2, ..., an, thenβ = af(1), af(2), ..., af(n). For each i, ai and af(i) are saidto be corresponding actions and (i, ai) and (f(i), af(i))are said to be corresponding events.

3 Data-race detection

Data-race detection is perhaps the most well-studied prob-lem in the area of verification of concurrent software. In-formally, a data race occurs in an execution wheneverthere are conflicting accesses to the same variable with-out proper synchronization. However, defining a datarace formally has been contentious with many competingdefinitions in the literature.

The first popular algorithm for data-race detectionwas the Eraser algorithm [37]. This algorithm is basedon the insight that programmers use locks to ensure ex-clusive access to shared variables; hence, it equates adata-race with the absence of a consistent locking proto-col. The following execution obeys the consistent lockingprotocol of always accessing X while holding the lock Lck.

T1 T2

-- --

acq(Lck)

tmp1 := X

tmp1 := tmp1 + 1

X := tmp1

rel(Lck)

acq(Lck)

tmp2 := X

tmp2 := tmp2 + 1

X := tmp2

rel(Lck)

However, the following execution has a data-race be-cause there is no common lock held by the two threadswhile accessing the variable X.

T1 T2

-- --

acq(Lck1)

tmp1 := X

tmp1 := tmp1 + 1

X := tmp1

rel(Lck1)

acq(Lck2)

tmp2 := X

tmp2 := tmp2 + 1

X := tmp2

rel(Lck2)

To detect an inconsistency in the locking protocol,Eraser maintains for each variable x, a lockset LS ini-tialized to the set of all locks in the program. It alsomaintains for each thread t, a variable LH containing

the current set of locks held by thread t. At each accessof x by thread t, LS is updated with the intersection ofthe LS and LH and a data race is reported if LS becomesempty. The emptiness of LS indicates that accesses to xhave not been consistently protected by a single uniquelock.

The simplicity of Eraser and its intuitive appeal madeit very popular. However, Eraser suffered from the prob-lem of false alarms, i.e., it occasionally reported racesthat the programmer did not consider errors. There weretwo main reasons for it.

1. Some natural programming idioms change the lock-ing discipline over time. For example, a variable maybe allocated and initialized by one thread withoutholding any lock and then made available via a globalpointer which is protected by a lock. Another ex-ample is the common producer-consumer pattern inwhich the producer thread creates and initializes anobject representing a work item without holding anylocks and then puts it into a queue. The consumerthen takes the object out of the queue and again ac-cesses it without holding any locks.

2. Programmers often use custom synchronization prim-itives instead of locks to synchronize access to data.Examples of such custom primitives are hand-craftedspin locks and double-checked locking.

Reserchers have tried to address the false alarms de-scribed above by adding embellishments to the basicEraser algorithm. These embellishments usually take theform of a state machine attached to each shared vari-able [37,44]. A data race is reported only if the locksetbecomes empty and the state machine enters a particularstate. The state-machine approach has been unsatisfac-tory because although the specification being checkedbecomes more complicated, the problem of false alarmsis still not eliminated fully.

Recently, consensus has emerged around a precisedefinition of a data-race based on the happens-beforerelation [24]. A large factor in the forming of this con-sensus was the use of this definition by the Java memorymodel [28]. The happens-before relation of an executionis a partial order on the events in the execution. Intu-itively, the happens-before relation captures the causalrelationships between the events in an execution; thereis an edge from an event a to another event b if the exe-cution of a enables b to happen. Formally, the happens-before relation is the transitive closure of the followingset of edges:

1. the set of edges from one instruction to the next oneexecuted by a thread.

2. the set of edges from a lock release to the subsequentacquire of that lock.

3. the set of edges from the fork in one thread to thefirst operation of the forked thread.

4. the set of edges to the join operation in a thread fromthe last operation of the thread being joined with.

4 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

Two accesses to a variable x are conflicting if at leastone of them writes to x. Two conflicting accesses to avariable x in an execution form a data race if they are notordered by the happens-before relation of the execution.

The most well-known algorithm to check for the pres-ence of data races defined using the happens-before re-lation is the vector-clock algorithm [29]. A vector clockis a map VC : Tid → Nat that maintains a clock , anon-negative number, for each thread in the program.An important relationship between vector clocks is thev relation:

V1 v V2 iff ∀t ∈ Tid . V1(t) ≤ V2(t)

The vector-clock algorithm dynamically assigns to eachevent in the execution a particular vector-clock preserv-ing the following invariant:

For any pair of distinct events e and f , the vectorclock assigned to e is ordered before the vectorclock assigned to f according to v iff the evente is ordered before the event e according to thehappens-before relation.

The algorithm achieves the above goal by maintain-ing the following collection of vector clocks, each initial-ized to the map λt ∈ Tid . 0.

1. Ct for each thread t. The value of Ct is the vectorclock for the last event executed by thread t. It isupdated whenever thread t executes an event.

2. Lm for each lock m. If the last thread to release lockm was t, then the value of Lm is the vector clock oft when it released m.

3. Wx for each variable x. If the last thread to writevariable x was t, then the value of Wx is the vectorclock of t when the write happened.

4. Rx for each variable x. If the last thread to readvariable x was t, then the value of Rx is the vectorclock of t when the read happened.

The vector clocks are updated as events occur in anexecution according to the scheme described below. Thescheme uses the following operations on vector clocks:

V1 t V2 = λt ∈ Tid . max (V1(t), V2(t))incu(V ) = λt ∈ Tid . ite(t = u, V (t) + 1, V (t))

The update operations are as follows:

1. When a lock m is released by thread t, Lm is updatedto Ct and Ct is updated to inct(Ct).

2. When a lock m is acquired by thread t, Ct is updatedto Ct t Lm.

3. When a thread t creates a thread u, Cu is updatedto Ct and Ct is updated to inct(Ct).

4. When a thread t joins with a thread u, Ct is updatedto Ct t Cu.

5. When a thread t reads variable x, a race on x isdeclared unless Wx v Ct and Rx is updated to Ct.

6. When a thread t writes variable x, a race on x isdeclared unless Wx v Ct and Rx v Ct and Wx isupdated to Ct.

Together these rules serve to compute the happens-beforerelation indirectly.

While the precision of happens-before detection servesto reduce false alarms, in the context of dynamic verifi-cation this precision has been often considered to be aweakness. To understand the problem, consider the fol-lowing execution:

T1 T2

-- --

Z := 0;

acq(Lck)

tmp1 := X

tmp1 := tmp1 + 1

X := tmp1

rel(Lck)

acq(Lck)

tmp2 := X

tmp2 := tmp2 + 1

X := tmp2

rel(Lck)

Z := 1

While happens-before detection will not report a racein this execution, lockset-based detection will report arace on Z. The program clearly has a race in it and evenhappens-before detection would have caught it were theordering of the critical sections in T1 and T2 to be re-versed. By being too precise, happens-before detectionactually reduces its chances of detecting a potential racein an execution similar to the one being examined. Thisattribute might be considered undesirable since dynamicverification gets to execute only a few executions out ofthe set of all possible executions. On the other hand, bychecking for a stronger property, lockset-based detectionhas the ability to generalize from a given execution to aset of similar executions. Having understood this trade-off between the two algorithms, many researchers havetried to combine the best elements of both [44].

While the presence of data races in an execution maybe indicative of a programming error, absence of dataraces by itself is neither a necessary nor sufficient con-dition for the correctness of a concurrent programs [16].Lock-free data structures are full of data races, yet theyprovide semantically correct behavior from the point ofview of a client. On the other hand, consider the follow-ing execution in which the code in thread T1 intended toimplement an atomic increment of the shared variable X.

T1 T2

-- --

acq(Lck)

tmp1 := X

rel(Lck);

acq(Lck)

X := 42

rel(Lck)

Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria 5

acq(Lck);

X := tmp1 + 1

rel(Lck)

However, the granularity of the locking in T1 is er-roneous; hence, this execution is buggy in spite of be-ing free of data races. The next two sections describehigher-level correctness conditions that have the abilityto capture programmer intent more precisely.

Still, absence of data races is highly desirable in con-current programs when they are executed on multipro-cessors, a situation that is increasingly common with theadvent of commodity multi-cores. Multiprocessor execu-tions of concurrent programs, in the presence of data-races, may have surprising consequences to the program-mer. In the following program, a programmer naturallyexpects at least one of tmp1 or tmp2 to be 42 at the end.

T1 T2

-- --

X := 42 Y := 42

tmp1 := Y tmp2 := X

This expectation is based on the intuition that ineach thread an instruction that appears before anotherinstruction is also executed before it; the operationalsemantics implied by this expectation is known as se-quential consistency [25]. On a multiprocessor, due tothe reordering of instructions by the compiler or thehardware, if this program starts from a state in whichX = Y = 0 it is possible for it to end in a state with bothtmp1 = tmp2 = 0.

For performance reasons, typical runtime systems forconcurrent programs do not provide sequential consis-tency for all executions. But consensus has emerged thatsequential consistency be ensured at least for data-racefree programs [28,2]. In most programming languages,data-race freedom cannot be checked statically and dy-namic data-race detection remains the only alternativefor eliminating surprising behaviors such as those de-scribed above. For these reasons, researchers have alsosuggested that data-races be treated as a runtime excep-tion similar to null-dereference or array-out-of-boundsexceptions [8].

4 Atomicity and Serializabilty

This and the next section are about runtime verificationof criteria which define correctness of a concurrent traceby relating it to a sequential trace or a concurrent oneinterleaved at a coarser level. We focus on race-free, andtherefore, sequentially consistent programs from here on.

4.1 Serializability Preliminaries

Given a trace α and a code block b, we say that b isserial in α if all actions of b appear contiguously in α.

Otherwise, b is non-serial in α. A trace α is said to beserial if all intended-atomic blocks are serial in α.

A code block b is then called atomic or view-serializableif, for each trace α, there exists an equivalent trace (mod-ulo different definitions) β such that b is serial in β.This makes it possible for programmers to restrict theirattention to a subset of traces in which b appears tobe executed serially. This has two key benefits. First,within the intended-atomic block b, locally, program-mers can reason as if b is sequential code. Second, itis frequently the programmer’s intent to guarantee, us-ing synchronization, that b appear to execute serially.Thus, it is useful to draw the programmers attention toatomicity and serializability violations in an execution,even though these may not have led to a violation of anydata properties in that particular execution. Fixing suchviolations may prevent more catastrophic consequencesthat may occur in other executions, after the deploymentof the code.

At the lowest level, atomicity at the concrete levelof fine-grained actions as used in [16,43] makes use ofconflict-equivalence between traces.

Definition 1. Two traces α and β are conflict-equivalentiff β is an f -permutation of α, and, if ai and aj are twoconflicting actions, then they appear in the same orderin α and β, i.e., i < j iff f(i) < f(j).

A trace α is said to be atomic (equivalently, conflictserializable) iff it is conflict-equivalent to a serial one. Aprogram is called conflict serializable iff every trace ofthe program is conflict-equivalent to a serial one. In thiscase, all code blocks are said to be atomic.

A more relaxed definition of equivalence is view equiv-alence.

Definition 2. Two traces α and β are view-equivalentiff the following three conditions hold.

– β is an f -permutation of α.– Corresponding read events in α and β have corre-

sponding write-predecessors in α and β. More pre-cisely, if ei is the write-predecessor for ej in α, thenef(i) is the write-predecessor for the read event ef(j)in β.

– For each variable v, f maps the final write event tov in α to the final write event to v in β.

A α is said to be view serializable iff it is view-equivalent to a serial one. Conflict-serializability of atrace implies view-serializability, but not vice versa. Aprogram is called view atomic iff every trace of the pro-gram is view-equivalent to a serial one. Thus, if a pro-gram is conflict-atomic then it is view-atomic.

4.2 Discussion of Different Notions of Serializabilityand Atomicity

The example trace in Figure 1 highlights the differencebetween the two notions of serializability using an exam-

6 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

T1 T2 T3

--------- --------- ---------

begin

Read(x)

begin

Write(x)

end

Write(x)

end

begin

Write(x)

end

Fig. 1. A thread interleaving that is view serializable but notconflict serializable

ple that contains “blind writes.” This trace is not conflictserializable, but is view serializable as witnessed by theserial trace in which all actions of T1 are followed by allactions of T2, and then those of T3.

On a number of benchmarks, Wang and Stoller [42]find no cases where an algorithm contains a view-serializabilityviolation but not an atomicity violation. At the time ofthis work, the use of atomicity and serializability as spec-ifications was uncommon outside the concurrent datastructure design community, thus, this experimental studyuses scientific computing benchmarks and concurrencylibraries where atomicity specifications were added af-terwards. Thus, while the common belief appears to bethat the added flexibility due to view serializability isnot needed in shared memory programs, it is prematureto reach that conclusion.

Most work on atomicity in the literature requires theexistence of a trace in which all intended-atomic blocksappeear serial, i.e., all actions of each atomic block ap-pear consecutively in the serial trace. Farzan et al. [10]explore the notion of a single individual intended-atomicblock being atomic, which is formalized by the termcausal atomicity. This is different from the standard no-tion of atomicity in that it requires for each individualblock b in a concurrent execution a possibly differentequivalent concurrent execution αb is in which b appearsserially. The weaker criterion of causal atomicity is stillsufficient to prove pre-post condition pairs about a singleintended-atomic block provided that the pre- and post-conditions only refer to variables accessed in b.

4.3 Runtime Atomicity and Serializability Checkers

There are numerous approaches to the runtime checkingof serializability and atomicity violations. A thoroughreview of all of these approaches is outside the scopeof this paper. In the following, we provide an overview,highlight some representative work, and broadly presentthe underlying ideas. The state of the art is that, mosttools check for atomicity, and there are recent tools thatare able to do this with no false alarms, while even in-ferring atomicity violations in feasible traces other thanthe one encountered. While improvements are certainly

possible, in our view, the runtime detection of serializ-ability violations in a single trace is an efficiently (e.g.,approximately a factor of 10 slowdown for Java code)solved problem in practice. We expect future work to fo-cus more on improving coverage of runtime verification.

In terms of worst-case complexity, checking conflictserializability of a trace is a polynomially-solvable prob-lem [1], while checking view serializability is NP-complete[34]. Most earlier work has focused on checking sufficientconditions for atomicity, which leads to the potential forfalse alarms.

A widely-used way of determining whether a traceis equivalent to a serial one builds on Lipton’s theoryof movers. An action is a left (right) mover if in everytrace of a given program it can be swapped with theaction preceding it and an equivalent trace is obtained.Actions that are both left- and right-movers are calledboth-movers, and the remaining actions are called non-movers. Atomizer [13], and then later, tools by Wang andStoller [43], are tools based on identifying in a trace par-ticular sequences of movers within an intended-atomicblock b that are known not to be atomic. Improvementsin these studies to the basic approach involve using moresophisticated analyses to determine whether an action isa certain kind of mover. Block-based algorithms (Wangand Stoller [43]) look for unserializable patterns of readand write accesses that involve multiple intended-atomicblocks. Vaziri et al. [39] present a set of 11 problematicaccess patterns which they prove to be complete, i.e., ifa trace exhibits none of these, then it is guaranteed tobe serializable.

Graph-based runtime techniques for detecting atom-icity violations maintain a monitoring state (called the“instrumentation store” in [15]) that is obtained by pro-cessing every action in the trace in order. This moni-toring state is typically a directed graph with a certainstructure. In [42], algorithms are presented for maintain-ing a “conflict- or view-forest” during an execution. Ev-ery intended-atomic block corresponds to a tree, and thetrace is conflict (resp., view) serializable if the runtimeverification algorithm assigns a single commit node toeach intended-atomic block corresponding conflict (view)forest. Their algorithms are, in the worst case, linear inthe number of blocks in the trace, and quadratic in thenumber of accesses in the longest block, and the numberof accesses in the entire trace.

In [15], the instrumentation store is a graph whosenodes represent intended-atomic blocks, and whose edgesindicate synchronization dependencies and precedencerelationships between write and read accesses. An atom-icity violation exhibits itself in the form of a cycle in thegraph. In this case, the acyclicity of the graph is a pre-cise (necessary and sufficient) condition for the conflictserializability of the execution. This graph is garbagecollected and optimized as the trace is being processed.As a result, the tool is able to accomplish acceptableoverhead (a slowdown of about 10, which is similar to

Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria 7

slowdowns of tools that check sufficient conditions foratomicity) on a number of benchmarks. Velodrome is thestate of the art in runtime atomicity checking: In addi-tion to being precise and having competitive overheads,it has a “blame assignment” feature which, given a tracewith a serializability violation, in most cases, identifiesan intended-atomic block which does not appear seriallyin any equivalent trace.

4.4 Limitations of Atomicity and View-Serializability

Correctness criteria defined based on low-level, syntac-tic notions of conflicts between actions are sometimestoo restrictive, and declare serializability and atomicityviolations in traces that, from the programmer’s stand-point, do not contain any undesirable interference.

For instance, any reads and writes to a variable Xconflict with each other, although, when commuted, theymay result in the same end state. As a result, it is notpossible to re-order the two independent reads of X in thefollowing trace in order to prove the containing blocksconflict or view serializable, although, from the program-mer’s standpoint, this execution appears serial with allof T2’s actions preceding all of T1’s actions.

T1 T2

-- --

atomBegin

atomBegin

acq(Lck)

tmp1 := X

tmp1 := tmp1 + 1

X := tmp1

rel(Lck)

acq(Lck)

tmp2 := X

tmp2 := tmp2 + 1

X := tmp2

rel(Lck)

Z := 1

atomEnd

Z := 2

atomEnd

This example shows that, when applied at too low alevel, conflict-serializability and view-serializability aresometimes too restrictive. Considering conflicts at a higher,more semantic level by ignoring the local variables tmp1and tmp2, and observing that the increment of X by T1

commutes to the right of the increment of X by T2, theexample above does not contain a concurrency error fromthe standpoint of the programmer.

Purity[14] can be seen as a way of disregarding cer-tain low-level conflicts while reasoning about the atom-icity of a block. The programmer marks certain codeblocks as “pure”. When these blocks terminate normally,they have no net effect on the program state.

Reasoning about atomicity using purity (i) disregardsaccesses by normally-terminating pure blocks and, in

essence, allows them to be conflict-free with respect to allother accesses, (ii) allows one to reason about the atom-icity of a more abstract version of the program in whichreads within normally-terminating pure blocks can re-turn non-deterministic values, and (iii) disregards ac-cesses to “unstable variables” such as performance coun-ters which do not factor into the correctness requirementfor the program and which may have races on them. Pu-rity is a useful relaxation of atomicity. When using pu-rity, the programmer, while performing sequential rea-soning within an atomic block, must argue the correct-ness of a more abstract program where pure reads andaccesses to unstable variables may return non-deterministicvalues.

5 Linearizability and Refinement

Correctness criteria in this section are specific to concurrently-accessed data structures. These criteria relate a non-serial trace to a serial trace of a reference model, orspecification. This reference model may be a separate,abstract realization of a data structure whose operationsrun serially. Alternatively, this model may be obtainedfrom an implementation by forcing the implementationto operate at a coarser level of interleaving granularity(e.g., methods must run serially) and by hiding some ofthe internal state. In the definitions below, we representthe reference model R by its set of serial traces.

The distinction between criteria expressed using aseparate reference model and atomicity and serializabil-ity can be important for concurrent systems and makesatomicity unnecessarily restrictive in some cases. Con-sider, for example, an implementation in which a methodmay terminate exceptionally because resource contentionbetween concurrent threads prevents it from completingits job. In a serial execution, there is only one threadexecuting a method at any given time and thus no re-source contention between methods, therefore, methodexecutions never terminate exceptionally. Therefore, ex-ecutions of this implementation containing exceptionalterminations of this method are not equivalent to anyatomic execution of the system and will be declared er-roneous according to the atomicity criterion.

In the next section, we present some formalism anddefine linearizability, the correctness criterion that theflavors of refinement in the later sections are based upon.

5.1 Preliminaries

In order to be able to define the correctness criteria inthis section using a common formalism, we declare asubset of actions Vis ⊆ A to be observable, and definethe correspondence between executions over observableactions. Given a subset of actions B ⊆ A, the projectionof a trace α onto B, denoted α|B , is the subsequence of α

8 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

obtained by removing all actions that are not in B. Givena subset of actions Vis, a Vis-trace β is a projection of atrace α onto Vis, i.e., it is the subsequence of α consistingof all and only actions from Vis. Given a thread t, theprojection of the trace α onto the thread t, denoted α|t,is the subsequence of α consisting of only and all actionsexecuted by t.

Every trace α defines a partial order �α on intended-atomic blocks and operation executions, referred to asthe precedence order. Formally, l �α l

′ if l and l′ arelabels of intended-atomic blocks or operation executions,and the return or end action of the block labeled l comesbefore the call or beginning action of l′ in α. Note that allactions are part of an intended-atomic block (sometimesa unary one) and, thus, �α subsumes the program orderfor each thread.

Linearizability is a widely-used, compositional cor-rectness criterion for concurrent objects. A linearizabletrace or program provides the illusion that each execu-tion of an operation of a concurrent object takes placeinstantaneously between its invocation and its response,implying that the meaning of the operation can be pro-vided by pre- and post-conditions. In our formalism, lin-earizability is expressed as follows.

Definition 3. Let CR be the set of call and return ac-tions for all methods in a program P and let α be a CR-trace of P . Let R be a reference model encapsulating thesequential specifications of operations in P . Traces of Rare all serial CR traces where each operation executionsatisfies its sequential specification. A CR-trace α of Pis said to be linearizable if there exists a CR-trace β ofR such that β preserves the precedence �α.

Atomicity and serializability require that each imple-mentation trace to be equivalent to some atomic trace ofthe implementation. This makes these criteria sometimesunneccessarily restrictive because low-level conflicts maynot necessarily mean conflicts at the more “semantic”level. The more semantic level is given by the sequentialspecification.

The definition of linearizability does not specify howthe sequential trace β is to be constructed. In the ab-sence of any other information, all possible serial tracesconsisting of re-orderings of the actions in α may needto be examined. This makes linearizability not amenableto efficient runtime checking. For efficient runtime ver-ification of linearizability, candidate trace(s) β need tobe generated from α using heuristics and/or programmerhints, either during actual execution, or afterwards, froma logged execution. Furthermore, an efficient mechanismis needed to determine whether operation executions inβ are consistent with their sequential specifications.

In the following, we present flavors of runtime re-finement checking as practical sufficient conditions forverifying the linearizability of a trace at runtime.

5.2 Runtime Refinement Checking

5.2.1 I/O Refinement

As a correctness criterion, I/O refinement [9] is a smallvariation on linearizability. I/O refinement is Vis-refinement,where Vis consists of call and return actions of opera-tions. In I/O refinement, differently from how lineariz-ability is verified in [19], instead of an axiomatic se-quential specification for each data structure, an exe-cutable sequential specification for the data structure isused as the reference model R. The specification is givenin the form of a serially-executed method for each datastructure operation. Given a return value r of the oper-ation, the (deterministic) specification method updatesthe state of the data structure specification consistentlywith r, or signals an error if r is not an allowed returnvalue for the operation at the current data structurestate. This specification can be provided separately ora non-concurrent, serial interpretation of the implemen-tation can serve as the specification.

I/O refinement requires that for each trace of theimplementation, there is a serial trace (the “witness in-terleaving”) of the specification consisting of the sameoperation calls (and arguments) and return actions (andvalues). Runtime checking of I/O refinement, requireslittle instrumentation and logging while still providinga more rigorous check than pure testing. To enable effi-cient runtime checking of I/O refinement, the program-mer annotates the implementation code so that in everyoperation execution, a unique action is marked as the“commit action”. The analysis uses the order of occur-rence of commit actions during execution to generate thewitness interleaving. At each commit action, the oper-ation that is committing and its return value (derivedby looking ahead in the implementation’s execution) areused to execute the specification. From its current state,the specification must take a transition associated withthe committing operation and its return value. If this isnot possible, a refinement-violation is detected.

The runtime refinement check described could faileither because the implementation truly does not refinethe implementation or because the witness interleavingobtained using the commit actions is wrong. Comparingthe witness interleaving with the implementation tracereveals which one of these is the case.

If a separate specification does not exist, one can usean “atomized version” of the implementation code asthe specification. In an atomized version, the programis forced to have only serial executions. The atomizedversion’s methods are also modified so that, in additionto the original set of arguments, they take the returnvalue of the original operation as an additional argumentand compute the new specification state, or declare arefinement error.

Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria 9

In [9], I/O refinement was able to successfully detectconcurrency errors in a number of benchmarks, includingindustrial examples.

5.2.2 View Refinement

If static proof were carried out or exhaustive examina-tion of all implementation traces were possible, I/O re-finement would suffice as a correctness requirement, asit would require all observable behavior of the programto be consistent with a sequential specification. However,when the concurrent executions examined do not achievehigh coverage, I/O refinement used alone as a correctnesscriterion may miss errors that have been triggered, butwhose effects (i) have not made a difference observablein the return value of an operation in the execution in-vestigated, (ii) but might in another execution, possiblyone obtained by extending the one being examined.

Consider a bug scenario described in [9]. A storagemanager uses the main memory as a cache, and occasion-ally writes the contents of the cache onto a disk for per-manent storage. Because of a concurrency error, a datablock written onto the disk is corrupted while the copyin the cache is correct. At this point, the concurrency er-ror has been triggered and has resulted in a discrepancyin the program state, but this has not yet been observedin the return value of an operation. Most test scenarioswould perform read and write accesses that hit in thecache, and would be unlikely to uncover this error whilechecking I/O refinement. However, if, after a large num-ber of accesses, the block in the cache were to be evictedand then reloaded from the disk, or, if the program wereto be stopped and re-started with an empty cache, thenthe read operation associated with that block would re-turn an incorrect value. Testing or I/O refinement aloneare unlikely to detect this bug, even though the bug maybe triggered frequently. Furthermore, the point in theprogram trace at which a discrepancy from an expectedoutput is detected may be far from the point at whichit was triggered, which makes diagnosing and fixing thebug difficult.

View refinement addresses this shortcoming by re-quiring a correspondence between the states of the im-plementation and the specification at commit points ofoperations. The key idea in view refinement is to intro-duce hypothetical “view” variables and into the imple-mentation and specification. Intuitively, the values of theview variables extract the contents of data structure ina canonical way from the implementation and specifica-tion states. These variables are computed as a function ofthe data structure state and abstract away informationthat is not relevant to what applications can observe.For example, for a binary search tree, the view variablemight be defined as the ordered list of the (key, value)pairs, thus abstracting away the structure of the tree.If a hashtable with atomic operations is given as thespecification for the binary tree, the specification view

variable might again be the ordered list of (key, value)pairs while the hash function and the collision resolu-tion mechanism are abstracted away. Having both viewvariables be sorted lists makes the abstract set contentscanonical, and allows a direct comparison of the viewvariables. When using view refinement, the programmeris asked to specify how the view variables are to be com-puted.

To check view refinement, at each commit action, thevalues of the view variables in the implementation andthe specification are constructed. Formally, each methodupdates the view variable once, atomically with its com-mit action. The spec view variable is updated once atom-ically between the call and return of each method. Dur-ing runtime verification, it is checked whether the samesequence of updates are performed to the specificationand implementation view variables. Formally, we declareas visible all commit actions and annotate each of themwith the corresponding updated value of view variable.

The granularity of checking done using view variablesis intermediate between two extremes. At one extreme,one can require a state correspondence only at quiescentstates of the implementation as in [38] or commit atomic-ity [12]. But most industrial-scale concurrent data struc-tures are built to be used by large numbers of threadscontinuously and during any realistic execution, quies-cent points are very rare. Checking only at these pointsmight cause errors to be overwritten or to be discoveredtoo late. At the other extreme, asking the programmerwrite a refinement map that relates implementation andspecification states after each action is impractical. Ourchoice of view variables that are updated only with com-mit actions strikes a good compromise: A check is per-formed for each method execution, and the refinementmap is easier to write.

Elmas et al. [9] find that, at the cost of some moremanual effort to provide the view functions and addi-tional overhead of instrumentation in order to be ableto compute the view functions, view refinement is a lotmore effective in detecting errors in concurrent data struc-tures. Fewer, shorter traces are sufficient to detect con-currency errors, and errors are detected at the commitaction immediately following the method that results inan incorrect data structure state.

Tasiran et al. introduced rollback atomicity, a cor-rectness criterion that is a variation on view refinementthat is more appropriate correctness criterion for certainprograms. Rollback atomicity is a special case of viewrefinement where part of the state of a concurrently-accessed data structure is projected away. The set ofshared variables is partitioned into focus variables andperipheral variables. A version of the program whereintended-atomic blocks are forced to run serially consti-tutes the specification. Rollback atomicity uses the focusvariables as the set of view variables. View computationin the implementation makes use of a “rollback func-tion” which undoes the effects of partially executed but

10 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

not-yet-committed atomic blocks. Using rollback atom-icity, it is possible to pose a more relaxed requirementon peripheral variables than the one on focus variables.Peripheral variables are merely required to be consis-tent, while focus variables uniquely determine the ab-stract data structure state given a consistent assignmentto peripheral variables.

6 Improving Coverage

All specifications discussed so far in this paper couldbe checked either statically or dynamically. While staticanalysis attempts to perform a symbolic analysis for allinputs, dynamic analysis focuses its attention on a singleinput. Since concurrent programs are usually nondeter-ministic, even a single input may result in a number ofinterleavings each exhibiting a different behavior of theprogram. Dynamic analyses could be grouped based onthe degree of coverage they provide with respect to theset of interleavings for a single input. We discuss thevarious approaches in increasing order of coverage witha concomitant increase in analysis overhead and com-plexity.

Focusing on a single run. Early dynamic anal-yses for concurrent programs were based on observinga single execution. A program could be run multipletimes to get more executions but there is no commu-nication between the analysis performed for each execu-tion. Examples include data-race detection based on thehappens-before relation [8,44], atomicity-violation de-tection based on conflict serializability [15,11], lineariz-ability checking [40], and refinement checking [9].

Generalizing from a single run. While dynamicapproaches that focus on a single run have been effectiveat detecting concurrency errors, they provide little cover-age of the set of behaviors of the program. Consequently,researchers have investigated techniques that analyze agiven execution to predict bugs in other related execu-tions. An early example of such a technique is the Eraserdata-race detection algorithm [37]. Instead of comput-ing the precise happens-before relation of an execution,Eraser checks for the existence of a consistent lockingprotocol. For each shared variable, the algorithm main-tains a lockset which is initialized to the universal setand intersected with the currently held set of locks at anaccess to the variable. An error is reported if the locksetever becomes empty. Compared to happens-before data-race detection, Eraser is much less sensitive to the exactschedule taken by the program. For atomicity violations,the Atomizer [13] tool uses the theory of movers [27] tocheck for a stronger condition than conflict serializabil-ity.

Most techniques, including those described above,that use generalization suffer from the problem of falsealarms. For data-race detection, the problem of falseraces has been addressed by using suitable combinations

of the lockset algorithm and happens-before relation [5,44]. The Sidetrack tool [36] performs generalized yet pre-cise atomicity-violation detection. The jPredictor tool [4]can perform generalization for any trace property, in-cluding absence of data-races and atomicity-violations.

Concurrency fuzzing. As mentioned before, gen-eralizing from a single run offers greater coverage thanjust focusing on a single run. However, this extra cov-erage comes at the cost of imprecision. An orthogonaltechnique for improving coverage is concurrency fuzzing,where the program is run repeatedly with the goal of get-ting a variety of schedules. Of course, if done naively, weare unlikely to get much variation in schedules. There-fore, researchers have introduced techniques that per-turb the executions in a directed fashion to increase thelikelihood of buggy schedules. The main advantage offuzzing over generalization is precision in error reports.

The pioneer in concurrency fuzzing is the tool Con-Test [7] which introduces randomly inserted delays atcertain points in the execution. Subsequent tools havetried to improve ConTest by making the introductionof delays more directed. The CalFuzzer tool [22] detectsdata races in the execution and then attempts to re-execute with the intention of making the race happen inthe other order; it achieves this race flipping by intro-ducing a delay before the first access. The Cuzz tool [30]is based on a randomized priority-based scheduler thattries to expose bugs with small ordering depth.

Algorithmic enumeration of executions. Finally,the most exhaustive coverage is provided by techniquesthat attempt to systematically enumerate all possibleschedules of the concurrent program for a given input.Work in this direction has been carried out in the re-search areas of software testing and model checking. Inthe software testing community, Carver and Tai [3] pro-posed repeatable deterministic testing by running a pro-gram with an input and explicit thread schedule. Theidea of systematic generation of thread schedules camelater under the rubric of reachability testing [20,26]. Inthe model checking community, there have been two ap-proaches to solve this problem. The stateful approach [41,6,31,23] attempts to cache visited states in order toavoid exploring from them again. The stateless approach [18,33] attempts to enumerate schedules without performingany state caching.

In both stateless and stateful model checking, re-duction and prioritization techniques have been used.Partial-order reduction techniques [17] are based on thenotion of commuting operations. Symmetry reductiontechniques [21] are based on the notion bisimulation-equivalent states. These reduction techniques are soundbecause they are guaranteed to prune only those behav-iors or states that are redundant. On the other hand, pri-oritization techniques are based on exploring the statespace upto some bound. The bound is then iterativelyincreased and provides partial coverage guarantees. Themost well-known prioritization techniques are based on

Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria 11

depth-bounding [35] and preemption-bounding [32]. Whiledepth-bounding bounds the depth of the execution, preemption-bounding bounds the number of preemptions in the ex-ecution without bounding the depth.

References

1. P. A. Bernstein, V. Hadzilacos, and N. Goodman.Concurrency control and recovery in database systems.Addison-Wesley, 1987.

2. Hans-Juergen Boehm and Sarita V. Adve. Founda-tions of the c++ concurrency memory model. In PLDI08: Programming Language Design and Implementation,pages 68–78, 2008.

3. Richard H. Carver and Kuo-Chung Tai. Replay and test-ing for concurrent programs. IEEE Softw., 8(2):66–74,1991.

4. Feng Chen, Traian Florin Serbanuta, and Grigore Rosu.jpredictor: a predictive runtime analysis tool for java. InICSE: International Conference on Software Engineer-ing, pages 221–230. ACM, 2008.

5. Jong-Deok Choi, Keunwoo Lee, Alexey Loginov, RobertO’Callahan, Vivek Sarkar, and Manu Sridharan. Efficientand precise datarace detection for multithreaded object-oriented programs. In PLDI ’02: Proceedings of the ACMSIGPLAN 2002 Conference on Programming languagedesign and implementation, pages 258–269, New York,NY, USA, 2002. ACM Press.

6. Matthew B. Dwyer, John Hatcliff, Robby, andVenkatesh Prasad Ranganath. Exploiting object escapeand locking information in partial-order reductions forconcurrent object-oriented programs. Formal Methodsin System Design, 25(2-3):199–240, 2004.

7. Orit Edelstein, Eitan Farchi, Evgeny Goldin, Yarden Nir,Gil Ratsaby, and Shmuel Ur. Framework for testingmulti-threaded java programs. Concurrency and Compu-tation: Practice and Experience, 15(3–5):485–499, 2003.

8. Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran.Goldilocks: a race and transaction-aware java runtime.In PLDI 07: Programming Language Design and Imple-mentation, pages 245–255, 2007.

9. Tayfun Elmas, Serdar Tasiran, and Shaz Qadeer. VYRD:Verifying concurrent programs by runtime refinement-violation detection. In PLDI ’05: Proceedings of the 2005ACM SIGPLAN conference on Programming LanguageDesign and Implementation, pages 27–37, New York, NY,USA, 2005. ACM Press.

10. Azadeh Farzan and P. Madhusudan. Causal atomicity.In CAV: Computer Aided Verification, pages 315–328,2006.

11. Azadeh Farzan and P. Madhusudan. Monitoring atomic-ity in concurrent programs. In CAV 08: Computer AidedVerification, pages 52–65, 2008.

12. Cormac Flanagan. Verifying commit-atomicity usingmodel-checking. In SPIN, pages 252–266, 2004.

13. Cormac Flanagan and Stephen N. Freund. Atomizer: Adynamic atomicity checker for multithreaded programs.Sci. Comput. Program., 71(2):89–109, 2008.

14. Cormac Flanagan, Stephen N. Freund, and Shaz Qadeer.Exploiting purity for atomicity. IEEE Trans. Softw.Eng., 31(4):275–291, 2005.

15. Cormac Flanagan, Stephen N. Freund, and Jaeheon Yi.Velodrome: a sound and complete dynamic atomicitychecker for multithreaded programs. In PLDI 08: Pro-gramming Language Design and Implementation, pages293–303, 2008.

16. Cormac Flanagan and Shaz Qadeer. A type and effectsystem for atomicity. In PLDI ’03: Proceedings of theACM SIGPLAN 2003 conference on Programming lan-guage design and implementation, pages 338–349, NewYork, NY, USA, 2003. ACM Press.

17. Patrice Godefroid. Partial-order methods for the verifi-cation of concurrent systems: an approach to the state-explosion problem, volume 1032. Springer-Verlag Inc.,New York, NY, USA, 1996.

18. Patrice Godefroid. Model checking for programming lan-guages using Verisoft. In POPL 97: Principles of Pro-gramming Languages, pages 174–186. ACM Press, 1997.

19. Maurice P. Herlihy and Jeannette M. Wing. Linearizabil-ity: a correctness condition for concurrent objects. ACMTrans. Program. Lang. Syst., 12(3):463–492, 1990.

20. G. Hwang, K. Tai, and T. Hunag. Reachability testing:An approach to testing concurrent software. Interna-tional Journal of Software Engineering and KnowledgeEngineering, 5(4):493–510, 1995.

21. Radu Iosif. Exploiting heap symmetries in explicit-statemodel checking of software. In ASE 01: Automated Soft-ware Engineering, pages 254–261, 2001.

22. Pallavi Joshi, Mayur Naik, Chang-Seo Park, and KoushikSen. An extensible active testing framework for concur-rent programs. In CAV 09: Computer Aided Verification,2009.

23. Charles Edwin Killian, James W. Anderson, RanjitJhala, and Amin Vahdat. Life, death, and the criticaltransition: Finding liveness bugs in systems code. InNSDI 07: Networked Systems Design and Implementa-tion, pages 243–256, 2007.

24. Leslie Lamport. Time, clocks, and the ordering of eventsin a distributed system. Communications of the ACM,21(7):558–565, 1978.

25. Leslie Lamport. How to make a multiprocessor computerthat correctly executes multiprocess programs. IEEETransactions on Computers, C-28(9):690–691, 1979.

26. Yu Lei and Richard H. Carver. Reachability testingof concurrent programs. IEEE Trans. Software Eng.,32(6):382–403, 2006.

27. Richard J. Lipton. Reduction: a method of proving prop-erties of parallel programs. Commun. ACM, 18(12):717–721, 1975.

28. Jeremy Manson, William Pugh, and Sarita V. Adve.The java memory model. In POPL ’05: Proceedings ofthe 32nd ACM SIGPLAN-SIGACT Symposium on Prin-ciples of Programming Languages, pages 378–391, NewYork, NY, USA, 2005. ACM.

29. Friedemann Mattern. Virtual time and global statesof distributed systems. In Parallel and Distributed Al-gorithms: proceedings of the International Workshop onParallel and Distributed Algorithms. 1988.

30. Madanlal Musuvathi, Sebastian Burckhardt, PraveshKothari, and Santosh Nagarakatte. A randomized sched-uler with probabilistic guarantees of finding bugs. InASPLOS: Architectural Support for Programming Lan-guages and Operating Systems. ACM, 2010.

12 Shaz Qadeer, Serdar Tasiran: Runtime Verification of Concurrency-Specific Correctness Criteria

31. Madanlal Musuvathi, David Y. W. Park, Andy Chou,Dawson R. Engler, and David L. Dill. Cmc: A pragmaticapproach to model checking real code. In OSDI, 2002.

32. Madanlal Musuvathi and Shaz Qadeer. Iterative contextbounding for systematic testing of multithreaded pro-grams. In PLDI 07: Programming Language Design andImplementation, pages 446–455, 2007.

33. Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, GerardBasler, Piramanayagam Arumuga Nainar, and IulianNeamtiu. Finding and reproducing heisenbugs in concur-rent programs. In OSDI 08: Operating Systems Designand Implementation, pages 267–280, 2008.

34. Christos H. Papadimitriou. The serializability of concur-rent database updates. J. ACM, 26(4):631–653, 1979.

35. Stuart Russell and Peter Norvig. Artificial Intelligence:A Modern Approach. Prentice Hall, Inc., 2009.

36. Caitlin Sadowski, Stephen N. Freund, and CormacFlanagan. Singletrack: A dynamic determinism checkerfor multithreaded programs. In ESOP, pages 394–409,2009.

37. Stefan Savage, Michael Burrows, Greg Nelson, PatrickSobalvarro, and Thomas Anderson. Eraser: A dy-namic data race detector for multithreaded programs.ACM Transactions on Computer Systems, 15(4):391–411, 1997.

38. Serdar Tasiran, Andrej Bogdanov, and Minwen Ji. De-tecting concurrency errors in file systems by runtime re-finement checking., 2004.

39. Mandana Vaziri, Frank Tip, and Julian Dolby. Associat-ing synchronization constraints with data in an object-oriented language. In POPL ’06: Conference record ofthe 33rd ACM SIGPLAN-SIGACT symposium on Prin-ciples of programming languages, pages 334–345, NewYork, NY, USA, 2006. ACM.

40. Martin T. Vechev, Eran Yahav, and Greta Yorsh. Ex-perience with model checking linearizability. In SPINWorkshop on Model Checking of Software, pages 261–278, 2009.

41. Willem Visser, Klaus Havelund, Guillaume Brat, andSeung-Joon Park. Model checking programs. In Proc. ofthe 15th IEEE International Conference on AutomatedSoftware Engineering, 2000.

42. Liqiang Wang and Scott D. Stoller. Accurate and effi-cient runtime detection of atomicity errors in concurrentprograms. In PPoPP ’06: Proceedings of the eleventhACM SIGPLAN symposium on Principles and practiceof parallel programming, pages 137–146, New York, NY,USA, 2006. ACM.

43. Liqiang Wang and Scott D. Stoller. Runtime analysisof atomicity for multi-threaded programs. IEEE Trans-actions on Software Engineering, 32:93–110, February2006.

44. Yuan Yu, Tom Rodeheffer, and Wei Chen. Racetrack:efficient detection of data race conditions via adaptivetracking. In SOSP 05: Symposium on Operating SystemsPrinciples, pages 221–234, 2005.