new declarative programming over eventually consistent data stores · 2020. 9. 1. · levels, with...

Consist

ent *Complete *

Well D

ocumented*Easyt

oR

euse* *

Evaluated

*PLDI*

Artifact *

AEC

Declarative Programming overEventually Consistent Data Stores

KC Sivaramakrishnan ∗

University of Cambridge, [email protected]

Gowtham KakiPurdue University, [email protected]

Suresh JagannathanPurdue University, [email protected]

AbstractUser-facing online services utilize geo-distributed data stores tominimize latency and tolerate partial failures, with the intention ofproviding a fast, always-on experience. However, geo-distributiondoes not come for free; application developers have to contendwith weak consistency behaviors, and the lack of abstractions tocomposably construct high-level replicated data types, necessitatingthe need for complex application logic and invariably exposinginconsistencies to the user. Some commercial distributed data storesand several academic proposals provide a lattice of consistencylevels, with stronger consistency guarantees incurring increasedlatency and throughput costs. However, correctly assigning the rightconsistency level for an operation requires subtle reasoning and isoften an error-prone task.

In this paper, we present QUELEA, a declarative programmingmodel for eventually consistent data stores (ECDS), equipped witha contract language, capable of specifying fine-grained application-level consistency properties. A contract enforcement system analysescontracts, and automatically generates the appropriate consistencyprotocol for the method protected by the contract. We describean implementation of QUELEA on top of an off-the-shelf ECDSthat provides support for coordination-free transactions. Severalbenchmarks including two large web applications, illustrate theeffectiveness of our approach.

Categories and Subject Descriptors D.1.3 [Concurrent Program-ming]: Distributed Programming; C.2.4 [Distributed Systems]: Dis-tributed databases; D.3.2 [Language Classifications]: Applicative(Functional) Languages; F.3.1 [Logics and Meanings of Programs]:Specifying and Verifying and Reasoning about Programs

General Terms Languages, Performance

Keywords Eventual Consistency, Availability, CRDTs, AxiomaticContracts, Contract Classification, Distributed Transactions, SMTsolvers, Decidable Logic, Quelea, Cassandra, Haskell

∗ This work was done at Purdue University, USA.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected]’15, , June 13–17, 2015, Portland, OR, USA.Copyright © 2015 ACM 978-1-4503-3468-6/15/06. . . $15.00.http://dx.doi.org/10.1145/2737924.2737981

1. IntroductionMany real-world web services — such as those built and maintainedby Amazon, Facebook, Google, Twitter, etc. — replicate applica-tion state and logic across multiple replicas within and across datacenters. Replication is intended not only to improve applicationthroughput and reduce user-perceived latency, but also to toleratepartial failures without compromising overall service availability.Traditionally programmers have relied on strong consistency guar-antees such as linearizability [15] or serializability [21] in orderto build correct applications. While strong consistency is an eas-ily stated property, it masks the reality underlying large-scale dis-tributed systems with respect to non-uniform latency, availability,and network partitions [8, 14]. Indeed, modern web services, whichaim to provide an "always-on" experience, overwhelmingly favoravailability and partition tolerance over strong consistency. To thisend, several weak consistency models such as eventual consistency,causal consistency, session guarantees, and timeline consistencyhave been proposed.

Under weak consistency, the developer needs to be aware ofconcurrent conflicting updates, and has to pay careful attentionto avoid unwanted inconsistencies (e.g., negative balances in abank account, or having an item appear in a shopping cart afterit has been removed [13]). Oftentimes, these inconsistencies leakfrom the application and are witnessed by the user. Ultimately,the developer must decide the consistency level appropriate for aparticular operation; this is understandably an error-prone processrequiring intricate knowledge of both the application as well as thesemantics and implementation of the underlying data store, whichtypically have only informal descriptions. Nonetheless, picking thecorrect consistency level is critical not only for correctness butalso for scalability of the application. While choosing a weakerconsistency level than required may introduce program errors andanomalies, choosing a stronger one than necessary can negativelyimpact program scalability and performance.

Weak consistency also hinders compositional reasoning aboutprograms. Although an application might be naturally expressedin terms of well-understood and expressive data types such asmaps, trees, queues, or graphs, geo-distributed stores typically onlyprovide a minimal set of data types with in-built conflict resolutionstrategies such as last-writer-wins (LWW) registers, counters, andsets [16, 25]. Furthermore, while traditional database systemsenable composability through transactions, geo-distributed storestypically lack unrestricted serializable transactional access to thedata. Working in this environment thus requires application stateto be suitably coerced to function using only the capabilities of thestore.

To address these issues, we describe QUELEA, a declarativeprogramming model and implementation for ECDS. The key noveltyof QUELEA is an expressive contract language to declare and

Eventually Consistent Data Store

Replica�Replica� Replican

......x → {wx� , wx

� }y → {w y

� , w y� }.

.

.

w y�

wx�

y → {w y� } x → {wx

� }

...

Session� Session�

.

.

. ....SessionOrder

v�← x . f oo(arg�); �wx� �

v�← x .bar(arg�); �wx� �

Figure 1: QUELEA system model.

verify fine-grained application-level consistency properties. Theprogrammer uses the contract language to axiomatically specifythe set of legal executions allowed over the replicated data type.Contracts are constructed using primitive consistency relationssuch as visibility and session order along with standard logicaland relational operators. A contract enforcement system staticallymaps operations over the datatype to a particular consistency levelavailable on the store, and provably validates the correctness of themapping. The paper makes the following contributions:

• We introduce QUELEA, a shallow extension of Haskell thatsupports the description and validation of replicated data typesfound in an ECDS. Contracts are used to specify fine-grainedapplication-level consistency properties, and are statically ana-lyzed to assign the most efficient and sound store consistencylevel to the corresponding operation.

• QUELEA supports coordination-free transactions over arbitrarydatatypes. We extend our contract language to express fine-grained transaction isolation guarantees, and utilize the contractenforcement system to automatically assign the correct isolationlevel for a transaction.

• We provide meta-theory that certifies the soundness of ourcontract enforcement system, and ensures that an operation isonly executed if the required conditions on consistency are met.

• We describe an implementation of QUELEA as a transparentshim layer over Cassandra [16], a well-known general-purposedata store. Experimental evaluation over a set of real-worldapplications, including a Twitter-like micro-blogging site and aneBay-like auction site illustrates the practicality of our approach.

The rest of the paper is organized as follows. The next sectiondescribes the system model. We describe the challenges in program-ming under eventual consistency, and introduce QUELEA contractsas a proposed solution to overcome these issues in § 3. § 4 pro-vides more details on the contract language, and its mapping tostore consistency levels, along with meta-theory for certifying thecorrectness of the mapping. § 6 introduces transaction contracts andtheir classification. § 7 describes the implementation of QUELEA ontop of Cassandra. § 8 discusses experimental evaluation. § 9 and 10present related work and conclusions.

2. System ModelIn this section, we describe the system model and introduce theprimitive relations that our contract language is seeded with. Figure 1presents a schematic diagram of our system model. The distributedstore is composed of a collection of replicas, each of which stores aset of objects (x, y, . . .). We assume that every object is replicated

at every replica in the store. The state of an object at any replica isthe set of all updates (effects) performed on the object. For example,the state of x at replica 1 is the set composed of effects wx1 and wx2 .

Each object is associated with a set of operations. The clientsinteract with the store by invoking operations on objects. Thesequence of operations invoked by a particular client on the storeis called a session. The data store is typically accessed by a largenumber of clients (and hence sessions) concurrently. Importantly,the clients are oblivious to which replica an operation is appliedto; the data store may choose to route the operation to any replicain order to minimize latency, balance load, etc. For example, theoperations foo and bar invoked by the same session on the sameobject, might end up being applied to different replicas becausereplica 1 (to which foo was applied) might be unreachable when theclient invokes bar.

When foo is invoked on a object x with arguments arg1 atreplica 1, it simply reduces over the current set of effects at thatreplica on that object (wx1 and wx2 ), produces a result v1 that issent back to the client, and emits a single new effect wx4 that isappended to the state of x at replica 1. Thus, every operation isevaluated over a snapshot of the state of the object on which it isinvoked. In this case, the effectswx1 andwx2 are visible towx4 , writtenlogically as vis(wx1 , w

x4 ) ∧ vis(wx2 , w

x4 ), where vis is the visibility

relation between effects. Visibility is an irreflexive and asymmetricrelation, and only relates effects produced by operations on the sameobject. Executing a read-only operation is similar except that nonew effects are produced. The effect added to a particular replicais asynchronously sent to other replicas, and eventually mergedinto all other replicas. Observe that this model does not assumea particular resolution strategy for concurrent conflicting updates,and instead preserves every update. Update conflicts are resolvedwhen an operation reduces over the set of effects on an object at aparticular replica.

Two effects wx4 and wx5 that arise from the same session are saidto be in session order (written logically as so(wx4 , w

x5 )). Session

order is an irreflexive, transitive relation. The effects wx4 and wx5arising from operations applied to the same object x are said to beunder the same object relation, written sameobj(wx4 , w

x5 ). Finally,

we can associate every effect with the operation that generatedthe effect with the help of a relation oper. In the current example,oper(wx4 , foo) and oper(wx5 , bar) hold. For simplicity, we assumeall operation names across all object are distinct.

This model admits all the inconsistencies associated with even-tual consistency. The goal of this work is to identify the preciseconsistency level for each operation such that application-level con-straints are not violated. In the next section, we will concretelydescribe the challenges associated with constructing a consistentbank account on top of an ECDS. Subsequently, we will illustratehow our contract and specification language, armed with the primi-tive relations vis, so, sameobj and oper, mitigates these challenges.

3. MotivationConsider how we might implement a highly available bank accounton top of an ECDS, with the integrity constraint that the balancemust be non-negative. We begin by implementing a bank accountreplicated data type (RDT) in QUELEA, and then describe themechanisms to obtain the desired correctness guarantees.

3.1 RDT SpecificationA key novelty in QUELEA is that it allows the addition of newRDTs to the store, which obviates the need for coercing applicationlogic to utilize store-provided data types. In addition, QUELEAtreats the convergence semantics (i.e., how conflicting updatesare resolved) of the data type separately from its consistencyproperties (i.e., when updates become visible). This separation of

data Acc = Deposit Int | Withdraw Int | GetBal

getBalance :: [Acc] → () → (Int, Maybe Acc)getBalance hist _ =let res = sum [x | Deposit x ← hist] -

sum [x | Withdraw x ← hist]in (res, Nothing)

deposit :: [Acc] → Int → ((), Maybe Acc)deposit hist amt = ((), Just $ Deposit amt)

withdraw :: [Acc] → Int → (Bool, Maybe Acc)withdraw hist v =if sel1 $ getBalance hist () ≥ vthen (True, Just $ Withdraw v)else (False, Nothing)

Figure 2: Definition of a bank account expressed in Quelea.

concerns permits operational reasoning for conflict resolution, anddeclarative reasoning for consistency. The combination of thesetechniques enhances the programmability of the store.

Let us assume that the bank account object provides three opera-tions: deposit, withdraw and getBalance, with the assumptionthat the withdraw fails if the account has insufficient balance. Everyoperation in QUELEA is of the following type, written in Haskellsyntax:

type Operation e a r = [e] → a → (r, Maybe e)

An operation takes a list of effects (the history of updates to thatobject), and an input argument, and returns a result along withan optional effect (read-only operations return Nothing). Thenew effect (if emitted) is added to the state of the object at thecurrent replica, and asynchronously sent to other replicas. Theimplementation of the bank account operations in QUELEA is givenin Figure 2.

The datatype Acc represents the effect type for the bank account.The function sum returns the sum of elements in the list, and sel1returns the first element of a tuple. For each operation, hist is asnapshot of the state of the object at some replica. In this sense, everyoperation on the RDT is atomic, and thus amenable to sequentialreasoning. Here, getBalance is a read-only operation, depositalways emits an effect, and withdraw only emits an effect if thereis sufficient balance in the account. We have implemented a largecorpus of RDTs for realistic benchmarks including shopping carts,auction and micro-blogging sites, etc. in a few tens of lines of code,expressed in this style.

3.1.1 SummarizationObserve that the definition of getBalance reduces over the entirehistory of updates to an account. If we are to realize an efficientimplementation of this bank account RDT, we need a summary of theaccount history. Intuitively, the current account balance summarizesthe state of an account. A bank account with the history [Deposit10, Withdraw 5] is observably equivalent to a bank account witha single deposit operation [Deposit 5]; we can replace the earlierhistory with the latter and a client of the store would not able to tellthe difference between the two.

This notion of observable equivalence can be generalized to otherRDTs as well. For example, a last-writer-wins register with multipleupdates is equivalent to a register with only the last write. Similarly,a set with a collection of add and remove operations is equivalent toa set with a series of additions of live elements from the original set.Since the notion of observable equivalence is specific to each RDT,programmers can provide a summarization function - of type [e]

-> [e] - as a part of the RDT specification. The summarizationfunction for the bank account is:

summarize hist =[Deposit $ sel1 $ getBalance hist ()]

Given a bank account history hist, the summarize function returnsa new history with a single deposit of the current account balance.Our implementation invokes the summarization function associatedwith an RDT to reduce the size of the effect sets maintained byreplicas.

3.2 Anomalies under Eventual ConsistencyOur goal is to choose the correct consistency level for each ofthe bank account operations such that (1) the balance remains non-negative and (2) the getBalance operation never incorrectly returnsa negative balance.

Session 1

withdraw (70)

Session 2vis

getBalance → -50

withdraw (80)

deposit (100)vis so

vis

vis so

(a) Unsafe withdraw

deposit (100)

Session 1

withdraw (50)

Session 2

getBalance → -50

Session 3

vis

vis

(b) Negative balance

deposit (100)

withdraw (50)

getBalance → 100

vis, so

so

vis

Session 1

(c) Missing update

Figure 3: Anomalies possible under eventual consistency for theget balance operation.

Consider the execution shown in Figure 3(a). Assume that alloperations in the figure are on the same bank account object withthe initial balance being zero. Session 1 performs a deposit of 100,followed by a withdraw of 80 in the same session. The withdrawoperation witnesses the deposit and succeeds1. Subsequently, session2 perform a withdraw operation, but importantly, due to eventualconsistency, only witnesses the deposit from session 1, but not thesubsequent withdraw. Hence, this withdraw also incorrectly suc-ceeds, violating the integrity constraint. A subsequent getBalanceoperation, that happens to witness all the previous operations, wouldreport a negative balance.

It is easy to see that preventing concurrent withdraw opera-tions eliminates this anomaly. This can be done by insisting thatwithdraw be executed as a strongly consistent operation. Despitethis strengthening, the getBalance operation may still incorrectlyreport a negative balance to the user. Consider the execution shownin fig. 3(b), which consists of three concurrent sessions performing adeposit, a withdraw, and a getBalance operation, respectively,on the same bank account object. As the vis edge indicates, operationwithdraw(50) in session 2 witnesses the effects of deposit(100)from session 1, concludes that there is sufficient balance, and com-pletes successfully. However, the getBalance operation may onlywitness this successful withdraw, but not the causally precedingdeposit, and reports the balance of negative 50 to the user.

Under eventual consistency, the users may also be exposed toother forms of inconsistencies. Figure 3(c) shows an executionwhere the getBalance operation in a session does not witnessthe effects of an earlier withdraw operation performed in thesame session, possibly because it was served by a replica that has

1 Although visibility and session order relations relate effects, we have abusedthe notation in these examples to relate operations, with the idea that therelations relate the effect emitted by those operations.

not yet merged the withdraw effect. This anomaly leads the userto incorrectly conclude that the withdraw operation failed to gothrough.

Although it is easy to understand the reasons behind the occur-rence of the aforementioned anomalies, finding the appropriate fixesis not readily apparent. Making getBalance a strongly consistentoperation is definitely sufficient to avert anomalies, but is it reallynecessary? Given the cost of enforcing strong consistency [25, 28],it is preferable to avoid imposing such stringent conditions unlessthere are no viable alternatives. Exploring the space of these alterna-tives requires understanding the subtle differences in semantics ofvarious kinds of weak consistency alternatives.

3.3 ContractsQUELEA helps facilitate the mapping of operations to appropriateconsistency levels by letting the programmer declare application-level consistency constraints as contracts2 (Figure 4) that axiomati-cally specify the set of allowed executions involving this operation.In the case of the bank account, any execution that does not exhibitthe anomalies described in the previous section is a well-formedexecution on the bank account object. By specifying the set of legalexecutions for each data type in terms of a trace of operation invo-cations on that type, QUELEA ensures that all executions over thattype are well-formed.

In our running example, it is clear that in order to preservethe critical integrity constraint, the withdraw operation must bestrongly consistent. That is, given two withdraw operations a andb, either a is visible to b or vice-versa. We express this application-level consistency requirement as a contract (ψw) over withdraw:

∀(a : withdraw).

sameobj(a, η̂)⇒ a = η̂ ∨ vis(a, η̂) ∨ vis(η̂, a)

Here, η̂ stands for the effect emitted by the withdraw operation.The syntax a : withdraw states that a is an effect emitted by awithdraw operation i.e., oper(a, withdraw) holds. The contractspecifies that if the current operation emits an effect η̂, then forany operation a which was emitted by a withdraw operation, itis the case that a = η̂ or a is visible to η̂, or vice versa. Anyexecution on a bank account object that preserves the above contractfor a withdraw operation is said to be derived from a correctimplementation of withdraw.

To prevent getBalance from ever showing a negative balance,it is necessary to prevent the scenario depicted in Figure 3(b).Let η̂ stand for the effect emitted by the getBalance operation.If the effect (b) of a withdraw operation is visible to η̂, and theeffect (a) of a deposit operation is visible to the effect (b) of thewithdraw operation, then it must be the case that a is also visibleto η̂. A contract (ψ1

gb) for getBalance that precisely captures thisapplication-level consistency requirement can be written thus:

∀(a : deposit), (b : withdraw).

(vis(a, b) ∧ vis(b, η̂)⇒ vis(a, η̂))

To prevent the missing update anomaly described in Figure 3(c), it isnecessary for a getBalance operation on a bank account to witnessthe effects of all previous deposit and withdraw operationsperformed on the same bank account in the same session. We canexpress an additional contract (ψ2

gb) for getBalance that capturesthis consistency requirement:

∀(c : deposit ∨ withdraw).

((so ∩ sameobj)(c, η̂)⇒ vis(c, η̂))

Our contract language provides operators to compose relations.The syntax (R1 ∩ R2)(a, b) is equivalent to R1(a, b) ∧ R2(a, b).

2 QUELEA exposes the contract construction language as a Haskell library

x, y, η̂ ∈ EffVar Op ∈ OperName

ψ ∈ Contract ::= ∀(x : τ).ψ | ∀x.ψ | πτ ∈ EffType ::= Op | τ ∨ τπ ∈ Prop ::= true | R(x, y) | π ∨ π

| π ∧ π | π ⇒ π

R ∈ Relation ::= vis | so | sameobj | =

| R ∪R | R ∩R | R+

Figure 4: Contract language.

The above contract (ψ2gb) says that if a deposit or a withdraw

operation precedes a getBalance operation in session order, andis applied on the same object as the getBalance operation, thenit must be the case that the getBalance operation witnesses theeffects of the preceding operations.

The final contract (ψgb) of the getBalance operation is merelya conjunction of the previous two versions (ψ1

gb and ψ2gb):

∀(a : deposit), (b : withdraw), (c : deposit ∨ withdraw).

(vis(a, b) ∧ vis(b, η̂)⇒ vis(a, η̂))

∧ ((so ∩ sameobj)(c, η̂)⇒ vis(c, η̂))

Intuitively, this specification prohibits both the getBalance anoma-lies described in Figures. 3(b) and 3(c) from occurring.

Finally, since there are no restrictions on when or how a depositoperation can execute, its contract is simply true.

3.4 From Contracts to ImplementationNotice that the contracts for withdraw and getBalance onlyexpress application-level consistency requirements, and make noreference to the semantics of the underlying store. To write contracts,a programmer only needs to reason about the semantics of theapplication under the QUELEA system model. The mapping ofapplication-level consistency requirements to appropriate store-levelguarantees is done automatically behind-the-scene. How mightone go about ensuring that an execution adheres to a contract?The challenge is that a contract provides a declarative (axiomatic)specification of an execution, while what is required is an operationalprocedure for enforcing its implicit constraints.

One strategy would be to execute operations speculatively. Here,operations are tentatively applied as they are received from theclient or other replicas. We can maintain a runtime manifestationof executions, and check well-formedness conditions at runtime,rolling back executions if they are ill-formed. However, the overheadof state maintenance and the complexity of user-defined contracts islikely to make this technique infeasible in practice.

We devise a static approach instead. Contracts are analyzed withthe help of a theorem prover, and statically mapped to a particularstore-level consistency property that the prover guarantees preservescontract semantics. We call this procedure contract classification.Given the variety and complexity of store level consistency prop-erties, the idea is that the system implementer parameterizes theclassification procedure by describing the store semantics in thesame contract language as the one used to express the contract onthe operations. In the next section, we describe the contract languagein detail and describe the classification procedure for a particularstore semantics.

4. Contract Language4.1 SyntaxThe syntax of our core contract language is shown in Figure 4. Thelanguage is based on first-order logic (FOL), and admits prenexuniversal quantification over typed and untyped effect variables.We use a special effect variable (η̂) to denote the effect of current

η ∈ Effect ψ ∈ Contract η ∈ Effect Set

A ∈ EffSoup ::= η

vis, so, sameobj ∈ Relations ::= A× AE ∈ ExecState ::= (A,vis,so,sameobj)

Figure 5: Axiomatic execution.

operation - the operation for which a contract is being written.Notice that η̂ occurs free in the contract. We fix its scope whenclassifying contracts (§ 4.4). The type of an effect is simply the nameof the operation (eg: withdraw) that induced the effect. We admitdisjunction in types to let an effect variable range over multipleoperation names. The contract ∀(a : τ1 ∨ τ2). ψ is just syntacticsugar for ∀a.(oper(a, τ1) ∨ oper(a, τ2)) ⇒ ψ. An untyped effectvariable ranges over all operation names.

Quantifier-free propositions in our contract language are con-junctions, disjunctions and implications of predicates expressingrelations between pairs of effect variables. The syntactic class ofrelations is seeded with primitive vis, so, and sameobj relations,and also admits derived relations that are expressible as union,intersection, or transitive closure3 of primitive relations. Com-monly used derived relations are the same object session order(soo = so ∩ sameobj), happens-before order (hb = (so ∪ vis)+)and the same object happens-before order (hbo = (soo ∪ vis)+).

4.2 SemanticsQUELEA contracts are constraints over axiomatic definitions ofprogram executions. Figure 5 summarizes artifacts relevant to definean axiomatic execution. We formalize an axiomatic execution as atuple (A,vis,so,sameobj), where A, called the effect soup, is theset of all effects generated during the program execution, andvis, so, sameobj ⊆ A × A are visibility, session order, and sameobject relations, respectively, witnessed over generated effects atrun-time.

Note that the axiomatic definition of an execution (E) providesinterpretations for primitive relations (eg: vis) that occur free incontract formulas, and also fixes the domain of quantification to setof all effects (A) observed during the program execution. As such,E is a potential model for any first-order formula (ψ) expressible inour contract language. If E is indeed a valid model for ψ (written asE |= ψ), we say that the execution E satisfied the contract ψ:

Definition 1. An axiomatic execution E satisfies a contract ψ if andonly if E |= ψ.

4.3 Capturing Store SemanticsAn important aspect of our contract language is its ability to capturestore-level consistency guarantees, along with application-levelconsistency requirements. Similar to [10], we can rigorously definea wide variety of store semantics including those that combine anysubset of session and causality guarantees, and multiple consistencylevels. However, for our purposes, we identify three particularconsistency levels – eventual, causal, and strong, commonly offeredby many distributed stores with tunable consistency, with increasingoverhead in terms of latency and availability.

• Eventual consistency: Eventually consistent operations can besatisfied as long as the client can reach at least one replica. Inthe bank account example, deposit is an eventually consistentoperation. While an ECDS typically offers basic eventual con-sistency with all possible anomalies, we assume that our store

3 Strictly speaking,R+ is not the transitive closure ofR, as transitive closureis not expressible in FOL. Instead, R+ in our language denotes a superset oftransitive closure ofR. Formally,R+ is any relationR′ such that forall x, y,and z, a) R(x, y)⇒ R′(x, y), and b) R′(x, y) ∧ R′(y, z)⇒ R′(x, z).

ψ ≤ ψsc

WellFormed(ψ)

ψ ≤ ψec

EventuallyConsistent(ψ)

ψ 6≤ ψec ψ ≤ ψcc

CausallyConsistent(ψ)

ψ 6≤ ψcc ψ ≤ ψsc

StronglyConsistent(ψ)

Figure 6: Contract classification.

provides stronger semantics that remain highly-available [2, 19];the store always exposes a causal cut of the updates. This seman-tics can be formally captured in terms of the following contractdefinition:

ψec = ∀a, b. hbo(a, b) ∧ vis(b, η̂)⇒ vis(a, η̂)

The above contract mandates that an effect a must be visible tothe current effect η̂ if some effect b that causally succeeds a isalso visible to η̂. Thus, if every store replica always maintainsand exposes a causal cut of updates to the client, then such astore will satisfy this contract. In such a system, an eventuallyconsistent operation, such as deposit, which requires weakerguarantees than offered by the store, can be satisfied as long assome replica is reachable.

• Causal consistency: Causally consistent operations are requiredto see a causally consistent snapshot of the object state, includingthe actions performed on the same session. The latter require-ment implies that if two operations o1 and o2 from the samesession are applied to two different replicas r1 and r2, the secondoperation cannot be discharged until the effect of o1 is includedin r2. The getBalance operation requires causal consistency,as it requires operations from the same session to be visible,which cannot be guaranteed under eventual consistency. Thecorresponding store semantics is captured by the contract ψcc

defined below:

ψcc = ∀a. hbo(a, η̂)⇒ vis(a, η̂)

• Strong consistency: Strongly consistent operations may blockindefinitely under network partitions. An example is the total-order contract on withdraw operation. The corresponding storesemantics is captured by the ψsc contract definition:

ψsc = ∀a. sameobj(a, η̂)⇒ vis(a, η̂) ∨ vis(η̂, a) ∨ a = η̂

4.4 Contract ClassificationOur goal is to map application-level consistency constraints onoperations to appropriate store-level consistency guarantees capableof satisfying these constraints. The ability to express both thesekinds of constraints as contracts in our contract language lets uscompare and determine if contract (ψop) of an operation (op) isweak enough to be satisfied under a store consistency level identifiedby the contract ψst. Towards this end, we define a binary weakerthan relation for our contract language as following:

Definition 2. A contract ψop is said to be weaker than ψst (writtenψop ≤ ψst ) if and only if ∆ ` ∀η̂.ψst ⇒ ψop.

The quantifier in the sequent binds η̂ that occurs free in ψst andψop. The context (∆) of the sequent is a conjunction of assumptionsabout the nature of primitive relations. A well-formed axiomaticexecution (E) is expected to satisfy these assumptions (i.e., E |= ∆).

Definition 3. An axiomatic execution E = (A,vis,so,sameobj) iswell-formed if the following axioms (∆) hold:

• The happens-before relation is acyclic: ∀a. ¬hbo(a, a).• Visibility only relates actions on the same object:

∀a, b. vis(a, b)⇒ sameobj(a, b).• Session order is a transitive relation:

∀a, b, c. so(a, b) ∧ so(b, c)⇒ so(a, c).• Same object is an equivalence relation:

∀a. sameobj(a, a).∀a, b. sameobj(a, b)⇒ sameobj(b, a).∀a, b, c. sameobj(a, b) ∧ sameobj(b, c)⇒ sameobj(a, c).

If the contract (ψop) of an operation (op) is weaker than a storecontract (ψst), then constraints expressed by the former are impliedby guarantees provided by the latter. The completeness of first-orderlogic allows us to assert that any well-formed execution (E) thatsatisfies ψst (i.e., E |= ψst) also satisfies ψop (i.e., E |= ψop).Consequently, it is safe to execute operation op under a storeconsistency level captured by ψst.

Observe that the contracts ψsc, ψcc and ψec are themselves totallyordered with respect to the ≤ relation: ψec ≤ ψcc ≤ ψsc. Thisconcurs with the intuition that any contract satisfiable under ψec

or ψcc is satisfiable under ψsc, and any contract that is satisfiableunder ψec is satisfiable under ψcc. We are interested in the weakestguarantee (among ψec, ψcc, and ψsc) required to satisfy the contract.We define the corresponding consistency level as the consistencyclass of the contract.

The classification scheme, presented formally in Figure 6, definesrules to judge the consistency class of a contact. For example, thescheme classifies the getBalance contract (ψgb) from § 3 as aCausallyConsistent contract, because the sequent ∆ ` ∀η̂.ψcc ⇒ ψgbis valid in first-order logic (therefore, ψgb ≤ ψcc), whereas thesequent ∆ ` ∀η̂.ψec ⇒ ψgb is invalid (therefore, ψgb 6≤ ψec). Sincewe confine our contract language to a decidable subset of the logic,validity of such sequents can be decided mechanically allowing usto automate the classification scheme in QUELEA.

Along with three straightforward rules that classify contracts intoconsistency classes, the classification scheme also presents a rulethat judges well-formedness of a contract. A contract is well-formedif and only if it is satisfiable under ψsc - the strongest possibleconsistency guarantee that the store can provide. Otherwise, it isconsidered ill-formed, and rejected statically.

4.5 Generality of ContractsIt is important to note that our contract language provides a genericway to capture application-level consistency properties and isnot tied to a particular store semantics. In particular, the sameapplication-level contracts can easily be mapped to a different storewith a varied consistency lattice. To illustrate this, let us considerthe consistency lattice proposed by Terry et al. [27] based on sessionguarantees. Terry et al. propose the following four incomparablesession guarantees, whose semantics is captured in the contractsbelow:

Read Your Writes (RYW) ::= ∀a. soo(a, η̂)⇒ vis(a, η̂)

Monotonic Reads (MR) ::= ∀a, b. vis(a, b) ∧ soo(b, η̂)

⇒ vis(a, η̂)

Monotonic Writes (MW) ::= ∀a, b. soo(a, b) ∧ vis(b, η̂)

⇒ vis(a, η̂)

Writes Follow Reads (WFR) ::= ∀a, b, c. vis(a, b) ∧ vis(c, η̂)

∧(soo ∪ =)(b, c)⇒ vis(a, η̂)

In this scheme, the consistency level of an operation is anycombination of the above guarantees, which form a partially orderedconsistency lattice show in Figure 7. Each element in this latticecorresponds to a store-consistency level, and is represented byits contract. An edge from an upper level element to a lowerlevel element corresponds to a weaker-than relation between thecorresponding contracts. Classifying a contract under this schemeis a directed search in the lattice, starting from the bottom, and

true

RYW MW MR WFR

RYW ∧ MW RYW ∧ MR RYW ∧ WFRMW ∧ MR MW ∧ WFR MR ∧ WFR

RYW ∧ MW ∧ MR

RYW ∧ MW ∧ WFR

RYW ∧ MR ∧ WFR

MW ∧ MR ∧ WFR

RYW ∧ MW ∧ MR ∧ WFR

Figure 7: Lattice of consistency levels under session guarantees.

determining the weakest consistency level under which the contractcan be satisfied. Under this scheme, deposit operations do notneed any guarantees, getBalance needs RYW and WFR (ψgb ≤RYW ∧WFR), and withdraw cannot be satisfied (ψw 6≤ RYW ∧MW ∧MR ∧WFR).

4.6 Soundness of Contract ClassificationWe now present a meta-theoretic result that certifies the soundnessof classification-based contract enforcement. To help us state theresult, we define an operational semantics of the system describedinformally in § 2:

op ∈ Operation

τ ∈ ConsistencyClass ::= ec, cc, sc

σ ∈ Session ::= · | 〈op, τ〉;σΣ ∈ Session Soup ::= σ ‖ Σ | ∅

Config ::= E,Σ

We model the system as a tuple E,Σ, where the axiomaticexecution E captures the data store’s current state, and session soupΣ is the set of concurrent client sessions interacting with the store.A session σ is a sequence of pairs composed of replicated data typeoperations op, tagged with the consistency class τ of their contracts(as determined by the contract classification scheme). We assume areduction relation of form:

E, 〈op, τ〉;σ ‖ Ση−→ E′, σ ‖ Σ

on the system state. The relation captures the progress of theexecution (from E to E′) due to the successful completion of aclient operation op from one of the sessions in Σ, generating a neweffect η. If the resultant execution E′ satisfies the store contract ψτ(i.e., E |= ψτ ), then we say that the store has enforced the contractψτ in the execution E′. With help of the operational semantics, wenow state the soundness of contract enforcement as follows:

Theorem 4 (Soundness of Contract Enforcement). Let ψ be a well-formed contract of a replicated data type operation op, and let τdenote the consistency class of ψ as determined by the contractclassification scheme. For all well-formed execution states E, E′

such that E, 〈op, τ〉;σ ‖ Ση−→ E′, σ ‖ Σ, if E′ |= ψτ [η/η̂], then

E′ |= ψ[η/η̂]

The theorem states that if a data store correctly enforces ψsc, ψcc,and ψec contracts in all well-formed executions, then the samestore, extended with the classification scheme shown in Figure 6,can enforce all well-formed QUELEA contracts. The proof of thetheorem is given below:

Proof. Hypothesis:

E, 〈op, τ〉;σ ‖ Ση−→ E′, σ ‖ Σ H0

E′ |= ψτ [η/η̂] H1

Since τ is the contract class of ψ, by inversion, we have ψ ≤ ψτ .By the definition of ≤ relation:

∆ ` ∀η̂.ψτ ⇒ ψ H2

Since η denotes new effect, it is a fresh variable that does not occurfree in ∆. From H3, after instantiating bound η̂ with η, we have:

∆ ` ψτ [η/η̂]⇒ ψ[η/η̂] H3

Due to the soundness of natural deduction for first-order logic, H4implies that for all modelsM such thatM |= ∆, ifM |= ψτ [η/η̂]thenM |= ψ[η/η̂]. Since E′ is well-formed, we have:

E′ |= ∆ H4

Proof follows from H1, H4, and H3.It is important to note that Theorem 4 does not ascribe any seman-

tics to the reduction relation (−→). As such, it makes no assumptionsabout how the store actually implements ec, cc and sc guarantees.The specific implementation strategy is determined by the opera-tional semantics of the store, which defines the reduction relationfor that particular store. The following section describes operationalsemantics of the store used by the QUELEA implementation.

5. Operational SemanticsWe now describe operational semantics of a data store that im-plements strong, causal and eventual consistency guarantees. Thesemantics also serves as a high-level description of our implementa-tion of the store underlying QUELEA.

Figure 8 presents operational semantics as rules defining thereduction relation (−→) over the execution state. Since we now have aconcrete store, we extend our system model with Θ, a representationof the store as a map from replicas to their local states. The local stateof a replica r (i.e., Θ(r)) is a set of effects that are currently visibleat r. An operation op performed at replica r can only witness the setof effects (Θ(r)) visible at r. To avoid too many parentheses in ourformal presentation, we represent operation op, whose contract is inconsistency class τ , as opτ instead of the usual 〈op, τ〉. For the sakeof clarity, we only consider a single replicated object of well-definedtype (for eg: a replicated object of type BankAccount) in ourformalization. Our semantics are parametric over the specificationof this replicated data type. Figure 8 formalizes replicated data type(RDT) specification as tuple (δ,Λ,Ψ), where δ is the data type, Λmaps labels (op) of operations on δ to their definitions, while Ψmaps them to their consistency contracts (ψ). The definition of anoperation is expected to be a lambda expression, although we do notenforce this in our formalization. For technical reasons, we tag eachsession with a session identifier (s) and the sequence number (i) ofthe next operation in the session.

The state of an operational execution (E) is a tuple (A,vis,so,sameobj), where A is a set of effects, and vis, so, sameobj ⊆A× A are visibility, session order, and same object relations overeffects, respectively. We define an effect (η) as a tuple (s, i, op, v),which records the fact that ith action in session with SessIDs, which is an operation op on the replicated object, has beensuccessfully executed on some replica yielding a return value v.Note that the combination of s and i uniquely identifies the effect.Session order relation (so) relates effects generated by the samesession. An effect η = (s, i, op, v) is said to precede another effectη′ = (s′, i′, op′, v′) in session order if and only if s′ = s and i′ ≥ i.Since we only consider one replicated object in our formalization,the sameobj relation relates every pair of effects in the effect soup(A). An effect generated at a replica becomes visible at rest ofthe replicas eventually. If we denote the effect generated by the

operation op as ηop, then Θ(r) × {ηop} ⊆ vis. Often, in ourformalization, we use vis and so binary relations to obtain a setof effects visible to a given effect η, or set of effects that precedea given effect η in the session order. As a syntactic convenience,whenever R is a binary relation, we write R(η) to denote the set ofall η′ such that (η, η′) ∈ R. Conversely, we writeR−1(η) to denotethe set of all η′ such that (η′, η) ∈ R.

Basic guarantee provided by the store is causal visibility, whichis captured by the rule [EFFVIS] as a condition for an effect to bevisible at a replica. The rule makes an effect (η) visible at a replicar only after all the effects that causally precede η are made visibleat r. It is important to note that that enforcing causal visibility doesnot require any inter-replica coordination. Any eventually consistentstore can provide causal visibility while being eventually consistent.Therefore, we do not lose any generality by assuming that the storeprovides causal visibility.

Rule [OPER] is an auxiliary reduction of the form:Θ ` (E, 〈s, i, op〉)

r↪−→ (E′, η)

Under the store configuration Θ, the rule captures the progress inexecution (from E to E′) due to the application of operation opto replica r resulting in a new effect η. The rule first constructs acontext for the application from the local state (Θ(r)) of the replica,by projecting4 relevant information from effects in Θ(r). It thensubstitutes the definition (Λ(op)) of the operation for its label (op),and relies on the reduction relation ( ) of the server-side languageto reduce the application Λ(op)(ctxt) to a value v′. Subsequently,the the attributes of execution state, namely A, vis, so, and sameobjare extended to incorporate the new effect (η).

If the operation op is EventuallyConsistent, we simply apply theoperation to any replica r. Since the store provides causal visibility,eventually consistent operations are satisfiable under any replica. Ifthe operation is CausallyConsistent, the operation can be applied toa replica r only if it already contains the effects of all the previousoperations from the same session. This guarantee can be satisfied byapplying all operations from the same session to the same logicalcopy of the database. If such a logical copy is untenable, thenthe operation might block. Since the store is assumed to convergeeventually, the blocked causally consistent operation guaranteed tounblock eventually.

A StronglyConsistent operation expects sequential consistency.That is, universe of all effects (A) in an execution (E) must bepartitionable into a set of effects that happened before η andanother set that happened after η, where η is the effect generatedby an strongly consistent operation. The rule [SC] enforces thissequencing in two steps; firstly, it insists that the the stronglyconsistent operation (op) witness effects of all operations executedso far by requiring the global set of effects A to be a subset oflocal state (Θ(r)) of the replica (r) executing op. Secondly, the rulerequires the effect (η) generated by op to be added to the local stateof every other replica in the store, so that further operations on thesereplicas can witness the effect of op. Since both these steps requireglobal coordination among replicas, the store is unavailable duringthe time it is executing op.

5.1 Soundness of Operational SemanticsWe now prove a meta-theoretic property that establishes the sound-ness of our operational semantics in enforcing ψec, ψcc, and ψsc

consistency guarantees at every reduction step. As a corollary ofthis result, and Theorem 4, we have the assurance that QUELEAcorrectly enforces all well-formed consistency contracts.

First, we prove a useful lemma:

4 ctxt* is auxiliary function ctxt extended straightforwardly to set of effects

RDT Specification Language

δ ∈ ReplicatedDatatype

v ∈ Value

e ∈ Expression

op ∈ Operation

Λ ∈ OperationDef. ::= op 7→ e

ψ ∈ Contract

Ψ ∈ OperationContract ::= op 7→ ψ

RDTSpecification ::= (δ,Λ,Ψ)

System Model

s ∈ SessID i ∈ SeqNo r ∈ ReplID

η ∈ Effect ::= (s, i, op, v)

A ∈ EffSoup ::= η

vis, so, sameobj ∈ Relations ::= A× AE ∈ ExecState ::= (A,vis,so,sameobj)

Θ ∈ Store ::= r 7→ η

τ ∈ ConsistencyClass ::= ec, cc, sc

σ ∈ Session ::= · | opτ :: σ

Σ ∈ Session Soup ::= 〈s, i, σ〉 ‖ Σ | ∅Config ::= (E,Θ,Σ)

Auxiliary Definitions

oper(s, i, op, v) = op ctxt(s, i, op, v) = (op, v)

Auxiliary Reductions Θ ` (E, 〈s, i, op〉) r↪−→ (E′, η)

[OPER]

r ∈ dom(Θ) ctxt = ctxt∗(Θ(r)) Λ(op)(ctxt) ∗v η = (s, i, op, v) {η′} = A(SessID=s, SeqNo=i−1)

A′ = {η} ∪ A vis′ = Θ(r)× {η} ∪ vis so′ = (so−1(η′) ∪ η′)× {η} ∪ so sameobj′ = A′ × A′

Θ ` ((A, vis, so, sameobj), 〈s, i, op〉r↪−→ ((A′, vis′, so′, sameobj′), η)

Operational Semantics (E,Θ,Σ)η−→ (E′,Θ′,Σ′)

[EFFVIS]

η ∈ A Θ′ = Θ ∪ [r 7→ {η} ∪Θ(r)]

η /∈ Θ(r) E.vis−1(η) ∪ E.so−1(η) ⊆ Θ(r)

(E,Θ,Σ)η−→ (E,Θ′,Σ)

[EC]

τ = EventuallyConsistent

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η)

(E,Θ, 〈s, i, opτ :: σ〉 ‖ Σ)η−→ (E′,Θ, 〈s, i+ 1, σ〉 ‖ Σ)

[CC]

τ = CausallyConsistent

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) E′.so−1(η) ⊆ Θ(r)

(E,Θ, 〈s, i, opτ :: σ〉 ‖ Σ)η−→ (E′,Θ, 〈s, i+ 1, σ〉 ‖ Σ)

[SC]

τ = StronglyConsistent

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) E.A ⊆ Θ(r)

dom(Θ′) = dom(Θ) ∀r′ ∈ dom(Θ′).Θ′(r′) = Θ(r′) ∪ {η}(E,Θ, 〈s, i, opτ :: σ〉 ‖ Σ)

η−→ (E,Θ′, 〈s, i+ 1, σ〉 ‖ Σ)

Figure 8: Operational semantics of a replicated data store.

Lemma 5 (Auxiliary Reduction Preserves Well-Formedness). Forevery well-formed execution state E, if Θ ` (E, 〈s, i, op〉) r

↪−→(E′, η), then E′ is well-formed.

Proof. Let us denote E.A, E.vis, E.so, and E.sameobj as A, vis,so, and sameobj respectively. Likewise, let us denote E′.A, E′.vis,E′.so, and E′.sameobj as A′, vis′, so′, and sameobj′, respectively.By inversion on Θ ` (E, 〈s, i, op〉) r

↪−→ (E′, η), we have followinghypotheses:

r ∈ dom(Θ) H0ctxt = ctxt∗(Θ(r)) H1η = (s, i, op, v) H2

{η′} = A(SessID=s, SeqNo=i−1) H3A′ = {η} ∪ A H4

vis′ = Θ(r)× {η} ∪ vis H5so′ = (so−1(η′) ∪ η′)× {η} ∪ so H6

sameobj′ = A′ × A′ H7

Since E is well-formed, from the definition of well-formedness andmodels relation, we have the following:

∀(a ∈ A).¬hb(a, a) H8∀(a, b ∈ A).vis(a, b)⇒ sameobj(a, b) H9

∀(a, b, c ∈ A).so(a, b) ∧ so(b, c)⇒ so(a, c) H10∀(a ∈ A).sameobj(a, a) H11

∀(a, b ∈ A).sameobj(a, b)⇒ sameobj(b, a) H12∀(a, b, c ∈ A).so(a, b) ∧ so(b, c)⇒ so(a, c) H13

Since hb = (vis ∪ so)+, H8 is equivalent to conjunction offollowing assertions:

∀(a ∈ A).¬vis(a, a) H14∀(a ∈ A).¬so(a, a) H15

∀(a, b ∈ A).¬(vis(a, b) ∧ so(b, a)) H16

Since η is fresh, η /∈ Θ(r). From H4, H5, and H14, we have theacyclicity property for vis′:

∀(a ∈ A′).¬vis′(a, a) H17

Also, η /∈ A, and from H4, H6 and H15, we have acyclicity forso′:

∀(a ∈ A′).¬so′(a, a) H18

Similarly, from the uniqueness of η, and H5, H6, and H16, wehave the following:

∀(a, b ∈ A′).¬(vis′(a, b) ∧ so′(b, a)) H19

From H17− 19, we prove the acyclicity of hb′:∀(a ∈ A′).¬hb(a, a) G0

The sameobj′ relation is simply the cross product A′ × A′. Hencefollowing trivially hold:

∀(a, b ∈ A′).vis′(a, b)⇒ sameobj′(a, b) G1∀(a ∈ A′).sameobj′(a, a) G2

∀(a, b ∈ A′).sameobj′(a, b)⇒ sameobj′(b, a) G3

Finally, from H3, H4, H6 and H10, we have transitivity for so′:∀(a, b, c ∈ A′).so′(a, b) ∧ so′(b, c)⇒ so′(a, c) G4

Well-formedness of E′ follows from G0− 4.We now define causal consistency property of the store formally:

Definition 6. A store Θ is said to be causally consistent under anexecution E = (A,vis,so,sameobj) if and only if:

∀(r ∈ dom(Θ)).∀(η ∈ Θ(r)).∀(a ∈ A).hbo(a, η)⇒ a ∈ Θ(r) H20

Where, hbo = (vis ∪ soo)+

The following theorem proves that our operational semanticscorrectly enforce ψec, ψcc, and ψsc guarantees:

Theorem 7 (Soundness Modulo Classification). For every well-formed execution state E, for every store Θ that is causally consis-tent under E, and for every contract class τ ∈ {ec, cc, sc}, if:

(E,Θ, 〈s, i, opτ :: σ〉 ‖ Σ)η−→ (E′,Θ′, 〈s, i+ 1, σ〉 ‖ Σ)

then (i) E′ is well-formed, and (ii) E′ |= ψτ [η/η̂]

Proof. By case analysis on the derivation:


Cases:• Case [EC]: Hypotheses:

τ = EventuallyConsistent H0

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) H1

Goal (i) follows from H1 and lemma 5. Goal (ii) is the follow-ing:

E′ |= ∀a, b. hbo(a, b) ∧ vis(b, η)⇒ vis(a, η) G0

Let A′ = E′.A, vis′ = E′.vis, so′ = E′.so, and hbo′ = (so′ ∪vis′)+. By inversion on H1, we get the following hypotheses:

A′ = A ∪ {η} H2vis′ = Θ(r)× η ∪ vis H3

r ∈ dom(Θ) H4sameobj′ = A′ × A′ H5

Since E′ defines A′ as the universe of values. Therefore, the goalcan be rewritten:∀(a, b ∈ A′).E′ |= hbo(a, b) ∧ vis(b, η)⇒ vis(a, η) G1

New hypotheses after intros:a ∈ A′ H6b ∈ A′ H7

And new goal:E′ |= hbo(a, b) ∧ vis(b, η)⇒ vis(a, η) G2

Since (M |= A ⇒ B) ⇔ (M |= A ⇒ M |= B), we proveG0 by proving:

(E′ |= hbo(a, b) ∧ vis(b, η))⇒ (E′ |= vis(a, η)) G3

After intros:E′ |= hbo(a, b) ∧ vis(b, η) H8

Since E′ defines hbo′ and vis′ as interpretations for hbo and visrespectively, we have:

hbo′(a, b) ∧ vis′(b, η) H9

And the goal is:vis′(a, η) G4

Inversion on H9:hbo′(a, b) H10vis′(b, η) H11

Since η is unique, from H3 and H11 we have the following:b ∈ Θ(r) H12

Since a, b 6= η, we have that hbo′(a, b)⇒ hbo(a, b). Since Θis causally consistent under E, using H10 and H12 we derivethe following:

a ∈ Θ(r) H13

Now, from H3 and H13, we deduce:(a, η) ∈ vis′

which is what needs to be proven (G4).• Case [CC]: Hypotheses:

τ = CausallyConsistent H14

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) H15

E′.so(η) ⊆ Θ(r) H16

Goal (i) follows from H15 and lemma 5. We now prove Goal(ii). Expanding the definition of ψcc, goal is the following:

E′ |= ∀a. hbo(a, η)⇒ vis(a, η) G5

Let A′ = E′.A, vis′ = E′.vis, so′ = E′.so, and hbo′ =(so′ ∪ vis′)+. By inversion on H15, we get the following:

A′ = A ∪ {η} H17vis′ = Θ(r)× η ∪ vis H18

{η′} = A(SessID=s, SeqNo=i−1)

so′ = (so−1(η′) ∪ η′)× {η} ∪ so H19sameobj′ = A′ × A′ H20

Expanding |= followed by intros on G5 yields a context withfollowing hypotheses:

a ∈ A′ H21hbo′(a, η) H22

And the goal is the following:vis′(a, η) G6

Since happens-before is transitive, by inversion on H22, we gettwo cases5:

SCase 1: Hypotheses:(so′ ∪ vis′)(a, η) H23

Note that we Inversion on H23 leads to two subcases. Inone case, we assume vis′(a, η) and try to prove the goalG6. The proof for this case mimics the proof for Case [EC].Alternatively, in second case, we assume:

so′(a, η) H24

and prove G6. From H24 and H16, we infer:a ∈ Θ(r) H25

Now, from H25 and H18 we know:(a, η) ∈ vis′

which is the goal (G6). 26

5 Recall that we only consider single replicated object in our formaliza-tion. Accordingly, for any execution E = (A,vis,so,sameobj), we havesameobj = A× A. Since, soo = so ∩ sameobj and so ⊆ A× A, we useso and soo interchangeably in proofs.

SCase 2: Hypotheses (after abbreviating the occurrenceof(so′ ∪ vis′)+ as hbo’):

∃(c ∈ A′).hbo′(a, c) ∧ (so′ ∪ vis′)(c, η) H27

Inverting H27, followed by expanding so′ ∪ vis′:c ∈ A′ H28

hbo′(a, c) ∧ (so′(c, η) ∨ vis′(c, η)) H29

Inverting the disjunction in H29, we get two cases:− SSCase R: Hypothesis is

hbo′(a, c) ∧ vis′(c, η) H30

Observe that hypothesis H30 and current goal (G6) aresame as hypothesis H9 and goal (G4) in Case [EC]. Theproof for this SSCase is also the same.

− SSCase L: Hypothesis ishbo′(a, c) ∧ so′(c, η) H31

Inverting H31:hbo′(a, c) H32so′(c, η) H33

From H33 and H16, we infer:c ∈ Θ(r) H34

Since a, c 6= η, we know that hbo′(a, c) ⇒ hbo(a, c).Since Θ is causally consistent under E, using H32 andH34 we derive the following:

a ∈ Θ(r) H35

Proof follows from H35 and H18.• Case [SC]: Hypotheses:

τ = StronglyConsistent H36

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) H37

E.A ⊆ Θ(r) H38dom(Θ′) = dom(Θ) H39

∀r′ ∈ dom(Θ′).Θ′(r′) = Θ(r′) ∪ {η} H40

Let A′ = E′.A, vis′ = E′.vis, so′ = E′.so, and hbo′ =(so′ ∪ vis′)+. Inversion on H37 gives:

A′ = A ∪ {η} H41vis′ = Θ(r)× η ∪ vis H42



Goal (i) follows from H37 and Lemma 5. Expanding thedefinition of ψsc, followed by expanding |= relation, and thendoing intros, we get the following context:

a ∈ A′ H45

And the goals are following:sameobj′(a, η)⇒ hbo′(a, η) ∨ hbo′(η, a) ∨ a = η) G7

hbo′(a, η)⇒ vis′(a, η) G8hbo′(η, a)⇒ vis′(η, a) G9

From H41 and H45, we know that either a = η or a ∈ A.When a = η:

G7 follows trivially

From lemma 5, we know that hbo′ is acyclic. Hencehbo′(η, η) = false. Therefore, G8 − 9 are valid vacu-ously.

When a ∈ A:Intros on G7 gives following hypothesis:

sameobj′(a, η)

From H38 we know that a ∈ Θ(r). Using H42, we derive:vis′(a, η) H46

Introducing disjunction:(vis′ ∪ so′)(a, η) H47

Now, since hbo′ = (vis′ ∪ so′)+, proof follows from lasthypothesis.Intros on G8 gives following hypothesis:

hbo′(a, η)

From H38 we know that a ∈ Θ(r). Using H42, we derive:vis′(a, η) H48

Which proves G8.Intros on G9 gives:

hbo′(η, a) H49

From lemma 5, we know that E′ is well-formed. Hence:¬hbo′(a, a) H50

Since sameobj′ = A′ × A′, we have:sameobj′(a, η) H51

Using previous hypothesis, we can reuse the proof for G7 toderive:

hbo′(a, η) H52

Since hbo′ is a transitive relation, from H49 and H52, wederive:

hbo′(a, a) H53

H53 and H50 are contradicting hypothesis. Proof follows.

We now show that every configuration of the store that isreachable via the reduction relation (−→) is causally consistent.

Theorem 8 (Causal Consistency Preservation). For every well-formed execution state E, and a store Θ that is causally consistentunder E, if:


then Θ′ is causally consistent under E′.

Proof. By case analysis on on:


Cases:• Case [EFFVIS]: Final execution state is same as the initial (i.e.,

E′ = E). Therefore, we need to prove that new store configu-ration Θ is causally consistent under E = (A,vis,so,sameobj).Hypotheses:

r ∈ ReplID H0η ∈ A H1η /∈ Θ(r) H2

E.vis−1(η) ∪ E.so−1(η) ⊆ Θ(r) H3Θ′ = Θ ∪ [r 7→ {η} ∪Θ(r)] H4

∀(r ∈ dom(Θ)).∀(η ∈ Θ(r)).∀(a ∈ A).hbo(a, η)⇒ a ∈ Θ(r) H5

From H4 and H5, it suffices to prove:∀(a ∈ A).hbo(a, η)⇒ a ∈ Θ′(r) G0

After intros, hypotheses:a ∈ A H6

hbo(a, η) H7

Goal:a ∈ Θ′(r) G1

Inversion on H7 leads to two cases:SCase a directly precedes η : Hypothesis:

(vis ∪ so)(a, η) H8

From H3 and H8, we conclude that a ∈ Θ′(r).

SCase a transitively precedes η: Hypothesis:

∃(c ∈ A).hbo(a, c) ∧ (vis ∪ so)(c, η) H9

Inverting H9:

c ∈ A H10hbo(a, c) H11

(vis ∪ so)(c, η) H12

From H3 and H12, we have:

c ∈ Θ′(r) H13

From H5, H13 and H11, we conclude that a ∈ Θ′(r).• Cases [EC] and [CC]: Store configuration (Θ) remains un-

changed. Further, no new happens before order is added eitheramong existing effects, or from the newly generated effect toexisting effects. Consequently, proof is trivial.

• Case [SC]: Let E = (A,vis,so,sameobj) and E′ = (A′, vis′, so′,sameobj′). Hypotheses:

r ∈ ReplID H14τ = StronglyConsistent(ψ) H15

Θ ` (E, 〈s, i, op〉)r↪−→ (E′, η) H16

A ⊆ Θ(r) H17dom(Θ′) = dom(Θ) H18

∀r′ ∈ dom(Θ′).Θ′(r′) = Θ(r′) ∪ {η} H19∀(r′ ∈ dom(Θ)).∀(η′ ∈ Θ(r′)).

∀(a ∈ A).hbo(a, η′)⇒ a ∈ Θ(r′) H20

The goal:

∀(r′ ∈ dom(Θ′)).∀(η′ ∈ Θ′(r′)).∀(a ∈ A′).hbo′(a, η′)⇒ a ∈ Θ′(r′) G2

Inverting H16:

A′ = A ∪ {η} H21vis′ = Θ(r)× η ∪ vis H22



Using H19 and H20, we can reduce the goal (G2) to:

∀(r′ ∈ dom(Θ′)).∀(a ∈ A′).hbo′(a, η)⇒ a ∈ Θ′(r′) G3

After intros, hypotheses:

r′ ∈ dom(Θ′) H25a ∈ A′ H26

hbo′(a, η) H27

Goal:

a ∈ Θ′(r′) G4

From H26 and H21, we know that either a ∈ A or a = η.

If a ∈ A, then fromH17, we know that a ∈ Θ(r′). However,from H19 we know that Θ(r′) ⊂ Θ′(r′), which lets usconclude that a ∈ Θ′(r′).

If a = η, then H27 is hbo′(η, η). However, from lemma 5we know that E′ is well-formed, which means that hbo′ isacyclic. Hence, a contradiction. ex_falso_quodlibet.

Corollary 9 (Soundness). For every well-formed execution stateE, for every store Θ that is causally consistent under E, for everycontract class τ ∈ {ec, cc, sc}, and for every consistency contractψ in the contract class τ , If:


then (i) E′ is well-formed, (ii) Θ′ is causally consistent under E′,and (iii) E′ |= ψ[η/η̂]

Proof. Follows from Theorems 4, 8, and 7.

6. Transaction ContractsWhile contracts on individual operations offer the programmerobject-level declarative reasoning, real-world scenarios often in-volve operations that span multiple objects. In order to address thisproblem, several recent systems [2, 9, 26] have proposed eventuallyconsistent transactions in order to compose operations on multipleobjects. However, given that classical transaction models such asserializability and snapshot isolation require inter-replica coordi-nation, these systems espouse coordination-free transactions thatremain available under network partitions, but only provide weakerisolation guarantees. Coordination-free transactions have intricateconsistency semantics and widely varying runtime overheads. Aswith operation-level consistency, the onus is on the programmer topick the correct transaction kind. This choice is further complicatedby the consistency semantics of individual operations.

6.1 Syntax and Semantics ExtensionsQUELEA automates the choice of assigning the correct and mostefficient transaction isolation level. Similar to contracts on individualoperations, the programmer associates contracts with transactions,declaratively expressing consistency specifications. We extend thecontract language with a new term under quantifier-free propositions- txn S1 S2, where S1 and S2 are sets of effects, and introduce anew primitive equivalence relation sametxn that holds for effectsfrom the same transaction. txn{a, b}{c, d} is just syntactic sugar forsametxn(a, b) ∧ sametxn(c, d) ∧ ¬sametxn(a, c), where a and bconsidered to belong to the current transaction.

We assume that operations not part of any transaction belong totheir own unique transaction. While transactions may have varyingisolation guarantees, we make the standard assumption that alltransactions provide atomicity. Hence, we include the followingaxiom in ∆:

∀a, b, c. txn{a}{b, c} ∧ sameobj(b, c) ∧ vis(b, a)⇒ vis(c, a)

The semantics of this contract is illustrated in Figure 9(a).

6.2 Transactional Bank AccountIn order to illustrate the utility of declarative reasoning for trans-actions, consider an extension of our running example to use twoaccounts (objects) – current (c) and savings (s). Each account pro-vides operations withdraw, deposit and getBalance, with thesame contracts as defined previously. We consider two transactions– save(amt), which transfers amt from current to savings, andtotalBalance, which returns the sum of the balances of individ-ual accounts. Our goal is to ensure that totalBalance returns theresult obtained from a consistent snapshot of the object states. TheQUELEA code for these transactions is given below:

save amt =x ← $(classify ψsv)atomically x $ dob ← withdraw c amtwhen b $ deposit s amt

totalBalance =x ← $(classify ψtb)atomically x $ do

b1 ← getBalance cb2 ← getBalance sreturn b1 + b2

ψsv and ψtb are the contracts on the corresponding transactions.The function classify assigns the contracts statically to one ofthe transaction isolation levels offered by the store; $() is meta-programming syntax for splicing the result into the program. Theatomically construct invokes the enclosing operations at the givenisolation level x, ensuring that the effects of the operations are madevisible atomically.

While making both transactions serializable would ensure cor-rectness, distributed stores rarely offer serializable transactionsgiven their availability requirements and implications for scalabil-

x.b x.c

Txn 2x.a

vis vis

Txn 1

(a) Atomicity

x.c y.d

Txn 2x.a

vis vis

Txn 1

y.bso

(b) MonotonicAtomic View

x.c y.d

Txn 2x.a

vis vis

Txn 1

y.b

(c) RepeatableRead

Figure 9: Semantics of transaction contracts. x and y are distinctobjects. The dotted line represents the visibility requested by thecontracts.

ity [2]. Fortunately, these transactions can be satisfied with muchweaker isolation guarantees. However, despite the atomicity of-fered by the transaction, anomalies are still possible. For example,the two getBalance operations in a totalBalance transactionmight be served by different replicas with a distinct set of com-mitted save transactions. If the first(second) getBalance oper-ation witnesses a save transaction that is not witnessed by thesecond(first) getBalance operation, then the balance returned willbe less(greater) than the actual balance. It is not immediately appar-ent how to choose the weakest isolation guarantee that would besufficient to prevent the anomaly.

Instead, QUELEA requires the programmer to simply state theconsistency requirement as a contract. Since we would like boththe getBalance operations to witness the same set of save trans-actions, we define the constraint on the totalBalance transactionψtb as:

ψtb = ∀a : getBalance, b : getBalance,

(c : withdraw ∨ deposit), (d : withdraw ∨ deposit).

txn{a, b}{c, d} ∧ vis(c, a) ∧ sameobj(d, b)⇒ vis(d, b)

The key idea in the above definition is that the txn primitive allowsus to relate operations on different objects.

The save transaction only needs to ensure that the two writesit performs are made visible atomically. Since this is ensuredby combining them in a transaction, save does not require anyadditional constraints, and ψsv is simply true.

6.3 Coordination-free TransactionsIn order to illustrate the utility of transaction contract classification,we identify three well-understood coordination-free transaction se-mantics – Read Committed (RC) [7], Monotonic Atomic View(MAV) [2] and Repeatable Read (RR) [7], and illustrate the classifi-cation strategy. Our technique can indeed be applied to a differentisolation-level lattice.

A transaction with ANSI RC semantics only witnesses commit-ted operations. Let us assume that a replica will buffer transactionalupdates until all the updates from the same transaction are availableat that replica. Once all the updates from a transaction are available,the buffered updates are made visible to subsequent client requests.This ensures atomicity of transactions. Importantly, RC does notentail any other guarantees. As a result, a store implementing RCdoes not require inter-replica coordination. We can express RC asfollows:

ψrc = ∀a, b, c. txn{a}{b, c} ∧ sameobj(b, c)∧ vis(b, a)⇒ vis(c, a)

Notice that the above definition is the same as the atomicity guar-antee of transaction described in § 6.1. The save operation is anexample of an RC transaction.

Quelea Replicated Store

Off-the-shelf Distributed Store • Off-the-shelf store• Failure handling• Persistence (on-disk)• Eventual consistency

• Soft-state (in-mem)• Datatype operations• Summarization• Stronger consistency

select insert

Shim Layer (RDTs)

Clients

obj.oper(args)

res Business Logic (incl. Txns)

REST API

Figure 10: Implementation Model.

MAV semantics ensures that if some operation in a transactionT1 witnesses the effects of another transaction T2, then subsequentoperations in T1 will also witness the effects of T2. MAV semanticsis useful for maintaining the integrity of foreign key constraints,materialized views and secondary updates [2]. In order to implementMAV, a store only needs to keep track of the set of transactionsSt witnessed by the running transaction, and before performingan operation at some replica, ensure that the replica includes allthe transactions in St. Hence, MAV is coordination-free. MAVsemantics is captured with the following contract:

ψmav = ∀a, b, c, d. txn{a, b}{c, d} ∧ so(a, b) ∧ vis(c, a)

∧ sameobj(d, b)⇒ vis(d, b)

whose semantics is illustrated in the Figure 9(b).ANSI RR semantics requires that the transaction witness a

snapshot of the data store state. Importantly, this snapshot can beobtained from any replica, and hence RR is also coordination-free.An example for such a transaction is the totalBalance transaction.The semantics of RR is captured by the following contract:

ψrr = ∀a, b, c, d. txn{a, b}{c, d} ∧ vis(c, a)

∧ sameobj(d, b)⇒ vis(d, b)

whose semantics is illustrated in the Figure 9(c).

6.4 ClassificationSimilar to operation-level contracts, with respect to ≤ relation, thecoordination-free transaction semantics described here form a totalorder: ψrc ≤ ψmav ≤ ψrr. The transaction classification is alsosimilar to the operation-level contract classification presented inFigure 6; given a contract ψ on a transaction, we start from theweakest transaction contract ψrc, and progressively compare itsstrength to the known transaction contracts until we find a isolationlevel under which ψ can be safely discharged. Otherwise, we reporta type error.

7. ImplementationQUELEA is implemented as a shallow extension of GHC Haskelland runs on top of Cassandra, an off-the-shelf eventually consistentdistributed data (or backing) store responsible for all data man-agement issues (i.e., replication, fault tolerance, availability, andconvergence). Template Haskell is used to implement static contractclassification, and proof obligations are discharged with the helpof the Z3 [30] SMT solver. Figure 10 illustrates the overall systemarchitecture.

Replicated data types and various consistency semantics areimplemented and enforced in the shim layer. Our implementationsupports eventual, causal, and strong consistency for data typeoperations, and RC, MAV, and RR semantics for transactions. Thisfunctionality is implemented entirely on top of the standard interfaceexposed by Cassandra. From an engineering perspective, leveraging

1: atomically { 2: o1.oper1(v1); //Emits effect e1 3: o2.oper2(v2); //Emits effect e2 4: o3.oper3(v3); //Emits effect e3 5: }

m

e1 e2

After executing oper2

m

e1 e2 e3

After transaction completion

Figure 11: Implementing atomicity semantics.

an off-the-shelf data store enables an implementation comprisingroughly only 2500 lines of Haskell code, which is packaged as alibrary.

7.1 Operation ConsistencyThe shim layer maintains a causally consistent in-memory snapshotof a subset of objects in the backing store, by explicitly trackingdependencies introduced between effects due to visibility, sessionand same transaction relations. Dependence tracking is similar to thetechniques presented in [3] and [20]. Because Cassandra providesdurability, convergence, and fault tolerance, each shim layer nodesimply acts as a soft-state cache, with no inter-node communication,and can safely be terminated at any point. Similarly, new shim layernodes can be spawned on demand.

Each effect generated as a result of an effectful operation onan object inserts a new row (o, e, txn, val, deps) into the backingstore, where o and e are object and unique effect identifiers, txnis an optional transaction identifier, and val is the value associatedwith the effect (eg: Withdraw 50). deps is the set of identifiersof dependencies of this operation and is defined as deps(e) ={e1 | vis(e1, e) ∧ ¬(∃e2.vis(e1, e2) ∧ vis(e2, e))}. At any shimlayer node, an effect is included only if all of its dependencies arealso included in that node. This ensures that the state maintained bythe shim layer node is causally consistent. Hence, our dependencetracking strategy ensures that QUELEA does not track every effectas the number of writes in the system grows.

The shim layer nodes periodically fetch updates from the back-ing store for eventually consistent operations, and on-demand forcausally consistent and strongly consistent operations. Strongly con-sistent operations are performed after obtaining exclusive leaseson objects. The lease mechanism is implemented with the help ofCassandra’s support for conditional updates and expiring columns.

7.2 TransactionsCassandra does not provide general-purpose transactions. Sincethe transaction guarantees provided by QUELEA are coordination-free [2], we realize efficient implementations by explicitly trackingdependencies between operations and transactions. Importantly,the weaker isolation semantics of transactions in QUELEA permittransactions to be discharged if at least one shim layer node isreachable.

QUELEA implements atomic visibility by exploiting shim layercausality guarantees – an effect is included only if all the effectsif depends on are also included. Consider the example given inFigure 11. In the figure, the graphs represent the state of the store,where the circles represent effects and the edges represent the

summarize [o1,o2,o3] = [n1,n2]

o1 o2

o3

Before summarization

o1 o3o2

During summarization

sn2 n2

After summarization, before deletion

o1 o3o2

sn2 n2

Figure 12: Summarization in the backing store.

dependence between effects. The dotted circle represents effectsthat are not yet inserted into the store. The graph on the left showsthat state of the store after executing oper2. For every transactionin QUELEA, we instantiate a special transaction marker effect mthat is importantly not inserted into the backing store. Marker m isincluded as a dependence to every effect generated in the transaction.Since the causally preceding effect m has not yet been written to thestore, no operation will witness e1 and e2 while the transaction inprogress. After the transaction has finished execution, we insert minto the backing store, marking all the effects from the transactionsas a dependence for m, as shown in the graph on the right. Now,any replica which includes one of the effects from the transactionmust include m, and transitively must include every effect from thetransaction. This ensures atomicity and satisfies the RC requirement.

The above scheme prevents a transaction from witnessing itsown effects. This might conflict with causality requirements onthe operations. Hence, transactions piggy-back the previous effectsfrom the same transaction for each request. MAV semantics isimplemented by keeping track of the set of transaction markers Mwitnessed by the transaction, and before performing an operation atsome replica, ensuring that M is a subset of the transaction markersincluded at that replica. If not, the missing effects are synchronouslyfetched. RR semantics is realized by capturing a optimized snapshotof the state of some replica; each operation from an RR transactionis applied to this snapshot state. Any generated effects are added tothis snapshot.

7.3 SummarizationWe utilize the summarize function (§ 3.1.1) to summarize the objectstate both in the shim layer node and the backing store, typicallywhen the number of effects on an object crosses a tunable threshold.Shim layer summarization is straight-forward; a summarizationthread takes the local lock on the cached object, and replaces itsstate with the summarized state. The shim layer node only remainsunavailable for that particular object during summarization (usuallya few milliseconds).

Performing summarization in the backing store is more compli-cated since the whole process needs to be atomic from a client’sperspective, but Cassandra does not provide multi-row transactions.Summarization in the backing store involves deleting previouslyinserted rows and inserting new rows, where each row correspondsto an effect. It is essential that concurrent client operations are per-mitted, but are not allowed to witness the intermediate state of thesummarization process.

To this end, we adopt a novel summarization strategy that buildson the causality property of the store. Figure 12 illustrates thesummarization strategy. Suppose the original set of effects on anobject are o1, o2 and o3. When summarized, the new effects yieldedare n1 and n2. We first instantiate a summarization marker s, andsimilar to transaction marker, we do not insert it into the storeimmediately. We insert the new effects n1 and n2, with strongconsistency, including s as a dependence. Since s is not yet in thestore, the new effects are not made visible to the clients. Then we

Table 1: The distribution of classified contracts. #T refers to thenumber of tables in the application. The columns 4-6 (7-9) representoperations (transactions) assigned to this consistency (isolation)level.

Benchmark LOC #T EC CC SC RC MAV RRLWW Reg 108 1 2 2 2 0 0 0

DynamoDB 126 1 3 1 2 0 0 0Bank Account 155 1 1 1 1 1 0 1Shopping List 140 1 2 1 1 0 0 0

Online store 340 4 9 1 0 2 0 1RUBiS 640 6 14 2 1 4 2 0

Microblog 659 5 13 6 1 6 3 1

insert s with strong consistency, including the original effects o1,o2 and o3 as dependence. Strongly consistent insertions ensure thata shim layer node witnessing s on some object must also witness n1and n2 on the same object. A shim layer node which witnesses allthe effects removes the original effects from its cache since they aresuperseded by the new effects. Finally, the old effects are deletedfrom the backing store. This process ensures that clients eitherwitness the old or the new effects, but not both; the summarizationprocess appears to be atomic from the clients perspective.

8. EvaluationWe present an evaluation study of our implementation, report con-tract profiles of benchmark programs, and illustrate the performancebenefits of fine-grained consistency classification on operations andtransactions. We also evaluate the impact of the summarization. Wehave implemented the following applications, which includes in-dividual RDTs as well as larger applications composed of severalRDTs:

• LWW register: A last-write-wins register that provides read andwrite operations, where the read returns the value of the latestwrite.

• DynamoDB register: An integer register that allows eventualand strong puts and gets, conditional puts, increment and decre-ment operations.

• Bank account: Our running example.• Shopping list: A collaborative shopping list that allows concur-

rent addition and deletion of items.• Online store: An online store with shopping cart functionality

and dynamically changing item prices. The checkout processverifies that the customer only pays the accepted price.

• RUBiS: An eBay-like auction site [23]. The application allowsusers to browse items, bid for items on sale, and pay for itemsfrom a wallet modeled after a bank account.

• Microblog: A twitter-like microblogging site, modeled afterTwissandra [29]. The application allows adding new users,adding and replying to tweets, following, unfollowing andblocking users, and fetching a user’s timeline, userline, followersand following.

The distribution of contracts in these applications is given inTable 1. We see that majority of the operations and transactions areclassified as eventually consistent and RC, respectively. Operationcontracts are used to enforce integrity and visibility constraints onindividual fields in the tables. Transactions are mainly used to con-sistently modify and access related fields across tables. In QUELEA,the contract classification process is completely performed at com-

pile time and incurs no runtime overhead. The proof obligationsassociated with contract classification is discharged through the Z3SMT Solver. Across our benchmarks, classifying a contract took11.5 milliseconds on average.

For our performance evaluation, we deploy QUELEA applicationsin clusters, where each cluster is composed of five fully replicatedCassandra replicas within the same datacenter. We instantiate oneshim layer node co-located on the same VM as a Cassandra replica.Clients are instantiated within the same data center as the store,and run transactions. We deploy each cluster and client node ona c3.4xlarge Amazon EC2 instance. We call this a 1DC config-uration. For our geo-distributed experiments (2DC), we instantiate2 clusters, each with five nodes, and place the clusters on US-east(Virginia) and US-west (Oregon) locations. The average inter-regionlatency was 85ms.

Figure 13 shows throughput vs. latency of operations in thebank account example as we increase the number of clients in a1DC configuration. Our client workload was generated using theYCSB benchmark [12]. The benchmark uniformly chooses from100,000 keys, where the operation spread was 25% withdraw, 25%deposit and 50% getBalance, which corresponds to the default 50:50read:write mix in YCSB. We increased the number of clients from128 to 1024, and each experiment ran for 180 seconds.

The lines marked EC and CC correspond to all operations(including withdraw) being assigned EC and CC consistencylevels. These levels compromise correctness as withdraw has tobe an SC operation. The SC line corresponds to a configurationwhere all operations are strongly consistent; this ensures applicationcorrectness, at the cost of performance. QUELEA corresponds to ourimplementation, which classifies operations based on their contractspecifications. With 512 clients, the QUELEA implementation waswithin 41% of the latency and 18% of the throughput of EC, whereasSC operations had 162% higher latency and 52% lower throughputthan EC operations. Observe that there is a point in each caseafter which the latency increases while the throughput decreases;these correspond to points where the store becomes saturatedwith client requests. In a 2DC configuration (not shown here), theaverage latency of SC operations with 512 clients increased by 9.4×due to the cost of geo-distributed coordination, whereas QUELEAoperations were only 2.2× slower, mainly due to the increased costof withdraw operations. Importantly, the latency of getBalanceand deposit remained almost the same, illustrating the benefit offine-grained contract classification.

We compare the performance of different transaction isolationlevel choices in Figure 14 using the LWW register. The numberswere obtained under a 1DC configuration. The YCSB workloadwas modified to issue 10 operations per transaction, with a default50:50 read:write mix. Each operation is assumed to be eventuallyconsistent. NoTxn corresponds to a configuration that does notuse transactions. Compared to this, RC is only 12% shower interms of latency with 512 clients, whereas RR is 2.3× slower. Thedifference between RC and NoTxn is due to the meta-data overheadof recording transaction information in the object state. For RRtransactions, the cost of capturing and maintaining a snapshot is thebiggest source of overhead.

We also compared (not shown) the performance of EC LWWoperations directly against Cassandra (our backing store), whichuses last-writer-wins as the only convergence semantics. While Cas-sandra provides no stronger-than-eventual consistency properties,QUELEA was within 30%(20%) of latency(throughput) of Cassan-dra with 512 clients, illustrating that the programmers only have topay a minimal overhead for the expressive and stronger QUELEAprogramming model.

Figure 15 compares the QUELEA implementation of RUBiS in1DC configuration against a single replica (NoRep) and strongly

0 128 256 384 512 640 768 896 1024Clients

0

40

80

120

160

200

Late

ncy

(m

s)

EC

CC

SC

Quelea

(a) Latency

0 128 256 384 512 640 768 896 1024Clients

1

2

3

4

5

6

7

8

9

Thro

ughput

(X1

00

0 o

ps/

s) EC

CC

SC

Quelea

(b) Throughput

0 1 2 3 4 5 6 7 8 9Throughput (X 1000 ops/s)

0

20

40

60

80

100

120

140

160

180

Late

ncy

(m

s)

EC

CC

SC

Q

(c) Throughput vs. Latency

Figure 13: Bank account performance.

0 128 256 384 512 640 768 896 1024Clients

0

20

40

60

80

100

120

140

160

180

Thro

ughput

(X1

00

0 o

ps/

s) NoTxn

RC

MAV

RR

(a) Latency

0 128 256 384 512 640 768 896 1024Clients

1

2

3

4

5

6

7

8

9

Thro

ughput

(X1

00

0 o

ps/

s) NoTxn

RC

MAV

RR

(b) Throughput

0 1 2 3 4 5 6 7 8 9Throughput (X 1000 ops/s)

0

20

40

60

80

100

120

140

160

180

Late

ncy

(m

s)

NoTxn

RC

MAV

RR


Figure 14: LWW register transaction performance.

0 128 256 384 512 640 768 896 1024Clients

0

20

40

60

80

100

120

140

160

180

Thro

ughput

(X1

00

0 o

ps/

s) Quelea

NoRep

StrongRep

(a) Latency

0 128 256 384 512 640 768 896 1024Clients

0

1

2

3

4

5

6

7

Thro

ughput

(X1

00

0 o

ps/

s) Quelea

NoRep

StrongRep

(b) Throughput

0 1 2 3 4 5 6 7Throughput (X 1000 ops/s)

0

20

40

60

80

100

120

140

160

180

Late

ncy

(m

s)

Quelea

NoRep

StrongRep


Figure 15: Rubis bidding mix performance.

0 1 2 3 4 5 6 7Operations (X 10,000)

0

2

4

6

8

10

Late

ncy

(X

10

0 m

s)

No Sum

Mem Only

Mem & Disk

Figure 16: Impact of summarization.

replicated (StrongRep) 1DC deployment. The benchmark was RU-BiS bidding mix, which has 15% read-write interactions, which isrepresentative of the auction workload. Without replication, NoReptrivially provides strong consistency. However, this deploymentdoes not scale beyond 1750 operations per second. Strong replica-tion offers better throughput at the cost of greater latency due tointer-replica coordination. QUELEA deployment offers the benefitof replication, while only paying the cost of coordination whennecessary.

Finally, we study the impact of summarization in Figure 16. Weuse 128 clients and a single QUELEA replica, with all clients oper-ating on the same LWW register to stress test the summarizationmechanism. The shim layer cache (memory) is summarized every64 updates, while the updates in the backing store (disk) are sum-marized every 4096 updates. Each point in the graph represents theaverage latency of the previous 1000 operations. Each experiment isrun for one minute. Without summarization, the average latency ofoperations increases exponentially to almost one second, and only13K operations were performed in a minute. Since every operationhas to reduce over the set of all previous operations, operations takeincreasingly more time to complete since they must contend withan ever growing set. With summarization only in memory, perfor-mance still degrades due to the cost of fetching all previous updatesfrom the backing store into the shim layer. Fetching the latest up-dates from the backing store is essential for SC operations. Withsummarization enabled on both disk and memory, latency does notincrease over time, and the implementation realizes throughput of67K operations/minute.

9. Related WorkOperation-based RDTs have been widely studied in terms of their al-gorithmic properties [10, 24], and several systems utilize this modelto construct distributed data structures [5, 16, 22]. These systemstypically propose to implement the datatypes directly over a clus-ter of nodes, and only focus on basic eventual consistency. Hence,these systems implement custom solutions for durability and fault-tolerance. QUELEA realizes stronger consistency models for RDTson top of off-the-shelf eventually consistent distributed stores. Inthis respect, QUELEA is similar to [3] where causal consistency isachieved through a shim layer layered on top of Cassandra, whichexplicitly tracks and enforces dependencies between updates. How-ever, [3] does not support user-defined RDTs, automatic contractclassification and transactions.

Since eventual consistency alone is insufficient to build cor-rect applications, several systems [17, 22, 28] propose a lattice ofstronger consistency levels. Similarly, traditional database process-ing systems [7] and their replicated variants [2] propose weakerisolation levels for performance. In these systems, the onus is on the

developer to choose the correct consistency(isolation) level for oper-ations(transactions). QUELEA relieves the developer of this burden,and instead only requires contracts to express declarative visibilityrequirements.

Our contract language and system model is inspired by theaxiomatic description of RDT semantics proposed in [10]. While[10] uses axioms for formal verification of an RDT implementation,we utilize them as a means for the user to express desired consistencyguarantees in the application. The operational semantics of QUELEAdescribes an operational manifestation of our system model. [11]also presents an operational model of a replicated data store that isbased on the abstract system model presented in [10]. Their claimsabout the expressivity and practicality of the system model vindicateours. However, both approaches differ in the way they aim toempower the application programmer. [11] exposes the operationalmodel to the programmer, along with primitives, such as push, pulland fetch, that make it easy to implement required consistency andisolation guarantees. In contrast, QUELEA completely abstracts itsoperational model, and instead relies on contract classification toautomatically choose appropriate consistency and isolation levels.Similar to [10] and [11], our system model does not incorporate real(i.e., wall-clock) time. Hence, our contract language cannot describestore semantics such as recency or bounded-staleness guaranteesoffered by certain stores [28].

Several conditions have been proposed to judge whether an op-eration on a replicated data object needs coordination or not. [1]defines logical monotonicity as a sufficient condition for coordi-nation freedom, and proposes a consistency analysis that markscode regions performing non-monotonic reasoning (eg: aggrega-tions, such as COUNT) as potential coordination points. [4] and [18]define invariant confluence and invariant safety, respectively, as con-ditions for safely executing an operation without coordination. [18]also proposes a program analysis that conservatively marks certainoperations as blue (coordination not required), while marking theremaining as red (coordination required). [6] requires programmersto declare application semantics, and the desired application-specificinvariants as formulas in first-order logic. It performs static analysison these formulas to determine I-offender sets - sets of operations,which, when performed concurrently, result in violation of one ormore of the stated invariants. For each offending set of operations,if the programmer chooses invariant-violation avoidance over viola-tion repair, the system employs various techniques, such as escrowreservation, to ensure that the offending set is effectively serialized.Unlike QUELEA, these approaches focus on a coarse-grained clas-sification of consistency as eventual or strong, and do not considerfiner-grained transaction isolation levels. However, these analysesdo relieve programmers of the burden of tagging operations withconsistency levels. We consider automatic inference of consistencycontracts from application-specific integrity constraints as an obvi-ous next step for QUELEA.

10. ConclusionsThis paper presents QUELEA, a shallow Haskell extension fordeclarative programming over ECDS. The key idea underlyingQUELEA’s design is the automatic classification of fine-grainedconsistency contracts on operations and distributed transactionswith respect various consistency and isolation levels offered by thestore. Our contract language is carefully crafted from a decidablesubset of first-order logic, enabling the use of automated verificationtools to discharge the proof obligations associated with contractclassification. We realize an instantiation of QUELEA on top of anoff-the-shelf distributed store, Cassandra, and illustrate the benefit offine-grained contract classification by implementing and evaluatingseveral scalable applications.

AcknowledgementsWe thank the anonymous reviewers for their detailed comments.This work is supported in part by the National Science Foundationunder grants CCF-1216613 and CCF-1318227.

References[1] P. Alvaro, N. Conway, J. Hellerstein, and W. R. Marczak. Consistency

Analysis in Bloom: a CALM and Collected Approach. In CIDR2011, Fifth Biennial Conference on Innovative Data Systems Research,Asilomar, CA, USA, January 9-12, 2011, Online Proceedings, pages249–260, 2011. URL http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper35.pdf.

[2] P. Bailis, A. Davidson, A. Fekete, A. Ghodsi, J. M. Hellerstein, andI. Stoica. Highly Available Transactions: Virtues and Limitations.PVLDB, 7(3):181–192, 2013.

[3] P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Bolt-on CausalConsistency. In Proceedings of the 2013 ACM SIGMOD InternationalConference on Management of Data, SIGMOD ’13, pages 761–772,New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2037-5. doi: 10.1145/2463676.2465279.

[4] P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J. M. Hellerstein,and I. Stoica. Coordination-Avoiding Database Systems. CoRR,abs/1402.2237, 2014. URL http://arxiv.org/abs/1402.2237.

[5] M. Balakrishnan, D. Malkhi, T. Wobber, M. Wu, V. Prabhakaran,M. Wei, J. D. Davis, S. Rao, T. Zou, and A. Zuck. Tango: DistributedData Structures over a Shared Log. In Proceedings of the Twenty-FourthACM Symposium on Operating Systems Principles, SOSP ’13, pages325–340, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2388-8.doi: 10.1145/2517349.2522732.

[6] V. Balegas, N. Preguiça, R. Rodrigues, S. Duarte, C. Ferreira,M. Najafzadeh, and M. Shapiro. Putting the Consistency backinto Eventual Consistency. In Proceedings of the Tenth EuropeanConference on Computer System, EuroSys ’15, Bordeaux, France,2015. URL http://lip6.fr/Marc.Shapiro/papers/putting-consistency-back-EuroSys-2015.pdf.

[7] H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, and P. O’Neil.A Critique of ANSI SQL Isolation Levels. In Proceedings of the 1995ACM SIGMOD International Conference on Management of Data,SIGMOD ’95, pages 1–10, New York, NY, USA, 1995. ACM. ISBN0-89791-731-6. doi: 10.1145/223784.223785.

[8] E. Brewer. Towards Robust Distributed Systems (Invited Talk), 2000.[9] S. Burckhardt, D. Leijen, M. Fähndrich, and M. Sagiv. Eventually

Consistent Transactions. In Proceedings of the 21st European Confer-ence on Programming Languages and Systems, ESOP’12, pages 67–86,Berlin, Heidelberg, 2012. Springer-Verlag. ISBN 978-3-642-28868-5.doi: 10.1007/978-3-642-28869-2_4.

[10] S. Burckhardt, A. Gotsman, H. Yang, and M. Zawirski. ReplicatedData Types: Specification, Verification, Optimality. In Proceedingsof the 41st ACM SIGPLAN-SIGACT Symposium on Principles ofProgramming Languages, POPL ’14, pages 271–284, New York,NY, USA, 2014. ACM. ISBN 978-1-4503-2544-8. doi: 10.1145/2535838.2535848.

[11] S. Burckhardt, D. Leijen, J. Protzenko, and M. Fähndrich. GlobalSequence Protocol: A Robust Abstraction for Replicated SharedState. In Proceedings of the 29th European Conference onObject-Oriented Programming, ECOOP ’15, Prague, Czech Repub-lic, 2015. URL http://research.microsoft.com/pubs/240462/gsp-tr-2015-2.pdf.

[12] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears.Benchmarking Cloud Serving Systems with YCSB. In Proceedingsof the 1st ACM Symposium on Cloud Computing, SoCC ’10, pages143–154, New York, NY, USA, 2010. ACM. ISBN 978-1-4503-0036-0.doi: 10.1145/1807128.1807152.

[13] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman,A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo:Amazon’s Highly Available Key-value Store. In Proceedings of Twenty-

first ACM SIGOPS Symposium on Operating Systems Principles, SOSP’07, pages 205–220, New York, NY, USA, 2007. ACM. ISBN 978-1-59593-591-5. doi: 10.1145/1294261.1294281.

[14] S. Gilbert and N. Lynch. Brewer’s Conjecture and the Feasibility ofConsistent, Available, Partition-tolerant Web Services. SIGACT News,33(2):51–59, June 2002. ISSN 0163-5700. doi: 10.1145/564585.564601.

[15] M. P. Herlihy and J. M. Wing. Linearizability: A Correctness Con-dition for Concurrent Objects. ACM Transactions on ProgrammingLanguages and Systems, 12(3):463–492, July 1990. ISSN 0164-0925.doi: 10.1145/78969.78972.

[16] A. Lakshman and P. Malik. Cassandra: A Decentralized StructuredStorage System. SIGOPS Operating Systems Review, 44(2):35–40, Apr.2010. ISSN 0163-5980. doi: 10.1145/1773912.1773922.

[17] C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguiça, and R. Rodrigues.Making Geo-replicated Systems Fast As Possible, Consistent whenNecessary. In Proceedings of the 10th USENIX Conference onOperating Systems Design and Implementation, OSDI’12, pages 265–278, Berkeley, CA, USA, 2012. USENIX Association. ISBN 978-1-931971-96-6. URL http://dl.acm.org/citation.cfm?id=2387880.2387906.

[18] C. Li, J. a. Leitão, A. Clement, N. Preguiça, R. Rodrigues, andV. Vafeiadis. Automating the Choice of Consistency Levels in Repli-cated Systems. In Proceedings of the 2014 USENIX Conference onUSENIX Annual Technical Conference, USENIX ATC’14, pages 281–292, Berkeley, CA, USA, 2014. USENIX Association. ISBN 978-1-931971-10-2. URL http://dl.acm.org/citation.cfm?id=2643634.2643664.

[19] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don’tSettle for Eventual: Scalable Causal Consistency for Wide-area Storagewith COPS. In Proceedings of the Twenty-Third ACM Symposium onOperating Systems Principles, SOSP ’11, pages 401–416, New York,NY, USA, 2011. ACM. ISBN 978-1-4503-0977-6. doi: 10.1145/2043556.2043593.

[20] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. StrongerSemantics for Low-latency Geo-replicated Storage. In Proceedingsof the 10th USENIX Conference on Networked Systems Design andImplementation, nsdi’13, pages 313–328, Berkeley, CA, USA, 2013.USENIX Association. URL http://dl.acm.org/citation.cfm?id=2482626.2482657.

[21] C. H. Papadimitriou. The Serializability of Concurrent DatabaseUpdates. Journal of the ACM, 26(4):631–653, Oct. 1979. ISSN 0004-5411. doi: 10.1145/322154.322158.

[22] K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J. De-mers. Flexible Update Propagation for Weakly Consistent Replication.In Proceedings of the Sixteenth ACM Symposium on Operating SystemsPrinciples, SOSP ’97, pages 288–301, New York, NY, USA, 1997.ACM. ISBN 0-89791-916-5. doi: 10.1145/268998.266711.

[23] RUBiS. Rice University Bidding System, 2014. URL http://rubis.ow2.org/. Accessed: 2014-11-4 13:21:00.

[24] M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski. Conflict-Free Replicated Data Types. In X. Défago, F. Petit, and V. Villain,editors, Stabilization, Safety, and Security of Distributed Systems,volume 6976 of Lecture Notes in Computer Science, pages 386–400. Springer Berlin Heidelberg, 2011. ISBN 978-3-642-24549-7.doi: 10.1007/978-3-642-24550-3_29.

[25] S. Sivasubramanian. Amazon dynamoDB: A Seamlessly Scalable Non-relational Database Service. In Proceedings of the 2012 ACM SIGMODInternational Conference on Management of Data, SIGMOD ’12, pages729–730, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1247-9.doi: 10.1145/2213836.2213945.

[26] Y. Sovran, R. Power, M. K. Aguilera, and J. Li. Transactional Storagefor Geo-replicated Systems. In Proceedings of the Twenty-Third ACMSymposium on Operating Systems Principles, SOSP ’11, pages 385–400, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0977-6.doi: 10.1145/2043556.2043592.

[27] D. B. Terry, A. J. Demers, K. Petersen, M. Spreitzer, M. Theimer, andB. W. Welch. Session Guarantees for Weakly Consistent Replicated

http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper35.pdf

http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper35.pdf

http://dx.doi.org/10.1145/2463676.2465279

http://dx.doi.org/10.1145/2463676.2465279

http://arxiv.org/abs/1402.2237

http://arxiv.org/abs/1402.2237

http://dx.doi.org/10.1145/2517349.2522732

http://lip6.fr/Marc.Shapiro/papers/putting-consistency-back-EuroSys-2015.pdf

http://lip6.fr/Marc.Shapiro/papers/putting-consistency-back-EuroSys-2015.pdf

http://dx.doi.org/10.1145/223784.223785

http://dx.doi.org/10.1007/978-3-642-28869-2_4

http://dx.doi.org/10.1145/2535838.2535848

http://dx.doi.org/10.1145/2535838.2535848

http://research.microsoft.com/pubs/240462/gsp-tr-2015-2.pdf

http://research.microsoft.com/pubs/240462/gsp-tr-2015-2.pdf

http://dx.doi.org/10.1145/1807128.1807152

http://dx.doi.org/10.1145/1294261.1294281

http://dx.doi.org/10.1145/564585.564601

http://dx.doi.org/10.1145/564585.564601

http://dx.doi.org/10.1145/78969.78972

http://dx.doi.org/10.1145/1773912.1773922

http://dl.acm.org/citation.cfm?id=2387880.2387906




http://dx.doi.org/10.1145/2043556.2043593

http://dx.doi.org/10.1145/2043556.2043593



http://dx.doi.org/10.1145/322154.322158

http://dx.doi.org/10.1145/268998.266711

http://rubis.ow2.org/

http://rubis.ow2.org/

http://dx.doi.org/10.1007/978-3-642-24550-3_29

http://dx.doi.org/10.1145/2213836.2213945

http://dx.doi.org/10.1145/2043556.2043592

Data. In Proceedings of the Third International Conference onParallel and Distributed Information Systems, PDIS ’94, pages 140–149, Washington, DC, USA, 1994. IEEE Computer Society. ISBN0-8186-6400-2. URL http://dl.acm.org/citation.cfm?id=645792.668302.

[28] D. B. Terry, V. Prabhakaran, R. Kotla, M. Balakrishnan, M. K. Aguilera,and H. Abu-Libdeh. Consistency-based Service Level Agreements forCloud Storage. In Proceedings of the Twenty-Fourth ACM Symposium

on Operating Systems Principles, SOSP ’13, pages 309–324, New York,NY, USA, 2013. ACM. ISBN 978-1-4503-2388-8. doi: 10.1145/2517349.2522731.

[29] Twissandra. Twitter clone on Cassandra, 2014. URL http://twissandra.com/. Accessed: 2014-11-4 13:21:00.

[30] Z3. High-performance Theorem Prover, 2014. URL http://z3.codeplex.com/. Accessed: 2014-11-4 13:21:00.



http://dx.doi.org/10.1145/2517349.2522731

http://dx.doi.org/10.1145/2517349.2522731

http://twissandra.com/

http://twissandra.com/

http://z3.codeplex.com/

http://z3.codeplex.com/

new declarative programming over eventually consistent data stores · 2020. 9. 1. · levels, with...

Documents