causalconsistencyalgorithmsforpartially ...ajayk/ccalgos.pdfcausalconsistencyalgorithmsforpartially...

72
Causal Consistency Algorithms for Partially Replicated and Fully Replicated Systems TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at Chicago TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at Chicago Causal Consistency Algorithms for Partially Replicated and Fully Replicated Sys 1 / 72

Upload: others

Post on 23-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Causal Consistency Algorithms for PartiallyReplicated and Fully Replicated Systems

TaYuan Hsu, Ajay D. Kshemkalyani, Min ShenUniversity of Illinois at Chicago

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems1 / 72

Page 2: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Outline

1 Introduction

2 System Model

3 AlgorithmsFull-Track Algorithm: partially replicated memoryOpt-Track Algorithm: partially replicated memoryOpt-Track-CRP: fully replicated memory

4 Complexity Measures and Performance

5 Approximate Causal ConsistencyApprox-Opt-TrackPerformance

6 Conclusions and Future Work

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems2 / 72

Page 3: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Motivation for Replication

Replication – makes copies of data/services on multiple sites.Replication improves ...

Reliability (by redundancy)If primary File Server crashes, standby File Server still worksPerformanceReduce communication delaysScalabilityPrevent overloading a single server (size scalability)Avoid communication latencies (geographic scale)

However, concurrent updates become complexConsistency models define which interleavings of operations are valid(admissible)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems3 / 72

Page 4: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Consistency Models

Spectrum of consistency models trade-off: cost vs. convenient semanticsLinearizability (Herlihy and Wing 1990)

non-overlapping ops seen in order of occurrence +overlapping ops seen in common order

Sequential consistency (Lamport 1979)all processes see the same interleaving of executions

Causal consistency (Ahamad et al. 1991)all processes see the same order of causally related writes

PRAM consistency (Lipton and Sandberg 1988)pipeline per pair of processes, for updates

Slow memory (Hutto et al. 1990)pipeline per variable per pair of processes, for updates

Eventual consistency (Johnson et al. 1975)updates are guaranteed to eventually propagate to all replicas

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems4 / 72

Page 5: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

The Motivation of Causal Consistency

Figure: Causal consistency is good for users in social networks.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems5 / 72

Page 6: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Related Works

Causal consistency in distributed shared memory systems (Ahamad etal. [1])

Causal consistency has been studied (by Baldoni et al., Mahajan et al.,Belaramani et al., Petersen et al.).Since 2011,

ChainReaction (S. Almeida et al.)Bolt-on causal consistency (P. Bailis et al.)Orbe and GentleRain (J. Du et al.)Wide-Area Replicated Storage (K. Lady et al.)COPS, Eiger (W. Lloyd et al.)

The above works assume full replication.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems6 / 72

Page 7: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Partial Replication

Figure: Case for Partial Replication.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems7 / 72

Page 8: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Case for Partial Replication

Partial replication is more natural for some applications. See previous fig.

With p replicas at some p of the total of n DCs, each write operation thatwould have triggered an update broadcast now becomes a multicast to justp of the n DCs.

Savings in storage and infrastructure costs

For write-intensive workloads, partial replication gives a direct savings inthe number of messages.

Allowing flexibility in the number of DCs required in causally consistentreplication remains an interesting aspect of future work.

The supposedly higher cost of tracking dependency metadata is relativelysmall for applications such as Instagram.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems8 / 72

Page 9: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Introduction

Causal Consistency

Causal consistency: writes that are potentially causally related must beseen by all processors in that same order. Concurrent writes may be seenin a different order on different machines.

causally related writes: the write comes after a read that returned thevalue of the other write

Examples

Figure: Should enforce W(x)a < W(x)b ordering?

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems9 / 72

Page 10: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

System Model

System Model

n application processes - ap1, ap2,...,apn - interacting through a sharedmemory Q composed of q variables x1,x2,...,xq

Each api can perform either a read or a write on any of the q variables.ri (xj )v : a read operation by api on variable xj returns value vwi (xj )v : a write operation by api on variable xj writes value v

local history hi : a series of read and write operations generated byprocess api

global history H : the set of local histories hi from all n processes

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems10 / 72

Page 11: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

System Model

Causality Order

Program Order : a local operation o1 precedes another local operationo2, denoted as o1 �po o2

Read-from Order : read operation o2 = r(x )v retrieves the value vwritten by the write operation o1 = w(x )v from a distinct process,denoted as o1 �ro o2

Causality Order : o1 �co o2 iff one of the following conditions holds:9api s.t. o1 �po o2 (program order)9api , apj s.t. o1 �ro o2 (read-from order)9o3 2 OH s.t. o1 �co o3 and o3 �co o2 (transitive closure)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems11 / 72

Page 12: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

System Model

Underlying Distributed Communication System

The shared memory abstraction and its causal consistency model isimplemented on top of the distributed message passing system.

With n sites (connected by FIFO channels), each site si hosts anapplication process api and holds only a subset of variables xh 2 Q.When api performs a write operation w(x1)v , it invokes theMulticast(m) to deliver the message m containing w(x1)v to all sitesreplicating x1.

send event, receive event, apply event in msg-passing layer

When api performs a read operation r(x2)v , it invokes theRemoteFetch(m) to deliver the message m containing r(x2)v to apre-designated site replicating x2 to fetch its value.

fetch event, remote_return event, return event in msg-passing layer

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems12 / 72

Page 13: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

System Model

Activation Predicate

Baldoni et al. [2] defined relation !co on send events .sendi (mw(x)a) !co sendj (mw(y)b) iff one of the following conditionsholds:

1 i = j and sendi (mw(x)a) locally precedes sendj (mw(y)b)2 i 6= j and returnj (x ; a) locally precedes sendj (mw(y)b)3 9sendk (mw(z)c), s.t. sendi (mw(x)a) !co sendk (mw(z)c) !co sendj (mw(y)b)

!co � ! (Lamport’s happened before relation)A write can be applied when its activation predicate becomes true

there is no earlier message (under !co) which has not been locally applied

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems13 / 72

Page 14: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

System Model

Implementing the Activation Predicate

Figure: Can the write for M3 can be applied at s1? Only after the earlier write forM is applied. The dependency "s1 is a destination of M" is needed as meta-data onM3.

Meta-data grows as computation progresses

Objective: to reduce the size of meta-data

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems14 / 72

Page 15: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms

Overview of Algorithms

Two algorithms [3, 4] implement causal consistency in a partiallyreplicated distributed shared memory system.

Full-TrackOpt-Track (a message and space optimal algorithm)

Implement the !co relation; adopt the activation predicateA special case of Opt-Track – for full replication.

Opt-Track-CRP (optimal) : a lower message size, time, space complexitythan the Baldoni et al. algorithm [2]

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems15 / 72

Page 16: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Full-Track Algorithm: partially replicated memory

Algorithm 1: Full-Track

Algorithm 1 is for a non-fully replicated system.

Each application process performing write operation will only write to asubset of all the sites.

Each site si needs to track the number of write operations performed byevery apj to every site sk , denoted as Writei [j ][k ].

the Write clock piggybacked with messages generated by theMulticast(m) should not be merged with the local Write clock at themessage reception, but only at a later read operation reading the valuethat comes with the message.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems16 / 72

Page 17: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Full-Track Algorithm: partially replicated memory

Algorithm 1: Full-Track

Data structures1 Writei - the Write clock

Writei [j ; k ] : the number of updates sent by apj to site sk that causallyhappened before under the !co relation.

2 Applyi - an array of integersApplyi [j ] : the number of updates written by apj that have been appliedat site si .

3 LastWriteOni hvariable id, Writei - a hash map of Write clocksLastWriteOni hhi : the Write clock value associated with the last writeoperation on variable xh locally replicated at site si .

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems17 / 72

Page 18: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Full-Track Algorithm: partially replicated memory

Algorithm 1: Full-Track

Figure:TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems18 / 72

Page 19: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Full-Track Algorithm: partially replicated memory

Algorithm 1: Full-Track

The activation predicate is implemented.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems19 / 72

Page 20: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Algorithm 2: Opt-Track

Each message corresponding to a write operation piggybacks an O(n2)

matrix in Algorithm 1.Algorithm 2 uses propagation constraints to further reduce themessage size and storage cost.

Exploits the transitive dependency of causal deliveries of messages (ideasfrom the KS algorithm [5]).

Each site keeps a record of the most recently received message from eachother site (along with the list of its destinations).

optimal in terms of log space and message space overheadachieve another optimality that no redundant destination information isrecorded.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems20 / 72

Page 21: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Propagation Constraints: Two Situations forDestination Information to be Redundant

Figure: Meta-data information � = "s2 is a destination of m". The causal future ofthe relevant apply and return events are shown in dotted lines.

� must not exist in the causal future ofapply(w): (Propagation Constraint 1)apply(w’): (Propagation Constraint 2)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems21 / 72

Page 22: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Algorithm 2: Opt-Track

Data Structures1 clocki

local counter at site si for write operations performed by api .2 Applyi - an array of integers

Applyi [j ] : the number of updates written by apj that have been appliedat site si .

3 LOGi = fhj ; clockj ;Destsig - the local logEach entry indicates a write operation in the causal past.

4 LastWriteOni hvariable id, LOGi - a hash map of LOGsLastWriteOni hhi : the piggybacked LOG from the most recent updateapplied at site si for locally replicated variable xh .

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems22 / 72

Page 23: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Algorithm 2: Opt-Track

Figure: Write process at site siTaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems23 / 72

Page 24: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Algorithm 2: Opt-Track

Figure: Read, receiving processes at site siTaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems24 / 72

Page 25: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

Procedures used in Opt-Track

Figure: PURGE and MERGE functions at site si

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems25 / 72

Page 26: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

The Propagation Constraints: An example

M

P1

P2

P3

P4

P5

P6

M

M

MM

4,3

4,3

2,2

4,2M

4,2

5,1M

1

1

1

1

1

causal past contains event (6,1)

21 3 4

2

32

2

M2,3

M3,3

43

2 3

32

of multicast at event (5,1) propagatesas piggybacked information and in Logs

information about P as a destination6

5

M3,3

M 5,26,25,1

M

(a)

M5,1

to P

M4,2

to P3

,P2

2,2M to P

1

M6,2

to P1

M4,3

to P6

M4,3

to P3

M5,2

to P6

M2,3

to P

M

1

3,3to P

2,P

6

{ P4

{ P6

{ P6

{ P4

{ P6

{ }

{ P ,4

P6

{ P

{ }

}

}

}

}

}

64P }

Message to dest.

piggybacked

M 5,1 .Dests

,6

,P

}6

(b)

Figure: Meta-data information � = "P6 is a destination of M5;1". The propagationof explicit information and the inference of implicit information.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems26 / 72

Page 27: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track Algorithm: partially replicated memory

If the Destination List Becomes ;, then ...

Figure: Illustration of why it is important to keep a record even if its destination listbecomes empty.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems27 / 72

Page 28: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track-CRP: fully replicated memory

Algorithm 3: Opt-Track-CRP

Special case of Algorithm 2 for full replication; same optimizations.

Every write operation will be sent to exactly the same set of sites; thereis no need to keep a list of the destination information with each write.

Represent each write operation as hi ; clocki i at site si .

the cost of a write operation down from O(n) to O(1).d entries in local log, where d = no. of write operations in local log

Local log always gets reset after each writeEach read will add at most one new entry into the local log

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems28 / 72

Page 29: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track-CRP: fully replicated memory

Further Improved Scalability

In Algorithm 2, keeping entries with empty destination list is important.

In the fully replicated case, we can also decrease this cost.

s1

s2

s3

m(w(x1)v)

m(w0(x2)u)

LOG3 = {w}

LOG1 = {w}

LOG3 = {w0}return3(x1, v)

send1(m(w))

receive3(m(w))

receive1(m(w0))

send3(m(w0))

LastWriteOn1h2i = {w0}

LastWriteOn3h1i = {w}

Figure: In fully replicated systems, the local log will be reset after each write.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems29 / 72

Page 30: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Algorithms Opt-Track-CRP: fully replicated memory

Algorithm 3: Opt-Track-CRP

Figure: There is no need to maintain the destination list for each write operation inthe local log.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems30 / 72

Page 31: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Parameters

n : number of sites in the system

q : number of variables in the system

p: replication factor, i.e., the number of sites where each variable isreplicated

w : number of write operations performed in the system

r : number of read operations performed in the system

d : number of write operations stored in local log (used only inOpt-Track-CRP algorithm)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems31 / 72

Page 32: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Complexity

Table: Complexity measures of causal memory algorithms.

Metric Full-Track Opt-Track Opt-Track-CRP optP [2]

Message Count ((p � 1) + n�pn )w + 2r

(n�p)n ((p � 1) + n�p

n )w + 2r(n�p)

n (n � 1)w (n � 1)w

Message O(n2pw + nr(n � p)) O(n2pw + nr(n � p)) O(nwd) O(n2w)Size amortized O(npw + r(n � p))Time write O(n2) write O(n2p) write O(d) write O(n)Complexity read O(n2) read O(n2) read O(d) read O(n)Space O(max(n2

; npq)) O(max(n2; npq)) O(max(d; q)) O(nq)

Complexity amortized O(max(n; pq))

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems32 / 72

Page 33: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Message Count as a Function of wrate

Define write rate as wrate = ww+r

Partial replication gives a lower message count than full replication if

pw + 2r(n � p)

n< nw ) w > 2

rn

(1)

) wrate >2

1+ n(2)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems33 / 72

Page 34: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Message count: Partial Replication vs. FullReplication

0

1

2

3

4

5

6

7

8

9

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Mes

sage

Cou

nt

Write Rate (w/(w+r))

Partial Replication versus Full Replication (n=10)

p = 1p = 3p = 5p = 7

p = 10

Figure: The graph illustrates message count for partial replication vs. fullreplication, by plotting message count as a function of wrate .

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems34 / 72

Page 35: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Message meta-data structures

Figure: Partial Replication

Figure: Full Replication (only SM)

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems35 / 72

Page 36: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Simulation Methodology

Time interval Te between two consecutive events from 5ms � 2000ms .

Propagation time Tt from 100ms � 3000ms .

Number of processes n is varied from 5 up to 40.

wrate is set to be 0.2, 0.5, and 0.8, respectively.

Replica factor rate pn for partial replication is defined as 0.3.

Message meta-data size (ms): The total size of all the meta-datatransmitted over all the processes.

Each simulation execution runs 600n operation events totally.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems36 / 72

Page 37: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in Partial Replication

0

2

4

6

8

10

12

14

16

5 10 20 30 40

Avera

ge M

ess

ag

e S

ize (

KB

)

Processes

SM(Full-Track)FM(Full-Track)RM(Full-Track)SM(Opt-Track)FM(Opt-Track)RM(Opt-Track)

Figure: wrate = 0.2.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems37 / 72

Page 38: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in Partial Replication

0

2

4

6

8

10

12

14

16

5 10 20 30 40

Avera

ge M

ess

ag

e S

ize (

KB

)

Processes

SM(Full-Track)FM(Full-Track)RM(Full-Track)SM(Opt-Track)FM(Opt-Track)RM(Opt-Track)

Figure: wrate = 0.5.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems38 / 72

Page 39: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in Partial Replication

0

2

4

6

8

10

12

14

16

5 10 20 30 40

Avera

ge M

ess

ag

e S

ize (

KB

)

Processes

SM(Full-Track)FM(Full-Track)RM(Full-Track)SM(Opt-Track)FM(Opt-Track)RM(Opt-Track)

Figure: wrate = 0.8.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems39 / 72

Page 40: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Ratio of Message Overhead of Opt-Track to Full-Track

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25 30 35 40

Ra

tio

of

Mes

sag

e O

verh

ead

of

Op

t-T

rack

to

Fu

ll-T

rack

Processes

write rate: .2write rate: .5write rate: .8

Figure: Total message meta-data space overhead as a function of n and wrate inpartial replication protocols.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems40 / 72

Page 41: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Size in Partial Replication

With increasing n , the ratio rapidly decreases. For 40 processes, theOpt-Track overheads are only around 10 � 20 % of Full-Track overheadson different write rates.

In Full-Track protocol, the average message space overheads of SM andRM quadratically increases with n . However, the increasingly loweroverhead of SM and RM in Opt-Track are linear in n .

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems41 / 72

Page 42: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in Full Replication

0

100

200

300

400

500

600

700

5 10 20 30 35 40

Avera

ge M

ess

ag

e S

ize (

Byte

)

Processes

SM(optP)SM(Opt-Track-CRP)

Figure: wrate = 0.2.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems42 / 72

Page 43: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in Full Replication

0

100

200

300

400

500

600

700

5 10 20 30 35 40

Avera

ge M

ess

ag

e S

ize (

Byte

)

Processes

SM(optP)SM(Opt-Track-CRP)

Figure: wrate = 0.5.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems43 / 72

Page 44: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Meta-Data Space Overhead in full Replication

0

100

200

300

400

500

600

700

5 10 20 30 35 40

Avera

ge M

ess

ag

e S

ize (

Byte

)

Processes

SM(optP)SM(Opt-Track-CRP)

Figure: wrate = 0.8.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems44 / 72

Page 45: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Ratio of Message Overhead of Opt-Track-CRP to optP

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25 30 35 40

Ra

tio

of

Mes

sag

e O

verh

ead

of

Op

t-T

rack

-CR

P t

o o

ptP

Processes

write rate: .2write rate: .5write rate: .8

Figure: Total message meta-data space overhead as a function of n and wrate in fullreplication protocols.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems45 / 72

Page 46: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Message Count: Partial Replication vs. FullReplication

Total message size overhead= Total message count � (meta-data size + replicated data size)

X meta-data size � replicated data size

Figure: Total message count for Full Replication (Opt-Track-CRP) VS. PartialReplication (Opt-Track).

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems46 / 72

Page 47: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Complexity Measures and Performance

Message Size: Partial Replication vs. Full Replication

(a) when n = 40, p = 12, wrate = 0.5, in the worst case.

(b) when n = 40, p = 12, wrate = 0.5, in the real case.

Figure: Total message size for Full Replication (Opt-Track-CRP) and PartialReplication (Opt-Track).

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems47 / 72

Page 48: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency

Approximate Causal Consistency

For some applications where the data size is small (e.g, wall posts inFacebook), the size of the meta-data can be a problem.

Can further reduce meta-data overheads at the risk of some (rare)violations of causal consistency.

As dependencies age, w.h.p. the messages they represent get deliveredand the dependencies need not be carried around and stored.

Amount of violations can be made arbitrarily small by controlling aparameter called credits.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems48 / 72

Page 49: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Approx-Opt-Track

Notion of Credits: Case 1

Figure: Reduce the meta-data at the cost of some possible violations of causalconsistency. The amount of violations can be made arbitrarily small by controlling atunable parameter (credits).

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems49 / 72

Page 50: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Approx-Opt-Track

Notion of Credits: Case 2

Figure: Illustration of meta-data reduction when credits are exhausted.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems50 / 72

Page 51: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Approx-Opt-Track

Approx-Opt-Track

Integrate the notion of credits into the Opt-Track algorithm, to give analgorithm – Approx-Opt-Track [6] – that can fine-tune the amount ofcausal consistency by trading off the size of meta-data overhead.

Give three instantiations of credits (hop count, time-to-live, and metricdistance)Violation Error Rate: Re = ne

mc

ne : number of messages applied by the remote replicated sites withviolation of causal consistencymc : total number of transmitted messages

Meta-Data Saving Rate: Rs = 1- ms(cr 6=1)ms(Opt�Track)

ms : message meta-data size, the total size of all the meta-data transmittedover all the processes

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems51 / 72

Page 52: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Simulation Methodology in Approx-Opt-Track

Time interval Te between two consecutive events from 5ms � 2000ms .

Propagation time Tt 100ms � 3000ms .

Number of processes n is varied from 5 up to 40.

wrate is set to 0.2, 0.5, and 0.8, respectively.

Replica factor rate pn for partial replication defined as 0.3.

Number of variables q used is 100.

(?) Hop Count Credit (cr): denotes the hop count available before theentry meta-data ages out and is removed.

(?) Initial credit cr is specified from one to a critical value cr0, withwhich there is no message transmission violating causal consistency.

Each simulation execution runs 600n operation events totally.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems52 / 72

Page 53: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Violation Error Rates

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

1 2 3 4 5 6 7 8 9

Vio

lati

on

Erro

r R

ate

Initial Credits

5 Peers

10 Peers

20 Peers

30 Peers

40 Peers

Figure: wrate = 0.2.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems53 / 72

Page 54: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Violation Error Rates

0

0.05

0.1

0.15

0.2

1 2 3 4 5 6 7 8 9

Vio

lati

on

Erro

r R

ate

Initial Credits

5 Peers

10 Peers

20 Peers

30 Peers

40 Peers

Figure: wrate = 0.5.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems54 / 72

Page 55: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Violation Error Rates

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

1 2 3 4 5 6 7 8 9

Vio

lati

on

Erro

r R

ate

Initial Credits

5 Peers

10 Peers

20 Peers

30 Peers

40 Peers

Figure: wrate = 0.8.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems55 / 72

Page 56: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Critical Initial Credits

Figure: Summary of the critical values of cr0 and cr�0:5%.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems56 / 72

Page 57: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Impact of initial cr on Re

cr0 : major critical initial credit (Re = 0)

cr�0:5% : minor critical initial credit (Re � 0.5%).

cr�0:5% seems not to highly increase as n .

By setting the initial cr to a small finite value but enough, most of thedependencies will become aged and can be removed without violatingcausal consistency after the associated meta-data is transmitted across afew hops (even for a large number of processes.)

The correlation coefficients of cr0 and n are around 0.94 � 0.95. Themajor critical credit values increase as n to avoid causal violations.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems57 / 72

Page 58: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Average Meta-Data Size (KB)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

1 2 3 4 5 6 7 8

Ave

rag

e M

eta

-Da

ta S

ize

(KB

)

Initial Credits

5Peers

10Peers

20Peers

30Peers

40Peers

Figure: wrate = 0.2.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems58 / 72

Page 59: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Average Meta-Data Size (KB)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1 2 3 4 5 6 7 8

Ave

rag

e M

eta

-Da

ta S

ize

(KB

)

Initial Credits

5Peers

10Peers

20Peers

30Peers

40Peers

Figure: wrate = 0.5.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems59 / 72

Page 60: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Average Meta-Data Size (KB)

0.2

0.4

0.6

0.8

1

1.2

1.4

1 2 3 4 5 6 7 8

Ave

rag

e M

eta

-Da

ta S

ize

(KB

)

Initial Credits

5Peers

10Peers

20Peers

30Peers

40Peers

Figure: wrate = 0.8.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems60 / 72

Page 61: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Critical Average Message Meta-Data Size mave (KB)

Figure: Summary of the critical average message meta-data sizes.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems61 / 72

Page 62: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Critical Average Message Meta-Data Size

With increasing number of processes, mave linearly increases.mave becomes smaller with a higher wrate for more processes.

A read operation will invoke a MERGE function to merge the piggybackedlog list. So, a higher read rate may increase the likelihood to make the logsize enlarged.Furthermore, although a write operation results in the increase of explicitinformation, it comes with a PURGE function to delete the redundantinformation.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems62 / 72

Page 63: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Message Meta-Data Saving Rate

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 7 8 9

Met

a-D

ata

Siz

e S

avi

ng

Ra

te

Initial Credits

5 Peers10 Peers20 Peers30 Peers40 Peers

Figure: wrate = 0.2.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems63 / 72

Page 64: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Message Meta-Data Saving Rate

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3 4 5 6 7 8 9

Met

a-D

ata

Siz

e S

avi

ng

Ra

te

Initial Credits

5 Peers10 Peers20 Peers30 Peers40 Peers

Figure: wrate = 0.5.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems64 / 72

Page 65: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Message Meta-Data Saving Rate

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3 4 5 6 7 8 9

Met

a-D

ata

Siz

e S

avi

ng

Ra

te

Initial Credits

5 Peers10 Peers20 Peers30 Peers40 Peers

Figure: wrate = 0.8.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems65 / 72

Page 66: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Critical Message Meta-Data Size Saving Rates Rs

Figure: Summary of the critical message meta-data size saving rates.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems66 / 72

Page 67: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Approximate Causal Consistency Performance

Critical Message Meta-Data Size Saving Rate

Focus on the case of 40 processes:Rs is around 40% � 60% at a very slight cost of violating causalconsistency.Rs reaches around 5% � 20% without violating causality order in differentwrite rates.

This evidence proves that if the initial credit allocation is just a smalldigit, when the corresponding meta-data is removed, the associatedmessage would already (very likely) have reached its destination.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems67 / 72

Page 68: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Conclusions and Future Work

Conclusions

Opt-Track has a better network capacity utilization and betterscalability than Full-Track for causal consistency in partial replication.

The meta-data overhead of Opt-Track is linear in n .These improvements increase in higher write-intensive workloads.For the case of 40 processes, the overheads of Opt-Track are only around10 � 20 % of those of Full-Track for different write rates.

Opt-Track-CRP can perform better than optP (Baldoni et. al 2006) infull replication.

For 40 processes, the overheads of Opt-Track-CRP are only around 50 �55 % of those of optP for different write rates.

Showed the advantages of partial replication and provided the conditionsunder which partial replication can provide less overhead than fullreplication.Modification of Opt-Track, called Approx-Opt-Track, to provideapproximate causal consistency by reducing the size of the meta-data.

By controlling a parameter cr , we can trade-off the level of potentialinaccuracy by the size of meta-data.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems68 / 72

Page 69: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Conclusions and Future Work

Future Work

Reduce the size of the meta-data for maintaining causal consistency inpartially replicated systems.

Dynamic and Data-driven Replication Mechanism (Optimize thereplication mechanism).

Which file?How many replicas?Where?When?(i.e., the replica factor rate p

n is a variable.)Hierarchical Causal Consistency Protocol in Partial Replication.

A client-cluster model (two-level architecture) for causal consistency underpartial replication.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems69 / 72

Page 70: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Conclusions and Future Work

References I

M. Ahamad, G. Neiger, J. Burns, P. Kohli, and P. Hutto.Causal memory: Definitions, implementation and programming.Distributed Computing, 9(1):37–49, 1994.

R. Baldoni, A. Milani, and S. Tucci-piergiovanni.Optimal propagation based protocols implementing causal memories.Distributed Computing, 18(6):461–474, 2006.

Min Shen, Ajay D. Kshemkalyani, and Ta Yuan Hsu.OPCAM: optimal algorithms implementing causal memories in sharedmemory systems.In Proceedings of the 2015 International Conference on DistributedComputing and Networking, ICDCN 2015, Goa, India, January 4-7,2015, pages 16:1–16:4.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems70 / 72

Page 71: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Conclusions and Future Work

References II

Min Shen, Ajay D. Kshemkalyani, and Ta Yuan Hsu.Causal consistency for geo-replicated cloud storage under partialreplication.In IPDPS Workshops, pages 509–518. IEEE, 2015.

A. Kshemkalyani and M. Singhal.Necessary and sufficient conditions on information for causal messageordering and their optimal implementation.Distributed Computing, 11(2):91–111, April 1998.

Ta Yuan Hsu and Ajay D. Kshemkalyani.Performance of approximate causal consistency for partially replicatedsystems.In Workshop on Adaptive Resource Management and Scheduling forCloud Computing (ARMS-CC), pages 7–13. ACM, 2016.

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems71 / 72

Page 72: CausalConsistencyAlgorithmsforPartially ...ajayk/CCalgos.pdfCausalConsistencyAlgorithmsforPartially ReplicatedandFullyReplicatedSystems TaYuanHsu,AjayD.Kshemkalyani,MinShen UniversityofIllinoisatChicago

Conclusions and Future Work

Questions

Thank You!

TaYuan Hsu, Ajay D. Kshemkalyani, Min Shen University of Illinois at ChicagoCausal Consistency Algorithms for Partially Replicated and Fully Replicated Systems72 / 72