a multiversion update-serializable protocol for genuine partial data replication

15
A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso , Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís Rodrigues Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

Upload: marty

Post on 24-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Sebastiano Peluso , Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís Rodrigues. A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication. Distributed STMs. STMs are being employed in new scenarios: Database caches in three-tier web apps ( FénixEDU ) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication

Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís Rodrigues

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

Page 2: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Distributed STMsSTMs are being employed in new scenarios:

Database caches in three-tier web apps (FénixEDU)

HPC programming language (X10) In-memory cloud data grids (Coherence,

Infinispan)New challenges:

ScalabilityFault-tolerance

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

REPLICATION

2

Page 3: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Partial ReplicationEach site stores a partial copy of the data.Genuine partial replication schemes maximize

scalability by ensuring that:Only data sites that replicate data item read or

written by a transaction T, exchange messages for executing/committing T.

Existing 1-Copy Serializable implementations enforce distributed validation of read-only transactions [SRDS10]: considerable overheads in typical workloads

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 3

Page 4: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Issues with Partial ReplicationExtending existing local multiversion (MV) STMs

is not enough.Local MV STMs rely on a single global counter to

track version advancement.Problem:

Commit of transactions should involve ALL NODES

NO GENUINENESS = POOR SCALABILITY

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 4

Page 5: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

GMU: Genuine Multiversion Update-Serializable Replication

[ICDCS12]

In the execution/commit phase of a transaction T, ONLY nodes which store data items accessed by T are involved.

It uses multiple versions for each data item It builds visible snapshots = freshest consistent

snapshots taking into account:1. causal dependencies vs. previously committed transactions

at the time a transaction began,2. previous reads executed by the same transaction

Vector clocks used to establish visible snapshots

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

G M U

5

Page 6: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

High Level Overview (i)Transactions commit using a vector clock.Each node stores a log of committed vector

clocks.

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 6

Initial view of the visible snapshotUpon a transaction T begins on N: it acquires the

most recent vector clock in N’s commit log.

View extension of the visible snapshotUpon T reads on a node N:

T’s vector clock can be modified according to N’s commit log.

Three reading rules are applied using T’s vector clock.

Page 7: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

High Level Overview (ii)

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 7

Write operationUpon a transaction T writes V on data item O: it

inserts <O,V> in T’s write-set.

Commit operationRead-only transactions always commit.Update transactions run a genuine 2-Phase Commit:

Upon prepare message reception (participant-side)acquire read/write locks and validate read-set,send back a tentative commit vector clock.

If all replies are positive (coordinator-side)multicast write-set and final commit vector

clock.

Page 8: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Rule 1: Reading Lower BoundNode 0 Node 1

(it stores X)Node 2

(it stores Y)

X(2)

X(2)T1:R(X)

(1,1,1)

(1,2,2)

(1,1,1)

Y(2)

(1,2,2)

T0:W(X,v)

T0:W(Y,w)

(1,1,1)

T1:R(Y)Y(2)

(1,2,2)

Most recent VC in VCLog

T1.VC

T0:Commit

Commit

(1,2,2)T1.VC

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 8

Page 9: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Rule 2: Reading Upper BoundNode 0 Node 1

(it stores X)Node 2

(it stores Y)

X(3)

Y(2)

X(1)T1:R(X)

(1,1,1)

(1,3,3)

(1,1,1)

Y(3)

(1,3,3)

T0:W(X,v)

T0:W(Y,w)

X(1)

(1,1,1)

T1:R(Y) Y(2)

T1:Commit

(1,1,1)

Most recent VC in VCLog

T1.VC

T0:CommitCommit

(1,1,2)T1.VC

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

(1,1,2)

Y(1)

9

Page 10: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Rule 3: Selection of Data Versions

Informally: observe the most recent consistent version of data item id on node i based on T’s history (previous reads).

Formally: iterate over the versions of id and return the most recent one s.t.

id.version.VN <= T.VC[i]

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 10

Page 11: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Building the commit Vector Clock

Based on a variant of the Skeen’s total order multicast algorithm [SKEEN85].

Intuition:Serialize all-and-only conflicting transactions,

trackingdirect and transitive conflict dependencies,causal relationship

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 11

Page 12: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Consistency Criterion

GMU ensures Extended Update Serializability:Update Serializability [ICDT86] ensures:

1-Copy-Serializabilty (1CS) on the history restricted to committed update transactions;

1CS on the history restricted to committed update transactions and any single read-only transaction. But it can admit non-1CS histories containing at least 2 read-

only transactions.

Extended Update Serializability [Adya99]:ensures US property also to executing transactions;analogous to opacity in STMs.

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 12

Page 13: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Experiments on private cluster8 core physical nodes

TPC-C- 90% read-only xacts- 10% update xacts

- 4 threads per node

- moderate contention (15% abort rate at 20 nodes)

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 13

Page 14: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

Thanks for the attention

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 14

Page 15: A  Multiversion Update-Serializable Protocol for  Genuine  Partial  Data  Replication

References

Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland

[Adya99] A. Adya, “Weak consistency: A generalized theory and optimistic implementations for distributed transactions,” tech. rep., PhD Thesis, Massachusetts Institute of Technology, 1999.[ICDCS12] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, Luís Rodrigues. “When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Replication”. The IEEE 32nd International Conference on Distributed Computing Systems, June, 2012.[ICDT86] R. C. Hansdah and L. M. Patnaik, “Update serializability in locking,”. International Conference of Database Theory, vol. 243 of Lecture Notes in Computer Science, pp. 171–185, Springer Berlin / Heidelberg, 1986. [SKEEN85] D. Skeen. “Unpublished communication”, 1985. Referenced in K. Birman, T. Joseph “Reliable Communication in the Presence of Failures”, ACM Trans. on Computer Systems, 47-76, 1987 [SRDS10] Nicolas Schiper, Pierre Sutra, Fernando Pedone. “P-Store: Genuine Partial Replication in Wide Area Networks”. Proc. of the 29th Symposium of Reliable Distributed Systems, 2010.

15