a multiversion update-serializable protocol for genuine partial data replication
Post on 24-Feb-2016
37 Views
Preview:
DESCRIPTION
TRANSCRIPT
A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication
Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís Rodrigues
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland
Distributed STMsSTMs are being employed in new scenarios:
Database caches in three-tier web apps (FénixEDU)
HPC programming language (X10) In-memory cloud data grids (Coherence,
Infinispan)New challenges:
ScalabilityFault-tolerance
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland
REPLICATION
2
Partial ReplicationEach site stores a partial copy of the data.Genuine partial replication schemes maximize
scalability by ensuring that:Only data sites that replicate data item read or
written by a transaction T, exchange messages for executing/committing T.
Existing 1-Copy Serializable implementations enforce distributed validation of read-only transactions [SRDS10]: considerable overheads in typical workloads
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 3
Issues with Partial ReplicationExtending existing local multiversion (MV) STMs
is not enough.Local MV STMs rely on a single global counter to
track version advancement.Problem:
Commit of transactions should involve ALL NODES
NO GENUINENESS = POOR SCALABILITY
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 4
GMU: Genuine Multiversion Update-Serializable Replication
[ICDCS12]
In the execution/commit phase of a transaction T, ONLY nodes which store data items accessed by T are involved.
It uses multiple versions for each data item It builds visible snapshots = freshest consistent
snapshots taking into account:1. causal dependencies vs. previously committed transactions
at the time a transaction began,2. previous reads executed by the same transaction
Vector clocks used to establish visible snapshots
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland
G M U
5
High Level Overview (i)Transactions commit using a vector clock.Each node stores a log of committed vector
clocks.
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 6
Initial view of the visible snapshotUpon a transaction T begins on N: it acquires the
most recent vector clock in N’s commit log.
View extension of the visible snapshotUpon T reads on a node N:
T’s vector clock can be modified according to N’s commit log.
Three reading rules are applied using T’s vector clock.
High Level Overview (ii)
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 7
Write operationUpon a transaction T writes V on data item O: it
inserts <O,V> in T’s write-set.
Commit operationRead-only transactions always commit.Update transactions run a genuine 2-Phase Commit:
Upon prepare message reception (participant-side)acquire read/write locks and validate read-set,send back a tentative commit vector clock.
If all replies are positive (coordinator-side)multicast write-set and final commit vector
clock.
Rule 1: Reading Lower BoundNode 0 Node 1
(it stores X)Node 2
(it stores Y)
X(2)
X(2)T1:R(X)
(1,1,1)
(1,2,2)
(1,1,1)
Y(2)
(1,2,2)
T0:W(X,v)
T0:W(Y,w)
(1,1,1)
T1:R(Y)Y(2)
(1,2,2)
Most recent VC in VCLog
T1.VC
T0:Commit
Commit
(1,2,2)T1.VC
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 8
Rule 2: Reading Upper BoundNode 0 Node 1
(it stores X)Node 2
(it stores Y)
X(3)
Y(2)
X(1)T1:R(X)
(1,1,1)
(1,3,3)
(1,1,1)
Y(3)
(1,3,3)
T0:W(X,v)
T0:W(Y,w)
X(1)
(1,1,1)
T1:R(Y) Y(2)
T1:Commit
(1,1,1)
Most recent VC in VCLog
T1.VC
T0:CommitCommit
(1,1,2)T1.VC
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland
(1,1,2)
Y(1)
9
Rule 3: Selection of Data Versions
Informally: observe the most recent consistent version of data item id on node i based on T’s history (previous reads).
Formally: iterate over the versions of id and return the most recent one s.t.
id.version.VN <= T.VC[i]
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 10
Building the commit Vector Clock
Based on a variant of the Skeen’s total order multicast algorithm [SKEEN85].
Intuition:Serialize all-and-only conflicting transactions,
trackingdirect and transitive conflict dependencies,causal relationship
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 11
Consistency Criterion
GMU ensures Extended Update Serializability:Update Serializability [ICDT86] ensures:
1-Copy-Serializabilty (1CS) on the history restricted to committed update transactions;
1CS on the history restricted to committed update transactions and any single read-only transaction. But it can admit non-1CS histories containing at least 2 read-
only transactions.
Extended Update Serializability [Adya99]:ensures US property also to executing transactions;analogous to opacity in STMs.
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 12
Experiments on private cluster8 core physical nodes
TPC-C- 90% read-only xacts- 10% update xacts
- 4 threads per node
- moderate contention (15% abort rate at 20 nodes)
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 13
Thanks for the attention
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland 14
References
Euro-TM Workshop on Transactional Memory (WTM 2012), Bern, Switzerland
[Adya99] A. Adya, “Weak consistency: A generalized theory and optimistic implementations for distributed transactions,” tech. rep., PhD Thesis, Massachusetts Institute of Technology, 1999.[ICDCS12] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, Luís Rodrigues. “When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Replication”. The IEEE 32nd International Conference on Distributed Computing Systems, June, 2012.[ICDT86] R. C. Hansdah and L. M. Patnaik, “Update serializability in locking,”. International Conference of Database Theory, vol. 243 of Lecture Notes in Computer Science, pp. 171–185, Springer Berlin / Heidelberg, 1986. [SKEEN85] D. Skeen. “Unpublished communication”, 1985. Referenced in K. Birman, T. Joseph “Reliable Communication in the Presence of Failures”, ACM Trans. on Computer Systems, 47-76, 1987 [SRDS10] Nicolas Schiper, Pierre Sutra, Fernando Pedone. “P-Store: Genuine Partial Replication in Wide Area Networks”. Proc. of the 29th Symposium of Reliable Distributed Systems, 2010.
15
top related