replication: optimistic approaches
Post on 05-Jan-2016
33 Views
Preview:
DESCRIPTION
TRANSCRIPT
Replication: optimistic approaches
Marc Shapirowith Yasushi Saito (HP Labs)
Cambridge Distributed Systems Group
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 2
Motivations for this work
Peer-to-peer, decentralised write sharing
Lessons and commonalities
Understand limitations
Different solutions: spectrum or discrete points?
Simple formal model
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 3
Optimistic replication
Replicas of shared objects on sitesWithout synchronisation:
peer-to-peer read and update!
Consistency: a posteriori, offlineMerge independent updates
Applications:high latency networksdisconnected operationcooperative work
Improves availability & performance
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 4
Example: cooperative engineering with CVS
CVS: developing shared code
Local, disconnected replica: no interference
Conflicts:Write same file = syntacticOverlap in file = violates edit semanticsDoesn’t compile, test = violates
application semantics
Both sides of a conflict are excluded
Manual repair
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 5
Example: Bayou
General-purpose databaseAny replica can update, log actions
action = { dependency check, operation, merge-procedure }
Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commitdep-check: semantic check for conflict merge-proc: semantic repair
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 6
Basic vocabulary
While isolated: tentative updates
When connected, reconcile:Propagate & collect updates(Conceptually) Restart from initial stateReplay updates (if possible)
Overriding goal: consistency
1. Consistency
Study component issues of consistency
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 8
What is consistency?
Consistent with user intentsapply operationsaccording to user scenario
Consistent with data invariantsdependent actionspre- and post-conditionsconflict resolution
Replicas consistent with each otherconverge towards same values
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 9
Consistency: problem taxonomy1. Objects & updates
Internal vs. external consistency Value / value log / operation log Single master / multi-master
2. Detecting dependence vs. concurrency
3. Concurrency control
4. Laziness of concurrency control Pessimistic / advanced concurrency /
optimistic
5. Convergence
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 10
Operation-based reconciliation
Updates: concurrent, unsynchronised
Local log of actions = operation descriptionsobject identifier, method, arguments
Multi-log collects local + remote logs
Reconciliation schedule: merge multi-log & run sequentially
Scheduling issues:Include vs. excludeExecution order
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 11
Operation-based model
0
0
1
2
0
0
4
3
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 12
Dependence vs. concurrency
Two actions are either have a dependency or commutative / concurrent
Dependent actions:do not conflictmust be scheduled in dependence order
Concurrent actionspotentially conflict
Dependence / concurrency detection is a fundental mechanism
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 13
Concurrency control
Concurrent & no conflict commute: execute both, arbitrary order
Conflict detection options
Conflict resolution options
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 14
Convergence
Liveness: sites receive same/all actions
Safety: given same actions, sites compute the same value
Stability: actions eventually not undone
2. Dependency & Concurrency
Mechanisms to detect if actions are dependent or concurrent
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 16
Scalar clocks and timestamps
Wall clock, Lamport clockTotal orderTotal order, consistent with
causal dependenceSchedule in timestamp orderCan’t detect concurrency
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 17
Happens-before
e1 precedes e2 in processe1 sends, e2 receives
e1 e2
(e1 e2) (e2 e1) e1 || e2
e1 || e2: e1 does not cause e2
e1 e2: e1 might cause e2
Partial order, consistent with causal dependence
Schedule consistent with
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 18
Syntactic vs. semantic mechanisms
Scalar timestampsno concurrency detectionvery conservative approx.
of causalityVector timestamps
detect concurrencyconservative approx. of
causality
Alternative: explicit semantic constraints
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 19
Locks as semantic constraints
Read(x) depends onprevious Write(x) in same process, orpreviously-received Write(x), whichever
is laterWrite(x) depends on
previous Read(*) in same processMore semantic information than Happens-
BeforeStep in the right direction, but still too coarse
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 20
IceCube: Primitive constraints
Declarative (“static”):MustHave: a b
if as and ab then bs(not necessarily contiguous nor in
order)Order: a b
if a, bs and ab then a before b in s(not necessarily both nor contiguous)
Within log, across logs
Imperative (dynamic): preCondition (State)
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 21
Log constraints
parcelpredecessor-
successor
alternatives
Express user intents:Predecessor/successor: a b b a
b uses effect of a; “a causes b”Parcel: a b b a
transactionAlternatives: a b b a
3. Concurrency control & scheduling
Policies for dealing with concurrent actions
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 23
Optimistic concurrency control & scheduling
Two actions are either:dependent
schedule in dependence orderconcurrent and non-conflicting or
commutative schedule in any order
concurrent and conflicting schedule in non-conflicting order or exclude one, the other, or both
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 24
Concurrency control
Concurrent & no conflict commute: execute both, arbitrary order
Conflict detection options:2 concurrent actions conflictonly if operate on same objectonly if both writeonly if violate semantic invariant
Conflict resolution options:exclude bothexclude 1st, include 2nd (or vice-versa)execute both in favorable order(rewrite and execute both)
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 25
What is a conflict?
1 site executes code + pre/post-conditionsPre/post-conditions often unknownDependency between successive actions
Schedule execution must satisfy pre/post-conditionsViolation conflict
pre(x0) post(x0, f(x0))
x1:= f(x0)
pre(x0) post(x’1, g(x0))
x’1:= g(x0)
pre(x1) post(x1, g(x1))
x2:= g(x1)
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 26
Thomas’ Write Rule
Pre- / post-conditions unknownScalar clocks
no concurrency detectimplicit concurrency controlschedule in clock ordera later action excludes earlier ones
Lost updates
Delete ambiguity: “tombstone” state
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 27
Value-based Version Vector concurrency control
Pre- / post-conditions unknown
Independent objectsactions to different objects commuteVV = per-object vector timestampany concurrent writes to object conflict
Resolution:ManualValues: “Resolver” per data type
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 28
Bayou scheduling
Disjoint databases; 1 primary / database
Transaction: single database
Action = { dependency check, operation, merge-procedure }
Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commit
Conflict dependency check fails
merge-procedure
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 29
Bayou dependency checks
Write-write conflicts: on replay check that data unchanged
Read-write conflicts: check input datacan detect concurrent updatessemantic: only relevant changes
Application-specific checksbank account balance > £100fine grain
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 30
IceCube: Object constraints
Shared data type advertises static semanticsmutually exclusive a b b a
best order (e.g. bank: credits before debits) a b
Only between concurrent actions
Also: dynamic constraints
commutebestorder
mutuallyexclusive
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 31
IceCube scheduling
Insight: conflict: choice of which action to excludemaximise value
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 32
IceCube execution model
0 1
0 2
0
0
0
0
0
8
11
4
5
6
log constraints
log constraintsobjectconstraints
0 9
0 10
0 7
dynamic constraints
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 33
Search vs. syntactic order
0
50
100
150
200
250
5 40 75 110 145 180 215 250
Number of actions
Solu
tion
siz
e OptimalConcatenateIceCubeSingle log
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 34
Performance of IceCube heuristics
0
500
1000
1500
2000
2500
3000
1000
2000
3000
4000
5000
6000
7000
8000
9000
1000
0
Number of actions
Ex
ec
uti
on
tim
e (
ms
)
Total
4. Convergence
Can a peer-to-peer system converge?
Hard in the general case
Formalise to understand limitations, trade-offs
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 36
Convergence
Liveness: sites receive same/all operations epidemic multicastquickly
Safety: sites compute the same valueequivalent schedules
Stability: actions eventually not undonestable schedulesUsers, external world dependencyGarbage collection
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 37
Schedule soundness & equivalence
s sound:Closed for MustHave
as ab bsConsistent with Order
(a,b s ab) a before b in sEquivalence: s t
s, t soundas atordering is irrelevant!
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 38
Stability
Peer-to-peer, indefinite tentative update + advisory reconciliation OK
But stability needed:Users, external world depend on itGarbage collect multilog
Stable: eventually decisions not changedcommitted: definitely included in all
schedulesaborted: definitely excluded
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 39
Correctness of stability
Actions known to be stable at site i:stablei = committedi abortedi
Live: action a, site i: a stablei
Safe: site i, schedule si:
si sound committedi si site i,k: committedi abortedk =
Safety invariant: strong, global!
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 40
Maintaining disjointness
site i,k: committedi abortedk = Different possibilities
Unilateral abortTWR, Holliday 2000
Unilateral commitDeterministic abort / commit rule
TWR Primary (only one) site decides
Bayou, CVSConsensus before deciding
Deno, Holliday 2000-2002
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 41
Maintaining soundness site i, schedule si:
si sound committedi si
When aborting a, also abort actions that MustHave a
When committing a, also abort uncommitted actions that are ‘Order’ed before a
Maintain both soundness and disjointness.Peer-to-peer commitment is hard!
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 42
Stability with TWR
Independent objects
Independent writes (no MustHave nor Order)
All sites take same decision:Given two writes to same object, abort
the earlierWhether concurrent or notWrite stable when seen by all sites
Disjointness: committedi =
Soundness: no MustHave (no transactions)
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 43
Stability in Bayou
Databases:DisjointIndependent: no multi-DB transaction1 primary / database
Log constraints: transactions, time order
Disjointness: Only 1 site decides about a: the primary for the database that a updates
Soundness: whole transaction commits or aborts
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 44
Holliday’s pre-commit protocol
Log constraints: multi-object transactionshappens-before order
Read transactions commit locally
Read-Write transactions: consensus to commitconvert locks to intentionspre-commit, votecommit if quorum ‘yes’abort if anti-quorum ‘no’ or conflict with
committed
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 45
Trade-offs
Deterministic rulefast, inflexible
Partition + primarysingle point of failureno MustHave across partition boundaries
Consensusslowscalabilityimpossibility of consensus in asynchronous
systems with failure
5. Conclusions
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 47
Need for OR not going away
“Network technology improving: keep everything consistent pessimistically.”
True, but:Constant latency; unavailable bandwidthMobile access unbounded latencyIncreasing numbers of replicas
“Conflicts are rare.”
True, but:Do occurVery high cost
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 48
OR pros & cons
Peer-to-peer read/write sharing
OR accepts more updates:Performance despite latencyAvailability despite failures
Increased complexitySemantic informationNot transparent
Bottleneck moved to commitHard to make peer-to-peerUnless (unacceptable?) restrictions
Unavoidable
Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 49
The end
top related