introduction to concurrency &lamportclocks · •landmark 1978 paper by leslie lamport...
TRANSCRIPT
IntroductiontoConcurrency& Lamport Clocks
COS316:PrinciplesofComputerSystemDesignLecture14
AmitLevy&WyattLloyd
Concurrency
• Multiplethingshappeningatthesametime
• Primarybenefitisbetterperformance• Domoreworkinthesameamountoftime• Completefixedamountworkinlesstime• Betterutilizeresources
• Primarycostiscomplexity• Hardtoreasonabout• Hardtogetright• (Systemsdealwithit,notapplications,… tosomeextent)
ConcurrencyExamples
• Runacomputationonallcoresinamachine• Runacomputationonallcoreson100Kmachine!
• Allowanapplicationtohavemultipleoutstandingwritestodisk• Allowmanyapplicationstowritetodisks
• Manydevicesusewirelessbandwidth• MillionsofTCPconnectionsuseinternetbackbonelinks
DistributedSystems,What?
1)Multiplecomputers2) Connectedbyanetwork3) Doingsomethingtogether
ConcurrencyisInevitable!
• ANewYork-basedbankwantstomakeitstransactionledgerdatabaseresilienttowhole-sitefailures
• Replicate thedatabase,keeponecopyinsf,oneinnyc
Motivation:Multi-sitedatabasereplication
New YorkSan Francisco
5
• Replicatethedatabase,keeponecopyinsf,oneinnyc• Clientsendsquerytothenearestcopy• Clientsendsupdatetobothcopies
Theconsequencesofconcurrentupdates
“Deposit$100”
“Pay 1%interest”
$1,000$1,000
$1,100
$1,111
$1,010
$1,110
Inconsistent replicas!Updates should have been performed in the same order at each copy
6
RFC677“TheMaintenanceofDuplicateDatabases”(1975)• “Totheextentthatthecommunicationpathscanbemadereliable,andtheclocksusedbytheprocesseskeptclosetosynchrony,theprobabilityofseeminglystrangebehaviorcanbemadeverysmall.However,thedistributednatureofthesystemdictatesthatthisprobabilitycanneverbezero.”
Idea:Logicalclocks
• Landmark1978paperbyLeslieLamport
• Insight:onlytheeventsthemselvesmatter
8
Idea: Disregard the precise clock timeInstead, capture just a “happens before” relationship between a pair of events
• Considerthreeprocesses:P1,P2,andP3
• Notation:Eventa happensbefore eventb(aà b)
Defining“happens-before”(à)
Physical time ↓
P1 P2P3
9
• CanobserveeventorderatasingleprocessDefining“happens-before”(à)
Physical time ↓
P1 P2P3
a
b
10
1. Ifsameprocessanda occursbeforeb,thenaà b
Defining“happens-before”(à)
Physical time ↓
P1 P2P3
a
b
11
1. Ifsameprocessanda occursbeforeb,thenaà b
2. Canobserveorderingwhenprocessescommunicate
Defining“happens-before”(à)
P1 P2P3
a
bc
12
Physical time ↓
1. Ifsameprocessanda occursbeforeb,thenaà b
2. Ifc isamessagereceiptofb,thenbà c
Defining“happens-before”(à)
P1 P2P3
a
bc
13
Physical time ↓
1. Ifsameprocessanda occursbeforeb,thenaà b
2. Ifc isamessagereceiptofb,thenbà c
3. Canobserveorderingtransitively
Defining“happens-before”(à)
P1 P2P3
a
bc
14
Physical time ↓
1. Ifsameprocessanda occursbeforeb,thenaà b
2. Ifc isamessagereceiptofb,thenbà c
3. Ifaà b andbà c,thenaà c
Defining“happens-before”(à)
P1 P2P3
a
bc
15
Physical time ↓
• Notalleventsarerelatedbyà
• a,dnotrelatedbyà soconcurrent, writtenasa||d
Concurrentevents
16
P1
a
bc
P2P3
Physical time ↓
d
• WeseekaclocktimeC(a) foreveryeventa
• Clockcondition:Ifaà b,thenC(a)<C(b)
Lamportclocks:Objective
17
Plan: Tag events with clock times; use clock times to make distributed system correct
• EachprocessPi maintainsalocalclockCi
1. Beforeexecutinganevent,Ci ß Ci +1
TheLamportClockalgorithm
P1C1=0
a
bc
P2C2=0 P3
C3=0
18
Physical time ↓
1. Beforeexecutinganeventa,Ci ß Ci +1:
• SeteventtimeC(a)ß Ci
TheLamportClockalgorithm
P1C1=1
a
bc
P2C2=0 P3
C3=0C(a) = 1
19
Physical time ↓
1. Beforeexecutinganeventb,Ci ß Ci +1:
• SeteventtimeC(b)ß Ci
TheLamportClockalgorithm
P1C1=2
a
bc
P2C2=0 P3
C3=0
C(b) = 2
C(a) = 1
20
Physical time ↓
1. Beforeexecutinganeventb,Ci ß Ci +1
2. Sendthelocalclockinthemessagem
TheLamportClockalgorithm
P1C1=2
a
bc
P2C2=0 P3
C3=0
C(b) = 2
C(a) = 1
C(m) = 2
21
Physical time ↓
3. OnprocessPj receivingamessagem:
• SetCj andreceiveeventtimeC(c)ß1+max{Cj,C(m)}
TheLamportClockalgorithm
P1C1=2
a
bc
P2C2=3 P3
C3=0
C(b) = 2
C(a) = 1
C(m) = 2
C(c) = 3
22
Physical time ↓
LamportTimestamps:Orderingallevents
• Breaktiesbyappendingtheprocessnumbertoeachevent:
1. ProcessPi timestampseventewithCi(e).i
2. C(a).i <C(b).j when:• C(a)<C(b),orC(a)=C(b)andi <j
• Now,foranytwoeventsaandb,C(a)<C(b)orC(b)<C(a)• Thisiscalledatotalorderingofevents
23
Orderalltheseevents
P1C1=0
a
b
c
P2C2=0
P3C3=0
Physical time ↓
P4C3=0
d
e
f
g
h
i
• Clientsendsupdatetoonereplicasitej• ReplicaassignsitLamporttimestampCj .j
• Keyidea:Placeeventsintoasortedlocalqueue• SortedbyincreasingLamporttimestamps
Totally-OrderedMulticast
P1
%1.2
$1.1Example: P1’s
local queue:
25
Goal: All sites apply updates in (same) Lamport clock order
ß Timestamps
1. Onreceivinganupdatefromclient,broadcasttoothers(includingyourself)
2. Onreceivinganupdatefromreplica:a) Addittoyourlocalqueueb) Broadcastanacknowledgementmessagetoeveryreplica(includingyourself)
3. Onreceivinganacknowledgement:• Markcorrespondingupdateacknowledgedinyourqueue
4. Removeandprocessupdateseveryone hasack’edfromhead ofqueue
Totally-OrderedMulticast(Almostcorrect)
26
• P1queues$,P2queues%
• P1queuesandack’s %• P1marks%fullyack’ed
• P2marks% fullyack’ed
Totally-OrderedMulticast(Almostcorrect)
P1 P2$ 1.1
%1.2
$1.1
%1.2
%ack
$1.1
%1.2
%
✔ ✔✔
(Ack’sto self not shown here)27
✘ P2 processes %
1. Onreceivinganupdatefromclient,broadcasttoothers(includingyourself)
2. Onreceivingorprocessinganupdate:a) Addittoyourlocalqueue,ifreceivedupdateb) Broadcastanacknowledgementmessagetoeveryreplica(includingyourself)only
fromheadofqueue
3. Onreceivinganacknowledgement:• Markcorrespondingupdateacknowledgedinyourqueue
4. Removeandprocessupdateseveryone hasack’edfromhead ofqueue
Totally-OrderedMulticast (Correctversion)
28
29
Totally-OrderedMulticast(Correctversion)
P1 P2$ 1.1
%1.2
$1.1
%1.2
%ack
ack $
%1.2
$%
%
$
✔✔ ✔
(Ack’sto self not shown here)
$1.1
✔
• Doestotally-orderedmulticastsolvetheproblemofmulti-sitereplicationingeneral?
• Notbyalongshot!
1. Ourprotocolassumed:• Nonodefailures• Nomessageloss• Nomessagecorruption
2. Alltoallcommunicationdoesnotscale3. Waitsforeverformessagedelays(performance?)
So,arewedone?
30
• Cantotally-ordereventsinadistributedsystem:that’suseful!• WesawanapplicationofLamportclocksfortotally-orderedmulticast
• But:whilebyconstruction,aà bimpliesC(a)<C(b),• Theconverseisnotnecessarilytrue:
• C(a)<C(b)doesnotimplyaà b(possibly,a||b)
31
Take-awaypoints:Lamportclocks
Can’t use Lamport clock timestamps to infer causal relationships between events
IntrotoConcurrencyConclusion
• Concurrencyisgreatforperformance,hardtoreasonabout,andoftenunavoidableinsystems
• ReplicatedDBexample• Concurrentupdatescanleadtoinconsistencybetweenreplicas• Lamport clockscanordereventsinadistributedsystem• Lamport clocks+carefulprotocol=correctreplication
• Whatis“correct”?