reminder from last time - department of computer science ... · reminder from last time • history...
Post on 12-May-2018
216 Views
Preview:
TRANSCRIPT
9/24/16
1
ConcurrentsystemsLecture7:Crashrecovery,lock-free
programming,andtransactionalmemory
Dr RobertN.M.Watson
1
Reminderfromlasttime
• Historygraphs;good(andbad)schedules• Isolationvs.strictisolation;enforcingisolation• Two-phaselocking;rollback• Timestampordering(TSO)• Optimisticconcurrencycontrol(OCC)• Isolationandconcurrencysummary
2
9/24/16
2
Thistime
• Transactionaldurability:crashrecoveryandlogging–Write-aheadlogging– Checkpoints– Recovery
• Advancedtopics– Lock-freeprogramming– Transactionalmemory
• Afewnotesonsupervisionexercises
3
CrashRecovery&Logging
• TransactionsrequireACID properties– SofarhavefocusedonI (andimplicitlyC).
• HowcanweensureAtomicity&Durability?– Needtomakesurethatifatransactionalwaysdoneentirelyornotatall
– Needtomakesurethatatransactionreportedascommittedremainsso,evenafteracrash
• Considerfornowafail-stopmodel:– Ifsystemcrashes,allin-memorycontentsarelost– Dataondisk,however,remainsavailableafterreboot
4
Thesmallprint:wemustkeepinmindthelimitationsoffail-stop,evenasweassumeit.Failinghardware/softwaredoweirdstuff.Payattentiontohardwarepricedifferentiation.
9/24/16
3
Usingpersistentstorage
• Simplest“solution”:writeallupdatedobjectstodiskoncommit,readbackonreboot– Doesn’twork,sincecrashcouldoccurduringwrite– CanfailtoprovideAtomicityand/orConsistency
• Insteadsplitupdateintotwostages1. Writeproposedupdatestoawrite-aheadlog2. Writeactualupdates
• Crashduring#1=>noactualupdatesdone• Crashduring#2=>uselogtoredo,orundo
5
Write-aheadlogging• Log:anordered,append-onlyfileondisk• Containsentrieslike<txid,obj,op,old,new>– IDoftransaction,objectmodified,(optionally)theoperationperformed,theoldvalueand thenewvalue
– Thismeanswecanboth“rollforward”(redooperations)and“rollback”(undooperations)
• Whenpersistingatransactiontodisk:– Firstlogaspecialentry<txid,START>– Nextloganumberofentriestodescribeoperations– Finallyloganotherspecialentry<txid,COMMIT>
• Webuildcomposite-operationatomicityfromfundamentalatomicunit:single-sectorwrite.– Muchlikebuildinghigh-levelprimitivesoverLL/SC orCAS!
6
9/24/16
4
Usingawrite-aheadlog• Whenexecutingtransactions,performupdatestoobjectsinmemorywithlazywriteback– I.e.theOScandelaydiskwritestoimproveefficiency
• Invariant:writelogrecordsbeforecorrespondingdata• Butwhenwishtocommit atransaction,mustfirstsynchronously flushacommitrecordtothelog– Assumethereisafsync() orfsyncdata() operationorsimilarwhichallowsustoforcedataouttodisk
– Onlyreporttransactioncommittedwhenfsync() returns• Canimproveperformancebydelayingflushuntilwehaveanumberoftransactiontocommit- batching– Henceatanypointintimewehavesomeprefixofthewrite-aheadlogondisk,andtherestinmemory
7
TheBigPicture
8
RAM
ObjectValues
x = 3y = 27
Disk
ObjectValues
x = 1y = 17
z = 42
OlderLogEntries
NewerLogEntries
LogEntries
T2, z, 40, 42T2, STARTT1, START
T0, COMMITT0, x, 1, 2
T0, START
T3, STARTT2, ABORTT2, y, 17, 27
T1, x, 2, 3
LogEntries
RAMactsasacacheofdisk(e.g.noin-memorycopyofz)
On-diskvaluesmaybeolderversionsofobjects– ornewuncommittedvaluesaslong
ason-disklogdescribesrollback(e.g.,z)
Logconceptuallyinfinite,andspansRAM&Disk
9/24/16
5
Checkpoints
• Asdescribed,logwillgetverylong– Andneedtoprocesseveryentryinlogtorecover
• Bettertoperiodicallywriteacheckpoint– Flushallcurrentin-memorylogrecordstodisk– Writeaspecialcheckpointrecordtologwhichcontainsalistofactivetransactions
– Flushall‘dirty’objects(i.e.ensureobjectvaluesondiskareuptodate)
– Flushlocationofnewcheckpointrecordtodisk• (Notfatalifcrashduringfinalwrite)
9
Checkpointsandrecovery
• Keybenefitofacheckpointisitletsusfocusourattentiononpossiblyaffectedtransactions
10
TimeCheckpointTime FailureTime
T1
T2
T3
T4
T5
T1:noactionrequired
T2:REDO
T3:UNDO
T4:REDO
T5:UNDO
Activeatcheckpoint.Hassincecommitted;andrecordinlog.
Activeatcheckpoint;inprogressatcrash.
Notactiveatcheckpoint.Buthassincecommitted,andcommitrecordinlog.
Notactiveatcheckpoint,andstillinprogress.
9/24/16
6
Recoveryalgorithm• InitializeundolistU ={setofactivetxactions}• AlsohaveredolistR,initiallyempty• Walklogforwardfromcheckpointrecord:– IfseeaSTARTrecord,addtransactiontoU– IfseeaCOMMITrecord,movetransactionfromU->R
• Whenhitendoflog,performundo:– WalkbackwardandundoallrecordsforallTx inU
• Whenreachcheckpointrecordagain,Redo:– Walkforward,andre-doallrecordsforallTx inR
• Afterrecovery,wehaveeffectivelycheckpointed– On-diskstoreisconsistent,socantruncate thelog
11
Theorderinwhichweapplyundo/redorecordsisimportanttoproperlyhandlingcaseswheremultipletransactionstouchthesamedata
Write-aheadlogging:assumptions• Whatcangowrongwritingcommitstodisk?• Evenifsectorwritesareatomic:
– Allaffectedobjectsmaynotfitinasinglesector– Largeobjectsmayspanmultiplesectors– Trendtowardscopy-on-write,ratherthanjournaled,FSes– Manyoftheproblemsseenwithin-memorycommit(ordering
andatomicity)applytodisksaswell!• Contemporarydisksmaynotbeentirelyhonestabout
sectorsizeandatomicity– E.g.,unstablewritecachestoimproveefficiency– E.g.,largerorsmallersectorsizesthanadvertises– E.g.,non-atomicitywhenwritingtomirroreddisks
• Theseassumesfail-stop– whichisnottrueforsomemedia
12
9/24/16
7
Transactions:summary
• Standardmutualexclusiontechniquesnotgreatfordealingwith>1object– intricatelocking(&lockorder)required,or– singlecoarse-grainedlock,limitingconcurrency
• Transactionsallowusabetterway:– potentiallymanyoperations(readsandupdates)onmanyobjects,butshouldexecuteasifatomically
– underlyingsystemdealswithprovidingisolation,allowingsafeconcurrency,andevenfaulttolerance!
• Transactionsusedindatabases+filesystems
13
AdvancedTopics
• Willbrieflylookattwoadvancedtopics– lock-freedatastructures,and– transactionalmemory
• Then,nexttime,ontoacasestudy
14
9/24/16
8
Lock-freeprogramming• What’swrongwithlocks?– Difficulttogetright(iflocksarefine-grained)– Don’tscalewell(iflockstoocoarse-grained)– Don’tcomposewell(deadlock!)– Poorcachebehavior(e.g.convoying)– Priorityinversion– Andcanbeexpensive
• Lock-freeprogramminginvolvesgettingridoflocks...butnotatthecostofsafety!
• RecallTAS,CAS,LL/SC fromourfirstlecture:whatifweusedthemtoimplementsomethingotherthanlocks?
15
Assumptions• Wehaveasharedmemorysystem• Low-level(assemblyinstructions)include:
16
val = read(addr); // atomic read from memory(void) write(addr, val); // atomic write to memorydone = CAS(addr, old, new); // atomic compare-and-swap
• Compare-and-Swap(CAS) isatomic• readsvalueofaddr (‘val’),compareswith‘old’,andupdatesmemoryto‘new’iff old==val -- withoutinterruption!
• somethinglikethisinstructioncommononmostmodernprocessors(e.g.cmpxchg onx86– or LL/SC onRISC)
• Typicallyusedtobuildspinlocks(ormutexes,orsemaphores,orwhatever...)
9/24/16
9
Lock-freeapproach• DirectlyuseCAS toupdateshareddata• Asanexampleconsideralock-freelinkedlistofintegervalues– listissinglylinked,andsorted– UseCAS toupdatepointers– HandleCAS failurecases(i.e.,races)
• Representsthe‘set’abstractdatatype,i.e.– find(int)->bool– insert(int)->bool– delete(int)->bool
• Assumption:hardwaresupportsatomicoperationsonpointer-sizetypes
17
Searchingasortedlist
• find(20):
Non-blockingdatastructuresandtransactionalmemory
H 10 30 T
20?
find(20)->false
18
9/24/16
10
InsertinganitemwithCAS
• insert(20):
Non-blockingdatastructuresandtransactionalmemory
H 10 30 T
20
30® 20ü
insert(20)->true
19
InsertinganitemwithCAS
• insert(20):
Non-blockingdatastructuresandtransactionalmemory
H 10 30 T
20
30® 20
25
30® 25üû
• insert(25):
20
9/24/16
11
Concurrentfind+insert
• find(20)
H 10 30 T
-> false
20
20?
• insert(20) ->true
Non-blockingdatastructuresandtransactionalmemory 21
Concurrentfind+insert
• find(20)
H 10 30 T
-> false
20
20?
• insert(20) ->true
Non-blockingdatastructuresandtransactionalmemory
Thisthreadsaw20wasnotintheset...
...butthisthreadsucceededinputting
itin!
• Isthisacorrectimplementationofaset?
• Shouldtheprogrammerbesurprisedifthishappens?
• Whataboutmorecomplicatedmixesofoperations?
22
9/24/16
12
Linearisability• Aswithtransactions,wereturntoaconceptualmodeltodefinecorrectness– alock-freedatastructureis‘correct’ifallchanges(andreturnvalues)areconsistentwithsomeserialview:wecallthisalinearisable schedule
• Henceinthepreviousexample,wewereok:– canjustdeemthefind()tohaveoccurredfirst
• Getsalotmorecomplicatedformorecomplicateddatastructures&operations!
• NB:Oncurrenthardware,synchronisation doesmorethanjustprovideatomicity– Alsoprovidesordering:“happens-before”– Lock-freestructuresmusttakethisintoaccountaswell
23
TransactionalMemory(TM)
• Stealideafromdatabases!• Insteadof: lock(&mylock);
shared[i] *= shared[j] + 17;unlock(&mylock);
4Use: atomic { shared[i] *= shared[j] + 17;
}
4Has“obvious”semantics,i.e.alloperationswithinblockoccurasifatomically
4Transactional sinceunder thehooditlookslike:do { txid = tx_begin(&thd);
shared[i] *= shared[j] + 17;} while !(tx_commit(txid));
9/24/16
13
TMadvantages• Simplicity:– Programmerjustputsatomic{}aroundanythinghe/shewantstooccurinisolation
• Composability:– Unlikelocks,atomic{}blocksnest,e.g.:
credit(a, x) = atomic { setbal(a, readbal(a) + x);
}debit(a, x) = atomic {
setbal(a, readbal(a) - x);}transfer(a, b, x) = atomic {
debit(a, x);credit(b, x);
}
TMadvantages• Cannotdeadlock:– Nolocks,sodon’thavetoworryaboutlockingorder– (Thoughmaygetlivelockifnotcareful)
• Noraces(mostly):– Cannotforgettotakealock(althoughyoucanforgettoputatomic{}aroundyourcriticalsection;-))
• Scalability:– HighperformancepossibleviaOCC– Noneedtoworryaboutcomplexfine-grainedlocking
• Thereisstillasimplicityvs.performancetradeoff– Toomuchatomic{}andimplementationcan’tfindconcurrency.Toolittle,andraceconditions.
9/24/16
14
TMisverypromising…• Essentiallydoes‘ACI’butnoD– noneedtoworryaboutcrashrecovery– canworkentirelyinmemory– somehardwaresupportemerging(orpromised)
• Butnotapanacea– Contentionmanagementcangetugly– Difficultieswithirrevocableactions(e.g.IO)– Stillworkingoutexactsemantics(typeofatomicity,handlingexceptions,signaling,...)
• Recentx86hardwarehasstartedtoprovidedirectsupportfortransactions;notwidelyused– …Andpromptlywithdrawninerrata– Nowbackonthestreetagain– butverynew
Supervisionquestions+exercises
• Supervisionquestions– S1:Threadsandsynchronisation
• Semaphores,priorities,andworkdistribution– S2:Transactions
• ACIDproperties,2PL,TSO,andOCC– OtherC&DStopicsalsoimportant,ofcourse!
• OptionalJavapracticalexercises– Javaconcurrencyprimitivesandfundamentals– Threads,synchronisation,guardedblocks,producer-consumer,anddataraces
28
9/24/16
15
Concurrentsystems:summary• Concurrencyisessentialinmodernsystems– overlappingI/Owithcomputation– exploitingmulti-core– buildingdistributedsystems
• Butthrowsupalotofchallenges– needtoensuresafety,allowsynchronization,andavoidissuesofliveness (deadlock,livelock,...)
• Majorriskofover-engineering– generallyworthbuildingsequentialsystemfirst– andworthusingexistinglibraries,toolsanddesignpatternsratherthanrollingyourown!
29
Summary+nexttime• Transactionaldurability:crashrecoveryandlogging
– Write-aheadlogging;checkpoints;recovery• Advancedtopics
– Lock-freeprogramming– Transactionalmemory
• Notesonsupervisionexercises
• Nexttime:– ConcurrentsystemcasestudytheFreeBSDkernel– Briefhistoryofkernelconcurrency– Primitivesanddebuggingtools– Applicationstothenetworkstack
30
top related