mapreduce - cornell · pdf filemapreduce simplified data processing on large clusters...
TRANSCRIPT
MapReduce SimplifiedDataProcessingonLargeClusters
(WithouttheAgonizingPain)
PresentedbyAaronNathan
TheProblem
Massiveamountsofdata>100TB(theinternet)Needssimpleprocessing
ComputersarentperfectSlowUnreliableMisconfigured
Requirescomplex(i.e.bugprone)code
MapReducetotheRescue!
CommonFuncKonalProgrammingModelMapStep
map (in_key, in_value) -> list(out_key, intermediate_value) Splitaproblemintoalotofsmallersubproblems
ReduceStepreduce (out_key, list(intermediate_value)) -> list(out_value) Combinetheoutputsofthesubproblemstogivetheoriginalproblemsanswer
EachfuncKonisindependent HighlyParallelizable
Answer!Answer!
AlgorithmPicture
MAP
REDUCE
Answer!
MAPMAP MAP MAP
DATA
K1:vK1:vK2:v K2:v K1:v K2:vK3:v K3:v
K1:v,v,v K2:v,v,v K3:v,v,v
REDUCE REDUCE
aggregator
SomeExampleCodemap(String input_key, String input_value): // input_key: document name
// input_value: document contents
for each word w in input_value: EmitIntermediate(w, "1");
reduce(String output_key, Iterator intermediate_values): // output_key: a word
// output_values: a list of counts int result = 0; for each v in intermediate_values:
result += ParseInt(v);
Emit(AsString(result));
SomeExampleApplicaKons
DistributedGrep URLAccessFrequencyCounter ReverseWebLinkGraph TermVectorperHost DistributedSort InvertedIndex
TheImplementaKon
GoogleClusters100s1000sDualCorex86CommodityMachinesCommodityNetworking(100mbps/1Gbps)GFS
GoogleJobScheduler Librarylinkedinc++
ExecuKon
TheMaster
MaintainsthestateandidenKfyofallworkers Managesintermediatevalues ReceivessignalsfromMapworkersuponcompleKon
BroadcastssignalstoReduceworkersastheywork
CanretaskcompletedMapworkerstoReduceworkers.
InCaseofFailure
PeriodicPingsfromMaster>WorkersOnfailureresetsstateofassignedtaskofdeadworker
SimplesystemprovesresilientWorksincaseofa80simultaneousmachinefailures!
Masterfailureisunhandled. WorkerFailuredoesnteffectoutput
(outputidenKcalwhetherfailureoccursornot) Eachmapwritestolocaldiskonly Ifamapperislost,thedataisjustreprocessedNondeterminisKcmapfuncKonsarentguaranteed
PreservingBandwidth
MachinesareinrackswithsmallinterconnectsUselocaKoninformaKonfromGFSAhemptstoputtasksforworkersandinputslicesonthesamerack
UsuallyresultsinLOCALreads!
BackupExecuKonTasks
Whatifonemachineisslow? CandelaythecompleKonoftheenKreMROperaKon!
Answer:Backup(Redundant)ExecuKonsWhoeverfinishesfirstcompletesthetask!Enabledtowardstheendofprocessing
ParKKoning
M=numberofMapTasks(thenumberofinputsplits)
R=numberofReduceTasks(thenumberofintermediatekeysplits)
W=numberofworkercomputers InGeneral:
M=sizeof(Input)/64MB R=W*n(wherenisasmallnumber)
TypicalScenario:InputSize=12TB,M=200,000,R=5000W=2000
CustomParKKoning
DefaultParKKonedonintermediatekeyHash(intermediate_key)modR
Whatifuserhasaprioriknowledgeaboutthekey?AllowforuserdefinedhashingfuncKonEx.Hash(Hostname(url_key))
TheCombiner
IfreducerisassociaKveandcommuniviKve (2+5)+4=11or2+(5+4)=11 (15+x)+2=2+(15+x)
RepeatedintermediatekeyscanbemergedSavesnetworkbandwidthEssenKallylikealocalreducetask
I/OAbstracKons
HowtogetiniKalkeyvaluepairstomap?DefineaninputformatMakesuresplitsoccurinreasonableplacesEx:Text
Eachlineisakey/pair CancomefromGFS,bigTable,oranywherereally!
Outputworksanalogously
SkippingBadRecords
Whatifausermakesamistakeinmap/reduce Andonlyapparentonfewjobs..
WorkersendsmessagetoMaster Skiprecordon>1workerfailureandtellotherstoignorethisrecord
RemovingUnnecessaryDevelopmentPain
LocalMapReduceImplementaKonthatrunsondevelopmentmachine
MasterhasHTTPpagewithstatusofenKreoperaKonShowsbadrecords
ProvideaCounterFacilityMasteraggregatescountsanddisplayedonMasterHTTPpage
AlookattheUI(in1994)
h6p://labs.google.com/papers/mapreduceosdi04slides/indexauto0013.html
PerformanceBenchmarks
SorBng AND Searching
Search(Grep)
Scanthrough1010100byterecords(1TB)
M=15000,R=1 StartupKme
GFSLocalizaKonProgramPropagaKon
Peak>30GB/sec!
Sort
50linesofcode Map>key+textline Reduce>IdenKty M=15000,R=4000
ParKKononinitbytesofintermediatekey
Sortsin891sec!
WhataboutBackupTasks?
Andwaititsuseful!
NB:August2004
OpenSourceImplementaKon
Hadoop hhp://hadoop.apache.org/core/
ReliesonHDFSAllinterfaceslookalmostexactlylikeMapReducepaper
Thereisevenatalkaboutittoday!4:15B17CSColloquium:MikeCafarella(Uwash)
AcBveDisksforLargeScaleDataProcessing
TheConcept
UseaggregateprocessingpowerNetworkeddisksallowforhigherthroughput
WhynotmovepartoftheapplicaKonontothediskdevice?Reducedatatraffic Increaseparallelismfurther
ShrinkingSupportHardware
ExampleApplicaKons
MediaDatabaseFindsimilarmediadatabyfingerprint
RealTimeApplicaKonsCollectmulKplesensordataquickly
DataMiningPOSAnalysisrequiredadhocdatabasequeries
Approach
Leveragetheparallelismavailableinsystemswithmanydisks
Operatewithasmallamountofstate,processingdataasitstreamsoffthedisk
ExecuterelaKvelyfewinstrucKonsperbyteofdata
ResultsNearestNeighborSearch
Problem:DeterminekitemsclosesttoaparKculariteminadatabasePerformcomparisonsonthedriveReturnsthedisksclosestmatchesServerdoesfinalmerge
MediaMiningExample
Performlowlevelimagetasksonthedisk!
EdgeDetecKonperformedondiskSenttoserverasedgeimage
Serverdoeshigherlevelprocessing
WhynotjustuseabunchofPCs?
Theperformanceinceaseissimilar Infact,thepaperessenKallyusedthissetuptoactuallybenchmarktheirresults!
Supposedlythiscouldbecheaper ThepaperdoesntreallygiveagoodargumentforthisPossiblyreducedbandwidthondiskIOchannelButwhocares?
SomeQuesKons
Whatcouldadiskpossiblydobeherthanthehostprocessor?
WhataddedcostisassociatedwiththismediocreprocessorontheHDD?
Arenewdependenciesareintroducedonhardwareandsosware?
Perhapsother(beher)placestodothistypeoflocalparallelprocessing?
Maybein2001thismademoresense?