google’s mapreduce connor poske florida state university

Google’s MapReduceGoogle’s MapReduce

Connor PoskeConnor PoskeFlorida State UniversityFlorida State University

OutlineOutline

Part I:Part I:– HistoryHistory– MapReduce architecture and featuresMapReduce architecture and features– How it worksHow it works

Part II:Part II:– MapReduce programming model and MapReduce programming model and

exampleexample

Initial HistoryInitial History

There is a demand for large scale data There is a demand for large scale data processing.processing.

The folks at Google have discovered The folks at Google have discovered certain common themes for processing certain common themes for processing very large input sizes. very large input sizes.

- Multiple machines are needed- Multiple machines are needed

- There are usually 2 basic operations on the - There are usually 2 basic operations on the input data:input data:

1) Map1) Map

2) Reduce2) Reduce

MapMap

Similar to the Lisp primitiveSimilar to the Lisp primitive Apply a single function to multiple inputsApply a single function to multiple inputs

In the MapReduce model, the map function applies In the MapReduce model, the map function applies an operation to a list of pairs of the form an operation to a list of pairs of the form (input_key, input_value), and produces a set of (input_key, input_value), and produces a set of INTERMEDIATE key/value tuples. INTERMEDIATE key/value tuples.

Map(input_key, input_value) -> Map(input_key, input_value) ->

(output_key, intermediate_value) list (output_key, intermediate_value) list

ReduceReduce

Accepts the set of intermediate key/value Accepts the set of intermediate key/value tuples as inputtuples as input

Applies a Applies a reducereduce operation to all operation to all valuesvalues that that share the same keyshare the same key

Reduce(output_key, intermediate_value list) -> output listReduce(output_key, intermediate_value list) -> output list

Quick exampleQuick example Pseudo-code counts the number of occurrences of each Pseudo-code counts the number of occurrences of each

word in a large collection of documentsword in a large collection of documents

Map(String fileName, String fileContents)Map(String fileName, String fileContents)//fileName is input key, fileContents is input value//fileName is input key, fileContents is input value

For each word w in fileContentsFor each word w in fileContentsEmitIntermediate(w, “1”)EmitIntermediate(w, “1”)

Reduce(String word, Iterator Values)Reduce(String word, Iterator Values)//word: input key, values: a list of counts//word: input key, values: a list of counts

int count = 0int count = 0for each v in valuesfor each v in values

count += 1count += 1Emit(AsString(count))Emit(AsString(count))

The idea sounds good, but…The idea sounds good, but…

We can’t forget about the problems arising We can’t forget about the problems arising from large scale, multiple-machine data from large scale, multiple-machine data processingprocessing

How do we parallelize everything?How do we parallelize everything? How do we balance the input load?How do we balance the input load? Handle failures?Handle failures?

Enter the MapReduce model…Enter the MapReduce model…

MapReduceMapReduce

The MapReduce implementation is an abstraction The MapReduce implementation is an abstraction that hides these complexities from the that hides these complexities from the programmerprogrammer

The User defines the Map and Reduce functionsThe User defines the Map and Reduce functions The MapReduce implementation automatically The MapReduce implementation automatically

distributes the data, then applies the user-defined distributes the data, then applies the user-defined functions on the datafunctions on the data

Actual code slightly more complex than previous Actual code slightly more complex than previous exampleexample

MapReduce ArchitectureMapReduce Architecture

User program with Map and Reduce User program with Map and Reduce functionsfunctions

Cluster of average PCsCluster of average PCs Upon execution, cluster is divided into:Upon execution, cluster is divided into:

– Master workerMaster worker– Map workersMap workers– Reduce workersReduce workers

Execution OverviewExecution Overview1)1) Split up input data, start up program on all machinesSplit up input data, start up program on all machines2)2) Master machine assigns M Map and R Reduce tasks to idle worker Master machine assigns M Map and R Reduce tasks to idle worker

machinesmachines3)3) Map function executed and results buffered locallyMap function executed and results buffered locally4)4) Periodically, data in local memory is written to disk. Locations on disk Periodically, data in local memory is written to disk. Locations on disk

of data are forwarded to masterof data are forwarded to master

--Map phase complete—--Map phase complete—

5)5) Reduce worker uses RPCs to read intermediate data from Map Reduce worker uses RPCs to read intermediate data from Map machines. Data is sorted by key.machines. Data is sorted by key.

6)6) Reduce worker iterates over data and passes each unique key along Reduce worker iterates over data and passes each unique key along with associated values to the Reduce functionwith associated values to the Reduce function

7)7) Master wakes up the user program, MapReduce call returns. Master wakes up the user program, MapReduce call returns.

Execution OverviewExecution Overview

Master workerMaster worker

Stores state information about Map and Reduce Stores state information about Map and Reduce workersworkers– Idle, in-progress, or completedIdle, in-progress, or completed

Stores location and sizes on disk of intermediate Stores location and sizes on disk of intermediate file regions on Map machinesfile regions on Map machines– Pushes this information incrementally to workers with in-Pushes this information incrementally to workers with in-

progress reduce tasksprogress reduce tasks Displays status of entire operation via HTTPDisplays status of entire operation via HTTP

– Runs internal HTTP serverRuns internal HTTP server– Displays progress I.E. bytes of intermediate data, bytes Displays progress I.E. bytes of intermediate data, bytes

of output, processing rates, etcof output, processing rates, etc

ParallelizationParallelization

Map() runs in parallel, creating different Map() runs in parallel, creating different intermediate output from different input keys and intermediate output from different input keys and valuesvalues

Reduce() runs in parallel, each working on a Reduce() runs in parallel, each working on a different keydifferent key

All data is processed independently by different All data is processed independently by different worker machinesworker machines

Reduce phase cannot begin until Map phase is Reduce phase cannot begin until Map phase is completely finished!completely finished!

Load BalancingLoad Balancing

User defines a MapReduce “spec” objectUser defines a MapReduce “spec” object– MapReduceSpecification specMapReduceSpecification spec– Spec.set_machines(2000)Spec.set_machines(2000)– Spec.set_map_megabytes(100)Spec.set_map_megabytes(100)– Spec.set_reduce_megabytes(100)Spec.set_reduce_megabytes(100)

That’s it! The library will automatically take care of the That’s it! The library will automatically take care of the rest.rest.

Fault ToleranceFault Tolerance- Master pings workers periodicallyMaster pings workers periodically

Switch(ping response)Switch(ping response)

case (idle): Assign task if possiblecase (idle): Assign task if possible

case (in-progress): do nothingcase (in-progress): do nothing

case (completed): reset to idlecase (completed): reset to idle

case (no response): Reassign taskcase (no response): Reassign task

Fault ToleranceFault Tolerance

What if a map task completes but the machine What if a map task completes but the machine fails before the intermediate data is retrieved via fails before the intermediate data is retrieved via RPC?RPC?– Re-execute the map task on an idle machineRe-execute the map task on an idle machine

What if the intermediate data is partially read, What if the intermediate data is partially read, but the machine fails before all reduce operations but the machine fails before all reduce operations can complete?can complete?

What if the master fails…? PWNEDWhat if the master fails…? PWNED

Fault ToleranceFault Tolerance

Skipping bad recordsSkipping bad records– Optional parameter to change mode of executionOptional parameter to change mode of execution– When enabled, MapReduce library detects records that When enabled, MapReduce library detects records that

cause crashes and skips themcause crashes and skips them

Bottom line: MapReduce is very robust in its Bottom line: MapReduce is very robust in its ability to recover from failure and handle errorsability to recover from failure and handle errors

Part II: Programming ModelPart II: Programming Model

MapReduce library is extremely easy to useMapReduce library is extremely easy to use Involves setting up only a few parameters, and Involves setting up only a few parameters, and

defining the map() and reduce() functionsdefining the map() and reduce() functions– Define map() and reduce()Define map() and reduce()– Define and set parameters for MapReduceInput objectDefine and set parameters for MapReduceInput object– Define and set parameters for MapReduceOutput objectDefine and set parameters for MapReduceOutput object– Main programMain program

Map() Map()

Class WordCounter : public Mapper{Class WordCounter : public Mapper{

public: public:

virtual void Map(const MapInput &input)virtual void Map(const MapInput &input)

{{

//parse each word and for each word//parse each word and for each word

//emit(word, “1”)//emit(word, “1”)

}}

};};

REGISTER_MAPPER(WordCounter);REGISTER_MAPPER(WordCounter);

Reduce()Reduce()

Class Adder : public Reducer {Class Adder : public Reducer {

virtual void Reduce(ReduceInput *input) virtual void Reduce(ReduceInput *input)

{{

//Iterate over all entries with same key//Iterate over all entries with same key

//and add the values//and add the values

}}

};};

REGISTER_REDUCER(Adder);REGISTER_REDUCER(Adder);

Main()Main()int main(int argc, char ** argv) {int main(int argc, char ** argv) {

MapReduceSpecification spec;MapReduceSpecification spec;

MapReduceInput *input;MapReduceInput *input;

//store list of input files into “spec”//store list of input files into “spec”

for( int i = 0; I < argc; ++i) {for( int i = 0; I < argc; ++i) {

input = spec.add_input(); input = spec.add_input();

input->set_format(“text”);input->set_format(“text”);

input->set_filepattern(argv[i]);input->set_filepattern(argv[i]);

input->set_mapper_class(“WordCounter”);input->set_mapper_class(“WordCounter”);

} }

Main()Main()

//Specify the output files//Specify the output files

MapReductOutput *output = spec.output();MapReductOutput *output = spec.output();

out->set_filebase (“/gfs/test/freq”);out->set_filebase (“/gfs/test/freq”);

out->set_num_tasks(100); out->set_num_tasks(100);

// freq-00000-of-00100// freq-00000-of-00100

// freq-00001-of-00100 // freq-00001-of-00100

out->set_format(“text”);out->set_format(“text”);

out->set_reducer_class(“Adder”);out->set_reducer_class(“Adder”);

Main()Main()

//Tuning parameters and actual MapReduce call//Tuning parameters and actual MapReduce call

spec.set_machines(2000);spec.set_machines(2000);

spec.set_map_megabytes(100);spec.set_map_megabytes(100);

spec.set_reduce_megabytes(100);spec.set_reduce_megabytes(100);

MapReduceResult result;MapReduceResult result;

if(!MapReduce(spec, &result)) if(!MapReduce(spec, &result))

abort();abort();

Return 0;Return 0;

} //end main} //end main

Other possible usesOther possible uses Distributed grepDistributed grep

– Map emits a line if it matches a supplied patternMap emits a line if it matches a supplied pattern– Reduce simply copies intermediate data to outputReduce simply copies intermediate data to output

Count URL access frequencyCount URL access frequency– Map processes logs of web page requests and emits (URL, Map processes logs of web page requests and emits (URL,

1)1)– Reduce adds all values for each URL and emits (URL, count)Reduce adds all values for each URL and emits (URL, count)

Inverted IndexInverted Index– Map parses each document and emits a sequence of (word, Map parses each document and emits a sequence of (word,

document ID) pairs.document ID) pairs.– Reduce accepts all pairs for a given word, sorts the list Reduce accepts all pairs for a given word, sorts the list

based on Document ID, and emits (word, list(document ID))based on Document ID, and emits (word, list(document ID)) Many more…Many more…

ConclusionConclusion

MapReduce provides a easy to use, clean MapReduce provides a easy to use, clean abstraction for large scale data processingabstraction for large scale data processing

Very robust in fault tolerance and error handlingVery robust in fault tolerance and error handling Can be used for multiple scenariosCan be used for multiple scenarios Restricting the programming model to the Map Restricting the programming model to the Map

and Reduce paradigms makes it easy to and Reduce paradigms makes it easy to parallelize computations and make them fault-parallelize computations and make them fault-toleranttolerant

google’s mapreduce connor poske florida state university

Documents

input data

disk of data

bytes of intermediate

form input

input load

large input sizes

different keyall data

large scale data processing