"mapreduce: simplified data processing on large clusters" paper presentation @ papers we...

MapReduce: Simplified Data Processing on Large Clusters

Papers We LoveBucharest Chapter October 12, 2015

I am Adrian FloreaArchitect Lead @ IBM Bucharest Software Lab

Hello!

What is?

◉ MapReduce is a programming model and implementation for processing large data sets

◉ Users specify a Map function that processes a key-value pair to generate a set of intermediate key-value pairs and a Reduce function that aggregates all intermediate values that share the same intermediate key in order to combine the derived data appropriately

map:(k1, v1)->[(k2, v2)]reduce:(k2, [v2])->[(k3, v3)]

Map◉function (mathematics)◉map (Java)◉Select (.NET)

Where else have we seen this?

Reduce◉ fold (functional

programming)◉ reduce (Java)◉ Aggregate (.NET)

Other analogies

"To draw an analogy to SQL, map is like the group-by clause of an aggregate query. Reduce is analogous to the aggregate function that is computed over all the rows with the same group-by attribute"

D.J. DeWitt & M. Stonebraker

Divide-and-conquer algorithms“recursively breaking down a problem into two or more sub-problems of the same (or related) type (divide), until these become simple enough to be solved directly (conquer). The solutions to the sub-problems are then combined to give a solution to the original problem.”

Jeffrey DeanGoogle Fellow

December 6, 2004 4PM

Sanjay GhemawatGoogle Fellow

History

◉ April 1960: John McCarthy introduced the concept of “maplist”

◉ September 4, 1998: Google founded◉ 1998-2003: hundreds of special-purpose large data

computation programs in Google◉ February 2003: 1st version of MapReduce◉ August 2003: MapReduce significant enhancements◉ June 18, 2004: Patent US7650331 B1 filed◉ December 6, 2004: 1st MapReduce public presentation◉ 2005: Hadoop implementation started in Java (Douglass

R. Cutting & Michael J. Cafarella)◉ September 4, 2007: Hadoop 0.14.1◉ January 19, 2010: Patent US7650331 B1 published◉ July 6, 2015: Hadoop 2.7.1

Distribution issues

◉ Communication and routingwhich nodes should be involved?what transport protocol should be used?threads/events/connections managementremote execution of your processing code?

◉ Fault tolerance and fault detection◉ Load balancing / partitioning of data

heterogeneity of nodesskew in datanetwork topology

◉ Parallelization strategyalgorithmic issues of work splitting

“without having to deal with failures, the rest of the support code just isn’t so complicate”

S. Ghemawat

MapReduce model

Map Operation Reduce OperationInput Data Intermediate Data Output Data

application-independentMap Module

application-independentReduce Module

MapReduce system

application-specificMap Operation

application-specificReduce

OperationInput Data

Intermediate Data

Output Data

Original article Hadoop wiki

MapReduce model in practice

map(String key, String value):for each Word w in value:

EmitIntermediate(w, "1");

reduce(String key, Iterator values)int result = 0;for each v in values:

result += ParseInt(v);Emit(key, AsString(result));

map:(k1, v1)->[(k2, v2)]reduce:(k2, [v2])->[(k3, v3)]

How long does it take to go through 1 TB?

Sequentially: 3 hoursMapReduce: startup overhead 70 seconds + computation 80 seconds

Environment◉ 1800 Linux dual-processorx86 machines, 2-4 GB memory◉ Fast Ethernet/Giga Ethernet◉ Inexpensive IDE disks and a distributed Google File System

Take a walk through a Google data center

http://www.google.com/about/datacenters/inside/streetview/

Tianhe-2, #1 supercomputer: 3,120,000 cores, 1,5 PB total memory

http://www.top500.org/featured/top-systems/tianhe-2-milkyway-2-national-university-of-defense/

Execution diagram◉ Master process is a task itself initiated by the WQM and is

responsible for assigning all other tasks to worker processes◉ Each worker invokes at least a map thread and a reduce

thread◉ If a worker fails, its tasks are reassigned to another worker

process◉ When WQM receives a job, it allocates the job to the master

that calculates and requires M+R+1 processes to be allocated to the job

◉ WQM responds with the process allocation info (can result less processes) to the master that will manage the performance of the job

◉ Reduce tasks begin work when the master informs them that there are intermediate files ready

◉ Input data (files/DB/memory) are splitted in data blocks (16-64 MB) automatically or configurable

◉ The worker to which a map task has been assigned applies the map() operator to the respective input data block

◉ When the worker completes the task, it informs the master of the status

◉ Master informs workers where to find intermediate data and schedules their reading

◉ Workers (3 & 4) sort the intermediate key-value pairs, then merge (by applying reduce()) them and write to output

Workflow diagram

◉ When a process completes a task it informs WQM which updates the status tables

◉ When WQM discovers one process failed, it assign its tasks to a new process and updates the status tablesTask Status Table

◉TaskID◉Status (InProgress, Waiting, Completed, Failed)◉ProcessID◉InputFiles (Input, Intermediate)◉OutputFiles

Process Status Table◉ProcessID◉Status (Idle, Busy, Failed)◉Location (CPU ID, etc.)◉Current (TaskID, WQM)

Questions from the audience @ original paper presentation◉ Q: Wanted to know of any task that could not be handled using

MapReduce?A: join operations could not be performed with the current model

◉ Q: Wondered how MapReduce differs from parallel databases?A: MapReduce is stored across a large number of machines as compared to parallel

databases, the abstractions are fairly simple to use in MapReduce, and MapReduce also benefits greatly from locality optimizations

Bibliography

◉ J. Dean, S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", OSDI'04, Dec. 6, 2004

◉ J. Dean, S. Ghemawat, “System and method for efficient large-scale data processing”, Patent US 7650331 B1, Jan. 19, 2010

◉ S. Ghemawat, J. Dean, J. Zhao, M. Austern, A. Spector, "Google Technology RoundTable: Map Reduce", Aug. 21, 2008 – Youtube

◉ P. Mahadevan, "OSDI'04 Conference Reports", ;LOGIN: Vol. 30, No. 2, Apr. 2005, p. 61

◉ R. Jacotin, “Lecture: The Google MapReduce”, SlideShare, October 3, 2014

https://www.usenix.org/legacy/publications/library/proceedings/osdi04/tech/full_papers/dean/dean.pdf

http://www.google.com/patents/US7650331

https://www.youtube.com/watch?v=NXCIItzkn3E

http://c59951.r51.cf2.rackcdn.com/4815-1036-osdi04.pdf

http://www.slideshare.net/romain_jacotin/the-google-mapreduce

Any questions ?You can find me at

Thanks!

http://blogs.ugidotnet.org/adrian/Rss.aspx

http://github.com/adrianflorea/

http://delicious.com/adrianflorea

http://www.facebook.com/adrian.florea.9212

http://www.linkedin.com/in/adrianflorea

mailto:[email protected]

skype:adrian_ugidotnet