vgu bis2010 mapreduce and batch processing
Post on 08-May-2015
241 Views
Preview:
TRANSCRIPT
MapReduce and Batch Processing
VGU BIS2010, Group 13
Son Pham: phamtranthaison@gmail.com |
Phong Le: bigbangvn@gmail.com |
Lam Pham: lam.pts.vn@gmail.com |
Chuong Nguyen: chuongit@gmail.com |
Chapter 4
Content
Part 1: Son Pham
Batch Layer <
Part 2: Phong Le
> MapReduce
Part 3: Lam Pham
MapReduce <
Part 4: Chuong Nguyen
> Demo
Batch Layer
Lambda Architecture
Batch Layer
• Precomputation• High latency• Linearly Scalable
Batch Layer
On-the-fly computation:
Precomputation:
Batch Layer – Linear Scalability
“Scalability is the ability of a system to maintain performance under increased
load by adding more resources”
Linear vs. Non-Linear Scalability
Linear Scalability Non- Linear Scalability
“A linearly scalable system can maintain performance under increasedload by adding resources in proportion to the increased load”
MapReduce
A distributed computing paradigm originally pioneered by Google
Inspired by the “Map” and “Reduce” functions commonly used in functional programming (LISP)
Operating on data stored in a distributed filesystem (HDFS…)
A population free implementation is Apache Hadoop.
MapReduce
MapReduce - “Word count” Example
MapReduceScalability
Automatically parallelize the computation across the cluster of machines
Fault-ToleranceReassign failed tasks
Q&A
THANK YOU
top related