ieee 2009 sangwon seo(kaist), ingook jang kyungchang woo, inkyo kim jin-soo kim, seungyoul maeng...
TRANSCRIPT
![Page 1: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/1.jpg)
HPMR : Prefetching and Pre-shuffling in Shared MapReduce Computation Envi-ronment
IEEE 2009
Sangwon Seo(KAIST), Ingook Jang
Kyungchang Woo, Inkyo Kim
Jin-Soo Kim, Seungyoul Maeng
2013.04.25 파일처리 특론
김태훈
![Page 2: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/2.jpg)
2 /27
Contents
1. Introduction
2. Related Work
3. Design
4. Implementation
5. Evaluations
6. Conclusion
![Page 3: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/3.jpg)
3 /27
Introduction
It is difficult to deal internet services Enormous volumes of data Generate a large amount of data which needs
to be processed every day
To solve the problem, use MapReduce programing model Support distributed and parallel processing
for largescale data-intensive applicationdata-intensive application e.g : data mining, scientific
simulation
![Page 4: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/4.jpg)
4 /27
Introduction
Hadoop; based on MapReduce Since hadoop is distributed system, it’s called HDFS(Hadoop
distributed file system)
HDFS cluster is consist of A Single NameNode
master server that manages the namespace of a file system, regulates clients’ access to file
A Number of DataNode manage storage directly attached to each DataNode
HDFS placement policy place each of three replicas on each node in the local rack
Advantage : improve write performance by cutting down inter-rack write traffic
![Page 5: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/5.jpg)
/27
Introduction
5
Essential to reduce the shuffling overhead to improve the overall perfor-mance of the MapReduce computation. the network bandwidth between nodes is also an important factor of the shuffling overhead.
Node1
file
file
Input format
Split
Split
Split
RR RR RR
map
map
map
Com-binerParti-tioner
(sort)
reduce
Output Format
Node2
file
file
Input format
Split
Split
Split
RR RR RR
map
map
map
Com-binerParti-tioner
(sort)
reduce
Output Format
Writeback toLocal HDFS
store
RecordReaders
“Shuffling” process(over the N/W)
Files loaded from HDFS stores
![Page 6: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/6.jpg)
6 /27
Introduction
Hadoop’s basic principle Moving computation is better
Better to migrate the computation closer
It’s used for when the size of data set is huge the migration of the computation minimizes
network congestion and increase the overall throughput1) of the system.
1)Throughput : 지정된 시간 내 전송된 처리량
![Page 7: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/7.jpg)
7 /27
Introduction
HOD(Hadoop-On-Demand, developed by Ya-hoo!) a management system for provisioning virtual Hadoop
cluster over a large physical large physical clusterAll physical nodes are shared by more than one Yahoo! En-
gineers
Increase the utilization of physical resource
When the computing resources are shared by multiple users, Hadoop policy(‘Moving com-putation’) is not effective Because resource are shared
Resource e.g : computing n/w, hardware resource
![Page 8: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/8.jpg)
8 /27
Introduction
To solve the that problem, two optimiza-tion scheme is proposed Prefetching
Intra-block prefetching Inter-block prefetching
Pre-shuffling
![Page 9: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/9.jpg)
9 /27
Related work
J. Dean and S. Ghemawat Traditional prefetching techniques
V. Padmanabhan and J.Mogul, T.Kroeger and D. long, P. Cao,E. Felten et al., Prefetching method to reduce I/O latency
![Page 10: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/10.jpg)
10/27
Related work
Zaharia et al., LATE(Longest Approximation Time to End)
More efficiently in the shared environment
Drayd(Microsoft) Can be expressed as direct acyclic graph
The degree of data locality is highly related to the MapReduce performance
![Page 11: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/11.jpg)
11/27
Design(Prefetching Scheme)
Intra2)-block prefetching Bi-directional processing A simple prefetching technique that prefetches data within a single block
while performing a complex computation
ComputationIn progress
PrefetchingIn progress
2)Intra : 안 내부
Fig.1. The intra-block prefetching in Map Phase
ComputationIn progress
PrefetchingIn progress
Fig.2. The intra-block prefetching in Reduce Phase
Assigned input split for map task
Expected data for reduce task
![Page 12: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/12.jpg)
12/27
Design(Prefetching Scheme)
While a complex job is performed in the left side, the to be-required data are prefetched and assigned in parallel to the corresponding task
Advantage of Intra-block prefetching 1. Using the concept of processing bar that monitors
the current status of each side and invokes a signal if synchronization is about to be broken
2. Try find the appropriate prefetching rate at which the performance can be maximized while minimizing the prefetching overheadCan be minimize the network overhead
3)At which : when, where
![Page 13: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/13.jpg)
13/27
2
2
3
3
Design(Prefetching Scheme)
Inter-block prefetching runs in block level, by prefetching the expected block replica4) to a
local rack
• A2, A3, A4 is prefetching the required blocks D=Distance
n1
n2
n3
D=1 D=5 D=8
block
block
block
block
block
block
1
1
4)replica : 복제본
![Page 14: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/14.jpg)
14/27
Design(Prefetching Scheme)
Inter-block prefetching runs in block level, by prefetching the expected block replica4) to a
local rack
• A2, A3, A4 is prefetching the required blocks4)replica : 복제본
![Page 15: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/15.jpg)
15/27
Design(Prefetching Scheme)
Inter-block prefetch-ing processing Algo-rithm 1. Assign map task to
the node that are the nearest to the required blocks
2. The predictor gener-ates the list of data blocks, B, to be prefetched for the target task t
![Page 16: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/16.jpg)
16/27
Design(Pre-Shuffling Scheme)
Pre-Shuffling pro-cessing The pre-shuffling
module in the task scheduler looks over input split or candidate data in the map phase, and predicts which reducer the key-value pairs are partitioned into.
![Page 17: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/17.jpg)
17/27
Design(Optimization)
LATE(Longest Approximation Time to End) algorithm How to robustly perform specu-
lative execution to maximize performance under heteroge-nous environment Did not consider data locality
that can accelerate the MapReduce computation further
D-LATE(Data-aware LATE) al-gorithm Almost the same LATE, except that
a task is assigned as nearly as possible to the location where the needed data are present
![Page 18: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/18.jpg)
18/27
Implementation – Optimizer scheduler)
Optimized scheduler Predictor module
Not only finds stragglers, but also predicts candi-date data blocks and the reducers into which the key-value pairs are parti-tioned
D-LATEThese predictions, the opti-
mized scheduler perform the D-LATE algorithm
![Page 19: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/19.jpg)
19/27
Implementation – Optimizer scheduler)
Prefetcher To Monitor the status of
worker threads and to man-age the prefetching syn-chronization with processing bar
Load Balancer Check the logs(include dis usage
per node and current n/w traffic per data block)
Invoke to maintain load bal-ancing based on disk usage and n/w traffic
![Page 20: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/20.jpg)
20/27
Evaluation
Two dual-core 2.0Ghz AMD, 4GB main memory 400GB ATA Hard disk drives Gigabit Ethernet n/w interface card The entire nodes are divided in to 40racks which are con-
nected with L3 routers Yahoo! Grid which consists of 1670 nodes All test configured that HDFS maintains four replicas for each
data block, whose size is 128MB Three type of workload ; wordcount, search log aggregator, simi-
larity calculator
![Page 21: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/21.jpg)
21/27
Evaluation
Fig8, #1 : smallest ratio of number of nodes to the num-ber of map tasks.
#5 : due to significant reduction in shuffling overhead
Fig7, We can observe that HPMR shows significantly bet-ter performance than the na-tive Hadoop for all of test sets
![Page 22: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/22.jpg)
22/27
Evaluation
The prefetching latency is affected by disk overhead or n/w congestion
Therefore, the long prefetching latency in-dicates that the corre-sponding node is heavily loaded
Prefetching rate increases beyond 60%
![Page 23: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/23.jpg)
23/27
Evaluation
This means that HPMR assures consistent performance even in the shared environment such as Yahoo!Grid where the available bandwidth fluctuates severely.
4Kbps ~ 128Kbps
![Page 24: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/24.jpg)
24/27
Conclusion
Two innovative schemes The prefetching scheme
Exploits data locality
The pre-shuffling scheme Reduce the network overhead required to shuffle key-value
pairs
HPMR is implemented as a plug-in type compo-nent for Hadoop
HPMR improves the overall performance by up to 73% compared to the native Hadoop
Next, step we plan to evaluate a more compli-cated workload such as HAMA(Open-source Apache incubator project)
![Page 25: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/25.jpg)
/27
Appendix : MapReduce Example
MapReduce Example : Weather data set 분석 하나의 레코드는 라인 단위로 저장되며 , 이때 저장 타입은 ASCII 형태 하나의 파일에서 각 필드는 구분자없이 고정길이로 저장되어 있음 레코드 예제 ) 0057332130999991950010103004+51317+028783FM-
12+017199999V0203201N00721004501CN0100001N9-01281-01391102681
질의 1901 년 ~ 2001 년 동안 작성된 NCDC 데이터 파일들로부터 각 년도별 가장 높은
기온 (F) 을 측정하라
25
Input:Chunk(64MB) 단위 데이터 파일
1st Map:파일로부터
<offset, 레코드 >추출
2nd Map:각 레코드로부터< 연도 , 기온 >
추출
Shuffle:연도별 데이터 그룹으로 정리
Reduce:최종 결과
병합 및 반환
![Page 26: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/26.jpg)
/27
Appendix : MapReduce Example
1st Map : 파일에서 , <Offset, Record> 추출 <Key_1, Value> = <offset, record>
<0, 0067011990999991950051507004...9999999N9+00001+99999999999...>
<106, 0043011990999991950051512004...9999999N9+00221+99999999999...>
<212, 0043011990999991950051518004...9999999N9-00111+99999999999...>
<318, 0043012650999991949032412004...0500001N9+01111+99999999999...>
<424, 0043012650999991949032418004...0500001N9+00781+99999999999...>
...
2nd Map : 각 레코드별 Year, Temp 추출 <Key_2, Value> = <year, Temp>
<1950, 0>
<1950, 22>
<1950, −11>
<1949, 111>
<1949, 78>
…
26
연도 기온
![Page 27: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/27.jpg)
27/27
Appendix : MapReduce Example
Shuffle
2nd Map 의 결과가 너무 많기 때문에 , 이를 각 연도별 데이터 그룹으로 다시 정리 Reduce 과정에서 병합시 , 처리 비용 감소
Reduce : 모든 Map 의 후보집합을 병합하여 최종 결과 반환
<1950, 0><1950, 22><1950, −11><1949, 111><1949, 78>
<1949, [111, 78]><1950, [0, 22, −11]>
2nd Map
Shuffle
(1950, [0, 22, −11])
(1950, [25, 15])
Mapper_1
Mapper_2(1950, [0, 22, −11, 25, 15])
(1950, 25)
Reducer(1949, [111, 78])
(1949, [30, 45])
(1949, [111, 78, 30, 45])
(1949, 111)
![Page 28: IEEE 2009 Sangwon Seo(KAIST), Ingook Jang Kyungchang Woo, Inkyo Kim Jin-Soo Kim, Seungyoul Maeng 2013.04.25 파일처리 특론 김태훈](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5697c0251a28abf838cd4e88/html5/thumbnails/28.jpg)
28/27
Appendix : Hadoop the Definitive Guide p19~20
1
1
2
2
3
34
4