on the [ir]relevance of network performance for data ... › researcher › files › zurich-atr ›...
TRANSCRIPT
![Page 1: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/1.jpg)
On The [Ir]relevance of Network Performance for Data Processing
Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle,Radu Stoica, Bernard Metzler, Ioannis Koltsidas,
Nikolas Ioannou
IBM Research, Zurich
![Page 2: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/2.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 2
How [Ir]relevant is the Network?
![Page 3: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/3.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 3
How [Ir]relevant is the Network?
![Page 4: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/4.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 4
How [Ir]relevant is the Network?
TeraSort PageRank SQL WordCount GroupBy0
50
100
150
200
250
3001 Gbps 10 Gbps 40 Gbps
Ru
nti
me
in s
ecs
![Page 5: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/5.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 5
How [Ir]relevant is the Network?
TeraSort PageRank SQL WordCount GroupBy0
50
100
150
200
250
3001 Gbps 10 Gbps 40 Gbps
Ru
nti
me
in s
ecs
![Page 6: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/6.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 6
How [Ir]relevant is the Network?
TeraSort PageRank SQL WordCount GroupBy0
50
100
150
200
250
3001 Gbps 10 Gbps 40 Gbps
Network IO is very relevant - up to 64%
Ru
nti
me
in s
ecs
60%
64%
47% 33%
28%
1
![Page 7: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/7.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 7
How [Ir]relevant is the Network?
TeraSort PageRank SQL WordCount GroupBy0
50
100
150
200
250
3001 Gbps 10 Gbps 40 Gbps
Network IO is very relevant - up to 64% ??
Ru
nti
me
in s
ecs
1
![Page 8: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/8.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 8
Is It Spark Specific?
Flink-TS Flink-PR GraphLab Timely0
50
100
150
200
250
3001 Gbps 10 Gbps 40 Gbps
Ru
nti
me
in s
ecs
725s
![Page 9: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/9.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 9
Spark TeraSort: The Shuffle Story
outputinput
distributed sorting
- simple - shuffle data is input data- highest chance of improvements
![Page 10: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/10.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 10
Spark TeraSort: The Shuffle Story
Shuffledata
Reduce tasks
output
Map tasks
Cores
input
![Page 11: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/11.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 11
Spark TeraSort: The Shuffle Story
Shuffledata
Reduce tasks
output
net
net
net
reading in shuffle data
Cores
input
Map tasks
![Page 12: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/12.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 12
Spark TeraSort: The Shuffle Story
Shuffledata
Reduce tasks
output
net CPU
net CPU
net CPU
reading in shuffle data
sortingshuffle data
Cores
input
Map tasks
![Page 13: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/13.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 13
Spark TeraSort: The Shuffle Story
Shuffledata
Reduce tasks
output
net CPU
net CPU
net CPU
reading in shuffle data
sortingshuffle data
performance
Cores
input
Map tasks
![Page 14: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/14.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 14
How Important is the Network?
1 Gbps 10 Gbps 40 Gbps 100 Gbps*0%
20%
40%
60%
80%
100%
CPUNetwork
Gains from the networks are shadowed by the high CPU footprint
![Page 15: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/15.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 15
How Important is the Network?
1 Gbps 10 Gbps 40 Gbps 100 Gbps*0%
20%
40%
60%
80%
100%
CPUNetwork
Gains from the networks are shadowed by the high CPU footprint
52%
48%
![Page 16: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/16.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 16
How Important is the Network?
1 Gbps 10 Gbps 40 Gbps 100 Gbps*0%
20%
40%
60%
80%
100%
CPUNetwork
Gains from the networks are shadowed by the high CPU footprint
52%
48%
8%
92%
![Page 17: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/17.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 17
How Important is the Network?
1 Gbps 10 Gbps 40 Gbps 100 Gbps*0%
20%
40%
60%
80%
100%
CPUNetwork
Gains from the networks are shadowed by the high CPU footprint
52%
48%
8%
92% 97%
![Page 18: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/18.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 18
How Important is the Network?
1 Gbps 10 Gbps 40 Gbps 100 Gbps*0%
20%
40%
60%
80%
100%
CPUNetwork
Gains from the networks are shadowed by the high CPU footprint
52%
48%
8%
92% 97% 99%
Network gains are shadowed by the CPU
![Page 19: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/19.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 19
What Exactly is the CPU Doing?
Map Reduce0%
20%
40%
60%
80%
100%
Misc.IteratorSerializationSortingIOJVMLinux
Spark
![Page 20: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/20.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 20
What Exactly is the CPU Doing?
Map Reduce0%
20%
40%
60%
80%
100%
Misc.IteratorSerializationSortingIOJVMLinux
Spark
![Page 21: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/21.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 21
What Exactly is the CPU Doing?
Map Reduce0%
20%
40%
60%
80%
100%
Misc.IteratorSerializationSortingIOJVMLinux
Overheads are spread across the entire stack - serialization, abstration, execution model etc.2
Spark
![Page 22: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/22.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 22
The Balancing Act: CPU vs Network
![Page 23: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/23.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 23
The Balancing Act: CPU vs Network
I.Balance out the CPU
with the network time
Sorting : O(nlog(n))Network: O(n)
use smaller 'n'
![Page 24: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/24.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 24
The Balancing Act: CPU vs Network
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 25: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/25.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 25
The Balancing Act: CPU vs Network
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 26: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/26.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 26
The Balancing Act: CPU vs Network
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 27: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/27.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 27
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
if a single corecannot do 40 Gbps
then use more
Needs a more careful analysis of at the entire stack
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 28: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/28.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 28
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
1 2 4 8 160
20
40
60
Number of cores
idealmeasured
Ban
dw
idth
(Gb
ps)
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 29: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/29.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 29
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
1 2 4 8 160
20
40
60
Number of cores
idealmeasured
Ban
dw
idth
(Gb
ps)
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 30: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/30.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 30
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
1 2 4 8 160
20
40
60
Number of cores
idealmeasured
Ban
dw
idth
(Gb
ps)
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
![Page 31: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/31.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 31
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
Number of coresR
un
tim
e (s
ecs)
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
1 2 4 8 160
100
200
300reduce map
![Page 32: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/32.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 32
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
Number of coresR
un
tim
e (s
ecs)
I.Balance out the CPU
with the network time
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
1 2 4 8 160
100
200
300reduce map
260
_____coresruntime = 9 +
![Page 33: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/33.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 33
The Balancing Act: CPU vs Network
II.Use more cores to
scale up
Classical techniques are ineffective
I.Balance out the CPU
with the network time
3
Smaller Partitions
Ru
nti
me
(se
cs)
020406080
100
Number of cores1 2 4 8 16
0
100
200
300reduce map
Ru
nti
me
(se
cs)
![Page 34: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/34.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 34
ConclusionFaster networks (IO) are very relevant – as long as you have CPU cycles – differentiate between user vs framework CPU usage
Framework's CPU usage is bad – CPU-network imbalance : sorting, serialization, volcano
execution model, etc. – scalability (serial vs parallel components)– ineffective classical balancing techniques
Knowing today's usec-era IO and CPU hardware, how would you re-design modern data processing framework?
1
2
3
![Page 35: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/35.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 35
ConclusionFaster networks (IO) are very relevant – as long as you have CPU cycles – differentiate between user vs framework CPU usage
Framework's CPU usage is bad – CPU-network imbalance : sorting, serialization, volcano
execution model, etc. – scalability (serial vs parallel components)– ineffective classical balancing techniques
Knowing today's usec-era IO and CPU hardware, how would you re-design modern data processing framework?
1
2
3
![Page 36: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/36.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 36
ConclusionFaster networks (IO) are very relevant – as long as you have CPU cycles – differentiate between user vs framework CPU usage
Framework's CPU usage is bad – CPU-network imbalance : sorting, serialization, volcano
execution model, etc. – scalability (serial vs parallel components)– ineffective classical balancing techniques
Knowing today's usec-era IO and CPU hardware, how would you re-design modern data processing framework?
1
2
3
![Page 37: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/37.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 37
Backup
![Page 38: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/38.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 38
Spark
![Page 39: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/39.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 39
Spark
![Page 40: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/40.jpg)
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '16) 40
Runtime
1 2 4 8 160
50
100
150
200
250
300reduce map
![Page 41: On The [Ir]relevance of Network Performance for Data ... › researcher › files › zurich-ATR › trivedi2.pdfOn The [Ir]relevance of Network Performance for Data Processing Animesh](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f0383477e708231d4096dd8/html5/thumbnails/41.jpg)
What Exactly is the CPU Doing?Sp
ark
Map Reduce Reduce/Count0%
20%
40%
60%
80%
100%
Misc.IteratorSerializationSortingIOJVMLinux