kollaps/thunderstorm: reproducible evaluation of distributed … · 2020-07-13 · log media driver...
TRANSCRIPT
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
Kollaps/Thunderstorm: Reproducible Evaluation of Distributed Systems
1
Miguel Matos U. Lisbon & INESC-ID Portugal
Tutorial @ DISCOTEC/DAIS 2020
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
OUTLINE
2
• Overview of Kollaps & Thunderstorm
• Hands-on tutorial: basics
• Hands-on tutorial: advanced features
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
MOTIVATION
3
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
MOTIVATION• Performance depends heavily on underlying network • Variability and Failures are the norm
• Need for tools for systematic evaluation of distributed applications
• Ability to answer key questions: • What is the impact of halving the network latency in
application throughput? • What is the effect of packet loss? • What if …
4
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
RELATED WORK
5
Mainlimitations:- scalability- accuracy- dynamics
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
KOLLAPS IN A NUTSHELL• Applications are concerned about the end-to-
end properties: • latency, jitter, bandwidth, packet loss
• Rather than the internal network state leading to those properties
6
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
KOLLAPS IN A NUTSHELL• Applications are concerned about the end-to-
end properties: • latency, jitter, bandwidth, packet loss
• Rather than the internal network state leading to those properties
• Emulate only end-to-end properties • Allows decentralized highly scalable
emulation
7
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
NETWORK COLLAPSING
8
10Mb/s10ms
100Mb/s20ms
50Mb/s5ms
50Mb/s5ms
sv1
sv2
c1
sv1 sv250Mb/s10ms
10Mb/s35ms
10Mb/s35ms
s1 s2c1
Node
Router
ThroughputLatency
target topology collapsed topology
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
NETWORK COLLAPSING
9
10Mb/s10ms
100Mb/s20ms
50Mb/s5ms
50Mb/s5ms
sv1
sv2
c1
sv1 sv250Mb/s10ms
10Mb/s35ms
10Mb/s35ms
s1 s2c1
Node
Router
ThroughputLatency
target topology collapsed topology
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
NETWORK COLLAPSING
10
10Mb/s10ms
100Mb/s20ms
50Mb/s5ms
50Mb/s5ms
sv1
sv2
c1
sv1 sv250Mb/s10ms
10Mb/s35ms
10Mb/s35ms
s1 s2c1
Node
Router
ThroughputLatency
target topology collapsed topology
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
NETWORK COLLAPSING
11
10Mb/s10ms
100Mb/s20ms
50Mb/s5ms
50Mb/s5ms
sv1
sv2
c1
sv1 sv250Mb/s10ms
10Mb/s35ms
10Mb/s35ms
s1 s2c1
Node
Router
ThroughputLatency
target topology collapsed topology
Minimumbandwidthofalllinks
Minimumbandwidthofalllinks
Minimumbandwidthofalllinks
Sumoflatenciesofalllinks
Sumoflatenciesofalllinks
Pre-computationofstaticproperties
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
ARCHITECTURE
12
host-C
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
host-B
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
physical network
host-A
emulation manager
container container container
host-B host-C
dashboardinput deploymentgenerator
monitordesign
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EMULATION MANAGER (EM)•One instance per physical machine
•Enforces topology properties •static properties •dynamic properties
13
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EM: DYNAMIC PROPERTIES
14
10Mb/s10ms
100Mb/s20ms
50Mb/s5ms
50Mb/s5ms
sv1
sv2
c1
sv1 sv250Mb/s10ms
10Mb/s35ms
10Mb/s35ms
s1 s2c1
Node
Router
ThroughputLatency
target topology collapsed topology
c1
sv1
sv2
20Mb/s
20Mb/s
Howtoshareenforcebandwidth
sharingundercongestion?
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EM: DYNAMIC PROPERTIES
15
• RTT-AwareMin-Maxmodel:
• Intuition• AvailablebandwidthisinverselyproportionaltotheRTT
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
ARCHITECTURE
16
host-C
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
host-B
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
physical network
host-A
emulation manager
container container container
host-B host-C
dashboardinput deploymentgenerator
monitordesign
emulation core
shared memoryLog
emulation core
emulation core
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EMULATION CORE•Spawned by the Emulation Manager
•One instance per container
•Collect’s container’s usage
•Exchanges metadata with EC through shared memory •no bandwidth overhead for local containers
17
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
ARCHITECTURE
18
host-C
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
host-B
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
physical network
host-A
tcalcontainer container container
Logshared memory
host-B host-C
dashboardinput deploymentgenerator
monitordesign
emulation manager
emulation core
emulation core
emulation core
tcal tcal
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
LINUX TC ABSTRACTION LAYER
19
htbqdiscnetemqdisc
delayandpacketloss bandwidthshaping
DestinationX
TCAL
GatherUsageStatistics
EnforceDynamicProperties
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
•Emulation of network dynamics
DESCRIBE DYNAMIC EXPERIMENTS
20
100Mb/s10ms
100Mb/s10ms
100Mb/s10msc1 s2 sv1
s4 s5
s3
s1
100Mb/s10ms
20Mb/s20ms
20Mb/s
20ms
20Mb/s
20ms
initial state
100Mb/s10ms
100Mb/s10ms
100Mb/s10msc1 s2 sv1
s4 s5
s3
s1
20Mb/s20ms
20Mb/s
20ms
20Mb/s
20ms
t=10s
100Mb/s10ms
100Mb/s10ms
100Mb/s10msc1 s2
s4 s5
s3
s1
20Mb/s20ms
20Mb/s
20ms
20Mb/s
20ms
t=20s
150Mb/s5ms
sv1 100Mb/s10ms
100Mb/s10ms
100Mb/s10msc1 s2
s4 s5
s3
s1
20Mb/s20ms
20Mb/s
20ms
20Mb/s
20ms
t=30s
sv1
BandwidthLatency
ContainerRouter
BL
Broken linkOffline node
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
THUNDERSTORM DESCRIPTION LANGUAGE
21
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EVALUATION• Link-levelemulation• Scalabilityandmetadataoverhead• Short-andlong-livedconnections• CubicandRenocongestioncontrolalgorithms• Dynamicbehavior• Large-scaletopologies• Reproducingpublishedresults• Geo-replicatedSystems• What-ifusecases
22
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
EVALUATION• Link-levelemulation• Scalabilityandmetadataoverhead• Short-andlong-livedconnections• CubicandRenocongestioncontrolalgorithms• Dynamicbehavior• Large-scaletopologies• Reproducingpublishedresults• Geo-replicatedSystems• What-ifusecases
23
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
LARGE-SCALE TOPOLOGIES• Scale-freenetworkswithrandompingrequests• Mean-squareerrorw.r.t.theoreticalRTT:
24
Size(#nodes+#switches)
KOLLAPS Mininet Maxinet
1000 0.0261 0.0079 28.07792000 0.0384 N/A 347.53034000 0.0721 N/A N/A
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
GEO-REPLICATED SYSTEM
25
• CassandraonEC2(replicationfactor:2)• 4replicasinFrankfurt• 4replicasinSydney• 4YCSBclientsinFrankfurt
0
100
200
300
400
0 1000 2000 3000 4000 5000
Geo-replicated Cassandra and YCSB: EC2 vs. Kollaps
La
ten
cy (
ms)
Throughput (ops/sec)
EC2 Kollaps
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
WHAT-IF SCENARIO• CassandraonEC2(replicationfactor:2)
• 4replicasinFrankfurt• 4replicasinSydney(Seoul)• 4YCSBclientsinFrankfurt
0
100
200
300
400
500
600
0 1000 2000 3000 4000 5000
Original Halved latency
Late
ncy
(m
s)
Throughput (ops/sec)
Read Update Read Update
26
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
CONCLUSION AND FUTURE WORK• KOLLAPS:adecentralizedtopologyemulator
• Focusesonend-to-endproperties• Reliesonnetworkcollapsingtechniques• LeveragesLinuxtcandContainerTechnologies
• Thunderstorm• Languagetoconciselywritedynamicexperiments• Precisedescriptionofexperiments
• Keytoreproducibility
• Team• MiguelMatos,ValerioSchiavoni,ShadyIssa,PauloGouveia,
JoãoNeves,CarlosSegarra,LucaLiechti
27
https://github.com/miguelammatos/Kollaps
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
PART II: HANDS-ON TUTORIAL
28
• Overview of Kollaps & Thunderstorm
• Hands-on tutorial: basics
• Hands-on tutorial: advanced features
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
PART II: HANDS-ON TUTORIAL
29
• Goal • Install Kollaps/Thunderstorm
• Assumptions • Linux • Docker and Docker Swarm
• Run a simple experiment • iPerf3 server • iPerf3 client • measure bandwidth • measure ping
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
ARCHITECTURE
30
host-C
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
host-B
em. core!!!
tcaloverlay,..qdisc tcal
overlay,..qdisc tcal
overlay,..qdisc
AeronMedia DriverLog
shared memory
em. core!!!
em. core!!!
physical network
host-A
emulation manager
container container container
host-B host-C
dashboardinput deploymentgenerator
monitordesign
emulation core
shared memoryLog
emulation core
emulation core
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
IPERF3 TOPOLOGY
31
50Mb/s10ms
c1
s1 s2BandwidthLatency
ContainerRouter
BL
c2
c3
100Mb/s5ms
sv100Mb/s5ms
100Mb/s10ms
10Mb/s5ms
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
IPERF3 TOPOLOGY
32
50Mb/s10ms
100Mb/s5ms
100Mb/s5ms
c1
s1 s2
sv1
sv3
BandwidthLatency
ContainerRouter
BL
c2
c3
100Mb/s5ms
sv2100Mb/s5ms
100Mb/s10ms
10Mb/s5ms
DISCOTEC/DAIS KOLLAPS/THUNDERSTORM
PART II: HANDS-ON TUTORIAL
33
• Goal • Install Kollaps/Thunderstorm
• Assumptions • Linux • Docker and Docker Swarm
• Run a simple experiment • iPerf3 server • iPerf3 client
https://github.com/miguelammatos/Kollaps https://github.com/miguelammatos/Kollaps/wiki