Florin Dinu T. S. Eugene NgRice University
Inferring a Network Congestion
Map with Traffic Overhead0zero
3
The Vision: Passively Inferred Congestion Map
R0 R1 R3 R5
X7
X8
AS2
R2 R4 R6
. . .
. . .
Without any dedicated measurement (probing) traffic At fine time granularities (seconds) Good accuracy
AS1
How it works? Why it works? Where is this applicable?
4
Benefits of Passive InferenceSolution/Challenges Active Reporting
(SNMP)Passive
Inference
Has reasonable accuracy
Does not need access rights to routers
Does not exacerbate existing congestion
Detects congestion intimely manner
xx
Passive inference – complementary to active reporting
5
Overview – Passively Inferring Congestion Maps
R0 R1 R3 R5
X7
X8AS1
AS2
R2 R4 R6
. . .
. . .
R0 R1
Step 1 : Use congestion markings from existing traffic Get path-level congestion information Routers are AQM/ECN capable and can mark existing traffic
6
P06
P04
P46 ?
Expand on Step 1: path-level congestion from AQM/ECN markings
R0 R1 R3 R5
R2 R4 R6
R0 R1
Step 2: Use topological information to complete congestion map
P06 – P04
P46 = func(P06,P04) = 1 – P04
Overview – Passively Inferring Congestion Maps
7
AQM Background
AQM = Active Queue Management
Router marks/drops packets probabilistically as a function of congestion severity
Many different definitions of congestion severity
Mar
king
Pro
babi
lity
(MP)
Congestion severity
RED, PI
REM
We use marking probability (MP) as the congestion measure
8
ECN Background – Marking Data Packets
S D
AQM/ECN
Data packets are marked probabilistically
ECN = Explicit Congestion Notification
9
Use of the Data Markings
R0 R1 R3 R5
R2 R4 R6
R0 R1P40
P30P60
Data markings describe congestion on routers’ ingress paths
Data packet marking is probabilistic => Use ratio of marked data packets to obtain MP on the ingress path
10
ECN Background - Echoing
Echoing the markings from data packets to ACKs:
S DACK
DATA
The ACK markings are an altered version of the data packet markings
11
ECN Background – Responding to Markings
Responding to marked ACKs:
S
Stopping the echoing after receiving a CWR packet:
S D
DS
ACK
DATA
CWR
DATA
CWR
ACK
The ACK markings are an altered version of the data packet markings
12
Groups - Effect of ECN Echoing
Groups of unmarked ACKs of “size zero”:
Groups of marked and unmarked ACKs:
CWR
D
D
CWR
ACK
DATA
ACK
DATA
Group of size zero
13
Use of the ACK MarkingsR0 R1 R3 R5
R2 R4 R6
R0 R1
P05
P03P04
ACK markings describe congestion on forward paths of the flows
ACK markings describe congestion on routers’ egress paths
Ratio of marked ACKs is an inaccurate measure
ACK markings are very important and more challenging to use
14
Obtaining MP from ACK Markings
p = MP on the forward path
AVG_SZ_UNMARKED = func(p)
DACK
DATA
= ∑ n (1-p)∙ n p=(1-p)/p∙ n=0
∞
To get MP need to compute average size of groups of unmarked ACKs
CWR
15
Average Size of Groups of Unmarked ACKsSampling Interval (SI)
Training period
start of Estimation Interval (EI)
Flow1
Flow2
Flow5
end of EI
Select flows until a limit is reached During training period only select flows, do not compute samples For each following SI
Sample = avg size of groups of unmarked ACKs that finish in that SI Discard groups that start or end in different EI
At end of EI use AVG(SAMPLES)=(1-p)/p to obtain p
Flow4
Flow3
Not selected
16
Optimization – the Use of Groups of Size Zero
Probability of a group to be of size zero is: (1-p)0 p = p∙
If p is high, most groups will be of size zero
Better statistical significance if use groups of size zero
Routers need to be on both the data and ACK path of a flow
CWR
DACK
DATA
Group of size zero
Use of groups of size zero increases accuracy
17
Evaluation – Parameter Settings
ns-2 simulations, 500s simulation time
AQM algorithms (RED, PI, REM) – RED by default
SI=0.5 (congestion sample computed every 0.5s)
Monitor at most 1000 flows per EI/path
Groups of size zero used in all experiments
18
Evaluation – Traffic & Topology
5ms link delay, 500Mbps link bandwidth
Metric: 50th, 90th percentile of |inferred MP – real MP | for each link
R0 to Ri : 250*i2 TCP flows
Ri to Ri+2: 100 TCP flows
R0 R1 R2 R8 R9 R10
Ri to Ri+2: 100 TCP flows
UDP UDP UDP UDP
Hop 10
19
Evaluation – vs Baseline Solution
DCW
RACK
DATA
Our group-based solution (GROUP):
Baseline solution, no alteration (REFERENCE):
D
CWR
ACK
DATA
GROUP vs REFERENCE
Sensitivity to the Length of the EI
20Accuracy decreases with hop count but is within 0.1 for most cases
Value of EI (s) - logscale
Sensitivity to Drastic Changes
21
UDP sources vary their sending rate by 50Mbps between 250Mbps and 750Mbps
Every 10s we start 3000 TCP flows between random nodes, for a random time (0-10s)
How well does our solution track these sudden and large variations?
Sensitivity to Drastic Changes
Accuracy decreases with hop count but is within 0.1- 0.15 for most cases22
EI = 10s
EI = 3s90th perc.
50th perc.
23
Sensitivity to AQM Marking Function
A linear marking function allows better inference for our solution
Why does REM perform much worse? Abrupt variations in marking probability Limited visibility
Mar
king
/Dro
p Pr
obab
ility
Congestion severity
RED, PI
REM
24
Limited Visibility
R0 R1 R2
R1 marks 100% of packets
R2 marks 30% of packets
P20
P10
If P20=P10=100%, P12 is unknown (any value possible)
At high MP (less than 100%) problem still exist because very few packets are left unmarked
Limited visibility appears at high MP. More probable for REM.
P12=??
25
Sensitivity to Dropped ACKs - Numerical
Drop ACKs can modify the average size of groups of unmarked ACKs
Size 4 5 1 5
Size 8 1 4
Average size: 3.75
Average size: 4.33
ACKs can be dropped by non-AQM/ECN routers Pure ACKs can be dropped even by AQM/ECN routers
26
Sensitivity to Dropped ACKs - Numerical
At reasonable drop probabilities the additional error is low
27
Other Advantages of Our Solution
Incremental deployment On specific paths Around non AQM-ECN routers
Useful in heterogeneous environments Different AQM types
28
Related Work
Re-ECN [SIGCOMM 2005] , ConEx IETF WG Extends ECN with one step Sources re-echo congestion information from ACK markings A router on forward path has upstream, downstream and whole
path-congestion Useful for traffic policing or traffic management
Lower precision. Limited by header space bits. Needs modifications to ECN and headers Does not address challenge posed by ACK markings Does not go beyond path-level congestion inference
29
Conclusion
Novel method for inferring congestion with zero network overhead
Does not require changes to hosts, headers or protocols
Incrementally deployable and useful in heterogeneous environments
Good accuracy even in very congested environments
30
Thank you
Credits for the pictures
• http://networkequipment.net/wp-content/uploads/2011/02/voip-telephone.jpg• http://www.freefoto.com/images/04/28/04_28_50---US-Dollar-Bills_web.jpg• http://www.ciscorouting.com/routing_engine.jpg• http://www.rvoice.co.uk/uploads/Image/Green%20Tick.jpg
31
Why not Use Ratio for ACK Markings?
The ratio of marked ACKs is very inaccurate. Need a better solution.
36
R0 R1 R3 R5
R2 R4 R6
R0 R1P40
P06
Granularity of Inference
Estimation Interval (EI)
Sampling Interval (SI)
estimate(P06) = AVG( {samples(P06)} )
37
Counters per-path Length & Number of all groups of unmarked Acks
Counters per-flow Current group of unmarked ACKs
Prefix matching for source and destination Transport protocol header matching for flow
identification Sequence numbers for CWR
Implementation