mohammad alizadeh adel javanmard and balaji prabhakar stanford university analysis of dctcp:analysis...

Mohammad Alizadeh

Adel Javanmard and Balaji Prabhakar

Stanford University

Analysis of DCTCP:Stability, Convergence, and Fairness

Data Center Packet Transport

• Transport inside the DC– TCP rules (99.9% of traffic in some DCs)

• But, TCP:– Needs large buffers for high throughput– Induces large queuing delays– Does not handle bursty traffic well (Incast)

• DCTCP was proposed to address these shortcomings (SIGCOMM’10).

TCP Buffer Requirement

• Bandwidth-delay product rule of thumb:– A single flow needs C×RTT buffers for 100% Throughput.

e B = C×RTT

B < C×RTT

Throughput loss!

ffer S

B > C×RTT

More latency!

To lower the buffering requirements, we must reduce sending rate variations.

DCTCP: Main Ideas

1. React in proportion to the extent of congestion. Reduce window size based on fraction of marked packets.

2. Mark based on instantaneous queue length. Fast feedback to better deal with bursts. Simplifies hardware.

ECN Marks TCP DCTCP

1 0 1 1 1 1 0 1 1 1 Cut window by 50% Cut window by 40%

0 0 0 0 0 0 0 0 0 1 Cut window by 50% Cut window by 5%

DCTCP: Algorithm

Switch side:– Mark packets when Queue Length > K.

Sender side:– Maintain running average of fraction of packets marked (α).

Adaptive window decreases:

– Note: decrease factor between 1 and 2.

B KMark Don’t Mark

each RTT : F # of marked ACKs

Total # of ACKs (1 g) gF

DCTCP vs TCP

Setup: Win 7, Broadcom 1Gbps SwitchScenario: 2 long-lived flows, K = 30KB

Analysis of DCTCP

Steady State Analysis

• What is the effect of the various network and algorithm parameters on system throughput and latency?– Network: Capacity, Round-trip Time, Number of flows– Algorithm: Marking threshold (K), Averaging parameter (g)

• The standard approach is to study control loop behavior via fluid models.– Kelly et al., Low et al., Misra et al., Srikant et al, …

DCTCP Fluid Model

N/RTT(t)

p(t)Delay

p(t – R*)

+− 1

Switch

Source

Fluid Model vs ns2 simulations

• Parameters: N = {2, 10, 100}, C = 10Gbps, d = 100μs, K = 65 pkts, g = 1/16.

N = 2 N = 10 N = 100

• We make the following change of variables:

• The normalized system:

• The normalized system depends on only two parameters:

Normalization of Fluid Model

Equilibrium Characterization Case 1:

• Very large N: system (globally) converges to a unique fixed point:

)2 ,1 ,2() , ,()1 ,1 ,2()~ ,~ ,~

( 2 CdNqWqW w

Example:

g 1/16.

• Very large N: system (globally) converges to a unique fixed point:

)2 ,1 ,2() , ,()1 ,1 ,2()~ ,~ ,~

( 2 CdNqWqW w

Example:

g 1/16.

• System has a periodic limit cycle solution.

Example:

g 1/16.

• System has a periodic limit cycle solution.

Example:

g 1/16.

Stability of Limit Cycles

• Let X* = set of points on the limit cycle.

• A limit cycle is locally asymptotically stable if δ > 0 exists s.t.:

Poincaré Map

x2 = P(x1)

Stability of Poincaré Map ↔ Stability of limit cycle

x*α = P(x*

Stability Criterion

• Theorem: The limit cycle of the DCTCP system:

is locally asymptotically stable if and only if ρ(Z1Z2) < 1.

- JF is the Jacobian matrix with respect to x.

- T = (1 + hα)+(1 + hβ) is the period of the limit cycle.

• Proof: Show that P(x*α

+ δ) = x*α + Z1Z2δ + O(|δ|2).

We have numerically checked this condition for:

Parameter Guidelines• How big does the marking

threshold K need to be to avoid queue underflow?

Throughput-Latency Tradeoff

Throughput > 94% as K 0

18• Parameters: C = 10Gbps, d = 480μs, g = 0.05.

For TCP:Throughput → 75%

Convergence Analysis

• How long does it take for DCTCP sources to converge to their “fair share” rate (C/N)?– DCTCP is slower to converge than TCP since it cuts its window

by smaller factors.

• The fluid model is not suitable for transient analyses.

• We use a hybrid (continuous- and discrete-time) model.– The model is inspired by the AIMD models of Baccelli et al.

and Shorten et al.

The Hybrid Model

Rate of Convergence (Theorem)

Assume N DCTCP flows with arbitrary Wi(0) and αi(0), evolving according to the Hybrid Model, with:

Define function , and let 0 < α*≤ 1 be the unique positive solution to

Then: Also:

where:

Consequences

• DCTCP converges at most 40% slower than TCP:

• The parameter g should not be too small:

(g = 0.07) (g = 0.025) (g = 0.005)

Convergence: ns2 Simulations

Conclusion

• Our analysis shows DCTCP:– requires 17% of C×RTT for full throughput – achieves 94% throughput as K → 0. – converges at most 1.4 times slower than TCP.

• We provide guidelines for setting the DCTCP parameters.

• The analysis suggests a simple modification that improves the RTT-fairness of DCTCP.– Achieves linear-RTT fairness (Thrput RTT-1), like TCP-RED

mohammad alizadeh adel javanmard and balaji prabhakar stanford university analysis of dctcp:analysis...

dctcp slide

fairness slide

system throughput

b buffer size b crtt

window size

large n

cut window

incast dctcp

Documents

· the q method for symmetric cone programming yu xia∗...

iranian prehistoric project abbas alizadeh

experiences evaluating dctcp - linux...

experiences evaluating dctcp

ietf87 berlin dctcp implementation in freebsd midori kato...

bryan sonneveldt mentor: dr. iman alizadeh

dctcp talk

mohammad alizadeh stanford university joint with:

an optimum vision-based control of...

data center tcp (dctcp) - unsw school of computer...

the sun devil satellite laboratory ricky astrain mentor: dr....

data center tcp...

datacentertcp (dctcp)web.mit.edu/6.033/2015/ ·...

analysis of dctcp: stability, convergence, and...

packet transport mechanisms for data center networks...

ietf87 berlin dctcp implementation in freebsd

data center tcp (dctcp)balaji/papers/10datacenter.pdf ·...

b. alizadeh advanced logic design (2008) 1 / 55 decision...

pancreatic sarcoidosis: a literature reviewpancreatic...

dctcp & codel the best is the friend of the good