origins of long range dependence myths and legends aleksandar kuzmanovic 01/08/2001

28
Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Post on 21-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Origins of Long Range Dependence

Myths and LegendsAleksandar Kuzmanovic

01/08/2001

Page 2: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Outline

Definitions

Why is LRD important?

Heavy tails

Producing self-similar traffic

Physical interpretation in LAN and WAN networks

– Different hypothesis from around 10 papers

Page 3: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Self-Similar Nature of Ethernet Traffic, W. Willinger, 1994

Page 4: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Definitions

Long range dependent process– if its autocorrelation function is nonsummable

Self-similar process– scaling behavior of finite dimensional distributions

X=(m^(1-H))*X(m) in distribution

Second order self-similar process– aggregated processes possess the same non-degenerate AC

functions as the original process X and (m^(1-H))*X(m) have the same AC function

Self-similar processes have hyperbolically decaying autocorrelation functions - LRD can be characterized by a single parameter H

Page 5: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Heavy tails (Noah effect)

Heavy-tailed distributions

– LLCD

Pareto a typical example

20,,~)()( xxxXPxF

xxF log/)(log

mean1;var2

Page 6: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Producing Self-Similar Traffic

1. Multiplexing ON/OFF sources that have a fixed rate in ON periods and ON/OFF period lengths that are heavy tailed.

– Aggregate traffic is fBm with

2. queue model– implies that multiplexing constant-rate connections with

Poisson connection arrivals and a heavy-tailed distribution for connection lifetimes would result in self-similar traffic

3. Inter-arrival packet times are i.i.d. Pareto with– and then consider the corresponding count process (the number

of arrivals in consecutive intervals), we have “pseudo self-similar” traffic (Paxson, Floyd) (or even self-similar (L. Lipsky)?)

)),min(3( 21 H

// GM

1

Page 7: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Questions we want to answer

What physical activity causes LRD?What is the role of protocols (TCP and MAC layer

protocols)?What is the role of limited resources (i.e.

bandwidth)?What model fits best to each of the assumptions?What is the largest time-scale over which the

correlation is present?Self-similarity vs. pseudo self-similarity and

relevance

Page 8: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Statistical Analysis of Ethernet LAN Traffic at the Source Level, W. Willinger, 1997, I

Page 9: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Statistical Analysis of Ethernet LAN Traffic at the Source Level, W. Willinger, 1997, II

Model 1 (heavy tailed ON/OFF activity at the source level) is widely accepted

Result proven theoreticallyNoah effect (heavy-tailed periods)

ON periods alpha = 1.7 OFF periods alpha = 1.2

TCP traffic measured most of the time... Higher load - H increasesWAN measurements do not fit into this model

connection typically do not stay long

Page 10: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Wide Area Traffic: The Failure of Poisson Modeling, V. Paxson, S. Floyd, 1995

Summary of ways to produce LRD trafficWAN (TCP) traffic for TELNET and FTP

applications– TELNET connection arrivals appear to be Poisson, but

packet arrivals are not

– Single TELNET connection is LRD Model 3: Inter-arrival times are i.i.d. Pareto

– Aggregate is also LRD, but there is no analytical proof (*)FTP traffic also LRD, yet non of the models fit because

of limited resources.Aggregated traffic is not fBm (single H is not enough)

Page 11: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Explaining WWW Traffic Self-Similarity, M. Crovella, 1995

WWW traffic is self-similar– but only when load is high (i.e. in busiest hours)

Authors force model 1 (ON/OFF model)– The distribution of:

transfer times (alpha = 1.21) user requests for documents (alpha = 1.06) document sizes available in the Web (alpha = 1.05) user think times (alpha = 1.5)

H increases as the load increases (same as in LAN)

Page 12: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Relationships betw. file sizes, tran. prot. and s-s netw. traffic, M. Crovella, 1996

Model 1: The success of this simple model is surprising given that it ignores non-linarities arising in real networks

Hypothesis:– Heavy tailed file size distributions together with TCP is

responsible for LRD if UDP is used, there is little or no LRD

Explanation– “In some sense, the effect of the unaccounted for nonlinearity is

reflected back as a stretching in time effect, thus conforming to the model’s original suppositions”

Other interesting stuff: mix of Pareto and exp. background traffic

Page 13: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Propagation of LRD in the Internet, A. Veres, 2000, I

Not about roots, but about propagation of self-similarity by TCP

A(t) = C - B(t)TCP is a linear system beyond a characteristic time

scale– if it adapts well to a background traffic, it itself becomes

self-similar

Fo rwa rd Tra f f ic

B a ck wa rd Tra f f ic

Page 14: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Propagation of LRD in the Internet, A. Veres, 2000, II

Experimental proof:– NY-Budapest file transfer, source is not LRD - traffic is

LRD (H=0.76)

– Max time scale = 8 min

Also, if there is number of on-off TCP connections, they can spread LRD

W. Willinger obviously does not like this paper:– “This is a fraud and has no relevance for LRD observed on

link level...”

– “Protocols have no impact on LRD, they just have to send the data generated by applications...”

Page 15: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

TCP Congestion Control and Heavy-Tails, M. Crovella, 2000, I

Switch to Model 3 (Heavy-tailed inter-packet arrivals) Although heavy-tailed flow lengths are commonly associated

with heavy-tailed file sizes, there is no strong correlation between file sizes and transmission times

It has been shown that TCP can show heavy-tailed inter-arrival times under some

conditions Because most of the

connections are short

lived (!) only slow start

and exp. back-off were

considered

Page 16: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

TCP Congestion Control and Heavy-Tails, M. Crovella, 2000, II

Simple Markov chain model for exp. backoff and slow start with pr. of loss parameter

State probability with different loss ratesFor alpha to be

between 1 and 2,

p has to be between

1/8 and 1/4 ...but for different model

p increases =>

H increases

Page 17: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

TCP Congestion Control and Heavy-Tails, M. Crovella, 2000, III

Pathological TCP connections: 15 packetsAnalytical model not that good (borders are loose)For this set-up, correlation up to 1000 secFor larger file sizes, up to 200-300 secUnder certain conditions, heavy tailed transmission

times can occur even in the absence of any variability in file sizes

Future work: to consider the variability in round-trip time estimation

Page 18: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, I

Answer to previous two papers:– TCP can create self-similarity but over finite range of time

scales - “pseudo self similarity” but everything in nature is finite (thus “pseudo”)

– Also criticize pathological model of previous paper, but they themselves use pathological model of different kind (always packets model)

Separate Markovian models for Congestion avoidence (CA) and Time Out (TO) models

Simulated these two models with different loss probability parameters

Page 19: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, II

Range of time scales observed from the simulation (2^6*RTT*(2.5 to 10)) => 2^9*RTT

Explanation on why aggregate is self-similar– independent bottlenecks (at the edge)

– aggregate of independent pseudo-self-similar flows should be self-similar itself (**)

Page 20: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

On the Autocorrelation Structure of TCP Traffic, Don Towsley, 2000, III

!About Veres paper– compute loss probability (0.08 to 0.14)

– TO model predicts H=0.69-0.72 (really measured 0.74)

– Time scale goes up to 2^6 RTO (also near measured value)

Experiments (file transfers)– North-South America

Measurements: p = 0.13, H = 0.77, ts = (2^7 to 2^8)*RTT TO model: p = 0.12, H = 0.72, ts = (2^7 to 2^9)*RTT

– East - West Coast Measurements: p = 0.018, H = 0.86, ts = 2^6*RTT CA model: p = 0.018, H = 0.75, ts = 2^4*RTT

One should be careful when attributing the origin of traffic characteristics to a specific cause

Page 21: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Protocols Can Make Traffic Appear Self-Similar, Jon Peha, 1997. I

How basic retransmission mechanism can cause self-similarityNo model, only experimental investigationSimple single queue (bottleneck) modelInput traffic - Poisson; retransmissions are burstyAs time-scale gets larger, burstiness from original Poisson traffic

decreases, but burstiness from retransmissions stays the same!Unlikely that traffic from retransmission mechanism cause truly self

similar traffic, rather pseudo self-similarity

Page 22: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Protocols Can Make Traffic Appear Self-Similar, Jon Peha, 1997. II

Pictorial

“proof”

Page 23: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Protocols Can Make Traffic Appear Self-Similar, Jon Peha, 1997. III

Cut-off time scales observed:– 150Mbps link rate, 500 bits packets, RTT 60 msec

TS = 5 minutes

– 10Mbps Ethernet, No. of retransmissions=5, To=125 TS in range of minutes

– For larger To, it is possible to reach time scales measured at Bellcore

– I have computed cut-off time-scale for Veres paper 128 Kbps, Tout=10*RTT=2 sec, TS=8min

If this effect is found to be as strong in more complex models, this could be a significant cause

Page 24: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

The Second-order Characteristics of TCP, J.Y.Boudec, 1996, I

Pseudo self similarity (TS=20-30 sec)– Minimum bottleneck bandwidth 34Mbps (?)

Two main reasons (both heavy-tailed)– Burst length arrivals

– Round trip time

Real network measurementsFigure - missing

Page 25: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

The Second-order Characteristics of TCP, J.Y.Boudec, 1996, II

Even for 34Mbps link and utilization of 25%, the arrival bursts are eliminated and the inter packet times are dependent on the round trip times

The aggregate of TCP connections have the same H as a single TCP connection (***)

“It seems likely that the heavy tailed distributions observed in Willinger’s work were a result of, among other things, the heavy tailed distribution of a round trip time”

Page 26: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

More on RTTs

Why are round trip times heavy-tailed?– Because of TCP congestion control?

– Because of retransmissions?

– Because of variety of destinations?

It can be heavy-tailed even without any congestion protocol or different destinations!

– Measurement and Analysis of LRD Behavior of Internet Packet Delay, M. Borella, Infocom 97

Constant UDP transmissions - LRD response Is cross-traffic heavy-tailed? Or multiple bottlenecks assumption?

– Simple example (not through bandwidth adaptation, but through RTT adaptation)

Page 27: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Summary

Heavy-tailed parameters– File sizes

– Connection life-times

– Inter-arrival packet times

– Document sizes available in the web

– User think times

– TELNET packet arrivals

– Round trip times

Pseudo self-similarity– it should be clear that the

range of time scales covered is far beyond dominant time scales, and as long as packet loss is concerned, this is relevant

Page 28: Origins of Long Range Dependence Myths and Legends Aleksandar Kuzmanovic 01/08/2001

Conclusions

One should be careful when attributing the origin of traffic characteristics to a specific cause

There is more than one physical activity causing LRD Protocols (TCP) influence is more than relevant

– Time scales covered are relevant in both generation, time-stretching and propagation hypothesis

Model 3 (inter-arrival times i.i.d. Pareto) plus heavy-tailed file sizes (introducing congestion) is promising

Analytical proof for aggregate is missing (simulation proof reported in 3 papers)

Round-trip times hypothesis might be promising - supports Veres idea in a slightly different way