automated modeling of broadband network data using the qtes … · 2005-02-12 · traffic modeling...

98
Automated Modeling of Broadband Network Data Using the QTES Methodology G. Klaoudatos, BEng. Electrical Engineering, L-niversit y of Patras. Greece A thesis submitted to the Faculty of Graduate Studies and Research in partial fuMlment of the requirements for the degree of Mast er of Engineering Ott awa-Carleton Institut e for Electrical Engineering Department of Systems and Computer Engineering Carleton University Ottawa, Ontario October 8$ 1997 @ copyright 199'7: G. Klaoudatos

Upload: others

Post on 28-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Automated Modeling of Broadband Network Data Using the QTES Methodology

G . Klaoudatos, BEng. Electrical Engineering, L-niversit y of Patras. Greece

A thesis submitted to the Faculty of Graduate Studies and Research

in partial fuMlment of the requirements for the degree of

Mast er of Engineering

Ott awa-Carleton Institut e for Electrical Engineering Department of Systems and Computer Engineering

Carleton University Ottawa, Ontario October 8$ 1997

@ copyright 199'7: G. Klaoudatos

Page 2: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

National Library (*! of Canada Bibliothèque nationale du Canada

Acquisitions and Acquisitions et Bibliographie SeMces services bibliographiques 395 Wellington Street 395, nie Wellington Ottawa ON K1A ON4 Ottawa ON K1A ON4 Canada Canada

The author has granted a non- L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or sell reproduire, prêter, distribuer ou copies of this thesis in rnicrofom, vendre des copies de cette thèse sous paper or electronic formats. la forme de microfichelfilm, de

reproduction sur papier ou sur format électronique.

The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or othewise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son pesmission. autorisation.

Page 3: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Abstract -4s new and more advanced communication services ek-olve. t h e demand for bet -

ter models which ri11 predict system performance more accurately becomes larger. -4

number of these services supported by high speed net~vorks git-e rise to bursty (auto-

correlated) traffic streams. A rypical example is VBR (variable bit rate) compressed

The objective of this Thesis is to investigare a new rnodeling methodoloc

called QTES (Quantized Transform-Expand-Sample) which c m be used to mode1

network traffic taking into consideration both the marginal and t h e autocorrelation

function of the empiricai data. An effort is made towards an algorithmic procedure

rather than a heuristic search. the reby l a r p l - automating QTES modeling.

Page 4: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Acknowledgement s 1 would like to

Lambadaris

me achieve

t hroughout

and Prof.

a focused

the course

. . -

express my sincere thanks to my thesis supervisors. Prof. 1.

.A. R. Kaye as well as Prof. 31. Devetzikiotis for helping

and organized approach to research and for their guidance

of this thesis. 1 wish also to profuselp thank Prof. llakios

from the Electricai tngineering department at the Vniversity of Patras for his help

and support. His encouragement has been deeply appreciated.

1 wish to profusely thank my fa mil^ for heir enduring patience. support and

encouragement during this research.

1 $rat efully acknon-ledge the funding support provided b5- the Telecommuni-

cations Research Inst itute of Ontario.

Page 5: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Table of Contents

. . Acceptance Sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . !1

..*

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

r\cknowledgement s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

... List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . si

List of Acronyrns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sii

I Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Previous Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

- 1.3 Motivation and Objectives of this Thesis . . . . . . . . . . . . . . . . I

1.1 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Transform Expand Sample Models 12

2.1 TES Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Discretized and Quantized TES Processes . . . . . . . . . . . . . . . 17

2.2.1 Discretized TES Processes (DTES) . . . . . . . . . . . . . . . 17

2.3.2 Quantized TES Processes (QTES) . . . . . . . . . . . . . . . . 19

Page 6: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Summary

3 Automated Modeling of Broadband Network Data Using the QTES

Methodology 22

3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Algorit hmic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 2.j

3.3 Irnplementation of the Exhaustive Search . . . . . . . . . . . . . . . . 2.j

3.3.1 Sext Composition . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 - 3 2 Global Search . . . . . . . . . . . . . . . . . . . . . . . . . . . -29

3.4 Local Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 - 4 1 Constraint Optimization . . . . . . . . . . . . . . . . . . . . . 31

3.4.2 Sequential Quadratic Progamming (SQP) . . . . . . . . . . . 32

3.4.3 Partial Derivatives of the objective function . . . . . . . . . . 31

3 3 Esamples and Results . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 Random Search 43

4.1 Random Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 44

1 Randorn k-subset of an n-Set . . . . . . . . . . . . . . . . . . 45

4 - 1 2 Random Composition of n into k parts (rancom) . . . . . . . . 1 S

4.2 Examples and Results . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Cornparison Between Exhaustive Search and Random Search 53

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 Sumrnary 58

Page 7: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

6 QTES Generator 60

6.1 Inverse Tranform Method . . . . . . . . . . . . . - . . . . . . . . . 6 1

6.2 QTESalgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.2.1 Discrere number generator (dis-genn) . . . . . . . . . . . . . . 63

6.22 Background QTES generator (qtes) . . . . . . . . . . . . . . . 64

6.2.3 Foregound QTES generator . . . . . . . . . . . . . . . . . . . 64

6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.4 Summan . . . . . . . . . . . . . . . . . . . . - - - . . - . . . . . . . 70

7 Conclusions and Recommendations for further Study '71

7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . - . 'il

-- 7.2 Recommendat ions for Furt her St udy . . . . . - . . . . . . . . . . . . r a

- -- i 2.1 Experimental Design and Design Optimization . . . . . . . . . i a

7.2.1 O ther Recommendations . . . . . . . . . . . . . . - - . . . . . 79

vii

Page 8: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

List of Tables

5.1 Required running time (hours) for obtaining the trafic models from

. . . . . . . . . . . . . . . . . . . . the ernpirical l'BR video sarnples 5S

-- . . . . . . . . . . . . . . . . . . . . . 7.1 Generic 3 factor desigr analysis r c

Page 9: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

List of Figures

2.1 Probability Density of Innovation Sequence 1; . . . . . . . . . . . . .

2.2 Migration of correlated background sequence on the unit circle . . . .

. . . . . . . . . . . . . 2.3 Probability densi t y of Innovatioo Sequence k;

. . . . . . . . . . :3.1 Flowchart for the Sest k-Subset of an n-Set routine

. . . . . . . . . . . . . . 3.2 Flowchart for the Sext Composition routine

3.3 .I\utoconelation Coefficients for the BBC news empirical t raffic sample

and the QTES models for K=lO and K=30 using the exhaustive search

3.4 Aut ocorrelation Coefficients for the Last Action Hero empirical trafic

sample and the QTES models for K=10 and K=2O using the eshaustive

seaxch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5 Autocorrelat ion Coefficients for the Foot bal1 empirical t raffic sample

and the QTES models for K=10 and K=ZO using the exhaustive search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 balls in celis mode1

4.2 balls in cells model example . . . . . . . . . . . . . . . . . . .

4.3 Flowchart for the ranksb algorithm . . . . . . . . . . . . . . - . . . .

4.4 Autocorrelation Coefficients for the BBC ';en-s empirical trafic sample

and the QTES models for K=10 and K=20 using random search . .

Page 10: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

.-\utocorrelation Coefficients for the Last Action Hero empirical traffic

sample and the QTES models for K=lù and K=2O usin2 random search 51

Autocorrelation Coefficients for the Football empirical rrafiic sample

and the QTES models for K=10 and K=2O using random search . . .

Autocorrelation Coefficients for the empirical data stream and the

QTES (K=20) models of examplel (BBC video sample) using eshaus-

. . . . . . . . . . . . . . . . . . . . . . . . . . tive and random search

Autocorrelation Coefficients for the empirical data s t rearn and the

QTES (K=?O) models of examplel (Last Action Hero video sample)

using exhaustive and random search . . . . . . . . . . . . . . . . . . .

Autocorrelat ion Coefficients for the empirical data st ream and the

QTES (K=?O) rnodels of erampIe3 (Last Action Hero video sample)

. . . . . . . . . . . . . . . . . . . using exhaustive and random search

Autocorrelation Coefficients for the empirical data stream and the

QTES (K=20) models of esamplel (football video sample) using es-

. . . . . . . . . . . . . . . . . . . . . . . haustive and random search

Autocorrelation Coefficients for the QTES (K= 10) models of esample3

(football video sample) for K=30 and X=ïO using randorn search . .

r\utocorrelation Coefficients for the QTES (K=ZO) models of esample3

(football video sample) for K=100 and S=500 using random search

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . QTES generator

Inverse-transforrn method for continuous random variables . . . . . .

Page 11: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Cont inuous: piecewise-linear empiricd distri but ion funct ion frorn grouped

data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

QTES models ( K = 10 and I< = 30) of the BBC news vide0 trafic

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sample

QTES models(K = 10 and I< = 20) of the Last Action Hero video

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . traffic sample

QTES models(1i = 10 and Ii' = 20) of the football video traffic sample

Page 12: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

List of Acronyms AT 1.1 BISDS DCT DES DTES FR FDDI IEEE I ID ISDS KT L A S MMPP MPEG QoS QP QTES SQP TES VB R iT*-A 3.

Asj-nchronous Transfer Mode Broad band Integrated Services Digital Setwork Discrete Cosine Transform Discrete Event Simulation Discrete Transform Expand Sample Frame Relay Fiber Distributed Data Interface Institut e of Elect rical and Electronic Engineers Independent Identically Dist ributed Integrated Services Digital Xe'etwork Kuhn-Tucker Local Area Xetwork Markov-Modulated Poisson Processes Motion Picture Experts Group Quality of Service Quadrat ic Programming Quantized Transform Expand Sample Sequent i d Quadratic Programming Transform Expand Sample Variable Bit Rate Wide Xrea Xetwork

Page 13: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

QTES Quantized Transform Expand Sample S QP Sequential Quadrat ic Programming TES Transform Expand Sample VB R Variable Bit Rate WAX L't'ide Area Xetwork

Page 14: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 1

Introduction

1.1 Background

One of the vital elements in today's technological enrironment is the efficient and fast

flow of information through telecommunications networks. This flow is supported bu

complex cornputer and communications structures that. if properly designed and

operated, are invisible to the end users. High-speed network transport mechanisms,

such as the asynchronous transfer mode (ATM): ce11 relay and the frame r e l - (FR)

allow transmission speeds of 1.5 Mb/s t o 1.50 Mb/s as seen bj- the end user. while they

serve also as enabling technologies for new classes of communication services. such

as multimedia and video on demand that are typically grouped under the heading of

B-ISDS [l].

As these new communications services evoloe and the needs of users change:

the enterprise must respond by modifying existing communications s-stems or by

implementing entirely new ones. To this end, telecommunications professionals are

being called upon to design and manage these systems in the face of fast-moving

technology and a climate of increasing customer espectations. Design and manage-

ment decisions require predictions of network performance. Quality of Service (QoS)

Page 15: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

considerations have to take into account the fact that such services

sensitive and loss-sensitive as well. In the face of d l t his. network and

are both delay-

t raffic designers

must decide how to set a number of parameters. for instance the threshold values.

and evaluate the espected performance under a wide variet- of traffic conditions.

A11 this bring us to two issues that constitute the core subject of emerging

networks: performance el-aluat ion and traffic modeling. Monte Carlo comput er sim-

ulation [l] is a flexible performance prediction tool used widely in sciences and engi-

neering. Its flesibilit>- comes from the fact that it consisrs of a compurer progam that

-behaves- like the system under study. .lnalytical methods are also used as an alter-

native to avoid the huge computing load associated with conventional Monte Carlo

simulation. Cnfortunatelc analytical models require many assumptions and are too

restrictive for most real-11-orld systems. bIost simulation rools for telecommunications

networks though are based on Discrete Event Simulation (DES) j7. 31. The most im-

portant feature of DES models is that the- keep track of time via simulation docks.

which change by random incrernents. The basic executable unit in DES models is an

event (a prograrn thar is executed at discrete simulation times).

Traffic modeling is a key element in simulating communications networks. A

clear understanding of the nature of traffic which goes through a particular system.

its statistical properties and the selection of an appropriate trafic mode1 are vital

issues for the success of the modeling enterprise. In .-2IM and FR netn-orks' buffer

overflow is the major cause of losses. Therefore. the probability and the correlations

of such losses, together with the queuing delays constitute major issues of the QoS

criteria. Such criteria are extremely sensitive to the accuracy of the traffic models

used in their prediction.

Page 16: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Man- the existing traffic models do not represent the traffic in networks with

sufficient accuracy. In particulart they may not fit either the first order statistics

(marginal probabiii ij distribution) or the second order statistics (autocorrelat ion

function) of the empirical tirne series. Temporal dependence u-hich is prosied by

the autocorrelation function is a major cause of burst iness in t elecornaunications

traffic. especially in emergng high-speed communications networks: typical examples

are file transiers and compressed variable bit rate video (VBR). The autocorrelation

function is a convenient measure of temporal dependence in real-valued stochas tic

processes. frequent ly used by engineers. For a discrete-t irne. real-vaiued stationary

stochastic process {.ï,}~=, . the autocorrelation function P=(T j consists of the lagged

correlat ion coefficients

where px < s and cri < cc are the mean and variance. respectil-el- of the Xn.

Burstiness which more commonly is caused by the presence of significant positive

autocorrelations in the interarrix~al process can make wai ting t imes arbitrarily high.

without increasing the arriva1 rate. Various studies [4] ha\-e shown that when auto-

correlated traffic is offered to a queueing system. the resulting performance measures

are considerably worse than those corresponding to renewal traffic. In fact. mean

waiting times can differ by orders of magnitude.

As a solution to t his type of problem, we describe and investigate the use of the

Quant ized Transform-Espand -Sample (QTES) met hodology to capture bot h the first

and second order statistics of empirical time series simult aneously and accurat ely.The

criteria for a good t raffic mode1 should be the folloming [j] :

Page 17: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

4

0 The marginal distribution of the mode1 shodd match its ernpirical counterpart

The autocorrelation function of the model should approximate its empirical

counterpart. Since the empirical data is finite. the model needs to approximate

the significant leading autocorrelat ions.

Sample paths generated bu a simulation of the rnodel shouid -resemble2 the

empirical data.

1% should mention that the first two requirements consti tute quantitative cri-

teria. requiring that the fint-order and second-order properties of the empirical data

are adequately captured. The third requirement though. is a qcalitative requirement

which can not be defined with mathematical accuracy. Ir is mainly a heuristic pro-

cedure and if a model can imitate the qualitative character of the empirical data. so

much the better. as our confidence in the model is increased. It is important to real-

ize that the qualitative similarity should not substitute for the first two requirements

which approsimate the statistical signature of the empirical data [ 5 ] .

1.2 Previous Research

There are many tr&c models which are cornmonly used in traffic modeling [l]. Such

models are used either as a part of an anal-mical model. or to drive discrete-event

simuiations. Xmong the quantities that are of particular interest to the modeler.

interarrival t imes. batch sizes and work-load processes. in eit her cont inuous or discret e

(slot ted) time are the most important.

Page 18: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Renewnl Trafic Models Renewal models have been used a lot for traffic model-

ing: because of t heir mathematical sirnplicity. In a renewal t r a c process. the

interarrival times {A,) are independent. identicaily distribut ed (IID ). but t heir

distri but ion is allowed to be general. Renewal processes. while mat hemat ically

simple. they have a major disadvantage: the autotorrelation function of {il,)

vanishes identically for al1 nonzero lags. This poses a serious problem to the

performance evaluation of broadband networks n-hich are mainly dominated by

bursty traffic. Khen this traffic is offered to a queueing system. it gives rise to

much worse performance (such as mean maiting times) as compared to renenal

traffic u-hich lacks temporal dependence [4]. Poisson processes are the oldest

renewal processes used in traffic modeling.

Poisson models are the oldest trafEc models. dating back to the advent of tele-

p h o - and the renowned pioneering telephone engineer A. K. Erlang. .A Pois-

son process can be characterized as a renewal process IL-hose interarrival times

{ A , } are exponentially distributed with rate parameter X : P{& 5 t } =

1 - e x p ( - A t ) [ 6 ] . Equi~alently~ it is a counting process. satisfying P { X ( t ) =

n} = e z p ( - A t ) ( A t ) n / n ! , and the number of arrivals in disjoint intervais is sta-

tistically independent (a property known as independent increments).

Poisson processes enjoy sorne elegant analytical properties. First. the superpo-

sition of independent Poisson processes results in a n e a Poisson process whose

rate is the sum of the component rates. Second, the independent increments

property renders Poisson a memoryless process. This. in turn, greatly simpli-

fies queueing problems involving Poisson arrivals. Third, Poisson processes are

Page 19: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

fairly common in tr&c applications that physically

of independent traffic streams? each of ri-hich ma>- be

comprise a large number

quit e general.

Time-dependent Poisson processes are defined b~ letting the rate parameter X

depend on time.

The discrete-time analog of Poisson processes are Bernoulli models. Here the

probability of an arriva1 in any time slot is p, independent of an- other one. It

follows that for slot k. the corresponding nurnber of arrii-als is binomial. The

t ime between arrivals is geomet ric with parameter p.

a Markor Trafic ModeZs llarkov processes are more senerai than renewal pro-

cesses. since they introduce dependence into the random sequence {A,) [6].

Despite the fact that Markov processes ma? have different parameters. the

interarrival times are still esponentially dist ribut ed. Due to the memoryless

property of the exponential distribution. Markov traffic models have good an-

alytical properties and they are widely known. In order to build a Markov

process though. Ive need to know the transition rnatris v.-hich is ver>- difficult

to get when too man- states are assumed. A way to reduce the complexity

of Markov processes, while maintaining t heir flexibility. is r O integrate Markov

processes with few states along ivith other processes to iorm a new process.

Markov-Modulated Poisson Processes (MMP P) constituïe an exteremely irn-

portant class of such traffic models. These models have been rvidely used to

mode1 voice traffic sources [Tl.

Autoregressivc Trafic Models In autoregressive models the nest random variable

is defined in the sequence as a function of previous ones n-ithin a time windo~l*

Page 20: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

stretching from the present into the past. Such models are particularly suitable

for modeling VBR-coded video. a major consumer of bandwidt h in today 's

high-speed networks. The nature of vicieo frames is such thai successive frames

ivithin a video scene Vary risually ver- little (there are 30 frames per second in

a high quality video). Only scene changes can cause abrupt changes in frame

bit rare. Thus the sequence of bit rates (frame sizes) ma!- be modeled by an

autoregressive scheme. while scene changes can be modeled by some modulating

mechanism, such as a Markov chain. .A linear autoregessive model has been

used to model variable bit rate (VBR) coded video [Sj. There are many different

kinds of autoregressive trafic models such as the linear autoregressive models.

MA: ARMA7 and A R M A ahich are outside the scope of this rhesis. One type

of the autoregressive trafEic models though is of particular interest to us: since

the ri-hole modeling methodology that ive tvill use in this paper is based on

t hese t raffic models: Transform-Expand-Sample models. These models will be

described in detail in the sequel.

1.3 Motivation and Objectives of this Thesis

This thesis presents a consolidated study that is based on the generation of t r a c

models from empirical traffic strearns which go through high speed nern-orks. The aim

is to produce traffic models in an automated way using the QTEÇ met hodolog? [9]

which i d 1 capture both the autocorrelation structure and the marginal distribution

of the empirical traffic streams simult aneously.

The main objective of this thesis is to develop an algorithmic procedure which

will automate traffic modeling making use of the QTES modeling methodology. The

Page 21: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

algorithm will use a nonlinear programming setting with the objecïive of minirnizing

a weighted sum of squared differences between the empirical autocorrelations and

t heir candidate QTES model counterparts.

QTES processes is a class of stochastic processes tvhich can t e used to rnodel a

large cclass of stochastic processes in an accurate ivay. The QTES modelling method-

ology approsimates a continuous-state TES process by a discrete-state QTES coun-

terpart. TES is a class of models [S. 11. 12. 131 which can accurately model empirical

sequences. and in particular. can effecrirely capture traffic burstiness !10]. The TES

modelling methodolog (to be reviewed in chapter 2) can accurately capture both

first-order and second-order statistics of empirical data: more specifically. the model

esactly captures the empirical marginal distribution (histogram) and approsimately

captures the empirical aut ocorrelat ion function, simult aneously. One problem u-hich

is associated with the use of TES processes in traffic modelling is t hat the TES-based

queueing models have been analytically intractable. rnainl- due to the fact that TES

processes are essentially transformed Markov sequences, over an uncountable state

space. In this thesis. n-e use another class of stochastic processes. the QTES pro-

cesses (Quantized TES) which approximates a continuous-state TES process by a

discrete-st ate variant QTES counterpart . This ivay we can have a tract able queueinj

model which is still quite accurate in the sense that the traffic model captures first-

order and second order statistics of the empirical trafic data. A detailed analysis of

both TES and QTES processes is presented in chapter 2.

The specific objectives of this thesis are to :

1. Develop an algorithmic approach for capturing the correlation structure of the

empirical traffic using the Exhaustive Search Local Optirnizaiion algorithm. As

Page 22: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the name suggests this algorithm combines an exhaustive search with a local

nonlinear programming technique t O minimize an objective funct ion consist ing

of the distance between the empirical autocorrelation function and its candidate

model counterpart.

2. Validate the efficiency of the Exhaustive Search Local Optimization algorit hm

by resdts using UPEG-compressed VBR video data streams. Compare the

accuracy of the approsimation of the empirical autocorrelation structure wi t h

its QTES counterpart for different quantization le\-els.

3. Modify the automated algorithm by replacing the exhaustive search with a

random search.

4. Validate the efficiency of the algorithm.when a random search is used by results

using the same MPEG-compressed VBR video data srreams as in the case of

the exhaustive search.

2- Compare the accuracy of the approximation of the empirical autocorrelation

structure \vit h its QTES counterpart between the random and the exhaustive

search.

6. Develop a QTES traffic generator which \vil1 accept empirical traffic data as an

input and it ni11 return a t r d c model which captures both the autocorrelation

structure and the marginal distribution of the empirical data. The parameters

which ensure the match in the autocorrelation structure between the empirical

data stream and the model are calculated by the automated algorithm which

was described in a nutshell above. The match in the marginal distribution is

Page 23: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

achieved by developing an algorithm which makes use of the Inversion Transform

met hod.

T. Validate the accuracy of the QTES pnerator ty results usine rhe same JIPEG-

compressed VBR d e o data streams as above.

1.4 Thesis Organization

The remaining chapters of the thesis are organized as follows:

Chapter 2 : Resiews the fundamentals of the Transform Espand Models

(TES. DIES: QTES). t heir mathematical properties. characteristicc. reasons for cre-

ating the DTES and QTES processes from the TES processes.

Chapter 3: The optimization problem is formuiated by introducing the objec-

t ive funct ion consisting of the distance between the empirical autocorrelation function

and its candidate mode1 counterpart. The optirnization parameters and the paramet-

ric space are defined. This chapter presents the Exhaustive Search Local Optimization

algorithm. The implementation of the exhaustive search and the local opt imization

are explained in a detailed manner starting from the mat hematical prerequisi tes and

going further into inrestigating the implement ation from a progamming point of

view. The chapter ends xith validating the efficiency of the Exhaustive Search Local

Optimization algorit hm by using three examples from the domain of the cornpressed

VBR video traffic.

Chapter 4: Replaces the exhaustive search u-ith a random search over the

parametric space. The chapter explains first the mathematical backgound that is

needed to develop a random search a lpr i thm and then it proceeds with explaining in

Page 24: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

gea t detail the implementation of the random search algorithm from the program-

ming point of rien-. Finally the efficiency of t his approach is validated bu using the

same three VBR video data streams as in the previous chapter.

Chapter 5: Compares the efficient- of the exhaustive and the random search

with respect to the approximation of the empirical autocorelation tvith its QTES

counterpart . This chap t er invest ipoates the ris ks which are associated ivi t h the random

sampling procedure and shows through an example that bu increasing the number of

the randomly chosen points. iye have bet ter chances of getting "good- initial points.

The price we par for this increase is that the random search takes much longer.

Chapter 6: Implements a QTES generator tvhose purpose is to generate

sample pat h realizat ions which capture the first order st atis t ics (marginal probabiliry

distribution) and the second order statistics (autocorelation function) of a given em-

pirical data sample. The QTES generator is presented as a foreground/background

scherne. This chaprer esplains in detaii how an exact match to the empirical his-

togram is achieved by the use of the Inverse Transform method. Finally the accuracy

of the QTES generator is validated by using the same three eramples from the domain

of the compressed \'BR video traffic.

Chapt er 7: Conclusions and Recommendations for furrher st udy.

Page 25: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 2

Transform Expand Sample Models

2.1 TES Processes

This subsection provides a brief overvieu- of TES processes and the TES modeling

methodology For detailed discussion see [Il1 12. 131. TES modelin; constitutes a

different input analysis approach whose principal merit is its potential to capture both

the empirical marginal distribution (hist o g a m ) and to sirnult aneously approsimate

the leading empirical autocorrelat ions via a star ionary stochastic process.

The construction of TES models involves t ~ o random sequences in lockstep.

The first sequence: called a background TES process. plays an ausiliary role, and is

defined either as { C Ï ~ } ~ = o or {x )Mo : these have the form

and

I;: i fneven 1 4 ' " TL i f n o d d

Page 26: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

respectively where Uo is dicrributed uniformly on [ 0 4 ' {V;,)$, is a sequence of iid

random variables, independent of called the innovation sequ tncc. and the angular

brackets denote the modulo-1 (fractional part) operator (x) = z - mas {integer 7r:

n 9 x). The superscript notation in Eq '2.1 and 2.2 is due to the fact t hat { I Y ~ } F = ~

and {I;,-)n=o can generate lag-l autocorrelations in the range [OJ] and i-1-01? respec-

tively. For esample. the innovation densitp function illus t rated in figure innovation

cm be viewed causing the background sequence to miga te slowly and consistently

around the circle in a clockwise direction(fi,we 2.2). A background sequence of t his

type will exhibit strons positive autocorrelations over several lags. as well' a peri-

odic component can be introduced inro the background sequence autocorrelation bu

shifring the density function in figure 2.1 away from the y-asis. Figure 2.3 illustrates

a different innovation density function' such that the value of Li varies uniformly on

[-0.5. 0.51. In t his case. the corresponding background sequence moves arbitrarily on

the circle and is t herefore uncorrelat ed.

The second sequence. called a foreground TES Process and denoted by {X,f}No

or {X;}E,. respectively, is a transformed version of a background TES process?

where D is a measurable transformation from [ O 4 to the reals. called a distortion .

Eq 2.3 defines two classes of TES rnodels: denoted TES' and T E S - : respecti~ely~

and these foreground sequences are the end-product TES models.

An important family of distortions is based on an empirical time series.

these distortions consist of composite transformations of the form

Page 27: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 2.1: Probability Density of Innovation Sequence I/,

0.w 1 .O

Figure 2.2: bligaiion of correlated background sequence on the unit circle

Page 28: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 2.3: Probability densiïy of Innovation Sequence lm

Dl+j = H;'(s+)). z E [O. 1)- ('2.4)

Here. the inner r rznsformzt ion. 5 . is 2 --jmoothin_oY operation, cailed a stitching

transformation . parameïerized by O 5 € 5 1. and given by

I-; = { Y 1:. O < Y < [ (2.5) (1 - Y ) / U -0: t 5 Y 5 1

The outer transformation fi;' ; is the inwrse of the ernpirical (histograrn) distribution

function computed from an empirical t ime series. 1' = {IL): as

where J is the nun~be r of hisrograrn celk. il,. rj) is the support of ce11 j with widt h

uj = rj - Zj 2 0. j, is the probability estimator of ce11 j and {Ci};=, is the cdf of

Page 29: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

{ ~ j }&l

The rationale for TES processes stems from the follon-ing facts: First. a11

background TES sequences are st ationary M arkovian. and t heir marginal distribution

is uniform on [OJ). regardless of the probabiiity law of the innovations { L I ) selected.

It can be shown that the r-step transition density of {Le,'} is

and that of {Cr'} is

g f (44 O 5 y.s < 1 . 7 eren gF(pi4 = { IF=-= f ~ ( i 2 ~ ) ~ ~ ~ ~ ( ~ - ~ ) ! O < y . ~ < 1.7 odd (2. S)

O. ofherwise

The tilde denotes the Laplace transform. Second the inversion method enables us

to tranform any uniform variable to one 115th an arbitrar~- prescribed distribution as

follows: If F is of a n - distribution function and Cis uniform in (0.1). then the random

variable X = Fd'(C') sa~isfies X - F For detailed discussion see [2]. And third. for

O < < 1' the sample paths of background TES sequences are forced to be more

'continuous looking- due to the Sc transformation. The Inversion method can still

be applied to stitched background processes {Sf(L7,)} since stitching transformations

preserve uniformity [Il]. Therefore. any foregound TES sequence of the form

is always guaranteed to have the ernpirical distribution Huy regardless of the innova-

tion density f a and the stitching parameter 6 selected. The choice of a pair (f,: <)

Page 30: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

will determine the autocorrelation

ages to capture both the marginal

the empirical data simuitaneously.

17

function of 2.9. Therefore. TES rnodeling man-

distribution and the autocorrelation hnction of

2.2 Discretized and Quantized TES Processes

Despite the fact that TES-based models have proven to be accurare in a variet) of

application dornains TES based queueing models. horvever. have been analyti-

cally intract able. Queuing models wit h TES-based traffic have only been studied via

Monte Car10 simulation. The analysis of such models is difficult. due ro the fact that

TES processes are t ransformed Markov sequences. over an uncount able sr ate space.

The reason. therefore: for introducing discretized and quant ized TES processes is to

reduce the continuous state space of background TES processes ( the unit circle) to a

finite state space.

2.2.1 Discretized TES Processes (DTES)

A DTES process is a discrete variant of a TES process orer a finite sr are space. This

is achieved by partitioning the unit circle into a finite number of cubinrervals (called

slices). and "collapsing- each slice subinterval into a new state. Intuitevely. if the

slice subintervals are -srnail enough' , t hen the exact location of r he original process

within a slice is "well approximated" by the slice index. Let Z.J~ = {O. 1. ...'M - 1).

for some integer M > 0. A quantization of the unit circle: [OJ). is a partition,

Page 31: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

of the interval [OJ) into M subinterlals (slices). T,. each of length 6 = Ow = 1/M.

The integer-valued modulo-IV. operator from the integers onto 2.1~ is denoted by

where LzJ = rna~{n: n is an integer and n 5 x}.

Let Co be an integer-valued random variable. distributed uniformly over ZLkr-

Let. further. {J,}:=, be an iid sequence of integer-valued random variables. inde-

pendent of Co. called the innovation sequence. analogously to the setting of TES

processes. The probability distribution of the J, is denoted by

and

fJ(j) = P { J , = j } = oj j i Z.br

The background DTES processes are defmed by

Like TES backgound processes, the DTES background processes are also

stationary and their marginal distributions are al1 uniform on Z.tr. Their Î-step

transition probabilities are giren by

Page 32: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

and

where f j( i2nv) is the discrete Fourier transform of the discrete probability distribu-

tion function, fJ for v = O, l7 ...: Ii" - 1. For more detais, refer to [Il].

2.2.2 Quantized TES Processes (QTES)

Quantized TES processes (referred to as QTES processesj are obtained from DTES

processes by extending the latter to the unit circle. in such a way that the corre-

sponding transition density has has a quantized domain of conditionins.

W e have mentioned in the previous section rhat the background DTES pro-

cesses. {C,} are marginallv uniform on Zhl Le..

P{Cn = V L } = I/ik! = 6. rtt E Z-L~- (2.17)

The corresponding QTES process. {Q,). is a continuous extension of {C,) defined

where the sequence {CF;} is iid with uniforrn marginals on [0.1]. In other ivords,

Q, is conditionally uniform on Tm, given that C, = m. From 2.17 and 2.18, it can

be concluded that any QTES background sequence is uniform in [OJ). The s-step

transition density of {Q,) is given by

Page 33: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

nhere I r ( z ) is the unique slice index. k = k ( z ) , in the partition. r. such that r E rk.

Therefore. q 4 y 12) is only a function of the partition r, in the sense that ir does not

depend on r and y: but only on the partition indexes indices in which the- lie. In

other words. q, is constant in each variable over partition sïices. implying 1 hat it maps

a continuous domain to a finite range. -4s a result . gr only requires a finite number

of values for its represenation, ivhile in a TES background process rhe transition

function g, requires two continuous variables. It is this reducrion in cornplexit- that

motirated the QTES processes. As it can be seen by 2.7 and 2.15 and from 2.S and

'2.16.

Let D be a distortion, and let {Zn) be the corresponding foreground sequence,

with elements 2, = D(Q,. Thus? the QTES foreground sequence. {ZR}- has the

sarne marginal distribution. f i y 7 as the TES foreground sequence {X,}. with X, =

(se (Un)). Let further be the partial mean of 2, on each slice Tm. namelx,

and notice that

The autocorrelation function pZ(r) of Zn for r = O. 1: .... is gken by

Page 34: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

and

PW> Â €t'En

+ xYL;l fj( i 2 7 L , ) ~ 2 ( ; . 7 , y ) e - i2 rv - . Ï odd

2.3 Summary

The TES modelin; rnethodology has an attractive property of decouplino t h e dis-

tribut ion of the foreground process from t he au~ocorrelat ion funct ion. Because the

distribution of X, is guaranteed to match the empirical distribution. the mode1 needs

only to be optimized to fit the empirical autocorrelation function. Fit ting the TES

mode1 to the empirical autocorrelation is accomplished b ~ - var)-in2 the innovation

density. f, and r he stitching parameter. (.

I\ltough TES-based models ha\-e proven to be accurate in a variety of ap-

plication domains. TES-based queueing models. however. have been analytically in-

tract able. rnainlj- because TES processes are transformed Slarkm- sequences. over

an uncounrahle state space. This is the reason for introducing the quantized TES

processes (DTES). By using DTES processes. u-e manage to reduce the continuous

state space of background TES processes to a finite state space. Finallj-. ive obtain

quantized TES processes (QTES) by estendin; DTES processes to the unit circle.

in such a way that the corresponding transition densitj- has a quantized domain of

condi t ioning.

Page 35: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 3

Automated Modeling of Broadband Network Data Using the QTES Methodology

Despite the fact that TES modeling has been prol.en effective in manr application

domains. it has several disadvantages. Firstly. someone has to understand in depth

the qualitative nature of TES processes in order to use the 'TES modeling met hodology

effecri~ely and get accurate results. Secondly. the scope and the speed of the search

are fundarnentally subject to the human response and thirdly. modeling precision is

constrained by screen resolution as perceived by human eye. Bu trying to shift the

modeling burden from the modeler to the cornputer. ive can largely automate the

modeling process and aleviate the above problems. -4s i re have already mentioned in

section '2.2. TES based queueinj models have been analytically int ractable. due to the

fact t hat TES processes are t ransformed Ilarkov processes, over an uncount able st ate

space. Therefore ive are joins to use QTES processes in this algorithmic modeling.

achievins in t hat way a reduction in the complesity of the algorithm. .An algorit hmic

procedure using TES processes is discussed in [ls].

Page 36: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Our modeling approach combines a global search ~vith a local nonlinear tech-

nique to minimize the objective function consisting of the distance berween the em-

pi rical autocorrelat ion function and it s candidate model count erpart . Bu distance.

we mean the weighted sum of squared differences between autocorrelation coefficients

of corresponding lags. This approach aas mainly motivated bj- the fact that there

are fast and numerically stable analytical formulas for the calculation of the objec-

tive function and its partial derivatives. More than that. the constraints in the local

nonlinear optimization problem are very simple.

3.1 Problem Formulation

As ive have already mentioned? QTES modeling methodology tries to produce ac-

curate traffic models from empirical data by capturing first-order and second-order

statistics of empirical time series simultaneously. The first order parameters are repre-

sented by the distortions Dusi which are given by equarion 2.4. II-hile the second order

parameters are represented by the pairs (f,. [) of step-function innovation densities

and stitching parameters.

Since an esact match to the empirical histogram is guaranteed by using the

inversion method. the problem reduces to one of approsimating the empirical auto-

correlation function. Fy. by some QTES model autocorreiation function. p:",,~. n-hich

will be determined by the choice of (f,: 0. f, is the discrete innovation used in the

QTES model and is the stitching factor.

We have discrete innovation densities of the form fv = P = { f i Pz: ..-- Pk}.

wit h the restrictions

Page 37: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

and the stitching factor. lies in the interval [OJ]. Therefore. n-e can define the

parameter space as

GA- = {(PI c) : P = (Pl. ... Pc) ( E [O. 1)) (3.2)

Sow. we need a metric on the space of autocorrelation functions. The metric

which d l be used is

where T is the maximal autocorrelation lag to be approximated. The metric should

also t a h into consideration the fact that it is more important to approximate the

lo~k-er-lag autoconelations than the higher-lag ones. The use of the weight sum of the

squared differences between the empirical and modeled autocorrelations in 3.3 is due

t o this reason. In the above metric (3.3)- the O < a, 5 1 are neight coefficients.

Our task now is to search for a pair (P. <): which minimizes the autocorrela-

tion distance in 3.3. Formally. we seek to solve the folloaing nonlinear optimization

problem:

For a f i t ed incerse histogram distribution, find an optimal DTES innovation

density and stitching parameter. (f *,. @), such that

where g( fVo {) is given in equation 3.3.

Page 38: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

As we can observe from the above analpis, the optimization problem that ive

have to tackle has a finite dimensionality due to the use of the QTES proceses. it

is subject to simple linear constraints and its objective function has an analutical.

closed formula as it can be seen from equations 2-23 and 2.24.

3.2 Algorithmic Approach

The algorithrnic approach which will be used for the solution of the optirnization

problem consists of two parts.

Exhaustive Search: We select the B most promising points of the pararnetric

space GK. In order :O do that. me discretize the parameter space into a finite number

of points K. evaluate the objective function gh- at those K points and keep those

points which give the B smallest values of g~-.

Local Search: K e use each of these B points as initial points to find a

minimum of the objective function g ~ . Then. w e select among rhem that point x*

that gives the smallest value for g ~ - .

Bq. using the exhaustive search as the first step for the optirnization procedure.

u-e Cet rid of al1 the obviously "bad" points right from the beginning. \\-hile we increase

the chance that the best local minimum is close to the global minimum.

3.3 Implement at ion of the Exhaustive Search

-4s we have already pointed out, the Exhaustive Search selects the B most promising

initial points for the local search. This means that the algorithm finds the B points

(P. E ) which give the smallest value of the objective function. In order to simplifv

Page 39: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the notation. we write (Pl . ... PK7 c ) rather than ((Pl, ... PI;). [)' interchangeablu with

(P+f). Therefore. what Ive have to do first. is to find a nay to produce al1 the possible

combinations of (Pl . ... pK.() where Pl + ... - PI; = 1' O 5 PR 5 1 and E [O. 11.

In addition t O I< and B. the eshaust ive search algori t hm requires t 11-0 addi t ional

parameters. .Yp and -Yc. These are the number of equidistant values that each P,

and [ can assume respectiveel. For example! if we have a .Yp = 4. i t means that each

one of the Pl. . .. Ph. can be equal t O O, 0.25. 0.50. 0.75 or 1. There is of course the

restriction t hat Pl + ... + Ph- = 1. The same rule applies to ( also. If. for esample. Ive

have :\ = 10. it means t hat the allowable range for the smoothing transform is [0.1]

in increments of 0.1. The reason for establishing these specifications is obvious: First.

we decrease the number of searches for both the discrete innovation and the stitching

factor and second! given that we have an initial ;good" innovation and a stitching

factor' ive get more exact values in the local optirnization step of the algorithm.

We used two routines in order to implement the Global Search Algorithm: the

nezt composition and the Global Search. Both of them are esplained in the sequel.

3.3.1 Next Composition

We observecl that the way that each initial discrete innovation is formulated is the

same as the composition of n into k parts. Let n and k be fised positive integers. By

a composition of n into k parts, Ive mean a representation of the form

in which ri 2 O (i = 1. .... kj and the order of summands is importmt. By using the

sarne idea in our case: u-e can say that if we have a discretization lerel Ii and X p is

Page 40: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the number of the equidistant values that each one of the Pl. .... PI; can take. then

t his is the same as if ive have to accomodate :Yp balls in Ii cells. In [16] it is prol-en

t hat there are

ways of arranging n bzlls in k cells.

The algorithm for generating compositions of n into k parts is an extension of

another algorithm that Ive made which returns al1 the combinat ions of n things taken

k a t a time. Therefore. we will explain this aigorithm first.

In a lexicogaphic sequence, we can obtain the successor of a given k-subset

{a(l)- .... a ( k ) } as follows: search for the smallest h such that a ( k i 1- h ) < n + 1 - h:

then increase a(k + 1 - h ) by 1 and set a(j)+ a(j-1) + 1 ( j =k+2-h. k).

In figure 3.1. n-e can see hotv ne obtain a successor of a given k-subset of an

n-Set.It is interesting to notice that the index h can be found without searching. since

at each transition from a subset to its succesor: h increases by 1 unless a(k + 1- h ) <

n - h. where h is reset to 1 on the next transition.

We can generate compositions of n into k parts by going back to our lexico-

graphie algorit hm and t ranslating it into a direct algorithm for compositions. Recall

that in that algorithm. if { a ( l ) ' ...: a ( k ) } is a ( k - 1) subset. we go to the next one bu

finding the smallest h for which

Page 41: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 3.1: Flowchart for the Xext k-Subset of an n-Set routine

Page 42: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

We then increaçe a(k - h - 1) by 1 and set each succeeding a ( r + 1 :I equal to one more

than its predecessor a ( r j.

As far as the composition which is associated nith the subsei ( 3.5 ) is concerned.

the relations (3.7) tells us rhat r ( k ) = r ( k + l ) = ... = r(k+l-h) = O and r ( k - h ) > 0.

By increasing a ( k - h - 1) by 1 and setting each following a ( r ) equal to one more

than its predecessor. will increase r(k - h - 1) by 1. set r ( k ) = r ( k - h ) - 1 and set

r(k - h ) = 0.

The notion of subsets can then be eliminated and the algorit hm can be stated

direct- in terms of compositions. L1;hen ail of this is done. ivhat rernains is to

search the last compositision r l . ...' r t to find the first nonzero part r h . We then put

f - T A , r h t- O . rl c- f - 1, * rh+l + 1

In figure 3.2. ive can see the flowchart for the complete algorithm. If u-e divide

al1 the elements of each different combination with the discretization level E;: we can

have the DTES innovation.

3.3.2 Global Search

The objective function is calculated for each one of the B initial DTES innovations

and stitching factors t. The computed value is then compared II-ith the current best

set: namely the running set of best (at most B) values in a sorting order. If the newl';

computed value improves on the worst value in the current besz set. rhen the worst

value is discarded and the new value is added in the sorted order- Ke continue this

way until Ive search the whole discretited space.

Page 43: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 3.2: Flowchart for the Sext Composition routine

Page 44: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

3.4 Local Search

The local optimization of the objective function in 3.4 can be solved by man- stan-

dard nonlinear progamming techniques. W e irnplemented i r using the sequential

quadratic programming technique (SQP) mhich is considered state-of-t he-art in non-

linear constrained optimization met hods [1T, 18. 191.

3.4.1 Constraint Optimization

The optimization problem that tve have can be described as a constraint problem as

follow s :

subject t o : r l (x j = 0.

r i ( x ) 5 0: i = 2: ...' 2A' f 3

where x is the vector of the parameters which will be optimized. g is the objective

function in eq.3.3. (g : RK" + R) r is the vector of quaiity and equality constraints

(lotver-case boldface let ters indicate vectors and capit al-case boldface let tere indicat e

mat rices).

In constraint optimization. ive want to transform the problern into a n easier

subproblem which can be solved and used as the basis of an iterative process. Most

of the methods that are used ioday in nonlinear optimization depend h e a ~ i l y on

the solution of the Kuhn-Tücker (KT) equations. The KT equations are necessa;)

conditions for the optimality of a constrained optimization problem [?O' 211.

Page 45: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

The Kuhn-Tucker equations can be described as follows:

The first equation indicates the fact that at the solution point the gradients between

the objective function and the active constraints are cancelled. The Langranian multi-

pliers serve t his purpose. i-e. r hey balance the deviations in magnitude of the ob jecr ire

function and the gadienis of the constraints. Son-active constraints have Langra-

nian multipliers equal ro zero. since the! do not contribute to the canceling process.

This is stated in the last tno equations of Eq. 3.10. A more detailed analysis of the

constraint optimization and the Kuhn-Tucker equations can be found in [?O1 211.

'ilany algorithms are based on the direct computation of the Langanian mul-

tiplier~. There are some more efficient methods however which make use of second

order information: as far as the KT equation is concerned using updating procedures.

These methods are knov-n as Sequential Quadratic Programming (SQP) methods.

because a quadratic subproblem (QP) is sslved at each iteration.

3 A.2 Sequent ial Quadratic Programming (S QP)

In SQP methods. at each iteration. ive approximate the Hessian of the Langanian

funct ion 2K+l

using a quasi-Xewton updating procedure. Then. we use this approsimation

a feasible direction d = ( d l ... , dK: d~, -+~) t hrough a quadrat ic problem and

Page 46: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

once a feasible direction has been obtained. Ive apply a line search procedure to find

the optimal feasible step size. A more detailed description of SQP can be found in

Given the problem in eq. 3.4. the SQP implementation consists of three steps:

1) Update the Hessian Matrix

At each iteration. H. a positive-definite quasi-Sewton approximation of the Hessian

of the Langanian function. is calculated using the Broyden [Z]. Flet cher [BI. Golfarb

[24]. Shanno [25j (BFGS) formula which is given by

2) Quadratic Problern

At each iteration a QP problem is solved to 5-ield a search direction in which the

solution is estimated to lie.

1 minimize - d ' ~ ~ d + vg(xi)=d

3 -

The QP sub-problem can be solved using any QP algorihtm.

3) Line Search

The solution of the QP sub-problem produces a vector di. Sou-. ive have to find

the minimum along the line formed from this search direction. This minimum is

Page 47: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

generally approxirnated usine a search procedure (e.g.. Fibbonacci. Golden Section)

or by a polynomial method involving interpolation or extrapolation (e.5. quadratic.

cubic). The problern is to find a aew iterate xr+l of the form

where xl is the current iterate. d the search direction obtained by the QP subproblem

and a is a scalar step length parameter which is the distance to the minimum.

The algorit hm terminates when the optimal value of the objective function

falls beloa a prescribed threshold.

3.4.3 Partial Derivatives of the objective function

This section derives the partial derivatives of the objective function gh- in (3.3) which

are used by the SQP optimiztion algorithm for the local search. To simplify the

notation: we write (Pi. .... Ph.. 5) rather than ((Pl. ...: PI;). t). inrerchangeably with

(P. ) By differentiating the objective function with respect to al1 optimization

variables, we obtain

By differentiating the autocorrelation function in (2.23) ivith respect to al1

optimization variables, ive obtain

Page 48: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

A s it can be seen from equations (3.17) and (3.1s): n-e need to calculate the

a j J ( i l n u ) a ~ l b ~ i 2 r v ) 1 l = partial derivatives apfi;. and ac in order to find the gradient of the objective

function wit h respect to the opt imizatior variables.

By using synbolic differentiation rather than a numerical one, ive make the

optimizat ion algorit hm faster.

aJJ( i2 -v ) In order to find the partial derivative 3Ph-

ive can express the DTES

innovation as folloirs:

By taking the Discrete Fourier Transform of the above DTES innovation. ive have

h- - i ? r ( k - l ) ( ~ - l ) !@ru) = x Pk exp v = 1- '2. ..I< (3.20)

k=l K

It can be pro\-ed by simple mathematics that the partial derivatives of the

fJ(i2n) with respect to Pl: P2: ..., PK are given by the vectors:

a,? , ( im) - - i 2 ~ ( ~ - 1 ) ( 2 - 1 ) - ~ ~ z ( K - L ) ( ~ - I ) a P,- - il: exp K . ..., esp h- 1

In order to calculate the partial derivative w: we have to take into

consideration the fact that every complex number can be expressed as a sum of real

Page 49: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

and an irnaginary part. In thar respect we can express the Fourier rransform of the

distortion D(i2irv) as follotvs:

ahere D(i2av) is the discrete Fourier Transform the distortion. K e have ro recall

that the distortion is a column vector whose elements are the partial m e a s of the

corresponding foregound sequence given by 2 -2 1.

3.5 Examples and Results

This section illustrates the efficacy of the automated QTES modeling methodolog-

via some examples from the domain of the cornpressed video traffic. More specificall~

me used MPEG-Compressed VBR video data streams.

Video t r a c is inherently variable bit-rate (VBR). The information associated

with each frame is a function of the contents of the scene. the motion of objects in the

frame and the motion of the camera itself. Data compression is extensively used to

reduce the transmission bandrvidt h requirements of telecommunication traffic. Data

is coded at the source, therefore it is compressed to a fraction of its original size?

and then it is transported 01-er the network. Finally it is decoded at its destination.

Coding standards like H.261 and MPEG cornpress the raw image data of each frarne bu

takinp advantage of the correlation in the temporal as well as spatial domain of a video

sequence [El. The compression achieved for each frarne is not of same magnitude

and this leads to an even geater burstiness in the compressed video traffic.

Page 50: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Compression in UPEG (Moving Picture Expert Group) is achieved by usirig

three frame encoding types: intraframe frame (1 frarne): predictil-e frame ( P frame)

and bi-directionally interpolated frames (B frames). 1 frames are self contained and

can be decoded independently. whereas. P frames need a p s t 1 or P frame for decod-

ing. B frames are encoded with motion vectors from a past (1 or P ) and a future (1 or

P) frame. hence both are needed for decoding. B frames are no1 used for prediction

of other frames hence the errors in them do not propagate' as in 1 and P frames. This

makes the B frames the least critical of the three frames. Errors in 1 and P frames

propagate until the arril-ai of another 1 frame when the whole frame is effectively re-

freshed. The MPEG coding standard makes heavy use of Discrete Cosine Transform

(DCT) in order to achieve the compression of rideo units.

\f,'e used three different video sequences as an input to Our automated algo-

rithm: BBC video. Last Action Hero video and football video whose qualitative and

visual aspects are explained in the sequel.

Example 1: BBC Video

The BBC Video sample ccnsists of 9.5000 frames played at 30 frames/sec sho\v-

ing a hurnan figure talking into the camera. There is almost no video movement in the

scene. but picture brightness is high implying high picture detail. The background

consists of smoot h l - varying bright ness and colour informat ion wi t hout s harp visual

boundaries.

The marginal distribution of the traffic stream is exponential with parameter

1 = 1. The number of equidistant values for each Pz and were set to 4 and 10

respect ively. The aut ocorrelation coefficients were calculated for 50 lags. We kep t

the 8 = 30 best points out of all points that we got from the exhaustive search and

Page 51: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

we optimized the cost function only for those 30 points. Ué then took the point

which gives the smallest value of the objective function.

Figure 3.3 displays the results of the algorithmic modelling of a P-frame em-

pirical video sequence for a QTES+ model. r o t e the ver- good agreement of the

QTES rnodel autocorrelat ion function with its empirical counterpart . -1s expected.

when we increase the discretization level from K = 10 t o 1' = 20. the ernpirical

autocorrelation is more accurately approximated by the autocorrelation of the QTES

model.

Exarnple 2: Last Action Hero Video

The Last Action Hero sarnple is encoded into a rather compact SIPEG Stream.

in part due to the wide-screen format employed on the source video tape. This wide-

screen format egectively uses 70% of the middle range of the pict ure area ivith the

remaining 30% contribut ing not hing to the information content. The Last Action

Hero Video consists of 25000 frarnes.

The number of equidistant values for each P, and ,F were set to 4 and 10

respectivele The autocorrelat ion coefficients were calculat ed for 50 lags. We kept

the B = 30 best points out of al1 points that we got from the eshaustive search and

we optirnized the cost function only for those 30 points. lie then took the point

which gives the smallest value of the objective function.

In fisure 3.5. we can see hou- well the empirical autocorrelation of a P frame

rideo sequence is approsimated by its QTES counterpart. By a closer look, we can

observe again that the approximation is better for K = 20 than for Ii = 10.

Page 52: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Example 3: Football Video

The football segment is a high action segment trith very saturated rnonotonic

areas of colour. The video areas which are subject to movement get l a r g r as the

scene progresses. and this corresponds nicely to an increasing function of frarne sizes.

The Foot bal1 V-ideo consists of '25000 frames,

We kept the parameters of the optimization algorithm the same as in the two

previous examples. i.e. the number of equidistant values for each P, and 5 were set to

4 and 10 respectivelc the autocorrelatioli coefficients were calculated for 30 lags. we

kept the B = 30 best points out of ali points that we got from the eshaustive search

and we optirnized the cost function only for those 30 points. né then took the point

which gives the smallest value of the objective function.

By comparing the accuracy of the approximation in the three above examples.

we can see that in the case of the BBC and the Last Action Hero \-ide0 samples the

approximation of the empirical autocorrelation nith its QTES counterpart is very

sood. In the case of the football video sampie though. Ive notice that as we move to

higher lass the approximation becomes less accurate. This happens mainly because

the football video stream is a hiph action segment and the video areas nrhich are

subject to movement get larger as the scene progresses. This increases the burstiness

of the video data stream and leads t o a less accurate approximation as ive move to

higher lags. More than that the presence of the esponentially distributed weight

coefficients in the objective function forces the optimization to focus on the lower-lag

aut ocorrelation coefficients.

Page 53: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

ALJTOCORRELATION FOR EMPIIUChL DATA AND MODEL OF example 1 , q . d ~

data sueam + 1 S model, K=20

L ....... ......... ........ ........ ......... TES model. K=o el;. L ; ; ;-* ; : - .........-......

Figure 3.3: Autocorrelation Coefficients for the BBC nen-s empirical traffic sample and the QTES models for K=10 and K=2O using the exhaustive search

ALJTOCORRELAT~ON FOR EMPIRICAL DATA AbJO MODEL OF examplel--.da( l I I I 1 1 I

: Em ' irical &ta mun + ' k s model. K=IO -+-- i : &ES rnodel. K=20 e- - ; ........................... ...._........-.... ... - I

f ! .......................:..................... ......-

t I 1 t 1 I 1 I

O 5 10 15 20 25 30 35 40 45 50 LAG NUIMBER

Figure 3.4: Autocorrelation Coefficients for the Last Act ion Hero empirical t raffic sample and the QTES models for K=10 and K=Xl using the exhaustive search

Page 54: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

AUTOCORRELAT ION FOR EMPIRICAL DATA AND MODEL OF exa~nole3,emp.&1

O 5 10 15 20 25 30 35 10 5 50 LAG IUüMBER

Figure 3.5: Autocorrelation Coefficients for the Foot bal1 empirical traffic sample and the QTES models for K=10 and K=ZO using the exhaustive search

3.6 Summary

The design of an aut omated algori t hm tvhich approsimates an empirical autocorre-

lat ion as closely as possible using the QTES modeling mer hodology iras invest igat ed

in t his chapter. This tvas accomplished by finding an optimal innovation density. f,

and an optimal stitching parameter, S .

Our automaied modeling approach combines an exhaustive search wirh a non-

linear technique to minimize the objective funct ion consisting of the distance between

the empirical autocorrelation function and its candidate QTES mode1 counterpart .

In the exhaustive search we select the B most promising points of the para-

metric space Gs by discretizing the parameter space into a finite number of points

Ii, evaluating the cost function g~ at those points and keeping those points tvhich

give the B smallest values of g ~ . Since the exhaustive search is of a combinatoric

Page 55: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

nature. the exhaustive search is ver? time consuming.

In the Local search we use each one of the B points as inirial points to find a

minimum of the objective function g ~ - . Shen ive select among them the point z' which

$es the smallest value ~f the g ~ . Ive implemented the local optimizat ion bu using

the sequential progamming technique (SQP). The choice of this merhod ivas based

mainly on the folloiving facts: t here a r e fast and numerically st able analyt ical formulas

for the calculation of the cost function and its partial derivatives. the constraints

imposed on the objective function are linear and very simple and finallx the SQP

technique makes use of second order information which leads to more accurate results.

Finallc the efficiency of the automated algorithm \vas verified by using three

empirical tr&c strearns from the domain of the MPEG compressed VBR video traffic.

In al1 cases the approximation of the empirical autocorrelation ~vi-ith its QTES rnodel

counterpart was very accurate. As expected: it mas verified that when ive increase

the discretization level Ii of the QTES model. the accuracy is ber~er.

Page 56: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 4

Random Search

So far: ive have used an exhaustive search in the implementation of the automated

algorithm in order to find al1 the initial points for the local search.

In algorithms of eshaustive search type. we have often a list of combinatorial

objects and we want to search the entire list. or perhaps to search sequentially until

ive find an object which meets certain conditions. Random sampling. on the other

hand. is done LI-hen the esact determination of the quantity by exhaustive search

would be so time consuming as to be impracticable.

For the eshaustive search of Our optimization problem. Ive have used the "next

composition'' subroutine which: each time we cal1 upon it. returns one object on our

list. We process the object and cal1 the subroutine again to obtain the nest object.

Therefore our subroutine (a) realizes, when it is being called for the first time, (b)

remembers enough about its previous output so that it can construct the next member

of the list (c) redzes: at the end; that t here are not objects Left and (d) informs the

calling progam that the end has been reached.

The logic of a random sampling is much simpler. Given the input parameters,

the subroutine is expected to select at random just one object from the list specified bu

Page 57: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the input parameter. Of course' each object on the list has equal a priory probability

of being selected.

Therefore. in the random search approach' ive first specify ho~v many points

A7 of the pararnetric space GK we want to have randomly chosen. we evaluate the

objective function g ~ - at those ii- points and keep those which gire the B srnallest

values of gn

4.1 Random Search Algorithm

The random search algorithm is also based on the -next composition- idea that was

described in section :3.:3.1. C l é will elaborate a bit more on this idea. in order to

understand full:;. how the random search algorithm ivorks. In section 3.3.1. Ive had

ment ioned t hat t here are

n + k - J ( % k ) = ( ) (4.1 )

wap of arranging n balls in k cells.

Let n + k +. 1 spaces be rnarked on a sheet of paper. and suppose that in the

first space and the last space ive mark a vertical bar, as shown in figure 4.1. In the

remaining n+k- 1 spaces, distribute the n balls with no more than one bal1 occupying

any space. There are obviously J(n. t) ways of doing t his. In each of the other k- 1

spaces place a vertical bar. We now have a pattern like the one shown in figure 4.2.

Xow w e can rhink of the vertical bars as representing cell boundaries. Hence.

in fiope 4.2 there are 5 cells containing respectively. 2: 0' 1: 3: 1 balls. It is nom clear

that the number of compositions of n into k parts is @en by equation 4.1.

Page 58: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 4.1: balls in cells mode1

1 - 7 3 4 5 6 7 . (n+k+l)

Figure 4.2: balls in cells mode1 example

The algorithm for generating compositions of n into k parts is based on the

'balls in cells* mode1 [16] which rvas described above. What Ive must do in order to

generate compositions of n into k parts is to generate (k-1) subsets of (n+k-1) objects

and to interpret each such subset as the set of locations of the interior vertical bars in

figure 4.2. From the bar locations we cm, by substraction. find the number of balls

between each consequtive pair of bars and therefore determine the composition which

corresponds to the given subset .

CVe are interested in random compositions of n into k parts. Therefore. we

can choose the positions of the ce11 boundaries at random, then by differencing ive

find out how man? balls are in each cell. In other words. ive have to find an algoritm

which can find random k-subsets of an n-set.

4.1.1 Random k-subset of an n-Set

Suppose that integers k: n are given- 1 5 k 5 n and w e want ro select k distinct

elements al- al: ...: ak from {1,2, ...- n): at random. In our case. the n set can be a

very large one, since it represents the discretization level for the discrete innovation.

while Ir is the number of equidistant probability values. We tried therefore to produce

an algorithm that takes the minimum memor- storage and the minimum labour. The

Page 59: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

output set contain k ivords. and so it seems reasonable to impose the requirement

t hat

(i) no more t h a n k m r d s of array storage should be used b y t h e a lgor i thrn . each

word to hold an i n t e g e r b e t c e e n O and n

Sexr. it seems that most of the effort is needed for the selection of k integers

and perhaps some rernoval of duplicates, and so our second requirement is

(ii) the a c e r a g e labor required shouZd be O(kj.

The main probiem is that we select. one at a tirne. integers at random between

1 and n. and ive need to know if the integer just chosen has been chosen before. If

so. the new integer is discarded. otherwise is kept.

But how shall we discover if the latest integer is *neaW*? If we esamine al1

integers so far chosen. ive end with 0(k2) labor? in violation of (ii). If ne arrange the

integers in a linked tree. the labor may drop, still not in O ( k ) . but the links force

a violation of (i). If we keep an array whose ith entry tells us whether i has been

chosen. then (i) and (ii) will be violated.

Our algorithm, w-hich meets al1 the requirements. is. in summar): to divide

the range [ L n ] into k subinterds Ri (1 = 1'2. ...' k) of approximately equal sizes.

and choose the cardinalities i BiI of the sets Br of elements to be chosen Dom each

subinterval. Ive determine them by a rejection method nhich simulates the choosing

and recording of the members of the Br. When this step is finished. i re now must

choose the actual members of the Bl from the interval RI. again uniformlj- at random.

The algorithrn, in more detail: is the following.

Divide the range [lt n] into k subranges

Page 60: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

The random k-set to be chosen ni11 consist of members Bi of the subranges Rl

As the number of sets Bi equals k, they mil1 contain very feu- elements: some

mil1 be empty.

First determine the c~rdinalities IBi 1 , without worrying about exactly mhich

elernents of RI \ d l be members of Bi. In doing this, the k storaze locations ai

can be used, one for each Bi. We dram a random number x in the range [1, n]'

determine the Ri to which it belongs. by

and accept or reject the x dependinpo on whether it "duplicates" an element

already accepted. Suppose m members of R1 have already been accepted. while

the total number of members of Ri is q. then the probability that x is rejected.

is m/q. iVith this in mind, ive may simply reject x with probability ml+

without even checking x against any element that has been accepted. In fact

ive only maintain a count of the elements that have been accepted. Al1 this is

(i-1)n accomplished as follows: Initially. w e store in ai the number This is

one unit iess than the smallest element in Ri. When an x has been chosen, and

Ri is deterrnined by 4.3, w e accept x if x > ai: reject if z 5 al. If x is accepted.

w e increase ûr by one unit, but drop x.

When k such x have been accepted, w e scan the ai ( 1 = 1, .... k ) and move those

ai than no longer have their initial values (they represent the nonempty Bl to

Page 61: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the leftmost positions the array. Say ( a l ! .. .. a,). Ses t . for j = p. p - 1. . . .. 1 we

reserve space for each of the Bi in (al' ...' ah)' starting from r he right. store the

value of Z in the rigbtmoçt of these spaces. the others are zeros.

Again. scanning from the right. w e now place random elements of Ri into the

space reserved for Bi' listing them in order. Duplications are avoided bu choos-

in= at random. rn' = ! + [hm] where m is the number of rnernbers of Ri not

yet chosen. and x is the m'th member of this list of unchosen elernents.

4.1.2 Random Composition of n into k parts (rancom)

Our algorithm for random compositions is based on the -halls in cells- mode1 which

was described above. Briefly. we choose the positions of the ce11 boundaries at randorn

using the routine which \vas described in the previous section. rhen by differencing

we find out how man? balls are in each cell. The flowchart of the algorithm is shown

in figure 4.3.

4.2 Exarnples and Results

This section illustrates that our automated QTES algorit hm is equally efficient when

we replace the exhaustive search with the random search which wac described in det ail

in section 4.1

Figure 4.4 and 4.5 displays the results of the random aigorithmic modeling

for the BBC and Last Action Hero video samples. The number of equidistant values

for each P, and 5 were set to 4 and 10 respectively The autocorrelation coefficients

nere calculated for 50 lags. For a discretization level Ii' = 10: we chose randomly

Page 62: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

for j = 2,k-1

r ( j p au-1) - 1

F i e 3 Flowchart for the ranksb algorit hm

Page 63: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

AUTOCORRELATION FOR EMPIRlCAL DATA AND MODEL OF example I,emp.dat

-0.2 1 1 f 1 I I I I 1 1

O 5 10 15 30 15 30 35 40 45 50 LAG W F B E R

Figure 4.4: r\utocorrelation Coefficients for the BBC Sews empirical traffic sample and the QTES models for K=10 and K=-O using random search

X = 30 points of the parametric space GK and we kept the B = 10 best points out

of d l those 30 points n-hich were randornly chosen. Finally. ne optimized the cosr

function for those B points and took the point which gives the srnailest value of the

objective function. For a discretization l e ~ l II = 20: ae used 3' = 10 and B = 30.

Figure 4.6 illustrates the results of the random algorithmic rnodeling for the football

vide0 sample. For Ii' = 10. we used :V = 70 and B = 30, while for Ii = 20: we used

_\- = 500 and B = 50.

In most cases the ageement between the QTES mode1 autocorrelation func-

tions and their ernpirical counterparts is quite good. As expected. when ive increase

the discretization level from K = 10 to II = 20. the ernpirical autocorrelations for

both video sequences are more accuratelx approximated by the autocorrelation of the

QTES models.

Page 64: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

?$TOCORRELATION FOR EMPIRICAL DATA AND MODEL OF example2,emp.dat

-0.2 ' 1 I 1 I ! 1 I

O 5 10 15 10 5 30 35 10 45 50 LAG NUMBER

Figure 4.3: Autocorrelation Coefficients for the Last Action Hero empirical traffic sample and the QTES models for K=10 and K=20 using randorn search

Figure 4.6: .Autocorrelation Coefficients for the Football empirical t r a c sample and the QTES models for #=IO and K=20 using random search

Page 65: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

4.3 Summary

The automated algorithm nas modified by replacing the exhaustive search with a

random sampling aigorithm. Random sarnpling is done when the exact determination

of the quantity by eshaustive search is so time consuming as to be impracticable.

In the random search algorithm ive first specify how man? points :Y of the

parametric space GR we want to have randomly chosen. Then Ive keep those points

ahich give the B smallest values of gn. Finally. ive evaluate the objective function

gn at those B points and ive select among them the point 2' ivhich ,ives the smailest

value of the g ~ .

The efficiency of the automated algorithm was verified bu using the same three

XIPEG compressed VBR video traffic samples as in the previous chapter.

Page 66: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 5

Comparison Between Exhaustive Search and Randorn Search

As ive have already mentioned. random sampling is often prefered when the exact

determination of the quantity by exhaustive search is ~ i m e consurning. Each point has

equal a priori probability of being selected. Since random sampling selects at random

a nurnber of points 3 from the parametric space GK and the automated algorithm

chooses the best B of those as initial points for the local search. i r is reasonable to

assume that random sampling ail1 not bring so accurate results as the exhaustive

search which uses indiscriminately al1 the points of the parametric space GK. Figures

5.1. 5.2 and $3 compare the efficacy of the automated algorithm, when random and

exhaustive search are used respectively. In al1 cases, we used a discretization level

Ii = 20. For the random search, we used X = 70 and we chose the B = 30 points

as initial points for the local search. In the exhaustive search. ive also used B = 70.

As it c m be seen: the results that we got h m the automated rnodeling using the

random search are ver? close to those we got by using the exhaustive search.

The risk that is associated with the random sampling in the automated algo-

rithm is obvious: The random sampling algorithm may not select .Y "good" points,

Page 67: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

AUTOCORREL4TIOX FOR EMPIRTCAL DATA AND MODEL OF e~amplel-emp.dat 1 I I 1 I I T

Empirical dara meam t l QTÈS modeL exhaustive mrch. K=20 -+-.-

....... .: ........ :. ....... :. ................ .!XE. Wet.~an.o9.~ . e h : K30. .*: r.

10 25 30 LAG NL%lBER

Figure 5.1: Aurocorrelation Coefficients for the empirical data stream and the QTES (K=20) models of examplel ( B K rideo sample) using exhaustive and random search

AUTOCORRELATIOX FOR EMPIRlCAL DATA AND MODEL OF example3,emp.dat

: Eeirical dara meam QTES modei exhaujiive search K=10 . . * .!?TES. modeL .do!?! ?3i9!l Ki?!?.

20 25 30 iAG NUMBER

Fiove 5.2: Autocorrelation Coefficients for the empirical data stream and the QTES (K=20) models of exarnplel (Last Action Hero video sample) using exhaustive and random search

Page 68: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

AUTOCORREtATTON FOR EMPIRlCAL DATA AND MODEL OF examplj-emp-dat

l T

Figure 5.3: r\utocorrelation Coefficients for the empirical data Stream 2nd the QTES (~=40) rnodels of exarnpled (Last .Action Hero video sarnple) using exhaustive and random search

u-hich means that the besr B of those A- points which will be used as initial points

for the local search will not be -good" enough either. Figure 5.4 illusirates this for

the foot bal1 video sample. In 1 his examph. me used a discretization level of Ii' = 10.

For the random search. Ive used i\' = 70 and v e optimized the besr B = 30 points.

For the exhaustive search. ire used B = 30.

Since random sampling selects at random points from the parametric space

GK: it is reasonable to assume that the greater the number !Y of the randornly chosen

points we specifv from the beginning ist the greater the chances of zetting "good"

initial points are. Figures 5 . 5 and 5.6 illustrate this in the case of ihe football video

sample. For a discrerization level of I< = 10: the results are more accurate, when we

use X = 70 and u-e optimize locally the best B = 30 points of rhose. rather than

when we use A- = 30 and ue optimize locally the best B = 10 points of those. For a

Page 69: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

0 ' I 4 1 I 1 I 1 I 1 O 5 10 15 20 25 30 3 40 15 50

LAG N i i E R

Figure 5.4: hutocorrelation Coefficients for the empirical data stream and the QTES (kk20) models of esamplel (football video sample) using eshaustive and random search

discretization level 1; = 10 . we derive the same conclusion as it can been seen in 5.6.

From the above analusis, w e can conclude that in the case of random search.

the greater the parameters B and 3 are. the more accurate the traffic mode1 d l

be. The parameter B reperesents the number of the points that are $en to the

local search as initial points for the non-linear local optimization to start. Therefore

by increasing the number B! we increase the number of the local searches that the

automated algorithm is going to perform. This increases the required running time

for the automated algorithm. The running time is also increased when ve increase

the parameter .\: because ive increase the number of the random searches in the first

phase of the optimization.

Page 70: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

AUTOCORRELATION FOR QTES MODELS (K= 10) of example3,emp.dat

F r 5.5: .Autocorrelation Coefficients for t h e QTES (K=lO j models of exampled (football video sample) for K=30 and S=(O using random search

AUTOCORRELATION FOR QTES (K=30) MODELS of example3-emp.dar

O 5 10 15 ?O 25 30 35 40 45 50 LAG NUhfBER

Fiame 5.6: Autocorrelation Coefficients for the QTES (K=20) models of example3 (football video sample) for K=100 and K=500 using random search

Page 71: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Table 5.1: Required running time (hours) for obtaining the traffic models from the empirical VBR video samples

The comparison betmeen the exhaustive search and the random search showed that

Video 1 Exhaustive Sample 1 Search, K=lO

in mosr cases the random search is equally efficient as the eshaustiïe search. More

Randorn Search, K=lO

BBC

Action Hero

than that the randorn search is much more iaster than the heuristic search. In table 8

5.1, we can see the time it took us in hours to obtain the QTES traffic models for

8

Exhausive I Random Search, K=ZO~ Scarch. K=lO

2 .S2

5.105

both the exhaustive and random search and for discretization levels of I< = 10 and

Ii- = 20.

The cornparison also revealed the risk that is associated with the random

sampling: The random sampling algorithm may not select a 3' number of "good-

points. Of course. the geater the number X of the randomly chosen points is, the

I FootbaIl ! 2.36

higher the chances of getting "good" initial points are and the longer the automated

algorithm is going to take. Therefore, the accuracy of the approximation of the

empirical autocorrelation with its QTES mode1 counterpart is a tradeof between the

0.86 I I

55.13 r 13.63 1 I

number S of the randomly chosen points and the time that the algorit hm is going to

From the table 3.1 it is also obvious that the burstiness which is involved in

53.74

36.15 b

1.5 1

1 .O5

95.37

50.17

Page 72: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

a video traffic stream can increase the running tirne of the algorithm sigificantl-.

while at rhe same time it can reduce the positive impact of the use of the random

search. In the case of the BBC and Last Action Hero video samples the difference

in the running time between the random and exhaustive search is significant mairily

due to the fact that the random search is faster than the exhaustive search. In the

case of the Foot bal1 video stream though the random search is still faster than the

exhaustive search, but the difference is not so significant as it is in t t e case of the two

other video samples. This happens because the Foot bal1 traffic stream cont ains more

burstiness than the other tw-O and this means that the local search needs more time

to find a local minimum which will be as close as possible to the global minimum of

the objective function.

Page 73: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 6

QTES Generator

Traffic generators are a key element in studying communication netivorks and evaluat-

ing their performance. Traffic generators are most often employed in tno fundamental

w-ays: either as part of an analytical model. or to drive a discrete event simulation.

The most common modeling context is queueing. where traffic generators generate

the traffic which goes to a queue or a network of queues and various performance

measures are calculated.

The purpose of the QTES generator is to generate sample path realizations

which capture the first order statistics (marginal probability distribution) and the

second order statistics (autocorrelation function) of the given empirical data sample

simultaneously using the QTES methodology. In chapter 3' we dereloped an algo-

rithm which calculates in an automated way the parameters which lagely determine

the autocorrelation structure. These two parameters. namely the DIES innovation

and the stitching factor' are given as inputs t o the QTES generator algorithm which is

described later in this chapter to ensure t hat the autocorrelation structure of the s p -

thetic QTES traffic is as close as possible t o the ernpiricd autocorrelation structure.

-4s ive will see in the sequel. an exact match to the empirical distribution (histogram)

Page 74: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

can be guarrant ed algori t hmically t hrough t he inversion met hod [2. 261.

A QTES generator which can generate autocorrelated st ochast ic sequences is

comprised of a foreground jbackgound scheme like the one which is shown in figure 6.1

[12]. The generator generates two autocorrelated sequences in lockstep: an auxiliary

sequence {Q,} and the target sequence {Z,}. The background sequence {Q,}. is a

QTES background sequence generated recursively by the t ra~si t ion function Q, =

T(Q,-l. Jn-l). where the transition function T is gken by 2.18 and { J,} is a sequence

of independent ident ically dist ributed ( i . id ) variates which have r he same marginal

distribution a i t h that of the DTES innovation sequence given bj- equation '2.12. Al1

the parameters nhich are needed to produce the background sequence Q,. namely

the DTES innovation sequence and the stitching factor are determined b - the

automated algorithm which was described in detail in chapter 3 . The sequence {Zn}

is obtained in lockstep from the sequence {Q,) via Zn = D(Q,). u-here the mapping

D is the distortion given by equations 2.4 and 2.9-

6.1 Inverse Tranform Method

The Inverse method is one of the widely-used techniques for generating variates which

follow the marginal distribution of a given data sample [2' 261. It can be used to sample

from the exponential. the uniform: the IVeibull and the triangular distributions. but

most often it is used. when ive want to sample from empirical distributions. which is

what we have in Our case. Additionally. it is the underlying principle for sampling

from discrete distributions as u-e will see in the sequel.

Suppose that we wich to generate a random variate X that is continuous and

has distribution function F that is continuous and increasing n-hen O < F ( x ) < 1.

Page 75: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 6.1: QTES generator

Let F-' denote the inverse of the function F . Then the algorithm for geeneerating a

random variable X havi-ing distribution function F is as follows (- is read "distributed

as*):

1. Generate I - -- U(0.1)

2. Return X = F- ' (E)

where O' denotes the uniform distribution. Xote that F-'(C) will alivays be defined'

since O 5 C' 5 1 and the range of F is (0.11. Figure 8.1 illustrates the algorithm

graphically where the random variable corresponding to this distribution function

can take either positive or negative values: the particular value of CF determines

which will be the case. In the figure, the random number C; results in the positive

random variate XI, while the random number C; leads to the negative variate X?.

-4 more detailed description of the Inverse Transform blethod can be found in

[2, 261

Page 76: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

5 XI

Figure 6.2: Inverse-t ransform met hod for continuous randorn variables

6.2 QTES algorithm

The QTES generator algori t hm consists of t hree component s: A discret e number

generator. dis-genn. u-hich generates a sequence of i.i.d variates u-hich have the same

marginal distribution with that of the DTES innovation. a background QTES gen-

erator, qtes . which generates the uniform QTES background sequence and finally a

foregound QTES geeneerator qtesgen which generates the foreground QTES sequence

which captures both the autocorrelation structure and the marginal distribution of

the empirical data set. In figure 6.1 the three generators are denoted by J,: Q, and

Zn respectively.

6 -2.1 Discrete number generator (dis-genn)

Our situation here is the foUowing: We have a probability mass function Pl: P2; ...' pK

on the nonnegative intejers which represents the innovation sequence we got from

Page 77: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

the autornated algorithm ivhich a-as described in chapter 3. and IL-e want to generate

discrete random variates J ivith the corresponding distribution.

The direct inverse transform method? is as folloivs:

1. Generate C' - &(O. 1)

2. Return the nonnegative integer J = I satisfying

6.2.2 Background QTES generator (qtes)

This routine generates the background QTES sequence Q, in a recursive way by using

the transition funct ion which is $en by equation 2.1s. The innovat ion sequence and

the stitching factor that are used for the generation of the background QTES variates

have been calculated by the automated algorithm which was described in chapter 3.

The backgound sequence Q, which is generated has the same correlation structure of

the empirical data Stream. Dé have to recall the important fact the the backgound

QTES variates are uniformly distributed as ive mentioned in chapter 2.

6.2.3 Foreground QTES generator

The situation here is the following: We have an empirical data sample in the form

of a histogram and w e want to generate a sequence of numbers il-hich has the same

distribution with that of the empirical data set. In order to do that Ive miil use the

inverse transform met hod.

-2s Ive have mentioned. we have the empirical data sample in the form of a

histogram. Thot means that the empirical data is grouped into Ir adjacent intervals

Page 78: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Figure 6.3: Cont inuous. piecewise-linear empirical distribution funcrion from grouped data

[a0' a l ) . [al' a?): ...[ akdl- ak)- so that the j th interval contains sj observationsl where s

is the size of the empirical sample and sl +- s? + ... + s i = S . .A reasonable piecewise-

linear empirical distribution function G could be specified by firct letting G(ao) = O

and G(aj ) = (sl + S? + ... + s k ) / s for j = 1: 2. .... k. Then. interpolating linearly

betmeen the aj's. we can define

Figure 6.3 illustrates this specification of an empirical distribution for k = 4.

The following inverse-transform algorithm generates a random variate which

has the same marginal distribution as the empirical data set:

1. Generate a background QTES variate Q. Recall that Q - Lr(Ol 1)

Page 79: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

2. Find the nonnegatire integer

and return Z = i [Q - G(aI )

In figure 6.1. the inverse transform a lpr i thm is denoted with the mapping D.

The foreground sequence 2, that is generated has the same marginal distribution with

the empirical data sample and sirnultaneously captures the aut ocorrelar ion structure

of the empirical data Stream.

6.3 Results

This section illustrates the efficacy of the QTES generaror algorithm via the same

three examples that we used in the prerious sections. namely the BBC news. Last

Action Hero and football MPEG compressed VBR video samplec.

In al1 cases, as it can be seen in the figures 6.4, 6.5 and 6.6. the Inverse

Transform Method parantees an exact match of the QTES traffic mode1 histograms

to their empirical counterparts. As far as the match to the empirical autocorrelation

function is concerned. in the case of the BBC news and Last Action Hero video

samples. we used the DTES innovation sequence and stitching factor ive got from

the automated algorithm using the Global search algorithm: nhile in the case of

the football video sample. Ive used the results that the automzted algorithm gave

us using the random search algorithm. \iVe have used a line with --- to represent

the empirical autocorrelation function. a dashed line to represenr the autocorrelation

funct ion of the QTES mode1 nith a discretization level of K = 10 and a line wit h "'"

for a discretization level of K = 20. In al1 cases. we calculated the autocorrelation

coefficients for 50 lags.

Page 80: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

SAMPLE PATH FOR Empirical and QTES data HISTOGRAM FOR Empirical data

SERIAL NUMBER 04

AUTOCORRELATION FOR Empirical and QTES data 4000

3000 > k g 2000 W a

1 O00

VALUE HISTOGRAM FOR QTES data

O 20 40 60 "O 5 10 LAG NUMBER VALUE

Figure 6.1: QTES mode11 (A' = 10 and I< = 20) of the BBC news video traffic sarnple

-4s it can be inferred from the figures 6.4. 6.5 and 6.6. the match of the QTES

modek generated by the QTES generator algorit hm wit h t heir empirical counterparts

is very accurate. It can also be seen that the approximation is more accurate for a

discretization level of A' = 20 than ii is for a discretization level of K = 10. In

the case of the football video sample. this is more visible. since there is long term

dependence in the aurocorrelat ion structure.

Page 81: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

SAMPLE PATH FOR Empirical and QTES data 15000 r A 6000

H ISTOGRAM FOR Empirical data

"O 1 2 3 "O 5000 1 O000 15000 SERIAL NUMBER

04 VALUE

AUTOCORRELATION FOR Ernpirical and QTES data HISTOGRAM FOR QTES data

LAG NUMBER VALUE

Figure 6.5: QTES rnodels(K = 10 and IC = '20) of the Last Action Hero video traffic sarnple

Page 82: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

SAMP&PATH FOR Empirical and QTES data -

1 O00 2000 3000 SERIAL NUMBER

HISTOGRAM FOR Empirical data

5 VALUE

AUTOCORRELATION FOR Empirical and QTES data HISTOGRAM FOR QTES data

20 40 LAG NUMBER

5 VALUE

Figure 6.6: QTES models j l i = 10 and I< = 20) of the football video trafic sarnple

Page 83: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Summary

The QTES trafic generator generates t r a c mode15 from empirical data Stream-

which capture both the marginal probability distribution and the autocorrelation

function of the given empirical data sample simultaneously.

In chapters 3 and 4 ive saw hou- we can produce a QTES autocorrelation func-

tion n-hich approsimatei the empirical one as closel- as possible. \!e achieved that

by finding an optimal D I E S innovation sequence and an optimal stitching parameter.

These two parameters are given as inputs to the QTES generator algorithm and the'

are responsible for capturing the empirical autocorre!ation. The Inverse Transform

method which is the major component of the QTES generator algorithm guarantees

an esact match of the QTES marginal distribution to the empirical one.

The efficiency and accuracy of the QTES generator was ~erified by using the

same three MPEG compressed VBR video trafic samples as in the prerious chaprers.

Page 84: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Chapter 7

Conclusions and Recornrnendat ions for furt her Study

7.1 Conclusions

The .ATAI transport concept of B-ISDX allows for an efticient and flesible use of net-

work resources for many different services IL-ith high transmission rates. One major

traffic source which 4 1 make use of this capability will be video and multimedia

applications. The changes in the ce11 rate of such sources are caused b - the compres-

sion of the digitized video frames. The MPEG (ISO Moving Picture Expert Group)

coding scheme is expected to be the major compression algorithm for the first ATM

video applications. Thus there is a need to End source traffic descriptors which can

be used efficiently for connection admission control and usage pararneter control for

sources iike MPEG video sources where burstiness involved.

This thesis presented a consolidated study of modelinj bursty traffic, over

high speed networks. h a h g invest igated two of the principal chalienging st atistical

properties of an empirical traffic sample that have to be captured so that the traffic

Page 85: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

mode1 can be an accurate approsimation of the empirical data stream: rhe maranal

distribution and the autocorrelation structure of the empirical t raffic.

Learing the autocorrelation structure out of the traffic mode1 can be danger-

ous. since remporal dependence is a major cause of burstiness in telecommunication

traffic. The modeling LI-as done by using the QTES methodolog>- in an automated

approach (heurist iclrandom search local search algorir hm). Alt ough the evaluat ion

of the merhodology and the automated algorithm was iimited to MPEG video traffic

sources. the proposed approach is generic and can be applied to an- traffic stream

which goes through a network.

Previous research [.jj specified a first approach to capturing borh the auto-

correlation structure and the marginal distribution of the empirical traffic Stream.

namely the Global Search Local Optimization (GSLO) algorithm [15] mhich is an

automated procedure for modeling an empirical traffic source using TES processes.

The problem with that approach as we have already rnentioned in chapter 2 is that

TES based queueing models have been analytically intractable. since TES processes

are transformed Market- sequences over an uncountable state space. Queuing mod-

els with TES-based traffic have only been studied via Monte Car10 simulation. More

than that this approach uses an exhaustive sewch over the parameter space to get the

most promising points which will be given to the local optimization algorithm in order

to get the optimal innovation and stitching factor. This makes the u-hole approach

very time consuming. since the exhaustive search alpri thm is of combinatorial na-

ture. This thesis on the other hand outlines an algorithmic automated procedure for

modeling an empirical traffic source using QTES processes. A QTES process appros-

imates a continuous-state TES process by a discrete-state variant QTES counterpart.

Page 86: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

In this wa- Ive can have a tractable queueing model which is still quite accurate in

the sense that the traffic model captures first-order and second-order statistics of

the empirical traffic data. The algorithm combines an initial search orer the rvhole

parameter space to get the best initial points ~ i t h a local nonlinear progamming

technique to minimize an objective function consisting of the distance between the

empirical autocorrelation function and its candidate model counterpart . A s far as

the initial search over the whole parameter space is concerned. we used an eshaustive

search a t the be$nning and then we del-eloped an algorithm which uses a random

search to get the initial points for the local non-linear optimization. The use of the

random search reduced significant1~- the total amount of time that r he algorit hm needs

to bring results as i r can be shown in table -5.1.

A n eshaustii-elrandorn search local optimizarion algorithm n-as chosen to be

used in this automated rnodeling of empirical data due to the esistence of fast. analut-

ical formulas for the autocorrelation function and the partial derivatives and also due

to the simplicity of constraint s imposed on the objective funct ion. More specifically.

the sequential quadrating programming (SQP) was chosen for the implementation of

the local search optimization. This choice was made on the basis of the efficiency of

the optimization algorithm which makes use of second order information. as far as the

KT equation is concerned using updating procedures. This second order information

mas easy to obtain again due to the reasons that a e mentioned at the beginning of

the current paragraph.

The approximation of the empirical autocorrelation wi th i t s QTES mode1

counterpart is shown to be ver- good when exhaustive search io used. It is also

shoivn t hat bj- increasing the discretization level Ii of the model the approximation is

Page 87: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

more accurate. .As expected. the exhaustic search is ver? time consuming due to the

fact that the algorithm which irnplements it. narnely the nest composition routine. is

of cornbinarorial nature as it has been shown in section 3.3.1. From espression 3.6. i t

can be concluded thar any increase in the discretizat ion lerel K of the QTES model

leads to a combinatorial increase in the total number of searches in the exhaustive

algori t hm.

We replaced the exhaustive search with a random sampling algorithm and

ive managed to reduce significantly the runtirne of the mhoie algorithm as it can be

seen form table Jreftimetable. It \vas shon-n in chapter 5 that the use of a random

search generates models almost as accurate as those we get from the eshaustiïe

search over the parametric space. Since random sampling selects at random points

frorn the parametric space. it is reasonable to assume that the greater the number

of the randomly chosen points are. the higher the chances of geetting --goodï initial

points are. This intuirive assumption \vas confirmed in the case of the football video

sample. The approximation. as far as the ernpirical autocorrelation is concerned. iras

much more accurate. when ive produced 70 initial randomly chosen points than 30

initial randomly chosen points. Of course the more randomly chosen initial points

one generates. the more time consuming the automated algorithm is going to be.

On the basis of the general efficacy of the automated algorithm: it is inferred

that many parameters c m play an important role in the accuracy of the approximation

and in the time that the automated algorithm needs in order to ,ive results. Firstly.

the discretization lel-el A of the QTES model seems ro be a determinin: factor of

the accuracy of the mode!. The more we increase the discretization level, the more

accurate the trafic QTES mode1 is. Also the parameters :Yp and [: namely the

Page 88: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

number of equidistant values that each P, and ( can assume recpecti~:el- are of

much importance. since they determine the accuracy of the approsimation of the

innovation sequence and the st it ching factor in the local optimizar ion. These t hree

parameters. although they are the most crucial ones in determinin2 rhe accuracy of the

approximation. they are also the ones u-hich mainly determine hoil- fast ~ h e automated

algorithm is going to be. Since they are the only participants in the expression 3.6.

el-ery slight increase in eit her of t hose d l increase the total number of snarches in in

a cornpinatorial way making the automated algorithm time consuming. The nurnber

of the best inirial points B which mil1 be optimized locally also playi an important

role in the fastness of the algorithm. since ir determines ho^ man>- rimes the local

optimization routine is going to run. As it can be seen. the efficac- of the algorithm.

as far as the speed and the accuracy of the approsimation is concerned. is a tradeof

among al1 the above parameters.

7.2 Recommendations for Further S tudy

7.2.1 Experiment al Design and Design Opt imizat ion

The matching accuracy of the QTES models obtained from empirical traffic sam-

ples through the QTES modeling methodology was based on informa1 ad-hoc testing.

Experimentation with the traffic models provided evidence that changes to the dis-

cretization Ievel Ii? the number :Yp and :\ of equidistant values rhat each P, and (

cm assume, the number B of the best initial points nhich are oprimized locally and

the number of the randomly selected points, when random sampl in~ is used, would

affect the accuracy of the approximation of the empirical autocorrelat ion funct ion- as

measured by the cosr function 3.3 and the time that the automated algorithm needs

Page 89: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

to bring results. A more forma1 qualitative analysis could be based on determining

experimentallp which of the abore parameters (or combinations of parameters) have

the most significant influence on the matching accuracy of the QTES rraffic mode1

and the time that it takes for such a traffic model to be produced. This could be done

by performing a zk factorial analysis [26] . After having determined which parameters

play the most important role in the rnatching accuracy of the QTES t raffic mode1 and

the duration of the automared algorithm' ive can use these paramerers as potential

candidates for an optimization of the cost function with respect to those parameters.

A more detailed description of the procedure is given in the sequel.

Experimental Design

Factorial analysis supports the characterization of several design parameters (factors)

with reference to a standard systern response measure: the cost function. Two levels

(discrete values or ranges) are selected for each of the k factors. Simulations are run

for each of the zk possible factor-level combinations. from mhich the effect of each

factor and combinat ions of factors can be determined. Table 7.1 illustra tes a notionaI

experiment involving t hree factors ( F I F2: F3). < ..

The two lerels associated with each factor or denoted *-+- i high) and ---

( O ~ Eight different combinat ions of factors and levels are t herefore possible and

each combination defines a design point. The simulation result for each design point

is espressed as a cost function value (Ci : Czl C3). The main effect for a particular

factor is determined by computing the average difference in simulation results when

the factor lerel is high and lom, and the levels for al1 other factors remain fixed.

Combined effects of tn-O factors c m also be defined.

Page 90: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Table 7.1: Generic 3 factor design analpis

IVhen several simulation results are collected for each design point, the main

effects (et: e?: ... e k ) and combined effects (el?: e13. etc) can be calculated as confidence

intervals. which are influenced by the variability of the measured effecr and the number

of simulations performed for each factor-level combination. A completely positive or

negative confidence interïal indicates that the factor. or combination of factors. has

a statistically significant influence on the accuracy of the mode1 and the duration of

the algorithm and is a good candidate for optimization efforts.

Therefore: Xe can use the above procedure for our case and determine if-hich

factors play the most important role in the matching accuracy of the autocorrelation

function and in the time that the automated algorithm needs to bring results.

Page 91: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

Design Opt imization

Opt imizat ion of a cost function c m be achieved by locat ing the global minimum of the

cost function. which has a multidimensional solution surface. The oprimization effort

involves computing the d u e of the cost function at several points on the solution

surface and t hen applying some methodology: such as gradient descent. to locate the

minimum value. For t his particuiar optimizat ion problem t hough. the Mean Field

Annealing seems to be a nice choice, although a lot of the previously mentioned

techiques can be applied?

Mean Field Annealing (MFA). a variant of standard simulated annealing. is a

statistically bâsed technique which is well suited to the jeneral problem of optimizing

stochas tic cost funct ions over a multi-dimensional parameter space [XI. In particular.

31F.A is resistant to variation in the cost estimat e from simulation. is not susceptible to

local minima in the solution surface and converges rapidly to near-optimal solutions.

The YFA algorithm randornly selects a dimension N. in the parameter space and

generates estimates of the cost function C, at evenly spzced increments over the

interval !:\mi,. r\l,,,]. This process is repeated iteratively by seiecring new values for

the :Y buis. while the 1-alue for dl other dimensions remain fised. and until equilibrium

is reached.

-4 parameter T. which is commonly refered to as the temperature. is involved

in al1 those calculations, and provides a mechanism for escaping frorn local minima in

the solution surface. By gradually lowering the temperature over time, ive can avoid

local minima and relax to an optimal or near optimal solution.

Therefore: after having deterrnined ahich parameters are the most significant

ones, ive c m follow the MF-4 algorithm and estimate the optimal d u e s of those

Page 92: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

parameters so that we can have a very good accuracy in the approsimation of the

ernpirical autocorrelation and an optimal duration for the automated algcrithm.

7.2.2 O t her Recommendat ions

The thesis dernonstrated the efficacy and suitability of the QTES models and of

the QTES automated algorithm for producing traffic modek from empirical traffic

streams which capture both the empirical marginal distribution and the empirical

autocorrelat ion function. This research \vas iocused only on modeling the t raffic t hat

is given as an input to a neti~ork or to a queueing system. Taking into consideration

the fact that trafic models are mainlx used to predict accuratel!- different aspects

of network performance such as ce11 loss statistics. iurther study could be done in

the analysis of the performance of queueinp systems and netn-orks. u-hen this kind

of of input is used. Such a study could be ver)- useful. since it ir-ould provide us

with information about the accuracy of a trafic model that is needed to lead 10

an irnprol-ed network performance. In our case for esample. let us assume that u-e

have a queueing systern and we want ro compare its output performance when we

use a QTES traffic input with a discretization level of I< = 10 and when we use a

discretization level of I< = 20. If t here is not significant improvement in the output

performance between the two cases. then. maybe. there is not pracrical reason to use

Ii = 20: picen the fact that even a small increase in the discretization l e ~ e l of the

QTES model increases the number of initial searches that the algorithm perfoms in

a combinat orial way.

'ilany differeni traffic models have been proposed for the data generated by

variable bit rate (VBR) coders. i t has been observed in simulaiion sr udies that the use

Page 93: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

of different coders results in a wide variation in predicted ce11 loss statist ics. Therefore.

there is a need to study several traffic models (such as TES models. QTES models.

self similar trafic rnodels) and compare their accuracy in capturing the statistical

signature of the empirical data. Then their predictions of network performance czn

be examined. so that we cao determine which models give a better prediction. Of

cource: the same procedure can be follolved for other types of data as ivell. giren the

fact that certain traffic models are more appropnate for certain types of data than

ot hers.

In this t hesis. I r e used an exhaustive algorithm in order to find the best initial

points which will be u ~ e d for local optimization. Fié also developed a random search

algorithm in Our effort 10 reduce the duration of the automated algorithm. Except

for these two approaches. someone could invest igate the possibility of using more

sophisticated sampling methods to replace the exhaustive or random search. Methods

such as important sampling and mean field annealing for example can be much more

faster and more accurate at the same time.

Page 94: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

References

[1] Y.S.Frost and B. Nelamed. -0ver-iew of Simulation and Tr&c Modeling for

Telecommunications Setworks'. IEEE Cornrn.:\fag. Vol. 32(3):'70-SI. 1991.

[2] J. Banks and J . S. Carson II. -Dismete-Ecent System Simulation". Prentice-Hall.

Inc. 1954. pages 294-316.

[3] G.S.Fishman. ..Prin ciples of Discrete Ecen t Sim dation'. John Wiley and Sons.

Inc. 1 9 X

[A] B. hleiamed .\I.Livni and r\.K.Tsiolis. "The impact of .\ut ocorrelation on Queu-

[5] B.3ielamed. "An overview of TES processes and modeling merhodolog';'. IEICE

TRAIYS.CO:VN I*X- E'iJ-B(12):359-3931 1993.

[6] E. Cinlar. "Introduction to Stochastic Processes ". Englewood Cliffs' SJ: Prent ice-

Hall, 1973.

[il H.Heffers and D.1I.Lucantoni. 'A Markov 3Iodulated Characrerization of Pack-

etized Voice and Data and Related Stat istical Mult ipleser Performance" . IEEE

J. on Sdected h e a s i n &mmun. SAC-4: pages 856-836. 1986.

Page 95: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

[SI .\.Tabat abai D - H e p a n and T.V. Lakshman. "S t atistical hnalysis and Simula-

tion Study of Video Teletraffic in .ATM Xetworks' . IEEE Trans. Circuit.; and

Systerns for Iïdeo Twhnology: 2.

[9] Ren Q. Melamed. B. and B. Sengupta. 'The QTES/PH/l Queue-. Performance

Evaluation. \,*o'ol. 26(1):1-20. Jul. 1996.

[IO] D. L. Jagerman and B. Melamed. "Burstiness Descripton of Trafic Streams:

Indices of Dispersion and Peakednessy . Proceedings of the Trcenty Eighth rlnnual

Conference on Information Sciences and Sgstérns. Mar. 199-2.

[lll B. Melamed. "TES: -1 Class of Methods for Generating Autocorrelated Gniform

Variates". ORS.? J. on Cornputing. Vol. 3(4):317-379. 1991.

1 D. L. Jagerman and B. Melamed. T h e Transition and Autocorrelation structure

of TES Processes Part 1: General Theoryu. Stochastic Models. 1-01. 2(S):193-719.

1992.

[13] D. L. Jagerman and B. Melamed. "The Transition and utoc oc or relation structure

of TES Processes Part II: Special Cases'. Stochastic Models. 1-01. 3(S):499-527.

199'2.

[14] B. Melamed and J. Hill. ".Applications of the TES klodeling Methodology'. .

Proceedings of 61SC '98 pages 1330-1338. 1993.

[15] Predrag R. Jelencoric and Benjamin Melamed. "-htomated TES Modeling of

Compressed Video'!. IEEE Infocon '95 pages 746-752. 1993.

Page 96: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

[16] W.Feller. ' A n introduction to Probability Thcory and its Applications ". John

Wiley and Sons Inc. Xeu- York. 1931.

1 R. Fletcher. "Practical Methods of Optimization ". John U-ileu and Sons. Inc.

19SO.

[IS] W. Murray P.E. Gill and M.H. Wright. 'Practical Optimization ". -4cademic

Press- London. NSl .

1191 M.J.D.Powel1. "I~hriable Mefric Methodsfor Constraint Optimization ". Springer

ITerla,o. 1933.

[-O] H.D. Sherali L1.S Bazaraa and C.51. Shetty. -.\onlineu+ Programming ". John

Wiley and Sons. Inc. Sew York. 1993.

1211 Wayne L. Mïnston. "Introduction to Mathematical Programming: -4pplications

and .-lZgorithms ". International Thomson Publishing. California. 1993.

[2] C.G. Bro-den. &The Convergence of a Class of Double-rank Minimization Algo-

rit hms- . J.Inst.Jlaths.ilpplics~ Vol. 636-90. 1970.

1-31 R. Fletcher. &A Xew Approach to Variable Metric Algorithms '-. Cornputer

Journa. 1'01. 13'33-26. 1970.

1'141 D. Golfarb. "A family of Variable Metric Lpdates Derived bu Variational Means".

Mathematics of Computing. Vol. 24:23-26, 1970.

[El D.F. Shanno. 'Conditioning of Quasi-Sewtoo Methods for Function Minimiza-

tion" . Jlathematics of Cornputing, Vol. 2k6-17-636, 1970.

Page 97: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

(261 A. Law and D. Kelton. "Simulation Modeling and Analysis '. 1IcGrarv-Hill Inc.

Ken* York, 1991.

[-il 31 .Devetsikiotis and Ii. Townsend. 'Stat istical Optimization of Dynarnic Impor-

tance Sarnpling Parameteres for Efficient Simulation of Communication Setn-orks

. IEEE . K M Trans. Set.. Vol. 1(3):'293-305, 1993.

Page 98: Automated Modeling of Broadband Network Data Using the QTES … · 2005-02-12 · Traffic modeling is a key element in simulating communications networks. A clear understanding of

l MAGE EVALUATION TEST TARGET (QA-3)

APPLIED - i IIVZAGE . lnc 1653 Eaçt Main Street - -. - , Rochester. NY 14609 USA -- --= Phone: 71W482-0300 -- -- - - Fa: 71 6/288-5989

O 1993. A p p l i Image. Inc.. All Right~ Resenred