modeling call holding times of public safety network

International Journal on Computational Sciences & Applications (IJCSA) Vol.3, No.3, June 2013

DOI:10.5121/ijcsa.2013.3301 1

MODELING CALLHOLDING TIMES OF PUBLICSAFETYNETWORK

Tuyatsetseg Badarch*, Otgonbayar Bataa†

*Department of Electrical and Computer Engineering,Northeastern University, Boston, USA

[email protected]†Wireless Communication and Broadcasting Technology, School of Information and

Communications Technology (SICT), Mongolian University of Science and Technology(MUST), Mongolia.

[email protected]

ABSTRACT

This paper presents parametric probability density models for call holding time (CHT)based on the actualdata collected for over a week from the IP based public Emergency Information Network (EIN) inMongolia. When the set of chosen candidates of Gamma distribution family is fitted to the call holdingtime data, it is observed that the whole area in the CHT empirical histogram is underestimated due tospikes of higher probability and long tails of lower probability in the histogram. Therefore, we provide theGaussian parametric model of a mixture of lognormal distributions with explicit analytical expressions forthe modeling of CHTs of PSN. Finally, we show that the CHT for the IP based PSN is fitted reasonably bya mixture of lognormal distributions via the simulation of expectation maximization algorithm. This resultis significant as it expresses a useful mathematical tool in an explicit manner of a mixture of lognormaldistributions.

KEYWORDS

A Mixture of Lognormal Distributions, EM algorithm, Modeling Call Holding Times, Public SafetyNetworks.

1. INTRODUCTION

Emergency communication networks that serve various safety personnel, including medicalresponders, police, hazard and fire fighters, play a critical role in responding to emergency callsand managing voice traffic. In such time-conscious networks, the CHT of incoming voice callsfrom the affected population contribute to the maximum volume of traffic, which may not besupported by existing infrastructure because of logistical constraints of system resources.Studying call holding time provides the ability to estimate the probability of an ongoing call tohang up the phone during the next t seconds [1]. Therefore, the call holding time distributionplays a prominent role that is used to analyze the traffic and accurate design of the systemresources, simulation, performance, and resource allocation strategy.The call duration over fixed, cellular, and trunked radio networks is traditionally assumed to benegative exponentially distributed. However, this distribution approximation is not valid forcommunication networks because many empirical approaches have proved that generalizedgamma, lognormal distributions fit empirical data much better [2]-[9]. For instance, theexponential distribution in cellular networks is quite inaccurate in capturing the empirical data

mailto:[email protected]

mailto:[email protected]


2

compared to the mixture of lognormal distributions. It is observed that the probability of veryshort occurrences is overestimated, while the area with the highest probability in the empiricalhistogram is underestimated [3], [10].

Whereas there exists a literature such as call holding time distribution modeling in fixed PSTNnetwork [4], in PCS network [6], in cellular networks [7], in private mobile radio PAMR [8], andpublic safety network [11]. Studying call holding time of a public safety network providesinsight into the operation of the special purposed network [11].

In this paper, we present the study of more flexible distribution of CHTs with its mathematicaltools in PSNs, which has not has been presented before. It is proved that the parametric mixturemodel has been implemented to show that average CHTs of PSNs may be approximatedaccurately by a mixture of lognormal distributions compared to the fitting of the statisticalprobability distributions of Lognormal, shifted Lognormal, and Weibull distributions, whichprovides the example of visual view of non-superimposed distribution graphs (Fig.2).

This article is structured as follows: Section 2 presents the IP based EIN, its system structure andcomponents. Section 3 describes the statistical methods of deriving CHT distributions. Section 4describes the proposed modeling of a mixture of lognormal distributions. Section 5 presents theEM algorithm performance for the proposed model of a mixture of lognormal distributions.Section 6 presents a data exploration of the existing PSN for the performance of the proposedmodel. Section 7 shows the results of the standard probability models. In section 8, wedemonstrate the numerical results based on the proposed mixture models for the PSN comparedto the statistical models such as Lognormal model, and Weilbull model. Finally, we reach aconclusion on the research.

2. IP BASED EMERGENCY INFORMATION NETWORK

The ability to access emergency services by dialing fixed numbers is a vital component of publicsafety and emergency preparedness. The Federal Communications Commission (FCC) definesVoice over Internet Protocol (VoIP) as a technology that supports some IP based services toallow a user to call anyone who has a telephone number, including mobile and fixed networknumbers. Gradually, the Emergency telephonic services are migrating to interconnected VoIP.The IP based EIN is the national level, integrated PSN with state-of-the-art ICT solutonscompared with previous emergency services which were handled by the national medical, police,and emergency management agencies in Mongolia separately. Four emergency telephonenumbers are used throughout the country. They are:"103" (ambulance), "102" (police), "101"(fire), and "105" (hazards, disaster). When dialing any one of these numbers fromtelecommunications networks (PSTN, GSM, CDMA) through PSTN, the emergency call ispushed/forwarded to an emergency call center agency desk in the EIN.

Figure 1 presents the EIN communication architecture that provides emergency call servicesthrough wired (PSTN) and wireless (GSM, CDMA) system services for the main capitalUlaanbaatar city as well as provinces within 1100 km, and high speed data, voice, and imagetransfer services through its main networks. The EIN system is composed of four main networksincluding a fiber optical (F/O) backbone network, IP based telecommunications and computernetworks (IP PABX/VoIP/LAN/WAN/WiMax/WiFi), digital and analog Trunk Radio System(TRS), and CCTV( Digital Camera TV) networks for voice, data, video, and image transfers inurban area coverage.


3

The EIN automatically associates a location with the origination of the Emergency call though aGPS system with Geographic Information System (GIS) using a Digital Map of shape type(**.shp). Additionally, a variety of Emergency data information, such as data and image

Figure 1. Communication Architecture of EIN

transfers, are being handled by the EIN within the network as real-time applications without apeak load for the system performance.

However, the life-threatening emergency voice calls which are transferred through the bulk ofthe existing optical E1 trunks across long haul telecommunication networks are the main part ofthe system used for the peak period network performance and traffic analysis because of thededicated limited number of inbound and outbound E1 trunks under the government regulation.If the number of calls addressed to the system is too large during the peak period, the incomingtrunks will be overloaded and operators cannot handle the volume of emergency calls. Hence,one aspect of particular interest is the peak period network analysis to model the call holdingtimes and to estimate the call holding time parameters for the system performance.

3. GENERAL METHODS ON DERIVING CALL HOLDING TIME

DISTRIBUTIONS

In this section, as the preliminary study, the set considered is of probability density functions(pdf) as main statistical candidates to fit the CHT empirical distribution for ambulance, police,fire, and hazards calls.

For the fitting of these candidate distributions, we use three tests: K-S tends to be more sensitivenear the center of the distribution than at the tails. Due to this limitation above, we may prefer touse the Anderson-Darling goodness-of-fit test which gives more weight to the tails than does theK-S test. The Chi- Square(Chi-Sq) test presents a statistic via the observed frequency and theexpected frequency for bins, which can be calculated based on the sample size. Test results showthat these candidate distributions are not best probability models of average CHTs for PSNs.


4

1. The pdf of Exponential distribution:

( ; ) exp( )f x x = − (1)

where > 0 is the rate parameter of distribution.2. The pdf of Weibull distribution:Weibull distribution is a continuous probability distribution with wide applicability, primarilydue to its relation to the Gamma distribution family.The pdf of Weibull distribution has the form:

( )( ; , ) exp xF x

= (2)

where and are the scale and shape parameters of distribution, respectively.3. The pdf of Shifted Lognormal distribution:

The pdf of a shifted lognormal distribution has the form:

2

21 ( ( ) )( ) exp

2( ) 2x

Ln xxx

− −= − − (3)

where >0, We use the likelihood function :

11

( ( )) ( ) ( )n n

x x i x iii

L x x Ln x ==

= = ∑∏ (4)

where > 0,4. The pdf of Lognormal Distribution:

Lognormal distribution is a probability distribution of any random variable whose logarithm isnormally distributed. The pdf of a lognormal distribution has the form:

2

21 ( )( ) exp

22x

Lnxxx

−= −

(5)

The log-likelihood function is given by

2

21

22

1 1

1 ( )( ( )) exp22

12 ( )

2 2

ni

xii

n n

i ii i

LnxLn x Lnx

nnLn Ln Lnx Lnx

=

= =

−= −

= − − − − −

∑

∑ ∑(6)

The parameter values are computed through Maximum Likelihood Estimate (MLE) for alognormal distribution [18], [19].

Based on the assumption of smooth histograms of an empirical data, the standard distributionsfor the empirical data, or workloads fit quite accurate. However, in practice, a jagged probabilityhistogram with random spikes and long tails frequently occurred. Therefore, the spikes maycause the call holding time to have a jagged histogram. The most comprehensive method of a


5

mixture of lognormal distributions with two or more parameters can be used to approximate thewhole call duration, including the spikes and tails, detailed in the following section.

4. PROPOSED MODEL OF MIXED LOGNORMAL DISTRIBUTIONS BASED ON

EXPECTATION MAXIMIZATION ALGORITHM

In the EM approach, when CHTs , are observed, we may consider thecomponent indicators 1( ,..., )ny y y= as missing like as in the usual mixture situation, so that

becomes the complete data [20]. In literature it might happen that the CHTs innetworks are not explicitly described by a mixture of parametric multivariate distributions [14].When each mixture component is mapped to a lognormal call holding time distribution usingpriority probabilities , we notice that the n independent and identically distributed ( . . )i i d call

holding time observations come from a finite mixture of k lognormal CHTcomponents as follows:

1 1

( ) ( ) ( | , )j

k k

j j k kj j

x x N x = =

= =∑ ∑ (7)

where are the component lognormal densities,

are the parameters, and are the component

weights satisfying . Therefore, the probability density function of the average CHTs

may be modelled by a mixture of k random variables of Gaussian densities on a logarithmic timescale:

21

1 21 1

2

2

( )1 1( ) [ exp ...

2 2

( )1exp ]

2k

kk k

Lnxx

x

Lnx

−= − +

−−

(8)

It is often assumed that the parametric MLE is to estimate the set of parameters θ for the densityof the samples that maximizes the likelihood function [15].Hence the likelihood of the lognormal CHT distribution becomes:

21

1 21 11

2

2

( )1 1[ exp ...

2 2

( )1exp

2

n

ii

kk

k k

LnxL

x

Lnx

=

−= − +

−−

∏(9)

The complete CHT data set (observed CHTs plus unobserved CHTs datasets ) existswith density where . When a many- to-one

mapping from z to is the unobserved component of the

origin of the n call holding times. Hence, it is clear that the indicator implies

Bernoulli random variable indicating that the CHT observation comes from

the exact lognormal distribution with parameter .


6

The parametric MLE problem is to estimate the set of Gaussian parameters for the lognormaldistribution that maximizes the likelihood function [15]. In this multivariate model case, thelikelihood function of this complete density for one data observation becomes

11 1

2

211

( ) ( , ) ( )

( )exp .

2 2

ij j

ij

n n k

i i j y iji i

yn k

j i j

i j jji

L z L x y I x

Lnx

x

== =

==

= = =

− −

∑∏ ∏

∑∏(10)

where the indicator function for individual i that comes from the lognormal component

with j and is zero elsewhere.

Since the logarithm is a convex increasing function, maximizing the likelihood is equivalent tomaximizing the log-likelihood, thus the likelihood (10) is updated in the logarithm form as log-likelihood function for the complete CHT data:

1

22

1 1

1 1

22

( ) [ 22

1( ) ]

2

1[ ( ) 2

2

1( ) ]

2

ij

ij

k

y j jj

n n

i i jji i

n k

y j j ii j

i jj

nLn L z I nLn nLn Ln

Lnx Lnx

I Ln Ln Lnx Ln

Lnx

=

= =

= =

= − −

− − −

= − + −

− −

∑

∑ ∑

∑∑(11)

where n ( . . )i i d andThe second component of the expected value of the complete CHT data log-likelihood is themarginal distribution of the unobserved CHT data on both the observed CHTdata and on the current estimates ( . The

weight is often assumed that .Therefore, we use Bayesian theorem tocompute the expression for the distribution of the unobserved CHT:

( ) ( 1 | ; , )

( ) ( )

( )( )

t

t tj j

t

tj

t tij j i ij i

t tj i j i

ki

ijj

r x p y x

x x

xx

= = =

= =

∑

(12)

where is simply the CHT lognormal distribution evaluated at and likelihood functiongiven which is similar to the expression in the right side of (9). Given , a mixture oflognormal distribution is computed for each and .

The expected value of the ”complete data ” log- likelihood would be theiterative process. It is formed as a function of the estimate θ from (11) and (12) where is thecurrent value at iteration


7

1 1

1 1 1 1

22

( , ) ( ) ( )

( ) ( ) [

1 12 ( ) ] ( )

2 2

j

k nt t

j i j ij i

k n k nt

j j i j i jj i j i

ti j j i

j

q Ln x x

Ln x Ln Lnx Ln

Ln Lnx x

= =

= = = =

=

= + − −

− − −

∑∑

∑∑ ∑∑ (13)

Therefore, taking derivatives with respect to these and equating them to zero, we definethe new estimates respectively as follows:

1. The weights of CHT lognormal components:To derive the expression for , we need the Lagrange multiplier λ with the constraint that

, then we can iteratively get the result:

1

( )n

tj i

t ij

x

n

==

∑(14)

2. The parameter of CHT lognormal components:

1 1

22

[ (

1 12 ( ) )] ( ) 0

2 2

k n

j i jjj i

ti j j i

j

dLn Lnx Ln

d

Ln Lnx x

= =

− −

− − − =

∑∑(15)

, it is clear that the parameter can be written in the form:

1

1

( )

( )

ntj i i

t ij n

tj i

i

x Lnx

x

=

=

=∑

∑(16)

3. The parameter of CHT lognormal components:

1 1

22

[ (

1 12 ( ) )] ( ) 0

2 2

k n

j i jjj i

ti j j i

j

dLn Lnx Ln

d

Ln Lnx x

= =

− −

− − − =

∑∑(17)

we get the result:

1

1

( )( )( )

( )

nt t t Tj i i j i j

t ij n

tj i

i

x Lnx Lnx

x

=

=

− −=∑

∑(18)


8

5. IMPLEMENTATION OF EM ALGORITHM FOR THE PROPOSED MODEL OF

MIXED LOGNORMAL DISTRIBUTIONS

The EM algorithm provides computational techniques for distributions that are almostcompletely unspecified 14], [16].

The algorithm performs the process with respect to the parametric part of this expected value ofthe completed CHT data log-likelihood by assuming the existence of the hidden variables andmaking a guess at the initial parameters of the distribution [15]. The EM algorithm iterativelymaximizes by the following two steps.

1. E-step: compute

E- step calculates the expected value of the "complete data" log likelihood from Equation (13)with respect to the unobserved call holding times for all and .Itmeans we calculate the expected value of the unobserved CHT data using the observedincomplete CHT data. Computing this expectation requires the posterior probability as inequation (12) and for the parameter value (14).

2. M-step: set

This step maximizes the expectation we performed in E-step. More specifically, M stepperforms the first part containing and the second part containing iteratively from (13).

We note that the parts are not related; we can maximize independently.

These two steps are iterated as necessary until the saturation value of the expected completelog-likelihood. At the saturation value, the M step performs the set of parameters , otherwisewe repeat the E-step for the next iteration.

Although the iteration increases the marginal log-likelihood function, the EM algorithm uses arandom restart approach to avoiding a local maximum of the observed data log-likelihoodfunction.

6. DATA EXPLORATION OF THE EXISTING SYSTEM

We explore the real data from a number of incoming bundled digital trunks that connect into themain edge port of IP PBX of EIN, the existing national public safety network .

The statistics indicate that the mean of CHT of Emergency ambulance, police, fire, and hazardsincoming calls are 61.16s ( 0.14cv = ), 42.56s ( cv = 0.22), 27.17s ( cv = 0.42), and 27.76s ( cv =0.58), respectively. The mean CHT data for all emergency incoming call samples measured is39.66s. This is a shorter call duration and less value of coefficient of variance cv compared to201s with cv = 1.23 in non-VoIP call center [13], 110s with cv = 2.7 in Taiwan-mobile [5], 113swith cv = 1.4 in public telephone system [4], 63.3s with cv = 2.91 in PAMR-PCS [10], 40.6swith cv = 1.7 in non VoIP cellular network [3].

From this comparison, emergency VoIP calls are considered more efficient time-wise than thoseof commercial networks. The ambulance customers keep longer holding call behavior than theother emergency customers, such as police, fire and hazards; 35s ambulance's mean CHT islonger than the fire's mean CHT.


9

5. RESULTS ON GENERAL STANDARD DISTRIBUTION MODELS

Before the proposed model performance of a mixture of lognormals using the set of empiricaldata obtained from the EIN, first, we proceed to compare the statistical fitting of the candidatestandard distributions (Exponential, Weibull, Lognormal, and Shifted Lognormal) on the basis ofthis data set of the existing system.

For determination of the best fit distribution function, a typical the significance resulting from theKolmogorov- Smirnov (K-S) goodness-of-fit test which is described by the modified K-Sdistance , where for the maximum difference between the fittingdistribution and the empirical cdf and the level of significance [12],[17]. After getting the set of significance resulting from the K-S goodness-of-fit test for allempirical data, the easiest way to make a decision validating which candidate is best is to plot thepdf of the set of statistics. The pdf gives us a rough but quick visualized result about the best-fitted distribution function, which fits best at the fixed significance level.

As a result of such graphs, the statistical fitting results of police, fire, and hazard's CHTs aresimilar to the ambulance's CHT pdf which is depicted in Fig.2. The pdf graphs except that of the

-10 0 10 20 30 40 50 60 70 80 900

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Average Holding Time(secs)

Prob

abilit

y De

nsity

Weibull modelLognormal modelExp modelLognormal model

Figure 2. PDF of CHT of Emergency Ambulance Calls

ambulance CHT, along with their fitting function curves are not necessarily shown in detail inthis paper due to the proposed expected result with the model of a mixture of lognormaldistributions. Looking at the graphs and goodness-of-fit test results, contradicting what weexpected, all of these four distributions do not show the expected best fit for the empirical data.The reason is explained to be that the spikes and long-tails in the set of data histograms are notestimated by these standard distribution models (Figure 2).

However, the lognormal distribution among these four candidates is assumed to be the betterfitted distribution with D max value and with p value at the = 0.02 significance level underK-S, Chi-Sq, and A-D tests (see Table 1.). In this table, we summarized statistics for the fourdistributions for the emergency ambulance ("103") CHTs. The fitting statistics of the otheremergency CHTs are not shown in the paper.In addition to the spikes, the relative tail frequencies of the real data are much larger than thevalues of the Exponential, Weibull, Lognormal and Shifted Lognormal models. Therefore, wepropose a mixture of lognormal distributions instead of a single distribution next.However MLE does not simplify a solution of a mixture of lognormal distributions for callholding time. Therefore, the EM approach [14], a specialized approach designed for MLEproblems, is described for a mixture of lognormal distributions in this paper.


10

Table 1. “103” CHTs: Parameters of the candidate standarddistributions at significance level = 0.2.

8. RESULTS ON THE PROPOSED PARAMETRIC MIXTURE MODELS AND

MODEL VALIDATION

This part presents the illustrative examples to show how analytical computational resultsobtained in section III via the EM algorithm may be used to perform modeling CHTs of IP basedPSN.

When simulations of a mixture model are carried out as a result of the analytical approach insection III, we observe that a mixture of lognormal distributions represents significantimprovements to hold the spikes and as well to estimate tails of frequencies.

We believe that many other similar modeling problems in a network area can be approximatedby the proposed parametric approach in this paper.

In the performance, we performed 50-321 iterations to find reasonable fits of a mixture oflognormals. The EM algorithm fits a mixture of lognormal distributions to a given distribution,aiming to accurately capture spikes and tails over a finite interval between possible start CHTsand possible large CHTs of the empirical data.

As a model validation, pdf, cdf, and ccdf plots are depicted. Whereas pdf plots the probability ofoccurrence of the random variable under study, the cdf plots the probability that the randomvariable will not exceed specific values. More specifically, small values of the cdf reveal theaccuracy in tracking the head portion of the pdf, while small values of the ccdf reveal theaccuracy in tracking the tail portion of the pdf. These percentile-percentile plots (P-P plots) showhow well the rank statistics of a sample match a model distribution. We show that the CHTs ofPSN are modeled by a mixture of lognormal distributions, as well as that the CHTs have long-tailed behavior.

Model Validation Results on the Body Fitting

We start to look at how long a call conversation time into emergency ambulance incoming callservices in the PSN.

m=61.16cν=0.14

“D. max”K-S

“D. max”Chi-sq

p valueK-S

p valueChi-sq

ShiftedLognormalLog normal

Weibull

ShiftedLognormalLog normal

Weibull

0.0370.0380.067p val(A-D)

0.19360.19592.156

2.2322.2314.7

“scale”

0.14σ=0.15β=64.45

0.96310.960.4

“shape”

µ=4.1γ=4.1

α=8.64

0.94580.94590.0388

“loc”

-1.6300


11

Figure 3 shows the probability results of the distribution fittings, with the Lognormal-5 as thebest fitted distribution. The values of maxD = 0.0274377, 42.3 10 −= × , and componentparameters of the best fitted lognormal-5 are its main parameter estimates (see Table 2).

In a practical matter of models, it is always very difficult to fit the spikes and tails of frequencies.We demonstrate the mixture models of other emergency call types similar as the mixture of fivelognormal distributions of "103". We observe clearly that the ambulance CHT pdf has extremespikes around 56 and 66 seconds and long tails around more than 93 seconds, which shows whythe CHT is not accurately modeled by Lognormal, shifted Lognormal, Exponential, and Weibulldistributions, while a mixture of lognormal distributions can be used to approximate accuratelythe empirical data with the random spikes and long tails as a best fitted model (Fig. 3a).

The CHT distributions for emergency police ("102"), fire ("105"), and hazards ("101") are fittedquite well by five, five, and four lognormal distributions, respectively as a pdf fitting (Fig. 3).We described the MLE parameters of a mixture of lognormal distributions in each case. Table IIhas shown the parameters (all , ) of lognormal components with their maxD values throughthe K-S test and of the fitting of a mixture of lognormal distributions.

(a)

(b)Figure 3. Comparison of candidates and the proposed model for CHTs : (a). Ambulance (“103”) andPolice (“102”) CHTs: fitted with the fitting lognormal-5 curves (b). Fire (“101”) and Hazard (“105”)

CHTs: fitted with the fitting lognormal-5 and lognormal-4 curves

To hold the spikes and tails, the second fit corresponds to the Lognormal-3, which is not like thebest fitted Lognormal-5.

From equation (8), we express lognormal-5 model of the call holding time of "103" using itscomponent parameters, which fits better than the other candidate distributions based on its

maxD value and as follows:


12

This empirical data is fitted by lognormal-5 model accurately compared to lognormal-3 model(Figure 3(a)). As a detailed explanation, the best fitted lognormal-5 distribution is composed ofthe parameters of its component lognormals.(see Table 2).

Figure 3(b) shows the pdf of the most fitted lognormal-4 model for the emergency hazard CHTs.Its parameters are detailed in Table 2. The model validation P-P results show quite good fittingof the lognormal-4 model with maxD = 0.025939 compared to lognormal model with maxD = 0.06.The Lognormal distribution is fitted with the scale parameter =93.25, shape parameter =4.18,mean 29.3, standard deviation 19.8, and coefficient of variation cv =0.67.

Model Validation Results on the Tail Fitting

Emergency ambulance CHT pdf, which is derivative of the cdf; the highest probability intervalsare easily recognized as the peaks in the pdf in 3(a), while cdf value is compared to thedifference between empirical data and fitting models; the steeper the slope of the cdf, the higherthe probability of values.

Emergency Ambulance CHTs

Figure 4(a), 4(b) show the model validation P-P results with the best fitting of the lognormal-5model with maxD = 0.0274377 compared to lognormal model with maxD = 0.038. The Lognormaldistribution is fitted with the scale parameter =0.14, shape parameter =4.1, mean 61.2,standard deviation 8.8, and coefficient of variation cv =0.14.

30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Call Holding Time (seconds)

Cum

ulat

ive

dist

ribut

ion

func

tion

Lognormal modelEmpirical dataLognormal-5 model

30 40 50 60 70 80 90 100-7

-6

-5

-4

-3

-2

-1

0

Holding time in seconds

CC

DF

LogNormal modelEmpirical dataLogNormal-5 model

(b) (b)Figure 4. Comparison of candidates and the proposed model for emergency ambulance CHTs: (a) The cdf

values with the small differences between empirical data and fitting models (b). The skewed- rightempirical histogram of emergency ambulance CHTs and its best fitting lognormal-5 curve

The top 40 seconds of empirical data, along with the best fitted lognormal-5 and fitted lognormaldistributions is the start time of the total ambulance CHTs, by average. The top 80 seconds of theempirical data, along with the fitted distributions above named account for more than 95 % of thetotal CHTs. Although the cdf fitting looks perfect, the discrepancy in the set of distributions isclear, while the tail values are not seen clearly well.


13

In a practical matter of difficulties of tails to distinguish for a model, to illustrate the tail fittingmore clearly, we test the ccdf that the large variability of the most fitted model is inherited by theapproximating lognormal-5 model, so that the best fit is done. Specifically, the fitting of the tailpart of frequencies presents that the empirical data fits accurately lognormal-5 distribution.(Fig.4(b)).

Result on Emergency Police CHTs

We present the best model with the tail fitting based on how long a call conversation time ofemergency policy incoming call stays in the PSN.

20 30 40 50 60 70 80 900

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


CD

F

LogNormal modelEmpirical dataLogNormal-5 model

(a) (b)Figure 5. Comparison of candidates and the proposed model for emergency police CHTs :(a). The cdf


We claim that the fitted lognormal-5 model is long-tailed model based on the observation of cdfand ccdf (Figure 5(a), 5(b)). From Figure 5(b), the top 25 seconds of empirical data, along withthe best fitted lognormal-5 and fitted lognormal distributions, is the start time of the totalambulance CHTs, by average. The top 65 seconds of the empirical data, along with the fitteddistributions, account for more than 95 % of the total CHTs. From the cdf, there is some clearview for the Lognormal distribution models of CHTs, but the extreme tail behavior isdistinguished in ccdf of long-tailed distributions.

Except the long tails and spikes, the lognormal model is a good match for the empirical data.Therefore, the empirical data fits a lognormal-5 accurately compared to the candidate Lognormalmodel. From the result, it can be seen that the advantage of modeling long tail patterns aslognormal is so that the multiplicative property holds, i.e. the product of independentlognormally distributed call holding times is itself lognormally distributed. If components of thecall holding times are lognormally distributed, then call holding time is also lognormallydistributed.

However, in this paper, the lognormal distribution does not adequately predict peak period callholding times of policy calls, while a mixture of lognormal distributions do adequateapproximation for the jagged distribution with random spikes and long tails for the peak periodcall holding time modeling.Therefore, the empirical data fits a lognormal-5 accurately compared to the candidate lognormalmodel. To examine tail behavior, we use the ccdf on log axis that shows that the divergence ofthe tail from the model is more accurately distributed by the lognormal-5 model (Figure 5(b)).

20 30 40 50 60 70 80 90-7

-6

-5

-4

-3

-2

-1

0

Holding time in

CCDF

LogNormalEmpirical dataLogNormal-5 model


14

Result on Emergency Fire CHTs

Figure 3 (a) shows how the probability density of the emergency fire incoming conversationholding time from the existing network distributes over a one week continuous measurementperiod and how its distribution model fits with best model, while Figure 6(a), Figure 6(b)presents the associated P-P plots for them. Lognormal-5 model provides good approximation tothe cdf in the range of probabilities from 0 to 1 (Figure 6(a)). The accuracy in tails is observed;long holding time probabilities correspond to small values of the pdf, while small values of thecomplementary cdf (ccdf) reveal the accuracy in tracking the tail portion of the pdf.

0 10 20 30 40 50 600

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


CD

F

Weibull modelEmpirical dataLogNormal-5 modelLogNormal model

0 10 20 30 40 50 60-6

-5

-4

-3

-2

-1

0


CC

DF


(a) (b)Figure 6. Comparison of candidates and the proposed model for Emergency Fire CHTs (a). The cdf


The less than 3 seconds of empirical average CHT data is the start time of the total fire CHTs,while the top 50 seconds of the empirical data, along with the fitted distributions, account formore than 85 % of the total CHTs.

The Weibull model is considered as a good matched model for the empirical data, except thelong tail and spikes. The Weibull distribution is fitted with the scale parameter =30.46, shapeparameter =2.4, the mean 27, standard deviation 11.9, and maxD =0.089. Due to thecomparably large scale parameter, it does not have a long-tailed behavior. It clearly cannot fitaccurately (Figure 6(b)).

Therefore, we present that the CHT of emergency incoming fire calls can be described bylognormal-5 distribution, which ultimately leads to the long tailed model of a emergency callduration.

Results on Emergency Hazard CHTs

Lognormal-4 model provides good approximation to the cdf in the range of probabilities from 0to 1 (Figure 7(a)). The accuracy in tails is observed; long holding time probabilities correspondto small values of the pdf, while small values of the complementary cdf (ccdf) reveal theaccuracy in tracking the tail portion of the pdf.


15

0 20 40 60 80 100 120 1400

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1


CDF


0 20 40 60 80 100 120 140-14

-12

-10

-8

-6

-4

-2

0


CCDF


(b) (b)Figure 7. Comparison of candidates and the proposed model for Emergency Hazard CHTs: (a). The cdf


The less than 10 seconds of empirical data, along with the best fitted lognormal-5 and fittedLognormal and Weibull distributions is the start time of the total fire CHTs, by average, whilethe top 50 seconds of the empirical data, along with the fitted distributions, account for morethan 91 % of the total CHTs.

Except the long tail and spikes, the Weibull model is a good match for the empirical data, whilethe lognormal model is far away the empirical hazard CHT data. The Weibull distribution isfitted with the scale parameter =32.41, shape parameter =1.95, the mean 28.7, standarddeviation 15.3, and maxD =0.118. The tail portion does not fit accurately by Lognormal modelcompared to the models of Lognormal-4 and Weibull (Figure 7(b)).


16

Table 2. Parameters of fitted mixture of lognormal distributions

Result Comparison

As a general workload computation, the analysis shows that among the total number ofemergency calls during the peak period, there is a big gap of the ambulance calls (49.26 %),police calls (45.02 %), fire calls (4.14 %), and hazard calls (1.59 %) in terms of the loads to thesystem performance.

The daily analysis shows the arrival process of emergency ambulance and police calls have aperiodic behavior with daily spikes which show the repeating frequency of everyday use.While emergency ambulance and police calls increase between 20PM and 24PM, fire callsincrease around morning 10AM and 11AM, and hazards calls have a flat rate. The ambulancecalls have a greater number of calls than the police calls during rush hours.Similar behavior of spikes and long-tailed with other classes of emergency CHTs is observed forthe emergency police CHT.

As a result of the detailed curve fitting computation, the EM based method is computed to be thebest algorithm to express a mixture of lognormal distributions.

By comparing the parameters in Table 1 and Table 2, we show that moving from Gamma familydistributions to a mixture of lognormals leads us to the more the accurate result and less error.As stated in section III, although the values of distance maxD in the Table 1 represent asuccessful outcome of measurements that the empirical CHT data may come from the Gammadistribution family, with respect to the all types of emergency incoming call holding times.However, we observe the peak period empirical data that the conversation time histograms havethe spikes and long tails. Thus, the mixture of lognormal distributions can be used toapproximate the empirical CHT data accurately.

CHTsD.max

“102”0.2172569

0.00013

“103”0.0374377

0.00023

“101”0.0364377

0.00023

“105”0.0259390.00016

µ1µ2µ3µ4µ5γ1γ2γ3γ4γ5π1π2π3π4π5

3.5902573.8237473.7015743.9685493.9283480.1842370.0178760.0509270.0897310.2539970.4212890.1156880.2207270.1000680.142225

4.0096604.1607924.1566124.1815884.0054970.011857

0.17883970.1028260.0179860.130829

0.07472360.1783270.410079

0.04415210.292717

2.7872043.1059663.5083553.3332184.0136810.3428120.0574090.1956270.239888

0.04141640.2372780.0600630.3153530.351031

0.0362751

2.8415442.8407513.1311993.703959

N/A0.2726280.0290380.2007540.355409

N/A0.1366570.0427470.4658860.354708

N/A


17

In Table 2, the parameters (all , ) of components with their maxD values are listed throughthe K-S test and of fitting a mixture of lognormal distributions. The values of maxD are 0.0217,0.0374, 0.0364 and 0.0259 for the police, ambulance, fire, and hazards CHTs, respectively. Theweights of component lognormal distributions are 0.0747236, 0.178327, 0.410079, 0.0441521,and 0.292717, respectively.

As described in the previous section, we should have a desired equation for the chosen CHTmodel by a mixture of $k$ random variables of Gaussian densities on a logarithmic time scale:

2

2

( )( ) exp

2 2k i k

k k

Lnxx

x

−

= − (19)

According (19), we would be able to obtain mathematical models that can accurately capture thespikes and heavy tails of call conversation time distribution of modern networks.

9. Conclusion

In summary, the research demonstrates that the mixture of lognormal distributions based on EMmethod presents the best statistical CHTs fitting model for the peak period analysis of PSN. Amixture of lognormal distributions with explicit analytical expressions is described for theincoming CHTs of PSN due to spikes of higher probability and long tails of lower probability inthe histogram.

The results contribute to the existing literature in two important ways: First, compared to thecandidate standard skewed - right distributions, the finite mixture lognormal modeling presents abest example of a statistically suitable method for modeling the distribution function when thelong tails and peak spikes randomly and frequently occurred in the frequencies of occurrence.Second, the result is important as it provides a useful mathematical tool in an explicit manner fora mixture of lognormal distributions. We emphasize the EM algorithm is a powerful method tomodel the CHTs of IP based PSNs.

ACKNOWLEDGEMENTS

Financial support from the International Fulbright Science and Technology PhD program ofUSA government is gratefully acknowledged.

REFERENCES

[1] A. M. Law, W. D. Kelton, Simulation Modeling and Analysis, New York: McGraw-Hill, 1991.[2] Al. Ajarmeh, J. Yu, M. Amezziane, “Framework for Modeling Call Holding Time for VoIP

Tandem Networks: Introducing the Call Cease Rate Function,” in Proc. GLOBECOM 2011,Chicago, IL, USA, 5-9 Dec. 2011, pp. 1- 6.

[3] F. Barcelo, J. Jordan, “Channel Holding Time Distribution in Cellular Telephony,” Elect. Letter, vol.34, no. 2, pp.146-147, 1998.

[4] V. A. Bolotin, “Modeling call holding time distributions for CCS network design and performanceanalysis,” IEEE J. Select. Areas Commun., vol. 12, no. 3, pp. 433-438, 1994.

[5] W-E Chen, H. Hung, Y. Lin, “Modeling VoIP Call Holding Times for Telecom,” IEEE Network, vol.21, no. 6, pp. 22-28, 2007

[6] Y. Fang and I. Chlamtic, “Teletraffic Analysis and Mobility Modeling of the PCS networks,” IEEETrans. on Communications, vol. 47, no.7, July 1999.


18

[7] C. Jedrzycky, V. C. M. Leung, “Probability distribution of channel holding time in cellular telephonysystem,” in Proc. IEEE Vech. Technol. Conf., Atlanta, GA, May. 1996, pp. 247-251.

[8] J. Jordan, F. Barcelo, “Statistical of Channel Occupancy in Trunked PAMR systems,” TeletrafficContributions for the Information Age (ITC’15), Elsevier Science, pp. 1169-1178, 1997.

[9] D. Sharp, N. Cackov., and L.Trajkovic, “Analysis of Public Safety Traffic on Trunked Land MobileRadio Systems,” IEEE J. Select. Areas Commun., vol. 22, no.7, pp. 1197-1202, 2004.

[10] F. Barcelò, J. Jordan, “Channel Holding Time Distribution in Public Telephony System (PAMR andPCS),” IEEE Transaction on Vechular Technology, vol. 49, no.5, pp.1615-1625, 2000.

[11] B. Vujicic, H. Chen, and Lj. Trajkovic, "Prediction of traffic in a public safety network," in Proc.IEEE Int. Symp. Circuits and Systems, Kos, Greece, May 2006, pp. 2637-2640.

[12] E. Chelbus, “Empirical validation of call holding time distribution in cellular communicationsystems,” Teletraffic Contributions of the Information Age, Elsevier Science, pp.1179 -1189, 1997.

[13] L. Brown, N. Gans., and L. Zhao, “Statistical Analysis of a Telephone Call Center: A Queueing -Science Perspective,” Journal of the American Statistical Association,vol. 100, no. 469, March.2005.

[14] Dempster AP, Laird NM, Rubin DB, “Maximum Likelihood from Incomplete Data Via the EMAlgorithm,” Journal of the Royal Statistical Society, no. 39. vol.1, pp.1-38, 1977.

[15] T. Benaglia, D. Chauveau, DR. Hunter, and DS. Young, “Mixtools: An R package for analyzingfinite mixture models,” Journal of Statistical Software, vol. 32, no. 6, pp.1-29, 2009.

[16] G. J. McLachlan, T. Krishnan, The EM Algorithm and Extensions, 2nd ed, John Wiley and Sons,New York, 2008.

[17] I. Tang, X. Wei, “Existence of maximum likelihood estimation for Three-parameter lognormalDistribution, reliability and safety,” in Proc. 8th International Conference ICRMS, 20-24 July, 2009.

[18] L. Eckhard, A. Werner, A. Markus, “Log-normal Distributions across the Sciences: Keys and Clues,”BioScience, vol. 51, no. 5, pp. 341-351, May. 2001.

[19] R. Vernic, S. Teodorescu, E. Pelican, “Two Lognormal Models for Real Data,” Annals of OvidiusUniversity, Series Mathematics, vol. 17, no. 3, pp. 263-277, 2009.

[20] J.A.Bilmes, “A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation forGaussian Mixture and Hiden Markov Models,” International Computer Science Institute, Berkeley,California, April, 1998.

Author

Ms. Tuyatsetseg Badarch received both a Bachelor and Master’s degree from the Mongolian Scienceand Technology University in 1996 and 1998, respectively as well as a postgraduate degree from theSpace Science and Technology Center in India in 2005. She received prestigious government scholarshipssuch as Mongolia, China, India, Taiwan, as well as USA government Fulbright S&T PhD scholarship.She studied as a researcher Beijing University of Post and Telecommunications in China from 1998 to2000 and served as researcher at National Tsinhua University in Taiwan from 2007 to 2008. She workedas lecturer and senior lecturer at the leading universities of Mongolia: Mongolian University of Scienceand Technology (MUST) and National University of Mongolia (NUM) from 1996 to now, and she is withthe Department of Electrical and Computer Engineering, Northeastern University, USA, on leave fromthe NUM. She is focused on the modeling network traffic, traffic distribution phenomena, networkresource allocation, advanced computational analysis and performance.

Dr. Otgonbayar Bataa is Professor and Head of the Department of Wireless Communications, Schoolof Information and Communications Technology (SICT), Mongolian University of Science andTechnology (MUST), Mongolia. She is the author of over many journals, as well as numerous proceedingpapers of highly ranked international conferences, technical papers, and projects. Her research focuses onthe performance analysis and design of advanced mobile and wireless systems, as well as network trafficmodels.