ts-iassl2014

Post on 12-Apr-2017

116 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Smoothing Parameter Selection for NonparametricDensity Estimation for Length-biased Data: A Bayesian

Perspective

Yogendra P. Chaubey

Department of Mathematics and StatisticsConcordia University, Montreal, Canada H3G 1M8

E-mail:yogen.chaubey@concordia.ca

∗ Based on joint work with Jun Li, Nanjing Audit University, Nanjing, PRCand Isha Dewan, Indian Statistical Institute, New Delhi, India

Talk to be presented at the IASSL-2014 Conference,Colombo, Srilanka, December 28-30, 2014

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 1 / 45

Abstract

Nonparametric estimation of densities defined over non-negativeobservations using asymmetric kernels is of special interest as it haspotential to remove the spill-over effect at the boundary. One importantproblem in this context is the selection of the smoothing parameter. Thepurpose of the this talk is to review some recent work on the application ofBayes criterion for this purpose and investigate its application in thecontext of length biased data.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 2 / 45

Outline

1 Introduction

2 Asymmetric Kernel Density Estimators

3 Bayesian Estimation of h

4 Optimal Selection of Parameters α and β

5 Numerical Studies

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 3 / 45

1. Introduction

For a given random sample X1, ..., Xn from some unknown densityf(x), a popular method known as the “kernel method” provides thefollowing density estimator:

fn(x) =1

n

n∑i=1

kh(x−Xi) =1

nh

n∑i=1

k

(x−Xi

h

)) (1.1)

where the function k(.) called the Kernel function, that is basically asymmetric probability density function on the entire real line, i.e.

(i)k(−x) = k(x), (ii)

∫ ∞−∞

k(x)dx = 1

and

kh(x) =1

hk(xh

)Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 4 / 45

Introduction

h is known as bandwidth; it is also commonly known as thesmoothing parameter.

Its choice is important one as it exhibits a strong influence on theestimator; a very small value of h may lead the estimator to be veryrugged (with many modes) where as a very large value may provide avery very smooth estimator and may hide some important features ofthe data.

For asymptotic analysis h is made to depend on n, i.e. h ≡ hn, suchthat hn → 0 and nhn →∞ as n→∞.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 5 / 45

Kernel Density Estimators

Source:

http://en.wikipedia.org/wiki/Kernel_density_estimation

Figure: 1. Kernel density estimate (KDE) with different bandwidths of a randomsample of 100 points from a standard normal distribution. Grey: true density(standard normal). Red: KDE with h=0.05. Black: KDE with h=0.337. Green:KDE with h=2.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 6 / 45

Kernel Density Estimators for Suicide Data

A problem with the general kernel density estimator in the case ofnon-negative data, is that it may provide positive mass for the zeroprobability region; see the next figure. [x : Days in hospital undertreatment of 86 treatment spells.]

0 200 400 600

0.00

00.

002

0.00

40.

006

x

DefaultSJUCVBCV

Figure 1. Kernel Density Estimators for Suicide Study Data

Silverman (1986)Figure: 2. Kernel Density Estimators for Suicide Data

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 7 / 45

Asymmetric Kernel Estimators

In order to overcome this defect, many new methods estimatingunderlying density for non-negative random variables have beenproposed particularly in recent years.

Bagai and Prakasa Rao (1995) proposed replacing the symmetrickernel k by a pdf k∗ with non-negative support.

This method avoids the problem of positive mass in the negativeregion; however, only the first r order statistics are used forestimating f(x), where X(r) < x ≤ X(r+1), X(i) denoting the ith

order statistic that is an undesirable feature.

Chaubey and Sen (1996) proposed smoothing based on Poissonweights using an approximation lemmas due to Feller (1966) that hasbeen adapted to the length-biased set-up by Chaubey, Sen and Li(2010).

This method uses the whole data in contrast to the method of Bagaiand Prakasa Rao (1995).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 8 / 45

Asymmetric Kernel Estimators

A class of asymmetric kernel estimators, was proposed by Chen(2000) in the i.i.d. setup, using Gamma kernels.

Similar estimators were proposed by Scaillet (2004) using inverseGaussian and reciprocal inverse Gaussian densities.

Another class of asymmetric kernel estimators has been proposedrecently by Chaubey et al. (2012) that is motivated by generalizationof the estimator in Chaubey and Sen (1996).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 9 / 45

Biased Data

In many applications the recorded observation may be assumed tohave the probability density function g(x), that is of the form

g(x) = µ−1w w(x)f(x), x ∈ R+, (1.2)

where f(x) is the original density, w(x) is a non-negative knownfunction called the weighting function,

1

µw= Eg

(1

w(X)

)(1.3)

with X ∼ g(.).Patil and Rao (1977) cite several examples including those generatedby PPS (probability proportional to size) sampling scheme (that iscommon in sample surveys), damage models and sub-sampling [seealso Rao (1977) and Patil and Rao (1978)].In such applications, the common choices for w(x) are (i) w(x) = xand (ii) w(x) = x2. These are known as length biased and area biaseddata respectively.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 10 / 45

Length Biased Data

Here, we concentrate on the case when w(x) = x, a situation knownas giving length-biased data and where typically the observations arenon-negative.

Thus we consider a random sample {X1, . . . , Xn} be n nonnegativeindependent and identically distributed (i.i.d.) random variables (r.v.)having a continuous probability density function (pdf)g(x), x ∈ R+ = [0,∞) given by

g(x) = µ−1xf(x), x ∈ R+ (1.4)

where µ = 1/Eg[1/X] is the harmonic mean of X with respect to thedensity g(.). and estimation of f(x), x ∈ R+ itself is of centralimportance. Here it is tacitly assumed that (0 <)µ <∞.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 11 / 45

Density Estimation

Bhattacharyya et al. (1988) studied the kernel density estimator forf(x) obtained by using the corresponding estimator for g(x) and therelation (1.4), replacing the unknown value µ by its harmonic meanestimator as proposed by Cox (1969).

Cox (1969) also gave a direct estimator of F (x) that has been used inproposing an alternative density estimator by Jones (1991); howeverthe density estimator in the background used the symmetric kernels.

The problem of estimation of f(x) near x = 0 is of concern as thekernel estimator may have serious bias near x = 0. This problem haslong been recognized in density estimation in the context of i.i.d. data[see Silverman (1986)], however, it becomes more pertinent for thelength-biased data, as the observations are necessarily non-negative.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 12 / 45

Density Estimation

In a more recent paper Chaubey and Li (2013) adapted thesmoothing estimator for the i.i.d. setup developed in Chaubey et al.(2012) to the length-biased density estimation and proposed two newasymmetric kernel estimators, one based on the usual empiricaldistribution function and the other one based on the Cox’s estimator.

Here also, determination of the smoothing parameter h, vis-a-vis theasymmetric kernel is of importance.

Traditionally, the smoothing parameter h is obtained from the data byminimizing an estimate of the global measure of error, called themean integrated squared error:

MISE(f) = E

[∫(f(x)− f(x))2dx

]

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 13 / 45

Density Estimation

Kulasekera and Padgett (2006) motivate the asymmetric kernelestimators, although in the context of censored data, by the generalform of the nonparametric density estimator given by

f(x) =

∫k(x;u, h)dFn(u)du. (1.5)

In the above paper the following inverse Gaussian (IG) choice of thekernel (with mean u and dispersion parameter 1/h) was found useful:

kKP (x;u, h) =1√

2πhx3e−(

12h

)(x−u)2

u2x . (1.6)

Kulasekera and Padgett (2006) note that the estimator f based in theabove kernel is a true density while that using the reciprocal IG is not.Yet, In these methods, selection of smoothing parameter may benumerically challenging. In this paper we explore the idea of usingBayesian method of smoothing parameter selection, inspired by thepaper of Kulsekara and Padgett (2006).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 14 / 45

Density Estimation

In this approach, we contrast some alternative choices of asymmetrickernels.

Along with some preliminary notions, the proposed smooth estimatorsof f(.) are given in Section 2.

Section 3 explains the Bayesian method for determining thesmoothing parameter along with various alternative choices ofasymmetric kernels.

Section 4 is devoted to a discussion of determining the parameters ofthe prior used and Section 5 presents a numerical study contrastingthe use of different kernels.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 15 / 45

2. Asymmetric Kernel Density Estimators

There are basically two strategies for obtaining the smooth estimatorof f(x). One is based on a smooth version Cox’s estimator Fn(x)that is given below:

Fn(x) =

∑ni=1

1XiI{Xi ≤ x}∑ni=1

1Xi

. (2.1)

The other one is to first estimate g(x), that is obtained as thederivative of the smooth version of the empirical distribution function

Gn(x) =1

n

n∑i=1

I{Xi ≤ x}, (2.2)

and then making some adjustments to obtain the estimator ofunderlying density f(x). Chaubey and Li (2013) and Jones (1991)found that the density estimators based on smoothing of Fn(x) havebetter performance. Thus the density estimators we consider here arebased on Fn.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 16 / 45

2. Asymmetric Kernel Density Estimators

These estimators are of the form

fh(x) =

n∑i=1

sik(x;Xi, h) (2.3)

where

si =1/Xi∑i(1/Xi)

, (2.4)

k is a kernel function and h is a smoothing parameter.

This expression is motivated by the general form of the nonparametricdensity estimators given by (1.5).

In this talk, we explore an alternative form of k(x;u, h) followingChaubey and Li (2013), where k(x;u, h) is required to be a densitywith respect to the argument u.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 17 / 45

2. Asymmetric Kernel Density Estimators

The smooth density estimator is derived from the derivative of thefollowing smooth estimator of the distribution function F (x) :

Fn(x) =

∫ ∞0

Fn(t)dQhn(t/x) (2.5)

= 1−∑n

i=11XiQhn(Xix )∑n

i=11Xi

, (2.6)

where Qh(.) is a distribution function with mean 1 and variance h2.The corresponding smooth estimator of f obtained by differentiatingFn(x) is given by

fn(x) =1x2∑n

i=1 qhn(Xix )∑ni=1

1Xi

(2.7)

where qhn(u) = ddtQhn(u). This estimator is of the same form as in

(2.3) with k(x;u, h) given by

k(x;u, h) =u

x2qh

(ux

). (2.8)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 18 / 45

2. Asymmetric Kernel Density Estimators

As noted in Chaubey et al. (2012), (2.7) may not be defined atx = 0, except in cases where limx→0 fn(x) exists. Modifications havebeen discussed in Chaubey and Li (2013) [see also Chaubey, Sen andSen (2007)]. However, this will not be pursued here and we may thinkof the methods discussed here for estimating f(x) for x > 0 only.

The following two alternative choices of k(x;u, h) will be investigatedhere.

The first one is obtained by choosing qh(.) to be the inverse Gaussiandistribution with parameters µ = 1 and λ = 1/h, i,e.

qh(u) =

√1

2πhu3exp

{− 1

2hu(u− 1)2

}This provides the following choice of k(x;u, h) :

kCL(x;u, h) =u

x2qh

(ux

)=

1√2πhux

exp{− 1

2hux(x− u)2} (2.9)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 19 / 45

Asymmetric Kernels

kCL represents a density with respect to argument x as well as withrespect to argument u; for fixed x, kCL(x;u, h) is the density ofIG(µ, λ) with µ = x, λ = x/h and it represents the density ofreciprocal IG with µ = 1/u and λ = 1/hu for fixed u.

Thus u is the mode of kCL for fixed x, as opposed to the mean incase of kKP .

The second choice is obtained from the density of RRIG (orco-Gaussian as suggested by Professor Mudholkar recently) that hasmode 1, i.e. we take

qh(u) =

√2

πhexp{− 1

2h

(u− 1

u

)2

} (2.10)

This gives

kCG(x;u, h) =u

x2qh(u/x) =

√2

πh

u

x2exp{− 1

2hx2

(u− x2

u

)2

}

(2.11)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 20 / 45

Strong Consistency of fh(x)

The next theorem is about strong consistency of fh using the CL kernel,though we believe that it will hold under other kernels considered here.

Theorem

If f(x) is Lipschitz continuous on [0,∞) and∫∞0

1xf(x)dx <∞, then as

n→∞, h→ 0 and hn(log n)−(1+θ) →∞(θ > 0),

|fh(x)− f(x)| → 0 a.s.

for any x > 0.

Remark 2.1:

A similar theorem is offered in Kulsekera and Padgett (2006), however, theonly condition on h they assumed is that it goes to zero. We believe thatthe condition hn(log n)−(1+θ) →∞ imposed here is essential otherwisethe absolute value under the integral sign is not properly accounted.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 21 / 45

3. Bayesian Estimation of h

Consider the approximation fh(x) for f(x) :

fh(x) =

∫k(x; t, h)dF (t)

and let ξ(h) represent a prior on h then the posterior of h at thepoint x is given by

ξ(h|x) =fh(x)ξ(h)∫fh(x)ξ(h)dh

.

In the spirit of empirical Bayes methodology, using the estimate fh(x)in place of fh(x), the posterior of h based on the data X may beestimated by

ξ(h|x,X ) =fh(x)ξ(h)∫fh(x)ξ(h)dh

.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 22 / 45

3. Bayesian Estimation of h

Then under the squared error loss, the Bayes estimator of h is given by

h =

∫hfh(x)ξ(h)dh∫fh(x)ξ(h)dh

=

∑si∫hk(x;Xi, h)ξ(h)dh∑

si∫k(x;Xi, h)ξ(h)dh

.

We follow Kulasekera and Padgett (2006) to obtain the explicitexpressions for the Bayes estimator of h under the two choices ofkernels mentioned earlier. We will employ the inverted gamma priorwith parameters α and β, i.e.

ξ(h) =1

βαΓ(α)hα+1e− 1βh .

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 23 / 45

KP-Kernel

In this case, the (estimated) posterior density is given by

ξ(h|x,X1, ..., Xn) =

∑ni=1(si/h

α∗+1)e−1/β∗i h

Γ(α∗)∑n

i=1 si(β∗i )α∗ (3.1)

where α∗ = α+ 1/2 for α > 1/2 and

β∗i =

[1

β+

(x−Xi)2

2xX2i

]−1Then the Bayes estimator of h is given by

h(x) =

∑siβ∗i(α∗−1)

(α∗ − 1)∑siβ∗i

α∗ . (3.2)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 24 / 45

KP-Kernel

If we were to use an improper prior

ξ(h) ∝ 1

h2, (3.3)

then the proper posterior density can be obtained to be

ξ(h|x,X1, ..., Xn) =2√π

∑ni=1 sih

−2/5e−1/β∗∗i h∑n

i=1 si(β∗∗i )3/2

and the resulting estimator of h is given by

h(x) = 2

∑siβ∗i1/2∑

siβ∗i3/2

(3.4)

where

β∗∗i =

[(x−Xi)

2

2xX2i

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 25 / 45

CL-Kernel

In this case, the (estimated) posterior density using the gamma prior isgiven by

ξ(h|x,X1, ..., Xn) =

∑ni=1(si/X

1/2i hα

∗+1)e−1/βih

Γ(α∗)∑n

i=1 si(βi)α∗/X

1/2i

, (3.5)

where

βi =

[1

β+

(x−Xi)2

2xXi

]−1.

This results in the following estimator of h :

h(x) =

∑(si/X

(1/2)i )β

(α∗−1)i

(α∗ − 1)∑

(si/X(1/2)i )βα

∗i

. (3.6)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 26 / 45

CL-Kernel

For the improper prior as given in Eq. 3.3, the estimate of h is given by

h(x) = 2

∑(si/X

1/2i )β

∗1/2i∑

(si/X1/2i )β

∗3/2i

(3.7)

where

β∗i =

[(x−Xi)

2

2xXi

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 27 / 45

Co-Gaussian Kernel:

Using the co-Gaussian kernel kCG(x;u, h), the Bayesian estimator of hunder the gamma prior is given by

h(x) =

∑siXiβi∗

(α∗−1)

(α∗ − 1)∑siXiβα

∗i

. (3.8)

where

β∗i =

[1

β+

(Xi − x2/Xi)2

2x2

]−1.

And that with improper prior is given by

h(x) = 2

∑siXiβi∗

∗1/2∑siXiβi∗

∗3/2 (3.9)

where

β∗i =

[(Xi − x2/Xi)

2

2x2

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 28 / 45

CL/CP Estimators of χ2(4) density

The following figure gives a qualitative idea of the nature of the twoestimators; the CL kernel seems to have an edge over the KP kernel.

0 5 10 15 20 25 30

0.00

0.05

0.10

0.15

0.20

0.25

0.30

x

f h(x

)

True DensityKP KernelCL Kernel

Figure: 3. Two Smooth Estimators of the Length-biased χ2(4) density, n = 500.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 29 / 45

The choice of the parameters α and β for the two kernels may bequite different; for example in this case we chose α = 0.8, β = 7 forthe KP kernel while α = 2, β = 5 for the CL kernel.

Kulasekara and Padgett (2006) choose a value of β > 3 and thenchoose α so that the average values of h calculated at a range ofvalues equals 1/(α− 1)β.

We take the direct approach of choosing α and β so that theestimated MISE is minimized. This is achieved by minimizing theunbiased cross validation criterion as explained in the next section.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 30 / 45

Optimal selection of parameters α and β

Here we describe the process of selecting h by developing adata-driven selection for parameters α and β. One of popular criteriathat gives a measure of accuracy between any density estimator f(x)and the true density f(x) is

ISE(f , f) =

∫ ∞0

[f(x)− f(x)]2dx

=

∫ ∞0

f2(x)dx− 2

∫ ∞0

f(x)f(x)dx

+

∫ ∞0

f2(x)dx, (4.1)

which is referred as integrated squared error(ISE).An optimal choice of bandwidth could be obtained by making thevalue of (4.1) as small as possible.However, since the true density f(x) is unknown, (4.1) is not apractical objective function for optimization. So some modificationsmust be made.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 31 / 45

UCV Criterion:

Ignoring the last (constant) term in (4.1) and replacing the second termwith the leave-one-out estimator, an objective function for practicaloptimization arises as an approximation of (4.1), which is called asunbiased crossed-validation (UCV) criterion.In the length-biased scheme, the objective function is given by

UCV =

∫ ∞0

f2(x)dx− 2

n∑i=1

f−i(Xi)/Zi, (4.2)

where f−i(x) represents the density estimator built on data set excluding

Xi and Zi =∑

j 6=iXi

Xj[see Wu and Mao (1996) and Chaubey et al.

(2012)].

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 32 / 45

Choice of α and β

Note that Bayesian method gives local bandwidth choice as hdepends on x.

After the local bandwidth h is plugged into (4.2), the UCV becomes afunction of parameters α and β.

In this case, the optimum choices of α and β will be obtained byminimizing the UCV function (4.2), which would definitively lead tothe optimal selection of Bayesian bandwidth under criterion (4.1).The following plot is an example of a figure of UCV function withindependent variables α and β. The surface of UCV function isconcave down, which means that the minimum value of UCV functionexists.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 33 / 45

alpha

2.5 3.0 3.5 4.0 4.5 5.0

beta

34

56

78

−0.665

−0.660

−0.655

Figure: 4. Plot of UCV function.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 34 / 45

5. Numerical Studies

We compare KP estimator with CL and CG estimator based on asimulation study that considers a class of Weibull densities:

f(x) = θxθ−1 exp{−xθ}I{x ≥ 0}with θ = 0.5, 1, 1.5 and the followings are their figures.

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

θ=0.5θ=1.0θ=1.5

Figure: 5. Plot of various underlying pdf’s

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 35 / 45

We use the UCV method as mentioned earlier to choose the globalparameters α and β in the KP, CL and CG priors. Then we compare theirperformance based on ratios RKP,CL and RKP,CG, where

RKP,CL =EMSE(fKP (x))

ESMSE(fCL(x))

and

RKP,CG =EMSE(fKP (x))

ESMSE(fCG(x))

for a range of the values of x;, here the estimated MSE(EMSE) for thedensity estimator f(x) is defined as

EMSE(f(x)) =

N∑i=1

[f(x)− f(x)]2/N,

f being the true density function f(x) and N being the number ofsimulations. Here we use 500 repetitions to estimate MSE. The ratiosRKP,CL and RKP,CG are plotted over a range of x−values for differentvalues of θ in Figure 4.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 36 / 45

0.5 1.0 1.5 2.0

0.8

1.0

1.2

1.4

1.6

1.8

2.0

x

Rat

io

n=50n=200

(a) RKP,CL for θ = 0.5,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

2.5

x

Rat

io

n=50n=200

(b) RKP,CL for θ = 1.0,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

12

34

x

Rat

io

n=50n=200

(c) RKP,CL for θ = 1.5,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

0.6

0.7

0.8

0.9

1.0

1.1

1.2

x

Rat

io

n=50n=200

(d) RKP,CG for θ = .3,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

x

Rat

ion=50n=200

(e) RKP,CG for θ = 1.0,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

2.5

x

Rat

io

n=50n=200

(f) RKP,CG for θ = 1.5,Solid line: n = 50, Dottedline: n = 200.

Figure: 6. Plots of the Ratios RKP,CL and RKP,CG for various values of θ.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 37 / 45

Conclusion and Discussion

For smaller values of x as well as in tails, CL kernel outperforms theKP kernel; KP kernel may have a slight advantage over the CL kernelin middle range, but this advantage disappears for larger values of n.

The co-Gaussian kernel may be a bit better than KP kernel only forlarger sample sizes and large values of x. The co-Gaussian kernel waslater abandoned from further investigation realizing that the resultingkCG(x;u, h) may neither be a density in x nor in u. However, whenthe mean of the generating kernel qh(u) is close to one [that would betrue for small values of h,], qh(u) is a density with approximate mean1 and the CG kernel will behave as a density in u with mean x.

The co-Gaussian kernel with mode x and dispersion h, that is,

k(x;u, h) =

√2

πhexp{− 1

2h

(x2

u− u)2

}.

may directly be employed, but we have not investigated this here.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 38 / 45

Conclusion and Discussion

We also apply the density estimators consider here to real length biaseddata constituting widths of 46 shrubs given in Muttlak and McDonald(1990). These estimators are plotted in Figure 5. Their behaviour inextreme tails is similar, however KP and CG estimators do not performwell at the edge.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 39 / 45

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

x

f(x)

CLKPCG

Figure: 7. Plots of density estimators and histogram for 46 shrubs.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 40 / 45

References

1 Bagai, I. and Prakasa Rao, B.L.S. (1995). Kernel Type DensityEstimates for Positive Valued Random Variables. Sankhya A57,56-67.

2 Bhattacharyya, B.B., Franklin, L.A. and Richardson, G.D. (1988). AComparison of Nonparametric Unweighted and Length-Biased DensityEstimation of Fibres. Communications in Statistics - Theory andMethods, 17, 3629-3644.

3 Chaubey, Y.P. and Li, J. (2013). Asymmetic kernel density estimatorfor length biased data. In Contemporary Topics in Mathematics andStatistics with Applications, Vol. 1, (Ed.: A. Adhikari, M.R. Adhikariand Y.P. Chaubey), Pub: Asian Books Pvt. Ltd., New Delhi, India.

4 Chaubey, Y.P. , Li, J., Sen, A. and Sen, P.K. (2012). A New SmoothDensity Estimator for Non-Negtive Random Variables. TechnicalReport No. 01/07, Deapartment of Mathematics and Statistics,Concordia University, Montreal.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 41 / 45

References

5 Chaubey, Y.P. and Sen, P.K. (1996). On Smooth Estimation ofSurvival and Density Function. Statistics and Decision, 14, 1-22.

6 Chaubey, Y.P. and Sen, P.K. and Li, J. (2010). Smooth DensityEstimation for Length-biased Data. Journal of the Indian Society ofAgricultural Statistics, 64, 145-155.

7 Chaubey, Y.P. , Sen, A. and Sen, P.K. (2012). A New SmoothDensity Estimator for Non-Negtive Random Variables. TechnicalReport No. 01/07, Deapartment of Mathematics and Statistics,Concordia University, Montreal.

8 Chen, S.X. (2000). Probability Density Function Estimation UsingGamma Kernels. Annals of the Institute of Statistical Mathematics,52, 471-480.

9 Cox, D.R. (1969). Some Sampling Problems in Technology. In NewDevelopments in Survey Sampling, 506-527 (Eds.: N.L. Johnson andH. Smith,) John Wiley, New York.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 42 / 45

References

10 Feller, W. (1966). An Introduction to Probability Theory andApplication II , John Wiley, New York.

11 Jones, M.C. (1991). Kernel Density Estimation for Length BiasedData. Biometrika, 78, 511-519.

12 Kulasekera, K. B. and Padgett, W. J. (2006). Bayes BandwidthSelection in Kernel Density Estimation with Censored Data. Journalof Nonparametric Statistics, 18, 129-143.

13 Muttlak, H. A. and McDonald, L. L. (1990) Ranked set sampling withsize-biased probability of selection. Biometrics, 46, 435-446.

14 Patil, G. P. and Rao, C. R. (1977). Weighted distributions: a surveyof their applications. In Applications of Statistics. P. R. Krishnaiah(ed.), North Holland Publishing Company, 383-405.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 43 / 45

References

15 Patil, G. P. and C. R. Rao (1978). Weighted Distributions andSize-Biased Sampling with Applications to Wildlife Populations andHuman Families. Biometrics, 34 179-189.

16 Rao, C. R. (1977). A Natural Example of A Weighted BinomialDistribution. American Statistician, 31, 24-26.

17 Scaillet, O. (2004). Density Estimation Using Inverse and ReciprocalInverse Gaussian Kernels. Journal of Nonparametric Statistics, 16,217-226.

18 Silverman, B. W. (1986). Density Estimation for Statistics and DataAnalysis. Chapman and Hall, London.

19 Wu, C. O. and Mao, A. Q. (1996). Minimax Kernel for DensityEstimation with Biased Data. Annals of Institute of StatisticalMathematics, 48, 451-467.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 44 / 45

Talk slides are available on SlideShare:http://www.slideshare.net/YogendraChaubey/TS-IASSL2014

THANKS!!

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 45 / 45

top related