ts-iassl2014

45
Smoothing Parameter Selection for Nonparametric Density Estimation for Length-biased Data: A Bayesian Perspective Yogendra P. Chaubey Department of Mathematics and Statistics Concordia University, Montreal, Canada H3G 1M8 E-mail:[email protected] * Based on joint work with Jun Li, Nanjing Audit University, Nanjing, PRC and Isha Dewan, Indian Statistical Institute, New Delhi, India Talk to be presented at the IASSL-2014 Conference, Colombo, Srilanka, December 28-30, 2014 Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 1 / 45

Upload: ychaubey

Post on 12-Apr-2017

116 views

Category:

Education


0 download

TRANSCRIPT

Page 1: TS-IASSL2014

Smoothing Parameter Selection for NonparametricDensity Estimation for Length-biased Data: A Bayesian

Perspective

Yogendra P. Chaubey

Department of Mathematics and StatisticsConcordia University, Montreal, Canada H3G 1M8

E-mail:[email protected]

∗ Based on joint work with Jun Li, Nanjing Audit University, Nanjing, PRCand Isha Dewan, Indian Statistical Institute, New Delhi, India

Talk to be presented at the IASSL-2014 Conference,Colombo, Srilanka, December 28-30, 2014

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 1 / 45

Page 2: TS-IASSL2014

Abstract

Nonparametric estimation of densities defined over non-negativeobservations using asymmetric kernels is of special interest as it haspotential to remove the spill-over effect at the boundary. One importantproblem in this context is the selection of the smoothing parameter. Thepurpose of the this talk is to review some recent work on the application ofBayes criterion for this purpose and investigate its application in thecontext of length biased data.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 2 / 45

Page 3: TS-IASSL2014

Outline

1 Introduction

2 Asymmetric Kernel Density Estimators

3 Bayesian Estimation of h

4 Optimal Selection of Parameters α and β

5 Numerical Studies

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 3 / 45

Page 4: TS-IASSL2014

1. Introduction

For a given random sample X1, ..., Xn from some unknown densityf(x), a popular method known as the “kernel method” provides thefollowing density estimator:

fn(x) =1

n

n∑i=1

kh(x−Xi) =1

nh

n∑i=1

k

(x−Xi

h

)) (1.1)

where the function k(.) called the Kernel function, that is basically asymmetric probability density function on the entire real line, i.e.

(i)k(−x) = k(x), (ii)

∫ ∞−∞

k(x)dx = 1

and

kh(x) =1

hk(xh

)Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 4 / 45

Page 5: TS-IASSL2014

Introduction

h is known as bandwidth; it is also commonly known as thesmoothing parameter.

Its choice is important one as it exhibits a strong influence on theestimator; a very small value of h may lead the estimator to be veryrugged (with many modes) where as a very large value may provide avery very smooth estimator and may hide some important features ofthe data.

For asymptotic analysis h is made to depend on n, i.e. h ≡ hn, suchthat hn → 0 and nhn →∞ as n→∞.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 5 / 45

Page 6: TS-IASSL2014

Kernel Density Estimators

Source:

http://en.wikipedia.org/wiki/Kernel_density_estimation

Figure: 1. Kernel density estimate (KDE) with different bandwidths of a randomsample of 100 points from a standard normal distribution. Grey: true density(standard normal). Red: KDE with h=0.05. Black: KDE with h=0.337. Green:KDE with h=2.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 6 / 45

Page 7: TS-IASSL2014

Kernel Density Estimators for Suicide Data

A problem with the general kernel density estimator in the case ofnon-negative data, is that it may provide positive mass for the zeroprobability region; see the next figure. [x : Days in hospital undertreatment of 86 treatment spells.]

0 200 400 600

0.00

00.

002

0.00

40.

006

x

DefaultSJUCVBCV

Figure 1. Kernel Density Estimators for Suicide Study Data

Silverman (1986)Figure: 2. Kernel Density Estimators for Suicide Data

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 7 / 45

Page 8: TS-IASSL2014

Asymmetric Kernel Estimators

In order to overcome this defect, many new methods estimatingunderlying density for non-negative random variables have beenproposed particularly in recent years.

Bagai and Prakasa Rao (1995) proposed replacing the symmetrickernel k by a pdf k∗ with non-negative support.

This method avoids the problem of positive mass in the negativeregion; however, only the first r order statistics are used forestimating f(x), where X(r) < x ≤ X(r+1), X(i) denoting the ith

order statistic that is an undesirable feature.

Chaubey and Sen (1996) proposed smoothing based on Poissonweights using an approximation lemmas due to Feller (1966) that hasbeen adapted to the length-biased set-up by Chaubey, Sen and Li(2010).

This method uses the whole data in contrast to the method of Bagaiand Prakasa Rao (1995).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 8 / 45

Page 9: TS-IASSL2014

Asymmetric Kernel Estimators

A class of asymmetric kernel estimators, was proposed by Chen(2000) in the i.i.d. setup, using Gamma kernels.

Similar estimators were proposed by Scaillet (2004) using inverseGaussian and reciprocal inverse Gaussian densities.

Another class of asymmetric kernel estimators has been proposedrecently by Chaubey et al. (2012) that is motivated by generalizationof the estimator in Chaubey and Sen (1996).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 9 / 45

Page 10: TS-IASSL2014

Biased Data

In many applications the recorded observation may be assumed tohave the probability density function g(x), that is of the form

g(x) = µ−1w w(x)f(x), x ∈ R+, (1.2)

where f(x) is the original density, w(x) is a non-negative knownfunction called the weighting function,

1

µw= Eg

(1

w(X)

)(1.3)

with X ∼ g(.).Patil and Rao (1977) cite several examples including those generatedby PPS (probability proportional to size) sampling scheme (that iscommon in sample surveys), damage models and sub-sampling [seealso Rao (1977) and Patil and Rao (1978)].In such applications, the common choices for w(x) are (i) w(x) = xand (ii) w(x) = x2. These are known as length biased and area biaseddata respectively.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 10 / 45

Page 11: TS-IASSL2014

Length Biased Data

Here, we concentrate on the case when w(x) = x, a situation knownas giving length-biased data and where typically the observations arenon-negative.

Thus we consider a random sample {X1, . . . , Xn} be n nonnegativeindependent and identically distributed (i.i.d.) random variables (r.v.)having a continuous probability density function (pdf)g(x), x ∈ R+ = [0,∞) given by

g(x) = µ−1xf(x), x ∈ R+ (1.4)

where µ = 1/Eg[1/X] is the harmonic mean of X with respect to thedensity g(.). and estimation of f(x), x ∈ R+ itself is of centralimportance. Here it is tacitly assumed that (0 <)µ <∞.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 11 / 45

Page 12: TS-IASSL2014

Density Estimation

Bhattacharyya et al. (1988) studied the kernel density estimator forf(x) obtained by using the corresponding estimator for g(x) and therelation (1.4), replacing the unknown value µ by its harmonic meanestimator as proposed by Cox (1969).

Cox (1969) also gave a direct estimator of F (x) that has been used inproposing an alternative density estimator by Jones (1991); howeverthe density estimator in the background used the symmetric kernels.

The problem of estimation of f(x) near x = 0 is of concern as thekernel estimator may have serious bias near x = 0. This problem haslong been recognized in density estimation in the context of i.i.d. data[see Silverman (1986)], however, it becomes more pertinent for thelength-biased data, as the observations are necessarily non-negative.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 12 / 45

Page 13: TS-IASSL2014

Density Estimation

In a more recent paper Chaubey and Li (2013) adapted thesmoothing estimator for the i.i.d. setup developed in Chaubey et al.(2012) to the length-biased density estimation and proposed two newasymmetric kernel estimators, one based on the usual empiricaldistribution function and the other one based on the Cox’s estimator.

Here also, determination of the smoothing parameter h, vis-a-vis theasymmetric kernel is of importance.

Traditionally, the smoothing parameter h is obtained from the data byminimizing an estimate of the global measure of error, called themean integrated squared error:

MISE(f) = E

[∫(f(x)− f(x))2dx

]

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 13 / 45

Page 14: TS-IASSL2014

Density Estimation

Kulasekera and Padgett (2006) motivate the asymmetric kernelestimators, although in the context of censored data, by the generalform of the nonparametric density estimator given by

f(x) =

∫k(x;u, h)dFn(u)du. (1.5)

In the above paper the following inverse Gaussian (IG) choice of thekernel (with mean u and dispersion parameter 1/h) was found useful:

kKP (x;u, h) =1√

2πhx3e−(

12h

)(x−u)2

u2x . (1.6)

Kulasekera and Padgett (2006) note that the estimator f based in theabove kernel is a true density while that using the reciprocal IG is not.Yet, In these methods, selection of smoothing parameter may benumerically challenging. In this paper we explore the idea of usingBayesian method of smoothing parameter selection, inspired by thepaper of Kulsekara and Padgett (2006).

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 14 / 45

Page 15: TS-IASSL2014

Density Estimation

In this approach, we contrast some alternative choices of asymmetrickernels.

Along with some preliminary notions, the proposed smooth estimatorsof f(.) are given in Section 2.

Section 3 explains the Bayesian method for determining thesmoothing parameter along with various alternative choices ofasymmetric kernels.

Section 4 is devoted to a discussion of determining the parameters ofthe prior used and Section 5 presents a numerical study contrastingthe use of different kernels.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 15 / 45

Page 16: TS-IASSL2014

2. Asymmetric Kernel Density Estimators

There are basically two strategies for obtaining the smooth estimatorof f(x). One is based on a smooth version Cox’s estimator Fn(x)that is given below:

Fn(x) =

∑ni=1

1XiI{Xi ≤ x}∑ni=1

1Xi

. (2.1)

The other one is to first estimate g(x), that is obtained as thederivative of the smooth version of the empirical distribution function

Gn(x) =1

n

n∑i=1

I{Xi ≤ x}, (2.2)

and then making some adjustments to obtain the estimator ofunderlying density f(x). Chaubey and Li (2013) and Jones (1991)found that the density estimators based on smoothing of Fn(x) havebetter performance. Thus the density estimators we consider here arebased on Fn.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 16 / 45

Page 17: TS-IASSL2014

2. Asymmetric Kernel Density Estimators

These estimators are of the form

fh(x) =

n∑i=1

sik(x;Xi, h) (2.3)

where

si =1/Xi∑i(1/Xi)

, (2.4)

k is a kernel function and h is a smoothing parameter.

This expression is motivated by the general form of the nonparametricdensity estimators given by (1.5).

In this talk, we explore an alternative form of k(x;u, h) followingChaubey and Li (2013), where k(x;u, h) is required to be a densitywith respect to the argument u.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 17 / 45

Page 18: TS-IASSL2014

2. Asymmetric Kernel Density Estimators

The smooth density estimator is derived from the derivative of thefollowing smooth estimator of the distribution function F (x) :

Fn(x) =

∫ ∞0

Fn(t)dQhn(t/x) (2.5)

= 1−∑n

i=11XiQhn(Xix )∑n

i=11Xi

, (2.6)

where Qh(.) is a distribution function with mean 1 and variance h2.The corresponding smooth estimator of f obtained by differentiatingFn(x) is given by

fn(x) =1x2∑n

i=1 qhn(Xix )∑ni=1

1Xi

(2.7)

where qhn(u) = ddtQhn(u). This estimator is of the same form as in

(2.3) with k(x;u, h) given by

k(x;u, h) =u

x2qh

(ux

). (2.8)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 18 / 45

Page 19: TS-IASSL2014

2. Asymmetric Kernel Density Estimators

As noted in Chaubey et al. (2012), (2.7) may not be defined atx = 0, except in cases where limx→0 fn(x) exists. Modifications havebeen discussed in Chaubey and Li (2013) [see also Chaubey, Sen andSen (2007)]. However, this will not be pursued here and we may thinkof the methods discussed here for estimating f(x) for x > 0 only.

The following two alternative choices of k(x;u, h) will be investigatedhere.

The first one is obtained by choosing qh(.) to be the inverse Gaussiandistribution with parameters µ = 1 and λ = 1/h, i,e.

qh(u) =

√1

2πhu3exp

{− 1

2hu(u− 1)2

}This provides the following choice of k(x;u, h) :

kCL(x;u, h) =u

x2qh

(ux

)=

1√2πhux

exp{− 1

2hux(x− u)2} (2.9)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 19 / 45

Page 20: TS-IASSL2014

Asymmetric Kernels

kCL represents a density with respect to argument x as well as withrespect to argument u; for fixed x, kCL(x;u, h) is the density ofIG(µ, λ) with µ = x, λ = x/h and it represents the density ofreciprocal IG with µ = 1/u and λ = 1/hu for fixed u.

Thus u is the mode of kCL for fixed x, as opposed to the mean incase of kKP .

The second choice is obtained from the density of RRIG (orco-Gaussian as suggested by Professor Mudholkar recently) that hasmode 1, i.e. we take

qh(u) =

√2

πhexp{− 1

2h

(u− 1

u

)2

} (2.10)

This gives

kCG(x;u, h) =u

x2qh(u/x) =

√2

πh

u

x2exp{− 1

2hx2

(u− x2

u

)2

}

(2.11)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 20 / 45

Page 21: TS-IASSL2014

Strong Consistency of fh(x)

The next theorem is about strong consistency of fh using the CL kernel,though we believe that it will hold under other kernels considered here.

Theorem

If f(x) is Lipschitz continuous on [0,∞) and∫∞0

1xf(x)dx <∞, then as

n→∞, h→ 0 and hn(log n)−(1+θ) →∞(θ > 0),

|fh(x)− f(x)| → 0 a.s.

for any x > 0.

Remark 2.1:

A similar theorem is offered in Kulsekera and Padgett (2006), however, theonly condition on h they assumed is that it goes to zero. We believe thatthe condition hn(log n)−(1+θ) →∞ imposed here is essential otherwisethe absolute value under the integral sign is not properly accounted.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 21 / 45

Page 22: TS-IASSL2014

3. Bayesian Estimation of h

Consider the approximation fh(x) for f(x) :

fh(x) =

∫k(x; t, h)dF (t)

and let ξ(h) represent a prior on h then the posterior of h at thepoint x is given by

ξ(h|x) =fh(x)ξ(h)∫fh(x)ξ(h)dh

.

In the spirit of empirical Bayes methodology, using the estimate fh(x)in place of fh(x), the posterior of h based on the data X may beestimated by

ξ(h|x,X ) =fh(x)ξ(h)∫fh(x)ξ(h)dh

.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 22 / 45

Page 23: TS-IASSL2014

3. Bayesian Estimation of h

Then under the squared error loss, the Bayes estimator of h is given by

h =

∫hfh(x)ξ(h)dh∫fh(x)ξ(h)dh

=

∑si∫hk(x;Xi, h)ξ(h)dh∑

si∫k(x;Xi, h)ξ(h)dh

.

We follow Kulasekera and Padgett (2006) to obtain the explicitexpressions for the Bayes estimator of h under the two choices ofkernels mentioned earlier. We will employ the inverted gamma priorwith parameters α and β, i.e.

ξ(h) =1

βαΓ(α)hα+1e− 1βh .

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 23 / 45

Page 24: TS-IASSL2014

KP-Kernel

In this case, the (estimated) posterior density is given by

ξ(h|x,X1, ..., Xn) =

∑ni=1(si/h

α∗+1)e−1/β∗i h

Γ(α∗)∑n

i=1 si(β∗i )α∗ (3.1)

where α∗ = α+ 1/2 for α > 1/2 and

β∗i =

[1

β+

(x−Xi)2

2xX2i

]−1Then the Bayes estimator of h is given by

h(x) =

∑siβ∗i(α∗−1)

(α∗ − 1)∑siβ∗i

α∗ . (3.2)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 24 / 45

Page 25: TS-IASSL2014

KP-Kernel

If we were to use an improper prior

ξ(h) ∝ 1

h2, (3.3)

then the proper posterior density can be obtained to be

ξ(h|x,X1, ..., Xn) =2√π

∑ni=1 sih

−2/5e−1/β∗∗i h∑n

i=1 si(β∗∗i )3/2

and the resulting estimator of h is given by

h(x) = 2

∑siβ∗i1/2∑

siβ∗i3/2

(3.4)

where

β∗∗i =

[(x−Xi)

2

2xX2i

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 25 / 45

Page 26: TS-IASSL2014

CL-Kernel

In this case, the (estimated) posterior density using the gamma prior isgiven by

ξ(h|x,X1, ..., Xn) =

∑ni=1(si/X

1/2i hα

∗+1)e−1/βih

Γ(α∗)∑n

i=1 si(βi)α∗/X

1/2i

, (3.5)

where

βi =

[1

β+

(x−Xi)2

2xXi

]−1.

This results in the following estimator of h :

h(x) =

∑(si/X

(1/2)i )β

(α∗−1)i

(α∗ − 1)∑

(si/X(1/2)i )βα

∗i

. (3.6)

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 26 / 45

Page 27: TS-IASSL2014

CL-Kernel

For the improper prior as given in Eq. 3.3, the estimate of h is given by

h(x) = 2

∑(si/X

1/2i )β

∗1/2i∑

(si/X1/2i )β

∗3/2i

(3.7)

where

β∗i =

[(x−Xi)

2

2xXi

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 27 / 45

Page 28: TS-IASSL2014

Co-Gaussian Kernel:

Using the co-Gaussian kernel kCG(x;u, h), the Bayesian estimator of hunder the gamma prior is given by

h(x) =

∑siXiβi∗

(α∗−1)

(α∗ − 1)∑siXiβα

∗i

. (3.8)

where

β∗i =

[1

β+

(Xi − x2/Xi)2

2x2

]−1.

And that with improper prior is given by

h(x) = 2

∑siXiβi∗

∗1/2∑siXiβi∗

∗3/2 (3.9)

where

β∗i =

[(Xi − x2/Xi)

2

2x2

]−1.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 28 / 45

Page 29: TS-IASSL2014

CL/CP Estimators of χ2(4) density

The following figure gives a qualitative idea of the nature of the twoestimators; the CL kernel seems to have an edge over the KP kernel.

0 5 10 15 20 25 30

0.00

0.05

0.10

0.15

0.20

0.25

0.30

x

f h(x

)

True DensityKP KernelCL Kernel

Figure: 3. Two Smooth Estimators of the Length-biased χ2(4) density, n = 500.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 29 / 45

Page 30: TS-IASSL2014

The choice of the parameters α and β for the two kernels may bequite different; for example in this case we chose α = 0.8, β = 7 forthe KP kernel while α = 2, β = 5 for the CL kernel.

Kulasekara and Padgett (2006) choose a value of β > 3 and thenchoose α so that the average values of h calculated at a range ofvalues equals 1/(α− 1)β.

We take the direct approach of choosing α and β so that theestimated MISE is minimized. This is achieved by minimizing theunbiased cross validation criterion as explained in the next section.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 30 / 45

Page 31: TS-IASSL2014

Optimal selection of parameters α and β

Here we describe the process of selecting h by developing adata-driven selection for parameters α and β. One of popular criteriathat gives a measure of accuracy between any density estimator f(x)and the true density f(x) is

ISE(f , f) =

∫ ∞0

[f(x)− f(x)]2dx

=

∫ ∞0

f2(x)dx− 2

∫ ∞0

f(x)f(x)dx

+

∫ ∞0

f2(x)dx, (4.1)

which is referred as integrated squared error(ISE).An optimal choice of bandwidth could be obtained by making thevalue of (4.1) as small as possible.However, since the true density f(x) is unknown, (4.1) is not apractical objective function for optimization. So some modificationsmust be made.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 31 / 45

Page 32: TS-IASSL2014

UCV Criterion:

Ignoring the last (constant) term in (4.1) and replacing the second termwith the leave-one-out estimator, an objective function for practicaloptimization arises as an approximation of (4.1), which is called asunbiased crossed-validation (UCV) criterion.In the length-biased scheme, the objective function is given by

UCV =

∫ ∞0

f2(x)dx− 2

n∑i=1

f−i(Xi)/Zi, (4.2)

where f−i(x) represents the density estimator built on data set excluding

Xi and Zi =∑

j 6=iXi

Xj[see Wu and Mao (1996) and Chaubey et al.

(2012)].

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 32 / 45

Page 33: TS-IASSL2014

Choice of α and β

Note that Bayesian method gives local bandwidth choice as hdepends on x.

After the local bandwidth h is plugged into (4.2), the UCV becomes afunction of parameters α and β.

In this case, the optimum choices of α and β will be obtained byminimizing the UCV function (4.2), which would definitively lead tothe optimal selection of Bayesian bandwidth under criterion (4.1).The following plot is an example of a figure of UCV function withindependent variables α and β. The surface of UCV function isconcave down, which means that the minimum value of UCV functionexists.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 33 / 45

Page 34: TS-IASSL2014

alpha

2.5 3.0 3.5 4.0 4.5 5.0

beta

34

56

78

−0.665

−0.660

−0.655

Figure: 4. Plot of UCV function.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 34 / 45

Page 35: TS-IASSL2014

5. Numerical Studies

We compare KP estimator with CL and CG estimator based on asimulation study that considers a class of Weibull densities:

f(x) = θxθ−1 exp{−xθ}I{x ≥ 0}with θ = 0.5, 1, 1.5 and the followings are their figures.

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

θ=0.5θ=1.0θ=1.5

Figure: 5. Plot of various underlying pdf’s

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 35 / 45

Page 36: TS-IASSL2014

We use the UCV method as mentioned earlier to choose the globalparameters α and β in the KP, CL and CG priors. Then we compare theirperformance based on ratios RKP,CL and RKP,CG, where

RKP,CL =EMSE(fKP (x))

ESMSE(fCL(x))

and

RKP,CG =EMSE(fKP (x))

ESMSE(fCG(x))

for a range of the values of x;, here the estimated MSE(EMSE) for thedensity estimator f(x) is defined as

EMSE(f(x)) =

N∑i=1

[f(x)− f(x)]2/N,

f being the true density function f(x) and N being the number ofsimulations. Here we use 500 repetitions to estimate MSE. The ratiosRKP,CL and RKP,CG are plotted over a range of x−values for differentvalues of θ in Figure 4.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 36 / 45

Page 37: TS-IASSL2014

0.5 1.0 1.5 2.0

0.8

1.0

1.2

1.4

1.6

1.8

2.0

x

Rat

io

n=50n=200

(a) RKP,CL for θ = 0.5,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

2.5

x

Rat

io

n=50n=200

(b) RKP,CL for θ = 1.0,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

12

34

x

Rat

io

n=50n=200

(c) RKP,CL for θ = 1.5,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

0.6

0.7

0.8

0.9

1.0

1.1

1.2

x

Rat

io

n=50n=200

(d) RKP,CG for θ = .3,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

x

Rat

ion=50n=200

(e) RKP,CG for θ = 1.0,Solid line: n = 50, Dottedline: n = 200.

0.5 1.0 1.5 2.0

1.0

1.5

2.0

2.5

x

Rat

io

n=50n=200

(f) RKP,CG for θ = 1.5,Solid line: n = 50, Dottedline: n = 200.

Figure: 6. Plots of the Ratios RKP,CL and RKP,CG for various values of θ.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 37 / 45

Page 38: TS-IASSL2014

Conclusion and Discussion

For smaller values of x as well as in tails, CL kernel outperforms theKP kernel; KP kernel may have a slight advantage over the CL kernelin middle range, but this advantage disappears for larger values of n.

The co-Gaussian kernel may be a bit better than KP kernel only forlarger sample sizes and large values of x. The co-Gaussian kernel waslater abandoned from further investigation realizing that the resultingkCG(x;u, h) may neither be a density in x nor in u. However, whenthe mean of the generating kernel qh(u) is close to one [that would betrue for small values of h,], qh(u) is a density with approximate mean1 and the CG kernel will behave as a density in u with mean x.

The co-Gaussian kernel with mode x and dispersion h, that is,

k(x;u, h) =

√2

πhexp{− 1

2h

(x2

u− u)2

}.

may directly be employed, but we have not investigated this here.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 38 / 45

Page 39: TS-IASSL2014

Conclusion and Discussion

We also apply the density estimators consider here to real length biaseddata constituting widths of 46 shrubs given in Muttlak and McDonald(1990). These estimators are plotted in Figure 5. Their behaviour inextreme tails is similar, however KP and CG estimators do not performwell at the edge.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 39 / 45

Page 40: TS-IASSL2014

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5

0.0

0.5

1.0

1.5

x

f(x)

CLKPCG

Figure: 7. Plots of density estimators and histogram for 46 shrubs.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 40 / 45

Page 41: TS-IASSL2014

References

1 Bagai, I. and Prakasa Rao, B.L.S. (1995). Kernel Type DensityEstimates for Positive Valued Random Variables. Sankhya A57,56-67.

2 Bhattacharyya, B.B., Franklin, L.A. and Richardson, G.D. (1988). AComparison of Nonparametric Unweighted and Length-Biased DensityEstimation of Fibres. Communications in Statistics - Theory andMethods, 17, 3629-3644.

3 Chaubey, Y.P. and Li, J. (2013). Asymmetic kernel density estimatorfor length biased data. In Contemporary Topics in Mathematics andStatistics with Applications, Vol. 1, (Ed.: A. Adhikari, M.R. Adhikariand Y.P. Chaubey), Pub: Asian Books Pvt. Ltd., New Delhi, India.

4 Chaubey, Y.P. , Li, J., Sen, A. and Sen, P.K. (2012). A New SmoothDensity Estimator for Non-Negtive Random Variables. TechnicalReport No. 01/07, Deapartment of Mathematics and Statistics,Concordia University, Montreal.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 41 / 45

Page 42: TS-IASSL2014

References

5 Chaubey, Y.P. and Sen, P.K. (1996). On Smooth Estimation ofSurvival and Density Function. Statistics and Decision, 14, 1-22.

6 Chaubey, Y.P. and Sen, P.K. and Li, J. (2010). Smooth DensityEstimation for Length-biased Data. Journal of the Indian Society ofAgricultural Statistics, 64, 145-155.

7 Chaubey, Y.P. , Sen, A. and Sen, P.K. (2012). A New SmoothDensity Estimator for Non-Negtive Random Variables. TechnicalReport No. 01/07, Deapartment of Mathematics and Statistics,Concordia University, Montreal.

8 Chen, S.X. (2000). Probability Density Function Estimation UsingGamma Kernels. Annals of the Institute of Statistical Mathematics,52, 471-480.

9 Cox, D.R. (1969). Some Sampling Problems in Technology. In NewDevelopments in Survey Sampling, 506-527 (Eds.: N.L. Johnson andH. Smith,) John Wiley, New York.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 42 / 45

Page 43: TS-IASSL2014

References

10 Feller, W. (1966). An Introduction to Probability Theory andApplication II , John Wiley, New York.

11 Jones, M.C. (1991). Kernel Density Estimation for Length BiasedData. Biometrika, 78, 511-519.

12 Kulasekera, K. B. and Padgett, W. J. (2006). Bayes BandwidthSelection in Kernel Density Estimation with Censored Data. Journalof Nonparametric Statistics, 18, 129-143.

13 Muttlak, H. A. and McDonald, L. L. (1990) Ranked set sampling withsize-biased probability of selection. Biometrics, 46, 435-446.

14 Patil, G. P. and Rao, C. R. (1977). Weighted distributions: a surveyof their applications. In Applications of Statistics. P. R. Krishnaiah(ed.), North Holland Publishing Company, 383-405.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 43 / 45

Page 44: TS-IASSL2014

References

15 Patil, G. P. and C. R. Rao (1978). Weighted Distributions andSize-Biased Sampling with Applications to Wildlife Populations andHuman Families. Biometrics, 34 179-189.

16 Rao, C. R. (1977). A Natural Example of A Weighted BinomialDistribution. American Statistician, 31, 24-26.

17 Scaillet, O. (2004). Density Estimation Using Inverse and ReciprocalInverse Gaussian Kernels. Journal of Nonparametric Statistics, 16,217-226.

18 Silverman, B. W. (1986). Density Estimation for Statistics and DataAnalysis. Chapman and Hall, London.

19 Wu, C. O. and Mao, A. Q. (1996). Minimax Kernel for DensityEstimation with Biased Data. Annals of Institute of StatisticalMathematics, 48, 451-467.

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 44 / 45

Page 45: TS-IASSL2014

Talk slides are available on SlideShare:http://www.slideshare.net/YogendraChaubey/TS-IASSL2014

THANKS!!

Yogendra Chaubey (Concordia University) Department of Mathematics & Statistics December 28-30, 2014 45 / 45