exploring the use of asymmetric maximum likelihood, quantile & m-quantile regression ... ·...

31
Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression for Small Area Estimation of Counts Nikos Tzavidis Southampton Statistical Sciences Research Institute University of Southampton joint work with M.G. Ranalli 1 , N.Salvati 2 , E. Dreassi 3 , R. Chambers 4 Graybill 2013 Colorado State University 1 University of Perugia 2 University of Pisa 3 University of Florence 4 University of Wollongong

Upload: others

Post on 05-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Exploring the Use of Asymmetric MaximumLikelihood, Quantile & M-Quantile Regression for

Small Area Estimation of Counts

Nikos Tzavidis

Southampton Statistical Sciences Research InstituteUniversity of Southampton

joint work with M.G. Ranalli1, N.Salvati 2, E. Dreassi 3, R. Chambers4

Graybill 2013Colorado State University

1University of Perugia2University of Pisa3University of Florence4University of Wollongong

Page 2: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Outline

• SAE for continuous outcomes

• Asymmetric Maximum Likelihood for counts

• Quantile regression for counts

• M-Quantile regression for counts - Extending Cantoni &Ronchetti (2001, JASA)

• SAE for counts

• Empirical study

• Final remarks

Page 3: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Small Area Estimation: Preliminaries and Notation

• Units indexed by j, Areas indexed by d

• Variable of interest y, for now continuous

• Linear Mixed Model (LMM) approach to estimation, industrystandard:

yjd = xTjdβ + ud + εjd, j = 1, ..., nd, d = 1, ...D

• Small area estimator of area mean

ˆY LMMd = N−1d

[∑j∈sd

yjd +∑k∈rd

xTkdβ + ud

]

Page 4: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

SAE: Relevant Literature - Continuous Case

• Robust Estimation

• Ghosh et al. (2008, Bioka)• Sinha & Rao (2009, CJS)• Chambers & Tzavidis (2006, Bioka)• Chambers et al. (2013, JRSS B)• Dongmo Jiongo et al. (2013, Bioka)

• Empirical Best Predictor (EBP, Molina & Rao, 2010, CJS)

Page 5: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

M-Quantile Regression for Continuous Data: A Review

• Regression: model for the mean of y given x→ E(y|x) = xTβ.

• Quantile regression: model for the quantiles of y given x→ Qy(q|x) = xTβ(q)(Koenker and Bassett, 1978).

• M-Quantile regression: Qy(q|x;ψ) = xTβ(q)(Breckling and Chambers, 1988).

• an M-type generalization of quantile regression

Page 6: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

M-Quantile Estimating Equation

For fixed q and influence function ψ compute β(q) by

n∑j=1

ψq(rjq)xj = 0

• residuals: rjq = (yj −Qy(q|xj ;ψ))/σqψ;

• σqψ scale parameter;

• ψq(t) = 2ψ(t){qI(t > 0) + (1− q)I(t 6 0)}.• Solved via Iterative Weighted Least Squares (IWLS)

Page 7: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Small Area Estimation with M-Quantile Regression

Main idea of SAE with M-Quantile regression

• Quantiles/M-Quantiles used for describing group (domain)heterogeneity

• Similar role to random effects BUT

• Estimation is semiparametric

• If a hierarchical structure does explain part of the variability inthe data, units within the same domain will be clustered inthe same part of f(y|x)

Page 8: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Small Area Estimation with M-Quantile Regression

3 4 5 6 7

05

1015

20

x

y

●● ●

● ●● ●

● ●

●●

● ●

●●

●●

●●

● ●

q99

q90

q75

q50

q10

Page 9: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation

• Define qjd such that yjd = xTjdβ(qjd)

• Estimate the individual M-Quantile coefficient qjd by solving

yjd = xTjdβ(qjd)

• Estimate the area d M-Quantile coefficient θd = E[qjd|d].

• If there are no sample observations in area d, then set θd = 0.5

Page 10: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation

• The M-Quantile predictor of ˆYd is given by

ˆY LMQd = N−1d

[∑j∈sd

yjd +∑k∈rd

Qy(θd|xkd)],

whereQy(θd|xkd) = xTkdβ(θd)

Page 11: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

An Alternative View of Quantile Regression

Quantile Regression - A Parametric Link

A continuous random variable y follows an asymmetric Laplacedistribution, y ∼ ALD(µ, σ, q) with pdf

p(y|µ, σ, τ) =q(1− q)

σexp−|y − µ|

σ

• Geraci and Bottai (2007): Quantile random effects regression

• p(y, v|β, σ,Γ) = p(y|β, σ, v)p(v|Γ)

• y|v ∼ ALD(Xβ + v, σ)

• p(v|Γ), Normal random effects

• p(v|Γ), Robust random effects

Page 12: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Quantile/M-Quantile Regression for Counts

1 Machado & Santos Silva (JASA, 2005; Quantile regression;Frequentist paradigm)

2 Efron (JASA,1992; Expectile regression via AsymmetricMaximum Likelihood)

3 Our proposal - M-Quantile regression: Extending Cantoni &Ronchetti (JASA,2001)

Page 13: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Quantile Regression for Counts

Machado & Santos Silva (JASA, 2005); Lee & Neocleous(JRSS C, 2010)

Problem with estimating conditional quantiles of counts is causedby the combination of a non-differentiable sample objectivefunction with a discrete dependent variable.

• Jittering: Artificial smoothness by adding noise from aUniform(0, 1) to the count

• 1− 1 relationship between the conditional quantiles of thecount and those of the jittered outcome

• Linear model (using Asymmetric Laplace Distribution) forquantiles of jittered outcome

Page 14: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Asymmetric Maximum Likelihood - Expectile Regression

Efron (JASA,1992)

Asymmetric Maximum Likelihood Estimation: Can be seen as theresult of smoothing the objective function used to define thequantile regression estimator. Denote by D the deviance under apopulation model

Dw(y, µ) = D(y, µ), y ≤ µDw(y, µ) = wD(y, µ), y > µ

AML Details

Solution: Minimize∑n

i=1Dw(y, µ)

• Leads to expectile regression (Newey & Powell,1987) forcounts

Page 15: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Robust Estimation for Generalized Linear Models

Cantoni & Ronchetti (JASA, 2001)

• yj from Exponential Family

• E(yj) = µj ; V (yj) = V (µj); g(µj) = xTj β

•∑n

j=1(yj−µj)V (µj)

∂∂βµj = 0

• Large deviations of yj from µj or leverage points− > influence

•∑n

j=1ψ(rj)w(xj)

V 1/2(µj)∂∂βµj − α(β) = 0 (Huber quasi-likelihood)

• rj Pearson residuals; w(xj) controls leverage points

Page 16: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

M-Quantiles for Count Data: An Estimating EquationApproach

Let Qy(q|xj ;ψ) = exp(xTj β(q)) = µj(β(q)). Estimate β(q) byusing the following estimating equation

n∑j=1

[ψq

{ (yj − µj(β(q)))

V 1/2(µj(β(q)))

}w(xj)

1

V 1/2(µj(β(q)))

(∂µj(β(q))

∂β(q)

)−a(β(q))

]

• Estimation: Fisher scoring algorithm

Page 17: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Linking AML with the M-Quantile Extension of Cantoni &Ronchetti

AML Details

Start with Efron (1992): Minimize

Dw(βw) = 2

n∑j=1

[yj log(yj/µj(βw))− (yj − µj(βw))]wI(yj>µj(βw))

∂Dw(βw)

∂βw=

n∑j=1

[(yj − µj(βw))xj

]wI(yj>µj(βw)) = 0

w = 1 → corresponds to MLE

Page 18: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Linking AML with M-Quantile Extension of Cantoni &Ronchetti

Extension of Cantoni & Ronchetti

Consider extension of Cantoni & Ronchetti:

n∑j=1

[ψq

{ (yj − µj(β(q)))

V 1/2(µj(β(q)))

}w(xj)

(∂µj(β(q))∂β(q)

)V 1/2(µi(β(q)))

− a(β(q))]

= 0

With large tuning constant in Huber influence function

n∑j=1

[(yj − µj(β(q)))wqjxj

]= 0

with

wqj =[( q

1− q

)I(yj > µj(β(q))) + I(yj 6 µj(β(q)))

].

corresponds to Efron (1992) with w = q1−q

Page 19: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Comparing Different Approaches to Quantile Regressionfor Counts: An Example

• Generate data under a Poisson model

• η = 0.8 + 0.1x; y ∼ Poisson(exp(η))

• Fit M-Quantile regression, AML, Quantile regression (usingjittering)

Page 20: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Comparing Different Approaches to Quantile Regressionfor Counts: An Example

Table: Parameter estimates: Comparing different approaches to quantileregression for counts

Est. MQ c = 0.8 QR MQc = 100 AML

β0q25 0.51 0.43 0.57 0.57

β1q25 0.08 0.09 0.08 0.08

β0q50 0.77 0.62 0.83 0.83

β1q50 0.09 0.12 0.08 0.08

β0q75 1.10 1.08 1.10 1.10

β1q75 0.08 0.08 0.07 0.07

Page 21: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

GLMMs for Counts

yjd|ud ∼ Poisson(µjd)

withlog(µjd) = ηjd = xTjdβ + ud

ud ∼ N(0,Σu)

Page 22: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation

Empirical Conditional Mean Predictor (ECMP) of Yd is

ˆYd = N−1d

[∑j∈sd

yjd +∑k∈rd

ykd

],

where ykd = exp{ηkd} and ηkd = xTkdβ + ud

Notes on the use of GLMMs in SAE

• Standard methods for fitting GLMMs can be sensitive tooutliers

• Prediction of the random effects with GLMMs iscomputationally complicated

Page 23: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

SAE for Counts Using M-Quantile Regression

E[y|x, d] = exp{xTjdβ(θd)}

• with θd = E[qjd|d],

• and qjd random variables such that

yjd = exp{xTjdβ(qjd)}

Page 24: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation

• Estimate the individual M-Quantile coefficient qjd by solving

yjd = exp{xTjdβ(qjd)}

• Estimate the area d M-Quantile coefficient θd = E[qjd|d].

• The M-Quantile estimator of Yd is given by

ˆY MQd = N−1d

[∑j∈sd

yjd +∑k∈rd

ykd

],

ykd = exp{xTkdβ(θd)}

• MSE estimation with bootstrap

Page 25: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation of qjd

For count data yjd = Qy(qjd|xjd) does not have a solution whenyjd = 0. We will define qjd as the solution to

Qy(qjd|xjd) =

{k(xjd) yjd = 0yjd yjd = 1, 2, . . .

where

× k(xjd) = Qy(qmin|xjd) where qmin denotes the smallest q-value inthe grid of q-values used. However this implies that qjd = qminwhenever yjd = 0, irrespective of the value of xj

X We want that an observation with value y1 = 0 corresponds to asmaller q-value than another with value y2 = 0 whenQy(0.5|x1) > Qy(0.5|x2). A way to achieve this is by setting

k(xjd) = min{1− ε, [Qy(0.5|xjd)]−1}, ε > 0.

Page 26: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Illustration

• Generate data under the Poisson GLMM model

• u ∼ N(0, 1); η = 0.3 + 0.5x+ u; y ∼ Poisson(exp(η))

• Fit MQ regression, estimate area effects

3 4 5 6 7 8

010

2030

4050

6070

x

y

●●

●●

●●

●●

●●

●●

●●

●● ● ●●

● ●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●● ●

●●

●●

●● ●

●●

●●

●●● ●

●●

●● ●●●● ●

●● ●

● ●

●●

●●●

●● ●●

●●

●● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●● ●

●●● ●

●●

●●

●●

● ●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●● ●

●●●

● ●●

● ●

●●

●●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

q99

q95

q75

q50

q10

Page 27: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Estimation of θd

3 4 5 6 7 8

010

2030

4050

Estimating M−quantile coefficients

x

y

●●

●●

●●

●●

q95

q50

q75

q10

q25

θ1 = 0.11

θ2 = 0.6

u1 = − 0.5

u2 = 0.59

Page 28: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Prediction for Small Areas

3 4 5 6

010

2030

40Predictions for two areas using alternative models for counts

x

y

GLMM PredictionsMQ Predictions, c=100MQ Predictions, c=0.8ALD Model Predictions

Page 29: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Model-Based Simulations

• Compare: M-Quantile, ECMP & Direct Predictors

• Scenarios follow Ronchetti & Lo (2009, JMA)

Simulation Specifications

• u ∼ N(0, 0.5); x ∼ N(0, 1)

• yjd ∼ Poisson(µjd); µjd = exp(x+ ud) D = 50, nd = 10,MC = 500

• Scenario 1- No contamination

• Scenario 2- Contamination - 2%, 5%, 10%

• Contamination Mechanism: Ronchetti & Lo (2009):y = (1− α)Poisson(µ) + αPoisson(5µ)

Page 30: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Model-Based Simulations Results

Table: Model-based simulation results -y = (1− α)Poisson(µ) + αPoisson(5µ)

ScenarioPredictor 0 2% 5% 10%

Mean Values of Bias

ECMP 0.010 0.019 0.004 0.007MQ 0.011 0.042 -0.060 -0.145Direct 0.005 0.003 0.002 0.002

Mean Values of MSE

ECMP 0.139 0.340 0.664 1.190MQ 0.202 0.327 0.556 0.824Direct 0.719 1.122 1.749 2.820

Page 31: Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile Regression ... · 2013-09-23 · Exploring the Use of Asymmetric Maximum Likelihood, Quantile & M-Quantile

Final Remarks

1. Explore robust prediction for binary outcomes

2. Extensions to aggregate data available - Disease mapping(Chambers, Salvati, & Dreassi, 2013)

3. Define bias corrected predictors (extension of Chambers et al.,2013-JRSS B)

4. Robust prediction using glmms (Maiti, 2001, JSPI; Sinha,2004, CJS)

5. More work on numerical stability of estimating qjd is needed

6. Alternative approach to robust prediction for counts via ALDusing jittering