functional calibration estimation by the maximum entropy ... · maximum entropy on the mean...

20
Functional Calibration Estimation by the Maximum Entropy on the Mean Principle Statistics: A Journal of Theoretical and Applied Statistics, 49, 989-1004. 2015. Santiago Gallón (Joint work with F. Gamboa and J-M. Loubes) Departamento de Matemáticas y Estadística Facultad de Ciencias Económicas - Universidad de Antioquia Institut de Mathématiques de Toulouse, Université Toulouse III - Paul Sabatier Seminario Institucional - Escuela de Estadística Facultad de Ciencias - Universidad de Nacional de Colombia 5 de septiembre, 2016

Upload: others

Post on 24-Jun-2020

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Functional Calibration Estimation by the Maximum

Entropy on the Mean Principle

Statistics: A Journal of Theoretical and Applied Statistics, 49, 989-1004. 2015.

Santiago Gallón

(Joint work with F. Gamboa and J-M. Loubes)

Departamento de Matemáticas y Estadística

Facultad de Ciencias Económicas − Universidad de Antioquia

Institut de Mathématiques de Toulouse, Université Toulouse III − Paul Sabatier

Seminario Institucional − Escuela de Estadística

Facultad de Ciencias − Universidad de Nacional de Colombia

5 de septiembre, 2016

Page 2: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Outline

The problem of calibration estimation

Functional calibration estimation

Maximum entropy on the mean −MEM− principle

Simulation study

References

Page 3: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

The problem of calibration estimation I

• Let UN = {1, . . . , N} be a finite survey population.

• Assume that there is a survey variable Yi, i ∈ UN .

The goal: Estimate the finite population mean

µY = N−1∑

i∈UN

Yi,

based on a sample a ⊂ UN of n elements.

• The unbiased Horvitz-Thompson −HT− estimator:

µHTY = N−1

i∈a

πi−1Yi = N−1

i∈a

diYi, πi = P(i ∈ a).

• The HT estimator has low precision due to bad sample selection,

• The inference can be improved modifying the weights di = π−1i .

Page 4: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

The problem of calibration estimation II• A method can be conducted by incorporating auxiliary information

Xi ∈ Rq, i ∈ UN , q ≥ 1, (Deville and Särndal (1992)).

Sources of auxiliary information:

X Census data

X Administrative registers

X Previous surveys

• Calibration estimation: modify the weights of µHTY by new weights

wi close enough to di’s according to some distance D(w, d), s.t.

N−1∑

i∈a

wiXi = µX = N−1∑

i∈UN

Xi (Calibration equation).

• The weights wi generally result in estimates with smaller variance.

• The calibration estimator of µY is given by

µY = N−1∑

i∈a

wiYi.

Page 5: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Functional calibration estimation

Our aim: Calibration estimation for µY , where Y (t) ∈ C([0, T ]) using

functional information X(t) ∈ C(

[0, T ] ;Rq)

.

• The functional calibration constraint:

N−1∑

i∈a

wi(t)Xi(t) = µX(t) = N−1∑

i∈UN

Xi(t), ∀t ∈ [0, T ] .

• wi(t) are obtained by matching the calibration estimation with the

maximum entropy on the mean principle.

• Example (Chaouch and Goga (2012)): Estimation of the mean

electricity consumption (every 30 min.) during the 2nd week using

the consumption in the 1st week as auxiliary information of 18902

meters sending electricity consumption for two weeks.

Page 6: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Maximum entropy on the mean −MEM− principle I

• Proposed by Navaza (1985, 1986) to solve an inverse problem in

crystallography.

• It is an alternative to the Tikhonov’s regularization of ill-posed

inverse problems.

• Maximum entropy solutions provide a simple and natural way to

incorporate constraints on the solution.

• Let be the linear inverse problem

y = Kx,

where K : X → Y is a known bounded linear operator.

• The MEM considers x as the expectation of a r.v. X, such that

y = KEν(X),

where ν is an unknown probability measure.

Page 7: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Maximum entropy on the mean −MEM− principle II

• The solution: x = Eν∗(X), where ν∗ maximizes the entropy

S(ν ‖ υ) = −D(ν ‖ υ), subject to y = KEν(X),

where D(ν ‖ υ) is the Kullback-Leibler divergence,

D(ν ‖ υ) =

{

log(

dνdυ

)

dν if ν ≪ υ

+∞ otherwise,

• The MEM principle focuses on reconstructing a measure ν∗ that

maxS(ν ‖ υ) between a feasible finite measure ν w.r.t. a given

measure υ subject to a linear constraint.

Page 8: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Reconstruction of ν∗ I

• ν∗ is rebuilt by the random measure method for infinite dimensional

inverse problems (Gzyl and Velásquez (2011)).

• The calibration constraint is expressed as an infinite-dimensional

linear inverse problem, writing wi(t) as

wi(t) =

∫ 1

0K(s, t)i (s) ds+ di for each i ∈ a,

where K(s, t) is a kernel function, i = Eν [Wi (s)] and W is a

stochastic process.

Page 9: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Reconstruction of ν∗ II

• The inverse problem takes the form of a Fredholm integral equation

of the first kind,

Eν [KW] = Eν

{

i∈a

[∫ 1

0K(s, t)dWi (s) + di

]

Xi(t)

}

=

∫ 1

0

i∈a

K(s, t)Xi(t)i (s) ds+∑

i∈a

diXi(t)

= NµX(t)

• The functions ∗i (s) that solve Eν [KW] = NµX(t) are obtained

by considering i (s) as a density of a measure

dWi (s) = i (s) ds, i ∈ a.

Page 10: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

The main theoremTheorem (Csiszár (1984)). Let υ be a prior p.m., λ = λ(t) a measure

in the class M (C [0, 1] ;Rq), and V = {ν ≪ υ : Zυ(λ) < +∞}, where

Zυ(λ) = Eυ [exp {〈λ,KW〉}], with

〈λ,KW〉 =

∫ 1

0λ⊤(dt)

(

∫ 1

0

i∈a

K(s, t)Xi(t)dWi(s) +∑

i∈a

diXi(t)

)

.

Then, there exists a unique p.m.

ν∗ = argmaxν∈V

S(ν ‖ υ) subject to Eν [KW] = NµX(t),

which is achieved at

dν∗/dυ = Z−1υ (λ∗) exp {〈λ∗,KW〉} ,

where λ∗(t) minimizes the functional

Hυ(λ) = logZυ(λ)− 〈λ, NµX〉.

Page 11: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Our results

Lemma 1. Let υ be a prior centered stationary Gaussian measure on(

C([0, 1]),B(C([0, 1])))

, and λ(t) ∈ M (C [0, 1]q). Then,

wi(t) =∫ 10 K(s, t)∗(s)ds+ di, i ∈ a, where

∗(s) =∑

i′∈a

∫ 1

0K(s, t′)X⊤

i′ (t′)λ∗(dt′).

Lemma 2. Let Wi(s) =∑N(s)

k=1 ξik be a compound Poisson process,

where N(s) is a Poisson process on [0, 1] with parameter γ > 0, and

ξik, k ≥ 1 are i.i.d. r.v.’s for each i ∈ a with distribution u independent

of N(s). Then, wi(t) =∫ 10 K(s, t)∗

i (s)ds+ di, i ∈ a, where

∗i (s) =

R

ξi exp

{

i∈a

∫ 1

0K(s, t)ξiX

⊤i (t)λ

∗(dt)

}

u (dξi) .

Page 12: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Simulation study I

Yi(t) = α(t) +Xi(t)⊤β(t) + εi(t), i ∈ UN ,

where

• α(t) = 1.2 + 2.3 cos (2πt) + 4.2 sin (2πt),

• β(t) = (β1(t), β2(t))⊤ β1(t) = cos (10t) and β2(t) = t sin (15t),

• εi(t)i.i.d.∼N

(

0, σ2ε(1 + t)

)

with σ2ε = 0.1, independent of Xi(t).

• Xi(t) = (Xi1(t),Xi2(t))⊤

, where Xi1(t) = Ui1 + f1(t) and

Xi2(t) = Ui2 + f2(t) with f1(t) = 3 sin(3πt+ 3),

f2(t) = − cos(πt), Ui1i.i.d.∼ U [−1, 1.3] and Ui2

i.i.d.∼ U [−0.5, 0.5].

• Uniform fixed-size sampling design sample a ∈ UN of n = 0.12Nelements without replacement, with N = 1000.

• K(t, s) = exp{

− |t− s|2 /2σ2}

with σ2 = 0.5.

• For the compound Poisson case ξii.i.d.∼ U [−1, 1], and γ = 1.

Page 13: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Simulation study II

Table : Bias-variance decomposition of MSE

Functional estimator MSE Bias2 Variance

Horvitz-Thompson 0.2391 0.0005 0.2386

MEM (Gaussian) 0.2001 0.0006 0.1995

MEM (Poisson) 0.2333 0.0084 0.2249

Page 14: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

0.0 0.2 0.4 0.6 0.8 1.0

−4

−2

0

2

4 X i1(t)

0.0 0.2 0.4 0.6 0.8 1.0

−1.5

−1.0

−0.5

0.0

0.5

1.0

1.5 X i2(t)

0.0 0.2 0.4 0.6 0.8 1.0

−5

0

5

10Y i(t) Gaussian measure:

TheoreticalHTMEM

0.0 0.2 0.4 0.6 0.8 1.0

−5

0

5

10Y i(t) Compound Poisson measure:

TheoreticalHTMEM

Page 15: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

References I

M. Chaouch and C. Goga. Using complex surveys to estimate the l1-median of a functional

variable: application to electricity load curves. Int. Stat. Rev., 80:40–59, 2012.

I. Csiszár. I-divergence geometry of probability distributions and minimization problems. The

Annals of Probability, 3:146–158, 1975.

I. Csiszár. Sanov property, generalized I-projection and a conditional limit theorem. The Annals

of Probability, 12:768–793, 1984.

J. C. Deville and C. E. Särndal. Calibration estimators in survey sampling. Journal of the

American Statistical Association, 87:376–382, 1992.

H. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems, volume 375 of

Mathematics and Its Applications. Kluwer Academic Publishers, Dordrecht, 2000.

W. A. Fuller. Sampling Statistics. Wiley Series in Survey Methodology. Wiley, 2009.

F. Gamboa. Méthode du maximum d’entropie sur la moyenne et applications. PhD thesis,

Université de Paris-Sud, Orsay, 1989.

H. Gzyl and Y. Velásquez. Linear Inverse Problems: the Maximum Entropy Connection,

volume 83 of Series on Advances in Mathematics for Applied Sciences. World Scientific, 2011.

D. G. Horvitz and D. J. Thompson. A generalization of sampling without replacement from a

finite universe. Journal of the American Statistical Association, 47:663–685, 1953.

J. Navaza. On the maximum entropy estimate of electron density function. Acta

Crystallographica, A41:232–244, 1985.

J. Navaza. The use of non-local constraints in maximum-entropy electron density reconstruction.

Acta Crystallographica, A42:212–223, 1986.

Page 16: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Thanks!!!

Page 17: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Zυ(λ) = exp

{

Eυ [〈λ,KW〉] +1

2Varυ [〈λ,KW〉]

}

=

exp

i∈a

di

∫ 1

0λ⊤(dt)Xi(t) +

1

2

∫ 1

0

(

i∈a

∫ 1

0K(s, t)λ⊤(dt)Xi(t)

)2

ds

due to Eυ [dWi (s)] = 0, and Varυ [dWi (s)] = ds, i ∈ a.

Page 18: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Hυ(λ) = logZυ(λ)− 〈λ, NµX〉 =

1

2

∫ 1

0

(

i∈a

∫ 1

0K(s, t)λ⊤(dt)X i(t)

)(

i′∈a

∫ 1

0K(s, t′)λ⊤(dt′)Xi′(t

′)

)

ds

+

∫ 1

0λ⊤(dt)

(

i∈a

diXi(t)−NµX(t)

)

.

λ∗(dt) that minimizes Hυ(λ) is given by

i∈a

i′∈a

∫ 1

0

∫ 1

0K(s, t)K(s, t′)Xi(t)X

⊤i′ (t

′)λ∗(dt′)ds+∑

i∈a

diXi(t) = NµX

which can be rewritten as

i∈a

[

∫ 1

0K(s, t)

(

i′∈a

∫ 1

0K(s, t′)X⊤

i′ (t′)λ∗(dt′)

)

ds+ di

]

Xi(t) = NµX(

Page 19: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

mi ((a, b]) , Wi(b)−Wi(a) =

N(b)∑

k=N(a)+1

ξik.

By the Lévy-Khintchine formula for Lévy processes, the m.g.f. of

W(s) is

Eυ [exp {〈α,W(s)〉}] = exp

{

Rn

(

e〈α,ξk〉 − 1)

u (dξk)

}

, α ∈ Rn,

Eυ [exp {〈g(s),dWi〉}] = exp

{

γ

∫ 1

0ds

R

(exp {g (s) ξi} − 1) u (dξi)

}

.

〈λ,KW〉

=

∫ 1

0λ⊤(dt)

∫ 1

0

i∈a

K(s, t)Xi(t)dWi(s) +

∫ 1

0λ⊤(dt)

i∈a

diXi(t)

where g(s) =∫ 10 λ⊤(dt)

i∈aK(s, t)X i(t).

Page 20: Functional Calibration Estimation by the Maximum Entropy ... · Maximum entropy on the mean −MEM− principle I • Proposed by Navaza (1985, 1986) to solve an inverse problem in

Zυ(λ)

= exp

{

γ

∫ 1

0ds

R

(exp {g (s) ξi} − 1) u (dξi) +

λ,∑

i∈a

diXi(t)

⟩}

Hυ(λ)

= γ

∫ 1

0ds

R

(

exp

{

∫ 1

0λ⊤(dt)

i∈a

K(s, t)ξiXi(t)

}

− 1

)

u (dξi)

+

∫ 1

0λ⊤(dt)

(

i∈a

diXi(t)−NµX(t)

)

.

i∈a

[

∫ 1

0K(s, t)

(

R

ξi exp

{

i∈a

∫ 1

0K(s, t)ξiX

⊤i (t)λ

∗(dt)

}

u (dξi)

)

ds

+di]Xi(t) = NµX(t)