modeling and inverse problems in the presence of...

Modeling and Inverse Problems in the Presence of Uncertainty

Lecture 1. Probability Measure Estimation in Nonparametric Models(Chapter 5 on BanksHuThompson)

Lecture 2. Propagation of Uncertainty in Continuous Time Dynamical Systems (Chapter 7 of BanksHuThompson)

Modeling and Inverse Problems in the Presence of Uncertainty

Authors/Affiliations

H. T. Banks, North Carolina State University, Raleigh, USA

Shuhua Hu, North Carolina State University, Raleigh, USA

W. Clayton Thompson, Noth Carolina State University, Raleigh, USA

This book collects recent research—including the authors’ own substantial projects—on uncertainty propagation and quantification. It covers two sources of uncertainty: where uncertainty is present primarily due to measurement errors and where uncertainty is present due to the modeling formulation itself. With many examples throughout addressing problems in physics, biology, and other areas, the book is suitable for applied mathematicians as well as scientists in biology, medicine, engineering, and physics.

Key Features Reviews basic probability and statistical concepts, making the book self-

contained Presents many applications and theoretical results from engineering, biology,

and physics Covers the general relationship of differential equations driven by white noise

(stochastic differential equations) and the ones driven by colored noise (random differential equations) in terms of their resulting probability density functions

Describes the Prohorov metric framework for nonparametric estimation of aprobability measure

Contains numerous examples and end-of-chapter references to research results, including the authors’ technical reports that can be downloaded from North Carolina State University’s Center for Research in Scientific Computation

Selected Contents Introduction. Probability and Statistics Overview. Mathematical and Statistical Aspects of Inverse Problems. Model Comparison Criteria. Estimation of Probability Measures Using Aggregate Population Data. Optimal Design. Propagation of Uncertainty in a Continuous Time Dynamical System. A Stochastic System and Its Corresponding Deterministic System. Frequently Used Notations and Abbreviations. Index.

SAVE

20%

SAVE 20% when you order online and enter Promo Code EZL18 FREE standard shipping when you order online.

Catalog no. K21506 April 2014, 405 pp.

ISBN: 978-1-4822-0642-5 $89.95 / £57.99

http://www.ncsu.edu/crsc/reports.html

Lecture 1: Probability Measure Estimators in Nonparametric Models

H. T. BANKSCenter for Research in Scientific Computation

Center for Quantitative Sciences in Biomedicine N. C. STATE UNIVERSITY

Raleigh, N. C.NC STATE University

Uncertainty Quantification Summer School Univ. Southern California

Los AngelesAugust 11-13 2014.

1

UNCERTAINTY IN INVERSE PROBLEMS IMPORTANT IN:1) Data acquisition: (sensor) observation error2) Modeling intra- and inter-individual variability

in systems and data

Applications from composite materials, biology, E&M,and non-destructive testing where one seeks “effective”parameters averaged over variability or heterogeneity

Random (stochastic) parameters and mechanisms ( to treat inter-individual variability in aggregate data and observations)—”random effects/”mixing distributions” modeling

Stochastic models, aggregate dynamics2

OUTLINEInverse Problems with Uncertaintyi) Individual vs. Aggregate Dataii) Individual Dynamics-mixing distributions in

statistical inverse problems (NPML) or Prohorov Based Methods (PMF)

iii) Aggregate Dynamics-measure dependent dynamicsand PMF (Prohorov Metric Framework)

Ref: H.T. Banks, S. Hu and W.C. Thompson,Modeling and Inverse Problems in the Presence of Uncertainty Taylor/Francis-Chapman/Hall-CRC Press, Boca Raton, FL, 2014.

3

Prohorov Based Methods (PMF) in Inverse or Parameter Estimations Problems

• Developed in mid 1980’s thru 2010’s

• Useful ini) Individual dynamics/individual data formulations (patient care)ii) Individual dynamics/aggregate data formulations (Mosquito

fish, shrimp immune response in biodefense apps, PBPK cellularmodels)

iii) Aggregate dynamics/aggregate data: HIV-cellular progressionmodels, E&M with polarization (biotissue), viscoelastic materialsNDE in materials

4

Aggregate Data:

Individual Dynamics:

where f can represent ordinary, functional, or partial differential equation

Minimize

over

Includes as special cases usual problems with constant R.V.’s (i.e., usual vector or function space parameters-as opposed to Random Differential Equations)

2( ) [ ( ; ) : ]i ii

J P C x t q P d= −∑ E

(Q) QP probability measures over∈ =P

GENERIC INVERSE PROBLEM I:Aggregate Data-Individual Dynamics

~ [ ( ; ) : ]i id C x t q PE

( , ( ), ), dx f t x t q q Qdt

= ∈

5

Q( ; ) [ ( ; ) : ] ( ; ) ( )x t P x t q P x t q dP q= ≡ ∫E

Here

In this case, one has individual dynamics for eachrealization q of a random variable with distributionP. One solves the system many times for these realizations and then computes the expected valueof x with respect to P. This is then used with the data in the estimation- i.e., optimization of

2( ) [ ( ; ) : ]i ii

J P C x t q P d= −∑ E

(Q)P∈Pover in Ordinary Least Squares (OLS)6

Applied Math literature**HTB,L.Botsford,F.Kappel,C.Wang-Proc. Math Ecology,Trieste,1986HTB,LB,FK,CW-Proc. 5th IFAC,Perpignan,1989**HTB, B.Fitzpatrick-CAMS-TR90-2;Quart. Appl. Math,49 (1991),p.215-235HTB-CRSC-TR92-11,1992BF-CRSC-TR93-20,1993HTB,BF,Y.Zhang-CRSC-TR94-12,1994HTB,L.Potter,YZ-Memoria Congress Biomath,Panama,1997HTB,BF,LP,YZ-CRSC-TR98-06,1998HTB-CRSC-TR98-39,1998;Math.Comp.Mod33(2001),39-47**HTB,K.Bihari-CRSC-TR99-40,1999;Inv. Prob.17(2001),p.1-17 7

Initial efforts carried out in context of control of mosquitofish populations-Sinko-Streifer size-structured popln models

1

0

0 1

0 0

1

: ( , ) ,

( ) , 0

(0, ) ( )

( , ) ( , ) ( , ) ( , )

(

) 0

:

,

0 1x

x

Mixing densities v t xv gv v x x x tt x

t time x size

x = min size,x = max size

k

v x x

g t x v t x k t v t d

g t x

fecundity

Individual growth dxdt

dynamics

µ

α α α

∂ ∂+ = − < < >

∂ ∂= Φ

=

=

=

=

=

=

∫

( , )g t x 8

10 1

( , ; ) ,

( , ) (

( , )

; (

, )

)

v t x g gg

u t x v t x g d

correspond to individual growth ratesadmissible set of growth ratescompact H x x

Total population density

where is a p

P g

P

=

⊂

∈

= ∫G

G

G

robability measure on

Distributions of growth rates produce dispersionand cohort development in total population

G

10

Needs:(to carry out a careful mathematical analysis)

i) Topology on

ii) Continuity of

iii) Compactness of

(Q)=P P

( )P J P→

(Q)P

Brief summary of theory

Possible topologies on : Levy (R), Prohorov (Q), Bounded Lipschitz (Q);Total variation(Q), Kolmogorov (R)

(Q)P

11

1 2

1 2

(Q,d) . Q 0,

Q: d( , ) , .

:Prohorov m (Q)etric

( , )

inf 0 : [

(Q)

] [

Let be a complete metric space For any closedF and define

F q q q for some q F

Then define theb

P P

P F P F

y

ε

ε

ε

ρ

ρ

ε

⊂ >

= ∈ < ∈

≡

>

×

≤

→

+P P R

] , , Q .F closed Fε ε+ ⊂

12

(Q) : .

( ( ), )

..

P P are probability measures on Qis a metric space with the

It is a metric space and iscomplete compacP Q Prohorov me

if Q it

st compactricρ ρ

= =P P

RANDOM VARIABLES and ASSOCIATED METRIC SPACES

PROHOROV METRIC (weak* convergence for )

( , ) 0 ( )

[ ] [ ]

( ) 0

k k

Q Q

k

P P gdP gdP for all g C Q

convergence in expectation P A P A for

For details on Prohorov metric and anapproxi

all Borel A Q wi A

a

th P

m

ρ → ⇔ → ∈

⇔

⇔ → ⊂ ∂ =

∫ ∫

, [1].tion theory see

[1] H.T.Banks and K.L.Bihari, Modeling and estimating uncertainty in parameter estimation, CRSC-TR99-40, NCSU, Dec.,1999; Inverse Problems 17(2001),1-17.

(Q) C*(Q)⊂P

13

GENERAL THEORETICAL FRAMEWORKApplication here to ODE systems that include

population models

0

( , ( ), ) q Q

x(0)=x

:

Sys dx f t x ttem qdt

= ∈

( , , ) ( , , ) [0, ] , .

" " , ( ; )

n n

Argue that t x q f t x q is continuous fromT R Q to R locally Lipschitz in x

Then by standard continuous dependence on parametersresults for ODE we obtain that q x t q is continuousfro

→

× ×

→

.nm Q to R for each t14

2

1

, , , .

( ) [ ( ; ) : ]

( )( (

- ( ,2001)

), )

i iiThis yields is continuous

from to with respect to the Prohorov metricand is compactThen the general theor

P J P C x t

y of Banks Bihari Inverse Problems

q P d

P Q RP

can

Qρ

ρ

→ = −∑ E

(

). ,

existence stabilitycontinuous dependence wrt to obs

be followed to obtain and forinverse problemsof solutions of the inverse pr

ervations

approximatioblem Moreover an

as a basion the s foo y rr co .

mputational methodsis obtained

15

1

,

( ) ( ) : , , , 0 .

ˆ ˆ , ( ) ˆ ˆ.

ˆ ( )

Mj

MM j MM

MM MM M j j M j jqj

k ki i

k

kM

Let Q q Q be such that Q is dense in Q

P Q P P Q P p q Q p R p

Let d d d d be sets of data observations such that

d d

Define P d set of minimizers f

δ=

∗

= ⊂

= ∈ = ∈ ∈ ≥

= =

→

=

∑

( ) ( ),ˆ ( ) ( ) ( ).

( , ) ˆ ˆ ˆ ˆ( ( ), ( )) 0 , ,

.

: k

k

kM

Mor J P over P Q

and P d set of minimizers for J P over P Q Letdist A B be the Hausdorff distance between

dist P d P d as M d

se

d soTheo thre atsolutions

ts

m

nd B

de

A a∗ ∗

∗

→ →

=

∞ →

" ".pend continuously on data and approximate problems

are method stable

METHOD STABILITY UNDER APPROXIMATION

16

EXAMPLE

Uptake of trichloroethylene (TCE) in fat tissue(PBPK models)

17

PBPK Models forTCE in Fat Cells

Millions of cells withvarying size, residencetime, vasculature, geometry:“Axial-dispersion” typeadipose tissue compartmentsto embody uncertainphysiological heterogeneitiesin single organism (rat) =intra-individual variability

Inter-individual variability treated with parameters (including dispersionparameters) as random variables –estimate distributions from aggregatedata (multiple rat data) which also contains uncertainty (noise)

18

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( )( ) ( )( ) ( ) ( )( )

( )( ) ( )( )

2

0 02 2

,

/

sinsin

v br k l m tv f B br k l m t c v

br k l m t

a c v p c c p b

brbr br a br br

B B B BB B I BI I I B B A BA A A B B

I

dC t Q Q Q Q QV Q C t C t C t C t C t C t Q C tdt P P P P P

C t Q C t Q C t Q Q P

dC tV Q C t C t P

dtC V D CV vC f C f C f C f C

r r

V

π ε

φ λ µ θ λ µ θφ φ φ φ

= − + + + + + −

= + +

= −

∂ ∂∂= − + − + − ∂ ∂ ∂

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( )

0

0

2

2 2 21

2

2 2 20

1 1 sinsin sin

1 1 sinsin sin

I I I I IB I BI B B I I IA A A I I

A A A A AA B A BA B B A A IA I I A A

kk k a

C V D C C f C f C f C f Ct r

C V D C CV f C f C f C f Ct r

dC tV Q C t C

dt

θ

θ

φ δ θ χ φ λ µ µφ θ φ φ φ

φ δ θ χ φ λ µ µφ θ φ φ φ

∂ ∂ ∂∂= + + − + − ∂ ∂ ∂ ∂

∂ ∂ ∂∂= + + − + − ∂ ∂ ∂ ∂

= − ( )( )( ) ( ) ( ) ( ) ( )

( ) ( ) ( )( )( ) ( ) ( )( )

max

/

/

/

k k

l l l ll l a M

l ll

mm m a m m

tt t a t t

t P

dC t C t C t C tV Q C t v k

P Pdt P

dC tV Q C t C t P

dtdC t

V Q C t C t Pdt

= − − +

= −

= −Plus boundary conditionsand initial conditions

Whole-body system of equations

19

Estimation of Non Parameterized Distributions in PBPK/TCE Models

No prior assumptions as to form (shape)of distribution

Use the Prohorov based approximation and convergence theory!!

20

1 1 2 2 :

1, .1667 3, .2Example Bimodal Gaussian P

µ σ µ σ

∗

= = = =

21

*P32P

22

References:

1)R.A.Albanese,H.T.Banks,M.V.Evans,and L.K.Potter, PBPK models for the transport oftrichloroethylene in adipose tissue,CRSC-TR01-03,NCSU,Jan.2001; Bull. Math Biology,64 (2002), p.97-131.

2)H.T.Banks and L.K.Potter,Well-posedness results for a class of toxicokinetic models,CRSR-TR01-18,NCSU,July,2001;Dynamical Systems and Applications, 14 (2005),p. 297-322.

3)L.K.Potter,Physiologically based pharmacokinetic models for the systemic transport ofTrichloroethylene, Ph.D. Thesis,NCSU, August,2001

4)H.T.Banks and L.K. Potter, Model predictions and comparisons for three toxicokineticmodels for the systemic transport of TCE,CRSC-TR01-23,NCSU,August,2001; Mathematical and Computer Modeling, 35(2002), p.1007-1032.

5)H.T.Banks and L.K.Potter, Probabilistic methods for addressing uncertainty and variability in biological models: Application to a toxicokinetic model, CRSC-TR02-27, Sept., 2002; Math. Biosci., 192 (2004), p. 193-225.

23

Data:

Dynamics:

where f can represent ordinary, functional, or partial differential equation and x is the expected value of “states” x

Minimize

over

Remark: Individual dynamics may not be available

( ; )

( , ( ), )

i id Cx t Pdx g t x t Pdt

∼

=

2( ) ( ; )i ii

J P Cx t P d= −∑(Q) QP probability measures over∈ =P

GENERIC INVERSE PROBLEM II:”Aggregate” dynamics/aggregate data

24

GENERAL THEORETICAL FRAMEWORKApplication here to FDE systems that include HIV models

discussed below

0

( , , ) (Q)

x ( ) ( ),

:

0

t

t

Syst dx f t x P Pdt

x r

em

x tϕ θ θ θ

= ∈

= = − ≤ ≤

P

( , , ) ( , , ) [0, ] [0, ] ( ) , .

" " ' (

n

Argue that t P f t P is continuous fromT C r P Q to R locally Lipschitz in

Then by continuous dependence on parametersresults for FDE s Extension of standard ODE resultsto FDE's -

ψ ψ

ψ

→

× ×

1969]), ( ; ) ( ) .

.

n

-see [HTB,SIAM J. Control 1968,we obtain that P x t P is continuous from P Q to Rfor each t Then one proceeds with the theory outlinedabove

→

25

EXAMPLEHIV pathogenesis in infection

26

Involves systems of equations of the form (generally nonlinear)

τwhere is a production delay (distributed across the population of cells). That is, one should write

where k is a probability density to be estimated from aggregate data.

Even if k is given, these systems are nontrivial to simulate—requiredevelopment of fundamental techniques.

( ) ( ) ( ) ( ) ( )a c vtdV cV t n A t n C t n V t T tdt

τ= − + − + −

0

( ) ( ) ( ) ( ) ( ) ( )a c vtdV cV t n A t k d n C t n V t T tdt

τ τ τ∞

= − + − + −∫

28

10

20

20

( ) ( ) ( ) ( ) ( ) ( , )

( ) ( ( )) ( ) ( ) ( ) ( , )

( ) ( ( )) ( ) ( ) ( )

( ) ( ( )) ( ) ( , )

r

A C

r

v A

r

v C

u u

V t cV t n A t dP n C t p V T

A t r X t A t A t dP p V T

C t r X t C t A t dP

T t r X t T t p V T S

τ τ

δ δ γ τ τ

δ δ γ τ τ

δ δ

= − + − + −

= − − − − +

= − − + −

= − − − +

∫

∫

∫

HIV Model:

2 20

1

10

2

1

, acute cells

( ) ( ) ( ),

delay from acute infection to viral production delay from acute infecti

( ) ( ; ) ( ; ) ( )

( ) ( ; ) ( ;

on to chronic infecti

)

n

(

o

)

r

A C

r

A A A

C t C t C t dP

V t V t V t

A

V t V t V t

PPT

dP

τ τ τ

τ τ τ

= =

= =

=

= +

=

∫

∫

E

E

target cells, total (infected+uninfected) cellsX =

where

29

Typical model variables:

V = infectious viral population (expected values)

A = acutely infected cells

C = chronically infected cells (expected values)

T = uninfected target cells

X = A+C+T = total cell population

Some models (especially those involving treatment and control) also entail immune response variables

30

References:1) D. Bortz, R. Guy, J. Hood, K. Kirkpatrick, V. Nguyen, and V. Shimanovich,

Modeling HIV infection dynamics using delay equations, in 6th CRSC Industrial Math Modeling Workshop for Graduate Students, NCSU(July,2000), CRSC-TR00-24, NCSU, Oct, 2000

2) H. T. Banks, D. M. Bortz, and S. E. Holte, Incorporation of variability into the modeling of viral delays in HIV infection dynamics, CRSC-TR01-25, Sept, 2001; Math Biosciences, 183 (2003), p.63-91.

3) H.T.Banks, Incorporation of uncertainty in inverse problems,CRSC-TR02-08, March, 2002; in Proc. Intl. Conf. InverseProblems(Hong Kong,Jan. 9-12,2002) World Scientific Press.

4) H.T.Banks and D.M.Bortz, A parameter sensitivity methodology in the context of HIV delay equation models, CRSC-TR02-24, August, 2002; J. Math Biology, 50 (2005), p. 607-625.

5) D.M.Bortz, Modeling, Analysis, and Estimation of an in vivo HIV Infection Using Functional Differential Equations, Ph.D. Thesis, N.C.State University, August, 2002. 31

Aggregate Dynamics--Other Applications:

i) Viscoelasticity in composite materials such as tissue

ii) Electromagnetics in materials with dispersion

HTB and G.A. Pinter, A probabilistic multiscale approach to hysteresis in shear wave propagation in Biotissue, CRSC-TR04-03, January, 2004; SIAM J. Multiscale Modeling and Simulation, 3 (2005), 395-412.

HTB and N.L. Gibson, Electromagnetic inverse problems involving distributions of dielectric mechanisms and parameters, CRSC-TR05-29, August, 2005; Quarterly of Applied Mathematics, 64 (2006), 749-795. 32

Asymptotic Properties of Probability MeasureEstimators in a Nonparametric Model

H.T. Banks, Jared Catenacci, and Shuhua Hu

Center for Quantitative Sciences in BiomedicineCenter for Research in Scientific Computation

Department of MathematicsNorth Carolina State University

Raleigh, NC

July, 2014

Banks,Catenacci,Hu Asymptotic Properties

Introduction

We consider nonparametric estimation of an unknownprobability measure in the case where the regression functionis dependent on this measure. More precisely, the statisticalmodel, the model describing the observation process, isdescribed by

Yj = f (tj ;P0) + Ej , j = 1,2,3, . . . ,N. (1)

f (tj ;P0) denotes the observed part of the solution of amathematical model with the true probability measure P0

(unknown) at the measurement point tjEj is the measurement error at tjN is the total number of observations, where tj ∈ [ts, tf ],j = 1,2,3, . . . ,N, with ts and tf being some real numbers


Equation (1) is often referred to as a nonparametric statisticalmodel (a model with all the unknown parameters being in aninfinite-dimensional parameter space) in the statistics literature.Such models are motivated by a number of applications arisingin biology and physics, for example, in modeling mosquitofishpopulations [BanksFitzpatrick1991] and shrimp populations[BanksDavisErnstbergerHuArtimovichDhar2009], in wavepropagation in biotissue [BanksPinter2005], in modeling of acomplex nonmagnetic dielectric materials[BanksGibson2006],[BanksCatenacciHuKenz2013] and in HIVcellular models [BanksBortz2005]. Here we only elaborate oneof the motivating examples, a recent project[BanksCatenacciHuKenz2013] investigated by our group.


In this project, the goal is to develop a noninvasive technique tocharacterize the changes or degradation of a complexnonmagnetic dielectric material (such as tissue or inorganicglasses) by assessing the small physical and chemical changesin the material using reflectance spectroscopy. This involvesdetermining the components of the permittivity of the dielectricmedium using the measured spectral responses. The relativepermittivity of the dielectric medium is described by

εr (k ;P0) = ε∞ −∫

Ωθ

k2p

k2 − ik/τ − k20

dP0(θ). (2)


ε∞ denotes the relative permittivity of the dielectricmedium at infinite frequency

k is the wavenumber (k = ω/(2πc), where ω is the angularfrequency and c is the speed of light)

k0 represents the resonance wavenumber, τ denotes therelaxation time

composite parameter kp is given by kp = k0√εs − ε∞ with

εs being the relative permittivity of the medium at zerofrequency, θ = k0 ∈ Ωθ ⊂ R


Figure: A monochromatic uniform wave is incident at an angle θ on aplane interface between a free space and a nonmagnetic dielectricmedium, where ω = 2πck denotes the frequency of the wave and k isthe wavenumber.

Banks,Catenacci,Hu Composite Material Reflectivity

Assume a monochromatic uniform wave is incident at an anglezero on a plane interface between free space and anonmagnetic dielectric medium with the electric field polarizedperpendicular to the plane of incidence, then reflectioncoefficient is given by

rs(k ;P0) =1 −

√εr (k ;P0)

1 +√εr (k ;P0)

, (3)

where εr is defined by (2). The observations fj are thereflectance (the square of the magnitude of the reflectioncoefficient) at different wave numbers kj ; that is,fj = |rs(kj ;P0)|2. The goal is then to use these observations toestimate the unknown probability measure P0.


Problem here is different from those, for example, inpharmacokinetics studies and HIV studies to estimate bothindividual-specific parameters θ (such as clearance rate ofthe virus and infection rate in HIV studies) and theirassociated probability distribution function P0 from bloodsamples taken serially in time from individuals in thepopulation– data fj is dependent on θ instead of P0

In those one has individual longitudinal data instead ofaggregate longitudinal data (i.e., data collected bysampling from the population at large).

methods used to solve these two types of problems arefundamentally different.

We refer the interested reader to[BanksFA2012,BanksHuThompson, Modeling and InverseProblems in the Presence of Uncertainty2014] for more detailson this topic.


Theoretical and Computational Framework forProbability Measure Estimation

observations Yj in (1) are scalar (the multi-dimensionalcase can be treated similarly)

measurement errors Ej , j = 1,2,3, . . . ,N, are independentand identically distributed (i.i.d.) with zero mean andconstant variance σ2

0 defined on some probability space(Ω,F ,Prob)

f (t ;P0) correctly describes the observed part of thedynamical system (that is, the underlying mathematicalmodel is correct).


With the i.i.d. assumption on the measurement errors, theestimator of P0 can be obtained using the ordinaryleast-squares method as defined by

PN = arg minP∈P(Ωθ)

N∑

j=1

(Yj − f (tj ;P))2, (4)

where P(Ωθ) denotes the set of probability measures on thespace Ωθ ⊂ Rκθ with κθ being a positive integer. We remarkthat PN itself is random in that it is a function of randomvariables Yj (and hence Ej ) on a probability space (Ω,F ,Prob).


The corresponding realization PN of PN can be calculatedthrough

PN = arg minP∈P(Ωθ)

N∑

j=1

(yj − f (tj ;P))2, (5)

where yj is a realization of Yj , j = 1,2,3, . . . ,N. Thus, we canview PN as a stochastic process (i.e., PN(θ; ·) as a oneparameter (θ ∈ Ωθ) family of random variables on theprobability space (Ω,F ,Prob)) since each of its realizationsyields a probability measure PN ∈ P(Ωθ).


The existence of a minimizer to the least-squares optimizationproblem (4) or (5) can be established under the ProhorovMetric Framework. [Prohorov 1956; BanksFA 2012;BanksHuThompson 2014]

Definition

Let F ⊂ Ωθ be any closed set and define Fǫ as follows:

Fǫ = θ ∈ Ωθ : inf˜θ∈F

d(θ, θ) < ǫ,

where d denotes the metric on Ωθ. For P,Q ∈ P(Ωθ), theProhorov metric is given by

ρ(P,Q)= inf ǫ > 0|Q(F) ≤ P(Fǫ) + ǫ and P(F) ≤ Q(Fǫ) + ǫ

where inf is over all F closed in Ωθ.


the meaning of Prohorov metric is far from intuitive

several useful characterizations.

convergence in the Prohorov metric is equivalent to theweak∗ convergence if we view P(Ωθ) ⊂ C∗

B(Ωθ), whereC∗

B(Ωθ) denotes the topological dual of the space CB(Ωθ)of bounded and continuous functions on Ωθ.

i.e., ρ(Pj ,P) → 0 is equivalent to the statement∫

Ωθ

h(θ)dPj(θ) →∫

Ωθ

h(θ)dP(θ) for any h ∈ CB(Ωθ).


Prohorov metric also possesses many useful and importantproperties. For example, if we assume that Ωθ is compact, thenP(Ωθ) is a compact metric space when taken with the Prohorovmetric ρ. Based on these discussions, we see that if Ωθ iscompact and f is continuous with respect to P, then there existsa solution to (4) or (5).


Consistency of the Probability Measure Estimator

The ideas for establishing the consistency of probabilitymeasure estimators follow closely those given in[BanksFitzpatrick1991] and [BanksHuThompson2014.book].

Theorem

Under usual assumptions on data and how collected,

ρ(PN ,P0)a.s.−→ 0 as N → ∞, where a.s.−→ denotes convergence

almost surely in (Ω,F ,Prob). That is,

Probω ∈ Ω

∣∣∣ limN→∞

ρ(PN(ω),P0) = 0

= 1.


Approximation Schemes for Probability MeasureEstimation

We note that (5) is an infinite-dimensional optimization problem.Hence, the infinite-dimensional space P(Ωθ) must beapproximated by some finite dimensional space PM(Ωθ) so thatone has a computationally tractable finite-dimensionaloptimization problem given by

PNM = arg minP∈PM(Ωθ)

∑Nj=1(yj − f (tj ;P))2. (6)

However, one needs to choose PM(Ωθ) in a meaningful way sothat PN

M approaches the solution to (5) as M → ∞.


Dirac Measures Approximations

Theorem

Assume Ωθ ⊂ Rκθ is compact. Let ΩθD = θj∞j=1 be anenumeration of a countable dense subset of Ωθ. Define

PD(Ωθ)

=

P ∈ P(Ωθ)

∣∣∣P =

M∑

j=1

aj∆θ j, θj ∈ ΩθD, aj ∈ [0, 1] ∩Q,

M∑

j=1

aj = 1

,

where ∆θjis the Dirac measure with atom at θj , M ∈ N and

Q ⊂ R denotes the set of all rational numbers. (That is, PD(Ωθ)is the collection of all convex combinations of Dirac measureson Ωθ with atoms θj ∈ ΩθD and rational weights.) Then PD(Ωθ)is dense in (P(Ωθ), ρ), and thus P(Ωθ) is separable.


Under this Dirac measure approximation framework, we definePM(Ωθ) to be the set of all atomic probability measures withnodes placed at the first M elements in the enumeration of thecountable dense subset of Ωθ; that is,

PM(Ωθ) =

P ∈ P(Ωθ)

∣∣∣∣ P =

M∑

j=1

aj∆θ j, aj ≥ 0 and

M∑

j=1

aj = 1

. (7)

By theorem above we know that we can approximate anyelement P ∈ P(Ωθ) by a sequence PMj

, PMj∈ PMj

(Ωθ), suchthat ρ(PMj

,P) → 0 as Mj → ∞. We also see that this Diracmeasure approximation method can be used regardless of thesmoothness of probability measures. This is especially usefulin the situations where one has no knowledge of thesought-after probability measures.


Linear Spline Approximations (when probability measuresare absolutely continuous) – their corresponding probabilitydensity functions exist [BanksPinter 2005].

Theorem

Assume Ωθ ⊂ Rκθ is compact. Define

PS(Ωθ)

= P ∈ P(Ωθ)∣∣∣P′(θ) =

M∑

j=1

aj lMj (θ), aj ∈ [0,∞) ∩Q,

M∑

j=1

aj

∫

Ωθ

lMj (ξ)dξ = 1,M ∈ N,

where P ′ denotes the derivative of P with respect to θ, the lMj denote the usual piecewise linear splines, and Q ⊂ R denotesthe set of all rational numbers. Then PS(Ωθ) is dense in P(Ωθ).


Under this linear spline approximation framework, we definePM(Ωθ) to be

PM(Ωθ) = P ∈ P(Ωθ)

∣∣∣∣ P′(θ) =

M∑

j=1

aj lMj (θ), aj ≥ 0,M∑

j=1

aj

∫

Ωθ

lMj (ξ)dξ = 1.

(8)By above Theorem, we know that we can approximate anyelement P ∈ P(Ωθ) by a sequence PMj

, PMj∈ PMj

(Ωθ), suchthat ρ(PMj

,P) → 0 as Mj → ∞.


The following theorem provides the desired convergence resultwhich follows immediately from the Prohorov metric frameworkand convergence theorems of [BanksKunisch1989] as well asthe results above.

Theorem

Assume Ωθ is compact and P(Ωθ) is taken with the Prohorovmetric. If f is continuous with respect to P, then there exists aminimizer PN

M to (6), where PM(Ωθ) is chosen as either (7) or(8). Moreover, the sequence PN

M has at least one convergentsubsequence, and the limit PN∗ of such a subsequence is aminimizer to the least-squares problem (5).


RemarkDirac measure approximation methods and thespline-based approximation methods have beensuccessfully used to estimate probability measures in anumber of applications

it was demonstrated in [BanksDavis2007] that if thesought-after probability measure is absolutely continuous,then the spline-based approximation methods convergemuch faster than do the Dirac measure approximationmethods (in terms of the value of M)

moreover it was observed that spline-based approximationmethods also provide convergence for associatedprobability density functions while the Dirac measureapproximation methods do not do this

in spline-based approximation methods one directlyapproximates associated probability density functionsinstead of cumulative distribution functions


Bias and Variance in Probability MeasureEstimation

As we discussed in the above section, what one actually doesin practice is to minimize the cost functional in afinite-dimensional space; that is, one solves the optimizationproblem


∑Nj=1(Yj − f (tj ;P))2. (9)

For example, if one uses the Dirac measure approximationmethods, then PN

M = ∆T AN

M , where∆ = ∆(θ) = (∆θ1

,∆θ2, . . . ,∆θM

)T , and

ANM = arg minaN

M∈RM

∑Nj=1

[Yj − f (tj ;

∑Ml=1 aN

M,l∆θl)]2

. (10)


Here RM =aM = (aM

1 ,aM2 , . . . ,aM

M)T∣∣∣ aM

j ≥ 0, j = 1,2, . . . ,M,∑M

j=1 aMj = 1

.

The corresponding realization of (9) is given by


∑Nj=1(yj − f (tj ;P))2; (11)

that is, PNM = ∆

T aNM , where ∆ = (∆θ1

,∆θ2, . . . ,∆θM

)T , and

aNM = arg minaN

M∈RM

∑Nj=1

[yj − f (tj ;

∑Ml=1 aN

M,l∆θl)]2

. (12)


In essence, one presumes that the data was generated usingthe following statistical model

Yj = f (tj ;a0,M) + Ej , j = 1,2,3, . . . ,N. (13)

In the above equation, f (tj ;a0,M) = f (tj ;P0,M), whereP0,M = ∆

T a0,M ∈ PM(Ωθ), and a0,M ∈ RM

is the one that minimizes

J0(aM) = σ20 +

∫ tf

ts(f (t ;P0)− f (t ;aM ))2dµ(t) (14)

over RM . In other words, the functional J0 defined by

J0(P) = σ20 +

∫ tf

ts(f (t ;P0)− f (t ;P))2dµ(t) (15)

has a minimizer P0,M in PM(Ωθ) for each fixed M.


Thus, we have a model “misspecification", which is due to theapproximation of the infinite-dimensional space P(Ωθ) by thefinite-dimensional space PM(Ωθ). Under this framework, thetotal error between the true model (1) and the approximatingmodel (13) can be characterized by (illustrated in Figure 1)

ρ(P0,P0,M) + ρ(P0,M , PNM),

where the first term ρ(P0,P0,M) is a measure of the accuracy ofthe approximating model and is often called bias in thestatistics literature, and the second term ρ(P0,M , PN

M) is ameasure of estimation accuracy and is often called variance.


Figure: Illustration of the bias and the variance in the probabilitymeasure approximation.


Using properties of the Prohorov metric and and results above,we find the bias ρ(P0,P0,M) approaches zero as M → ∞.However, for fixed N the variance in general increases as thevalue of M increases (e.g., see [BurnhamAnderson]); that is,we have less confidence in the parameter estimates as thenumber of approximating parameters increases. Hence, thereis a trade-off between the bias and the variance

Figure: Illustration of the trade-off between the bias and the variance.Banks,Catenacci,Hu Asymptotic Properties

Model selection criteria (Akaike Information Criterion andBayesian Information Criterion) have been widely used toselect a best approximating model from a prior set ofcandidate models

all are based to some extent on the principle of parsimony(see [BHT.book,BurnhamAnderson])

goal in model selection is to simultaneously minimize bothbias (modeling error) and variance (estimation error)

use a model selection criterion to select a best M value(i.e., a best approximating model)


Pointwise Asymptotic Normality of theApproximate Probability Measure Estimator

In this section, we consider the pointwise asymptotic normalityof the least-squares estimator PN

M . Since for any given θ, PNM(θ)

is linearly dependent on ANM (for example, in the case where the

Dirac measure approximation is used, PNM(θ) = (∆(θ))T AN

M),we first consider the asymptotic normality of AN

M .


Theorem

Under usual standard assumptions, for each fixed M we have

√N(

ANM − a0,M

) d−→ Z ∼ N(0,Σ0,M

), as N → ∞, (18)

where d−→ denotes convergence in distribution,Σ0,M =

(H(a0,M)

)−1 F(a0,M)

(H(a0,M)

)−1, and N

(0,Σ0,M

)

represents a multivariate normal distribution with zero meanand covariance matrix Σ0,M .


If one uses the Dirac measure approximation method, then by(22) we know that for any sufficiently large N

PNM(θ) ∼ N ((∆(θ))T aN

M , 1N (∆(θ))T ΣN

M∆(θ)) (26)

holds for any fixed θ ∈ Ωθ. Similarly, one can use (22) to obtainthe pointwise asymptotic result for PN

M(θ) in the case where thelinear spline approximation method is employed.


Numerical Results

We use the motivating example in the Introduction todemonstrate our theoretic results through simulated data.Specifically, we consider the following nonparametric model

Yj = |rs(kj ;P0)|2 + Ej , j = 1,2,3, . . . ,N, (27)

with rs given by (3). The simulated data is then generated bysimulating

yj = |rs(kj ;P0)|2 + ǫj , j = 1,2,3, . . . ,N. (28)


Optimal Value of M

can use Akaike Information Criterion (AIC) (1973) todetermine the optimal value of M, where, e.g., probabilitymeasure is obtained by the linear spline approximationmethod

one of the most widely used model selection criteria,based on Kullback-Leibler information (a well-knownmeasure of “distance" between two probability densityfunctions) and maximum likelihood estimation

can be used to compare both nested models andnon-nested models, and it can also be used to comparemultiple models at a time


For the least squares case, it can be found (e.g., see[BurnhamAnderson2002],[BanksHuThompson2014] ) that if themeasurement errors are i.i.d. normally distributed, then the AICis given by

AIC = N log(

RSSN

)+ 2(M + 1). (29)


Here M + 1 is the total number of estimated parametersincluding the coefficients for the splines and the variance ofmeasurement errors, and RSS denotes the residuals of sumsquares given by

RSS =

N∑

j=1

(yj − |rs(kj ; PNM)|2)2.


given a prior set of candidate models, one calculates AICvalue for each model; best approximating model is onewith minimum AIC value

AIC value depends on data set used and one must usesame data set to calculate AIC values for each of models

AIC may perform poorly if sample size N is small relative tothe total number of estimated parameters; (it is suggested[BurnhamAnderson2002] that AIC should be used only ifsample size is at least 40 times total number of estimatedparameters


Otherwise, one needs to use the small sample AIC, theso-called AICc , which is given by

AICc = AIC +2(M + 1)(M + 2)

N − M − 2. (30)

For more information on the AIC and its variations, we refer the interested reader to [BurnAnd] and [BHT, Chapter 4].


set of our candidate models is chosen as model (13) withM = 5,10,15,20,25 and 30, and Ej , j = 1,2,3, . . . ,N,being i.i.d. normally distributed with zero mean andconstant varianceuse the AICc to select the best model (sample size is lessthan 40 times total number of estimated parameters)model with M = 15 is one with minimum AICc value

5 10 15 20 25 30−860

−840

−820

−800

−780

−760

−740

−720

−700

−680

−660

M

AIC

c

Figure: The AICc values with M = 5, 10, 15, 20, 25 and 30.


Pointwise Confidence Band

Can construct the pointwise confidence band for PNM by using

the asymptotic normality results presented above where M ischosen as the optimal value of M = 15 obtained in the aboveanalysis.That is, by (22) we know that for any sufficiently large N

PNM(k0) ∼ N (PN

M(k0),1N (L(k0))

T ΣNML(k0)) (31)

holds for any fixed k0 ∈ [k0, k0]. In the above equation,PN

M(k0) = (L(k0))T aN

M , where

L(k0) =

(∫ k0

k0

l1(ξ)dξ,∫ k0

k0

l2(ξ)dξ, . . . ,∫ k0

k0

lM(ξ)dξ

)T

.


One can then use (31) to construct the pointwise 100(1 − α)%level confidence band, which is given by[PN

M(k0)− t1−α/2SEPAN(k0), PNM(k0) + t1−α/2SEPAN(k0)

], k0 ∈ [k0, k0]

Here SEPAN(k0) =√

1N (L(k0))T ΣN

ML(k0), and the critical valuet1−α/2 is determined by ProbT ≥ t1−α/2 = α/2, where T hasa student’s t distribution tN−M with N − M degrees of freedom.For the simulations illustrated below, lj is the j th piecewise linearspline element using equally spaced nodes, and centraldifference schemes are used to approximate the first andsecond order derivatives involved in the covariance matrix ΣN

M .


The pointwise confidence bands for PNM

pointwise asymptotic normality vs. MC (α = 0.1, K = 1000).Similar except at plateau regions those using asymp results iswider than that of MC.

400 500 600 700 800 900 1000 11000

0.2

0.4

0.6

0.8

1

k0, 1/cm

Pro

babi

lity

Dis

trib

utio

n

True DistributionEstimated DistributionLower BandUpper Band

400 500 600 700 800 900 1000 11000

0.2

0.4

0.6

0.8

1

k0, 1/cm

Pro

babi

lity

Dis

trib

utio

n

True DistributionEstimated DistributionLower BandUpper Band

Figure: pointwise confidence bands for cumulative distributionfunction: pointwise asymptotic normality results (left) and those usingMC simulations (right).


400 500 600 700 800 900 1000 11000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

k, cm−1

|rs(k

)|2

DataModel Fit

500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

k0, 1/cm

Pro

babi

lity

Dis

trib

utio

n

Figure: Model fit (left) and the estimated distribution (right) from thefull inverse problem where N = 25 for Vitreous Germania.


500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

k0, 1/cm

Pro

babi

lity

Dis

trib

utio

n

N=5

N=10

N=15

N=20

N=25

Figure: The estimated distributions for all values of N consideredfrom the Vitreous Germania data.


The parameter estimates for εs, ε∞ and τ are given in the Tablebelow.

N εs ε∞ τ (cm)5 2.7677 2.1518 0.027510 2.5768 1.9634 0.041815 2.5999 1.9904 0.052520 2.4677 1.8341 0.057825 2.4361 1.7732 0.0581

Table: Estimations obtained using the reflectivity data for VitreousGermania using various numbers of Dirac measures.


H.T. Banks, A Functional Analysis Framework for Modeling,Estimation and Control in Science and Engineering,Chapman and Hall/CRC Press, Boca Raton, FL, 2012.

H.T. Banks and K.L. Bihari, Modeling and estimatinguncertainty in parameter estimation, Inverse Problems, 17(2001), 95-111.

H.T. Banks, S. Hu and W.C. Thompson, Modeling andInverse Problems in the Presence of Uncertainty,Taylor/Francis-Chapman/Hall-CRC Press, Boca Raton, FL,2014.

A.M. Efimov, Optical Contants of Inorganic Glasses, CRCpress, Boca Raton, Florida, 1995.

Yu.V. Prohorov, Convergence of random processes andlimit theorems in probability theory, Theor. Prob. Appl., 1(1956), 157–214.


modeling and inverse problems in the presence of...

Documents