modeling and inverse problems in the presence of...
TRANSCRIPT
Modeling and Inverse Problems in the Presence of Uncertainty
Lecture 1. Probability Measure Estimation in Nonparametric Models(Chapter 5 on BanksHuThompson)
Lecture 2. Propagation of Uncertainty in Continuous Time Dynamical Systems (Chapter 7 of BanksHuThompson)
Modeling and Inverse Problems in the Presence of Uncertainty
Authors/Affiliations
H. T. Banks, North Carolina State University, Raleigh, USA
Shuhua Hu, North Carolina State University, Raleigh, USA
W. Clayton Thompson, Noth Carolina State University, Raleigh, USA
This book collects recent research—including the authors’ own substantial projects—on uncertainty propagation and quantification. It covers two sources of uncertainty: where uncertainty is present primarily due to measurement errors and where uncertainty is present due to the modeling formulation itself. With many examples throughout addressing problems in physics, biology, and other areas, the book is suitable for applied mathematicians as well as scientists in biology, medicine, engineering, and physics.
Key Features Reviews basic probability and statistical concepts, making the book self-
contained Presents many applications and theoretical results from engineering, biology,
and physics Covers the general relationship of differential equations driven by white noise
(stochastic differential equations) and the ones driven by colored noise (random differential equations) in terms of their resulting probability density functions
Describes the Prohorov metric framework for nonparametric estimation of aprobability measure
Contains numerous examples and end-of-chapter references to research results, including the authors’ technical reports that can be downloaded from North Carolina State University’s Center for Research in Scientific Computation
Selected Contents Introduction. Probability and Statistics Overview. Mathematical and Statistical Aspects of Inverse Problems. Model Comparison Criteria. Estimation of Probability Measures Using Aggregate Population Data. Optimal Design. Propagation of Uncertainty in a Continuous Time Dynamical System. A Stochastic System and Its Corresponding Deterministic System. Frequently Used Notations and Abbreviations. Index.
SAVE
20%
SAVE 20% when you order online and enter Promo Code EZL18 FREE standard shipping when you order online.
Catalog no. K21506 April 2014, 405 pp.
ISBN: 978-1-4822-0642-5 $89.95 / £57.99
Lecture 1: Probability Measure Estimators in Nonparametric Models
H. T. BANKSCenter for Research in Scientific Computation
Center for Quantitative Sciences in Biomedicine N. C. STATE UNIVERSITY
Raleigh, N. C.NC STATE University
Uncertainty Quantification Summer School Univ. Southern California
Los AngelesAugust 11-13 2014.
1
UNCERTAINTY IN INVERSE PROBLEMS IMPORTANT IN:1) Data acquisition: (sensor) observation error2) Modeling intra- and inter-individual variability
in systems and data
Applications from composite materials, biology, E&M,and non-destructive testing where one seeks “effective”parameters averaged over variability or heterogeneity
Random (stochastic) parameters and mechanisms ( to treat inter-individual variability in aggregate data and observations)—”random effects/”mixing distributions” modeling
Stochastic models, aggregate dynamics2
OUTLINEInverse Problems with Uncertaintyi) Individual vs. Aggregate Dataii) Individual Dynamics-mixing distributions in
statistical inverse problems (NPML) or Prohorov Based Methods (PMF)
iii) Aggregate Dynamics-measure dependent dynamicsand PMF (Prohorov Metric Framework)
Ref: H.T. Banks, S. Hu and W.C. Thompson,Modeling and Inverse Problems in the Presence of Uncertainty Taylor/Francis-Chapman/Hall-CRC Press, Boca Raton, FL, 2014.
3
Prohorov Based Methods (PMF) in Inverse or Parameter Estimations Problems
• Developed in mid 1980’s thru 2010’s
• Useful ini) Individual dynamics/individual data formulations (patient care)ii) Individual dynamics/aggregate data formulations (Mosquito
fish, shrimp immune response in biodefense apps, PBPK cellularmodels)
iii) Aggregate dynamics/aggregate data: HIV-cellular progressionmodels, E&M with polarization (biotissue), viscoelastic materialsNDE in materials
4
Aggregate Data:
Individual Dynamics:
where f can represent ordinary, functional, or partial differential equation
Minimize
over
Includes as special cases usual problems with constant R.V.’s (i.e., usual vector or function space parameters-as opposed to Random Differential Equations)
2( ) [ ( ; ) : ]i ii
J P C x t q P d= −∑ E
(Q) QP probability measures over∈ =P
GENERIC INVERSE PROBLEM I:Aggregate Data-Individual Dynamics
~ [ ( ; ) : ]i id C x t q PE
( , ( ), ), dx f t x t q q Qdt
= ∈
5
Q( ; ) [ ( ; ) : ] ( ; ) ( )x t P x t q P x t q dP q= ≡ ∫E
Here
In this case, one has individual dynamics for eachrealization q of a random variable with distributionP. One solves the system many times for these realizations and then computes the expected valueof x with respect to P. This is then used with the data in the estimation- i.e., optimization of
2( ) [ ( ; ) : ]i ii
J P C x t q P d= −∑ E
(Q)P∈Pover in Ordinary Least Squares (OLS)6
Applied Math literature**HTB,L.Botsford,F.Kappel,C.Wang-Proc. Math Ecology,Trieste,1986HTB,LB,FK,CW-Proc. 5th IFAC,Perpignan,1989**HTB, B.Fitzpatrick-CAMS-TR90-2;Quart. Appl. Math,49 (1991),p.215-235HTB-CRSC-TR92-11,1992BF-CRSC-TR93-20,1993HTB,BF,Y.Zhang-CRSC-TR94-12,1994HTB,L.Potter,YZ-Memoria Congress Biomath,Panama,1997HTB,BF,LP,YZ-CRSC-TR98-06,1998HTB-CRSC-TR98-39,1998;Math.Comp.Mod33(2001),39-47**HTB,K.Bihari-CRSC-TR99-40,1999;Inv. Prob.17(2001),p.1-17 7
Initial efforts carried out in context of control of mosquitofish populations-Sinko-Streifer size-structured popln models
1
0
0 1
0 0
1
: ( , ) ,
( ) , 0
(0, ) ( )
( , ) ( , ) ( , ) ( , )
(
) 0
:
,
0 1x
x
Mixing densities v t xv gv v x x x tt x
t time x size
x = min size,x = max size
k
v x x
g t x v t x k t v t d
g t x
fecundity
Individual growth dxdt
dynamics
µ
α α α
∂ ∂+ = − < < >
∂ ∂= Φ
=
=
=
=
=
=
∫
( , )g t x 8
9
10 1
( , ; ) ,
( , ) (
( , )
; (
, )
)
v t x g gg
u t x v t x g d
correspond to individual growth ratesadmissible set of growth ratescompact H x x
Total population density
where is a p
P g
P
=
⊂
∈
= ∫G
G
G
robability measure on
Distributions of growth rates produce dispersionand cohort development in total population
G
10
Needs:(to carry out a careful mathematical analysis)
i) Topology on
ii) Continuity of
iii) Compactness of
(Q)=P P
( )P J P→
(Q)P
Brief summary of theory
Possible topologies on : Levy (R), Prohorov (Q), Bounded Lipschitz (Q);Total variation(Q), Kolmogorov (R)
(Q)P
11
1 2
1 2
(Q,d) . Q 0,
Q: d( , ) , .
:Prohorov m (Q)etric
( , )
inf 0 : [
(Q)
] [
Let be a complete metric space For any closedF and define
F q q q for some q F
Then define theb
P P
P F P F
y
ε
ε
ε
ρ
ρ
ε
⊂ >
= ∈ < ∈
≡
>
×
≤
→
+P P R
] , , Q .F closed Fε ε+ ⊂
12
(Q) : .
( ( ), )
..
P P are probability measures on Qis a metric space with the
It is a metric space and iscomplete compacP Q Prohorov me
if Q it
st compactricρ ρ
= =P P
RANDOM VARIABLES and ASSOCIATED METRIC SPACES
PROHOROV METRIC (weak* convergence for )
( , ) 0 ( )
[ ] [ ]
( ) 0
k k
Q Q
k
P P gdP gdP for all g C Q
convergence in expectation P A P A for
For details on Prohorov metric and anapproxi
all Borel A Q wi A
a
th P
m
ρ → ⇔ → ∈
⇔
⇔ → ⊂ ∂ =
∫ ∫
, [1].tion theory see
[1] H.T.Banks and K.L.Bihari, Modeling and estimating uncertainty in parameter estimation, CRSC-TR99-40, NCSU, Dec.,1999; Inverse Problems 17(2001),1-17.
(Q) C*(Q)⊂P
13
GENERAL THEORETICAL FRAMEWORKApplication here to ODE systems that include
population models
0
( , ( ), ) q Q
x(0)=x
:
Sys dx f t x ttem qdt
= ∈
( , , ) ( , , ) [0, ] , .
" " , ( ; )
n n
Argue that t x q f t x q is continuous fromT R Q to R locally Lipschitz in x
Then by standard continuous dependence on parametersresults for ODE we obtain that q x t q is continuousfro
→
× ×
→
.nm Q to R for each t14
2
1
, , , .
( ) [ ( ; ) : ]
( )( (
- ( ,2001)
), )
i iiThis yields is continuous
from to with respect to the Prohorov metricand is compactThen the general theor
P J P C x t
y of Banks Bihari Inverse Problems
q P d
P Q RP
can
Qρ
ρ
→ = −∑ E
(
). ,
existence stabilitycontinuous dependence wrt to obs
be followed to obtain and forinverse problemsof solutions of the inverse pr
ervations
approximatioblem Moreover an
as a basion the s foo y rr co .
mputational methodsis obtained
15
1
,
( ) ( ) : , , , 0 .
ˆ ˆ , ( ) ˆ ˆ.
ˆ ( )
Mj
MM j MM
MM MM M j j M j jqj
k ki i
k
kM
Let Q q Q be such that Q is dense in Q
P Q P P Q P p q Q p R p
Let d d d d be sets of data observations such that
d d
Define P d set of minimizers f
δ=
∗
= ⊂
= ∈ = ∈ ∈ ≥
= =
→
=
∑
( ) ( ),ˆ ( ) ( ) ( ).
( , ) ˆ ˆ ˆ ˆ( ( ), ( )) 0 , ,
.
: k
k
kM
Mor J P over P Q
and P d set of minimizers for J P over P Q Letdist A B be the Hausdorff distance between
dist P d P d as M d
se
d soTheo thre atsolutions
ts
m
nd B
de
A a∗ ∗
∗
→ →
=
∞ →
" ".pend continuously on data and approximate problems
are method stable
METHOD STABILITY UNDER APPROXIMATION
16
EXAMPLE
Uptake of trichloroethylene (TCE) in fat tissue(PBPK models)
17
PBPK Models forTCE in Fat Cells
Millions of cells withvarying size, residencetime, vasculature, geometry:“Axial-dispersion” typeadipose tissue compartmentsto embody uncertainphysiological heterogeneitiesin single organism (rat) =intra-individual variability
Inter-individual variability treated with parameters (including dispersionparameters) as random variables –estimate distributions from aggregatedata (multiple rat data) which also contains uncertainty (noise)
18
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )( ) ( )( ) ( ) ( )( )
( )( ) ( )( )
2
0 02 2
,
/
sinsin
v br k l m tv f B br k l m t c v
br k l m t
a c v p c c p b
brbr br a br br
B B B BB B I BI I I B B A BA A A B B
I
dC t Q Q Q Q QV Q C t C t C t C t C t C t Q C tdt P P P P P
C t Q C t Q C t Q Q P
dC tV Q C t C t P
dtC V D CV vC f C f C f C f C
r r
V
π ε
φ λ µ θ λ µ θφ φ φ φ
= − + + + + + −
= + +
= −
∂ ∂∂= − + − + − ∂ ∂ ∂
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
0
0
2
2 2 21
2
2 2 20
1 1 sinsin sin
1 1 sinsin sin
I I I I IB I BI B B I I IA A A I I
A A A A AA B A BA B B A A IA I I A A
kk k a
C V D C C f C f C f C f Ct r
C V D C CV f C f C f C f Ct r
dC tV Q C t C
dt
θ
θ
φ δ θ χ φ λ µ µφ θ φ φ φ
φ δ θ χ φ λ µ µφ θ φ φ φ
∂ ∂ ∂∂= + + − + − ∂ ∂ ∂ ∂
∂ ∂ ∂∂= + + − + − ∂ ∂ ∂ ∂
= − ( )( )( ) ( ) ( ) ( ) ( )
( ) ( ) ( )( )( ) ( ) ( )( )
max
/
/
/
k k
l l l ll l a M
l ll
mm m a m m
tt t a t t
t P
dC t C t C t C tV Q C t v k
P Pdt P
dC tV Q C t C t P
dtdC t
V Q C t C t Pdt
= − − +
= −
= −Plus boundary conditionsand initial conditions
Whole-body system of equations
19
Estimation of Non Parameterized Distributions in PBPK/TCE Models
No prior assumptions as to form (shape)of distribution
Use the Prohorov based approximation and convergence theory!!
20
1 1 2 2 :
1, .1667 3, .2Example Bimodal Gaussian P
µ σ µ σ
∗
= = = =
21
*P32P
22
References:
1)R.A.Albanese,H.T.Banks,M.V.Evans,and L.K.Potter, PBPK models for the transport oftrichloroethylene in adipose tissue,CRSC-TR01-03,NCSU,Jan.2001; Bull. Math Biology,64 (2002), p.97-131.
2)H.T.Banks and L.K.Potter,Well-posedness results for a class of toxicokinetic models,CRSR-TR01-18,NCSU,July,2001;Dynamical Systems and Applications, 14 (2005),p. 297-322.
3)L.K.Potter,Physiologically based pharmacokinetic models for the systemic transport ofTrichloroethylene, Ph.D. Thesis,NCSU, August,2001
4)H.T.Banks and L.K. Potter, Model predictions and comparisons for three toxicokineticmodels for the systemic transport of TCE,CRSC-TR01-23,NCSU,August,2001; Mathematical and Computer Modeling, 35(2002), p.1007-1032.
5)H.T.Banks and L.K.Potter, Probabilistic methods for addressing uncertainty and variability in biological models: Application to a toxicokinetic model, CRSC-TR02-27, Sept., 2002; Math. Biosci., 192 (2004), p. 193-225.
23
Data:
Dynamics:
where f can represent ordinary, functional, or partial differential equation and x is the expected value of “states” x
Minimize
over
Remark: Individual dynamics may not be available
( ; )
( , ( ), )
i id Cx t Pdx g t x t Pdt
∼
=
2( ) ( ; )i ii
J P Cx t P d= −∑(Q) QP probability measures over∈ =P
GENERIC INVERSE PROBLEM II:”Aggregate” dynamics/aggregate data
24
GENERAL THEORETICAL FRAMEWORKApplication here to FDE systems that include HIV models
discussed below
0
( , , ) (Q)
x ( ) ( ),
:
0
t
t
Syst dx f t x P Pdt
x r
em
x tϕ θ θ θ
= ∈
= = − ≤ ≤
P
( , , ) ( , , ) [0, ] [0, ] ( ) , .
" " ' (
n
Argue that t P f t P is continuous fromT C r P Q to R locally Lipschitz in
Then by continuous dependence on parametersresults for FDE s Extension of standard ODE resultsto FDE's -
ψ ψ
ψ
→
× ×
1969]), ( ; ) ( ) .
.
n
-see [HTB,SIAM J. Control 1968,we obtain that P x t P is continuous from P Q to Rfor each t Then one proceeds with the theory outlinedabove
→
25
EXAMPLEHIV pathogenesis in infection
26
27
Involves systems of equations of the form (generally nonlinear)
τwhere is a production delay (distributed across the population of cells). That is, one should write
where k is a probability density to be estimated from aggregate data.
Even if k is given, these systems are nontrivial to simulate—requiredevelopment of fundamental techniques.
( ) ( ) ( ) ( ) ( )a c vtdV cV t n A t n C t n V t T tdt
τ= − + − + −
0
( ) ( ) ( ) ( ) ( ) ( )a c vtdV cV t n A t k d n C t n V t T tdt
τ τ τ∞
= − + − + −∫
28
10
20
20
( ) ( ) ( ) ( ) ( ) ( , )
( ) ( ( )) ( ) ( ) ( ) ( , )
( ) ( ( )) ( ) ( ) ( )
( ) ( ( )) ( ) ( , )
r
A C
r
v A
r
v C
u u
V t cV t n A t dP n C t p V T
A t r X t A t A t dP p V T
C t r X t C t A t dP
T t r X t T t p V T S
τ τ
δ δ γ τ τ
δ δ γ τ τ
δ δ
= − + − + −
= − − − − +
= − − + −
= − − − +
∫
∫
∫
HIV Model:
2 20
1
10
2
1
, acute cells
( ) ( ) ( ),
delay from acute infection to viral production delay from acute infecti
( ) ( ; ) ( ; ) ( )
( ) ( ; ) ( ;
on to chronic infecti
)
n
(
o
)
r
A C
r
A A A
C t C t C t dP
V t V t V t
A
V t V t V t
PPT
dP
τ τ τ
τ τ τ
= =
= =
=
= +
=
∫
∫
E
E
target cells, total (infected+uninfected) cellsX =
where
29
Typical model variables:
V = infectious viral population (expected values)
A = acutely infected cells
C = chronically infected cells (expected values)
T = uninfected target cells
X = A+C+T = total cell population
Some models (especially those involving treatment and control) also entail immune response variables
30
References:1) D. Bortz, R. Guy, J. Hood, K. Kirkpatrick, V. Nguyen, and V. Shimanovich,
Modeling HIV infection dynamics using delay equations, in 6th CRSC Industrial Math Modeling Workshop for Graduate Students, NCSU(July,2000), CRSC-TR00-24, NCSU, Oct, 2000
2) H. T. Banks, D. M. Bortz, and S. E. Holte, Incorporation of variability into the modeling of viral delays in HIV infection dynamics, CRSC-TR01-25, Sept, 2001; Math Biosciences, 183 (2003), p.63-91.
3) H.T.Banks, Incorporation of uncertainty in inverse problems,CRSC-TR02-08, March, 2002; in Proc. Intl. Conf. InverseProblems(Hong Kong,Jan. 9-12,2002) World Scientific Press.
4) H.T.Banks and D.M.Bortz, A parameter sensitivity methodology in the context of HIV delay equation models, CRSC-TR02-24, August, 2002; J. Math Biology, 50 (2005), p. 607-625.
5) D.M.Bortz, Modeling, Analysis, and Estimation of an in vivo HIV Infection Using Functional Differential Equations, Ph.D. Thesis, N.C.State University, August, 2002. 31
Aggregate Dynamics--Other Applications:
i) Viscoelasticity in composite materials such as tissue
ii) Electromagnetics in materials with dispersion
HTB and G.A. Pinter, A probabilistic multiscale approach to hysteresis in shear wave propagation in Biotissue, CRSC-TR04-03, January, 2004; SIAM J. Multiscale Modeling and Simulation, 3 (2005), 395-412.
HTB and N.L. Gibson, Electromagnetic inverse problems involving distributions of dielectric mechanisms and parameters, CRSC-TR05-29, August, 2005; Quarterly of Applied Mathematics, 64 (2006), 749-795. 32
Asymptotic Properties of Probability MeasureEstimators in a Nonparametric Model
H.T. Banks, Jared Catenacci, and Shuhua Hu
Center for Quantitative Sciences in BiomedicineCenter for Research in Scientific Computation
Department of MathematicsNorth Carolina State University
Raleigh, NC
July, 2014
Banks,Catenacci,Hu Asymptotic Properties
Introduction
We consider nonparametric estimation of an unknownprobability measure in the case where the regression functionis dependent on this measure. More precisely, the statisticalmodel, the model describing the observation process, isdescribed by
Yj = f (tj ;P0) + Ej , j = 1,2,3, . . . ,N. (1)
f (tj ;P0) denotes the observed part of the solution of amathematical model with the true probability measure P0
(unknown) at the measurement point tjEj is the measurement error at tjN is the total number of observations, where tj ∈ [ts, tf ],j = 1,2,3, . . . ,N, with ts and tf being some real numbers
Banks,Catenacci,Hu Asymptotic Properties
Equation (1) is often referred to as a nonparametric statisticalmodel (a model with all the unknown parameters being in aninfinite-dimensional parameter space) in the statistics literature.Such models are motivated by a number of applications arisingin biology and physics, for example, in modeling mosquitofishpopulations [BanksFitzpatrick1991] and shrimp populations[BanksDavisErnstbergerHuArtimovichDhar2009], in wavepropagation in biotissue [BanksPinter2005], in modeling of acomplex nonmagnetic dielectric materials[BanksGibson2006],[BanksCatenacciHuKenz2013] and in HIVcellular models [BanksBortz2005]. Here we only elaborate oneof the motivating examples, a recent project[BanksCatenacciHuKenz2013] investigated by our group.
Banks,Catenacci,Hu Asymptotic Properties
In this project, the goal is to develop a noninvasive technique tocharacterize the changes or degradation of a complexnonmagnetic dielectric material (such as tissue or inorganicglasses) by assessing the small physical and chemical changesin the material using reflectance spectroscopy. This involvesdetermining the components of the permittivity of the dielectricmedium using the measured spectral responses. The relativepermittivity of the dielectric medium is described by
εr (k ;P0) = ε∞ −∫
Ωθ
k2p
k2 − ik/τ − k20
dP0(θ). (2)
Banks,Catenacci,Hu Asymptotic Properties
ε∞ denotes the relative permittivity of the dielectricmedium at infinite frequency
k is the wavenumber (k = ω/(2πc), where ω is the angularfrequency and c is the speed of light)
k0 represents the resonance wavenumber, τ denotes therelaxation time
composite parameter kp is given by kp = k0√εs − ε∞ with
εs being the relative permittivity of the medium at zerofrequency, θ = k0 ∈ Ωθ ⊂ R
Banks,Catenacci,Hu Asymptotic Properties
Figure: A monochromatic uniform wave is incident at an angle θ on aplane interface between a free space and a nonmagnetic dielectricmedium, where ω = 2πck denotes the frequency of the wave and k isthe wavenumber.
Banks,Catenacci,Hu Composite Material Reflectivity
Assume a monochromatic uniform wave is incident at an anglezero on a plane interface between free space and anonmagnetic dielectric medium with the electric field polarizedperpendicular to the plane of incidence, then reflectioncoefficient is given by
rs(k ;P0) =1 −
√εr (k ;P0)
1 +√εr (k ;P0)
, (3)
where εr is defined by (2). The observations fj are thereflectance (the square of the magnitude of the reflectioncoefficient) at different wave numbers kj ; that is,fj = |rs(kj ;P0)|2. The goal is then to use these observations toestimate the unknown probability measure P0.
Banks,Catenacci,Hu Asymptotic Properties
Problem here is different from those, for example, inpharmacokinetics studies and HIV studies to estimate bothindividual-specific parameters θ (such as clearance rate ofthe virus and infection rate in HIV studies) and theirassociated probability distribution function P0 from bloodsamples taken serially in time from individuals in thepopulation– data fj is dependent on θ instead of P0
In those one has individual longitudinal data instead ofaggregate longitudinal data (i.e., data collected bysampling from the population at large).
methods used to solve these two types of problems arefundamentally different.
We refer the interested reader to[BanksFA2012,BanksHuThompson, Modeling and InverseProblems in the Presence of Uncertainty2014] for more detailson this topic.
Banks,Catenacci,Hu Asymptotic Properties
Theoretical and Computational Framework forProbability Measure Estimation
observations Yj in (1) are scalar (the multi-dimensionalcase can be treated similarly)
measurement errors Ej , j = 1,2,3, . . . ,N, are independentand identically distributed (i.i.d.) with zero mean andconstant variance σ2
0 defined on some probability space(Ω,F ,Prob)
f (t ;P0) correctly describes the observed part of thedynamical system (that is, the underlying mathematicalmodel is correct).
Banks,Catenacci,Hu Asymptotic Properties
With the i.i.d. assumption on the measurement errors, theestimator of P0 can be obtained using the ordinaryleast-squares method as defined by
PN = arg minP∈P(Ωθ)
N∑
j=1
(Yj − f (tj ;P))2, (4)
where P(Ωθ) denotes the set of probability measures on thespace Ωθ ⊂ Rκθ with κθ being a positive integer. We remarkthat PN itself is random in that it is a function of randomvariables Yj (and hence Ej ) on a probability space (Ω,F ,Prob).
Banks,Catenacci,Hu Asymptotic Properties
The corresponding realization PN of PN can be calculatedthrough
PN = arg minP∈P(Ωθ)
N∑
j=1
(yj − f (tj ;P))2, (5)
where yj is a realization of Yj , j = 1,2,3, . . . ,N. Thus, we canview PN as a stochastic process (i.e., PN(θ; ·) as a oneparameter (θ ∈ Ωθ) family of random variables on theprobability space (Ω,F ,Prob)) since each of its realizationsyields a probability measure PN ∈ P(Ωθ).
Banks,Catenacci,Hu Asymptotic Properties
The existence of a minimizer to the least-squares optimizationproblem (4) or (5) can be established under the ProhorovMetric Framework. [Prohorov 1956; BanksFA 2012;BanksHuThompson 2014]
Definition
Let F ⊂ Ωθ be any closed set and define Fǫ as follows:
Fǫ = θ ∈ Ωθ : inf˜θ∈F
d(θ, θ) < ǫ,
where d denotes the metric on Ωθ. For P,Q ∈ P(Ωθ), theProhorov metric is given by
ρ(P,Q)= inf ǫ > 0|Q(F) ≤ P(Fǫ) + ǫ and P(F) ≤ Q(Fǫ) + ǫ
where inf is over all F closed in Ωθ.
Banks,Catenacci,Hu Asymptotic Properties
the meaning of Prohorov metric is far from intuitive
several useful characterizations.
convergence in the Prohorov metric is equivalent to theweak∗ convergence if we view P(Ωθ) ⊂ C∗
B(Ωθ), whereC∗
B(Ωθ) denotes the topological dual of the space CB(Ωθ)of bounded and continuous functions on Ωθ.
i.e., ρ(Pj ,P) → 0 is equivalent to the statement∫
Ωθ
h(θ)dPj(θ) →∫
Ωθ
h(θ)dP(θ) for any h ∈ CB(Ωθ).
Banks,Catenacci,Hu Asymptotic Properties
Prohorov metric also possesses many useful and importantproperties. For example, if we assume that Ωθ is compact, thenP(Ωθ) is a compact metric space when taken with the Prohorovmetric ρ. Based on these discussions, we see that if Ωθ iscompact and f is continuous with respect to P, then there existsa solution to (4) or (5).
Banks,Catenacci,Hu Asymptotic Properties
Consistency of the Probability Measure Estimator
The ideas for establishing the consistency of probabilitymeasure estimators follow closely those given in[BanksFitzpatrick1991] and [BanksHuThompson2014.book].
Theorem
Under usual assumptions on data and how collected,
ρ(PN ,P0)a.s.−→ 0 as N → ∞, where a.s.−→ denotes convergence
almost surely in (Ω,F ,Prob). That is,
Probω ∈ Ω
∣∣∣ limN→∞
ρ(PN(ω),P0) = 0
= 1.
Banks,Catenacci,Hu Asymptotic Properties
Approximation Schemes for Probability MeasureEstimation
We note that (5) is an infinite-dimensional optimization problem.Hence, the infinite-dimensional space P(Ωθ) must beapproximated by some finite dimensional space PM(Ωθ) so thatone has a computationally tractable finite-dimensionaloptimization problem given by
PNM = arg minP∈PM(Ωθ)
∑Nj=1(yj − f (tj ;P))2. (6)
However, one needs to choose PM(Ωθ) in a meaningful way sothat PN
M approaches the solution to (5) as M → ∞.
Banks,Catenacci,Hu Asymptotic Properties
Dirac Measures Approximations
Theorem
Assume Ωθ ⊂ Rκθ is compact. Let ΩθD = θj∞j=1 be anenumeration of a countable dense subset of Ωθ. Define
PD(Ωθ)
=
P ∈ P(Ωθ)
∣∣∣P =
M∑
j=1
aj∆θ j, θj ∈ ΩθD, aj ∈ [0, 1] ∩Q,
M∑
j=1
aj = 1
,
where ∆θjis the Dirac measure with atom at θj , M ∈ N and
Q ⊂ R denotes the set of all rational numbers. (That is, PD(Ωθ)is the collection of all convex combinations of Dirac measureson Ωθ with atoms θj ∈ ΩθD and rational weights.) Then PD(Ωθ)is dense in (P(Ωθ), ρ), and thus P(Ωθ) is separable.
Banks,Catenacci,Hu Asymptotic Properties
Under this Dirac measure approximation framework, we definePM(Ωθ) to be the set of all atomic probability measures withnodes placed at the first M elements in the enumeration of thecountable dense subset of Ωθ; that is,
PM(Ωθ) =
P ∈ P(Ωθ)
∣∣∣∣ P =
M∑
j=1
aj∆θ j, aj ≥ 0 and
M∑
j=1
aj = 1
. (7)
By theorem above we know that we can approximate anyelement P ∈ P(Ωθ) by a sequence PMj
, PMj∈ PMj
(Ωθ), suchthat ρ(PMj
,P) → 0 as Mj → ∞. We also see that this Diracmeasure approximation method can be used regardless of thesmoothness of probability measures. This is especially usefulin the situations where one has no knowledge of thesought-after probability measures.
Banks,Catenacci,Hu Asymptotic Properties
Linear Spline Approximations (when probability measuresare absolutely continuous) – their corresponding probabilitydensity functions exist [BanksPinter 2005].
Theorem
Assume Ωθ ⊂ Rκθ is compact. Define
PS(Ωθ)
= P ∈ P(Ωθ)∣∣∣P′(θ) =
M∑
j=1
aj lMj (θ), aj ∈ [0,∞) ∩Q,
M∑
j=1
aj
∫
Ωθ
lMj (ξ)dξ = 1,M ∈ N,
where P ′ denotes the derivative of P with respect to θ, the lMj denote the usual piecewise linear splines, and Q ⊂ R denotesthe set of all rational numbers. Then PS(Ωθ) is dense in P(Ωθ).
Banks,Catenacci,Hu Asymptotic Properties
Under this linear spline approximation framework, we definePM(Ωθ) to be
PM(Ωθ) = P ∈ P(Ωθ)
∣∣∣∣ P′(θ) =
M∑
j=1
aj lMj (θ), aj ≥ 0,M∑
j=1
aj
∫
Ωθ
lMj (ξ)dξ = 1.
(8)By above Theorem, we know that we can approximate anyelement P ∈ P(Ωθ) by a sequence PMj
, PMj∈ PMj
(Ωθ), suchthat ρ(PMj
,P) → 0 as Mj → ∞.
Banks,Catenacci,Hu Asymptotic Properties
The following theorem provides the desired convergence resultwhich follows immediately from the Prohorov metric frameworkand convergence theorems of [BanksKunisch1989] as well asthe results above.
Theorem
Assume Ωθ is compact and P(Ωθ) is taken with the Prohorovmetric. If f is continuous with respect to P, then there exists aminimizer PN
M to (6), where PM(Ωθ) is chosen as either (7) or(8). Moreover, the sequence PN
M has at least one convergentsubsequence, and the limit PN∗ of such a subsequence is aminimizer to the least-squares problem (5).
Banks,Catenacci,Hu Asymptotic Properties
RemarkDirac measure approximation methods and thespline-based approximation methods have beensuccessfully used to estimate probability measures in anumber of applications
it was demonstrated in [BanksDavis2007] that if thesought-after probability measure is absolutely continuous,then the spline-based approximation methods convergemuch faster than do the Dirac measure approximationmethods (in terms of the value of M)
moreover it was observed that spline-based approximationmethods also provide convergence for associatedprobability density functions while the Dirac measureapproximation methods do not do this
in spline-based approximation methods one directlyapproximates associated probability density functionsinstead of cumulative distribution functions
Banks,Catenacci,Hu Asymptotic Properties
Bias and Variance in Probability MeasureEstimation
As we discussed in the above section, what one actually doesin practice is to minimize the cost functional in afinite-dimensional space; that is, one solves the optimizationproblem
PNM = arg minP∈PM(Ωθ)
∑Nj=1(Yj − f (tj ;P))2. (9)
For example, if one uses the Dirac measure approximationmethods, then PN
M = ∆T AN
M , where∆ = ∆(θ) = (∆θ1
,∆θ2, . . . ,∆θM
)T , and
ANM = arg minaN
M∈RM
∑Nj=1
[Yj − f (tj ;
∑Ml=1 aN
M,l∆θl)]2
. (10)
Banks,Catenacci,Hu Asymptotic Properties
Here RM =aM = (aM
1 ,aM2 , . . . ,aM
M)T∣∣∣ aM
j ≥ 0, j = 1,2, . . . ,M,∑M
j=1 aMj = 1
.
The corresponding realization of (9) is given by
PNM = arg minP∈PM(Ωθ)
∑Nj=1(yj − f (tj ;P))2; (11)
that is, PNM = ∆
T aNM , where ∆ = (∆θ1
,∆θ2, . . . ,∆θM
)T , and
aNM = arg minaN
M∈RM
∑Nj=1
[yj − f (tj ;
∑Ml=1 aN
M,l∆θl)]2
. (12)
Banks,Catenacci,Hu Asymptotic Properties
In essence, one presumes that the data was generated usingthe following statistical model
Yj = f (tj ;a0,M) + Ej , j = 1,2,3, . . . ,N. (13)
In the above equation, f (tj ;a0,M) = f (tj ;P0,M), whereP0,M = ∆
T a0,M ∈ PM(Ωθ), and a0,M ∈ RM
is the one that minimizes
J0(aM) = σ20 +
∫ tf
ts(f (t ;P0)− f (t ;aM ))2dµ(t) (14)
over RM . In other words, the functional J0 defined by
J0(P) = σ20 +
∫ tf
ts(f (t ;P0)− f (t ;P))2dµ(t) (15)
has a minimizer P0,M in PM(Ωθ) for each fixed M.
Banks,Catenacci,Hu Asymptotic Properties
Thus, we have a model “misspecification", which is due to theapproximation of the infinite-dimensional space P(Ωθ) by thefinite-dimensional space PM(Ωθ). Under this framework, thetotal error between the true model (1) and the approximatingmodel (13) can be characterized by (illustrated in Figure 1)
ρ(P0,P0,M) + ρ(P0,M , PNM),
where the first term ρ(P0,P0,M) is a measure of the accuracy ofthe approximating model and is often called bias in thestatistics literature, and the second term ρ(P0,M , PN
M) is ameasure of estimation accuracy and is often called variance.
Banks,Catenacci,Hu Asymptotic Properties
Figure: Illustration of the bias and the variance in the probabilitymeasure approximation.
Banks,Catenacci,Hu Asymptotic Properties
Using properties of the Prohorov metric and and results above,we find the bias ρ(P0,P0,M) approaches zero as M → ∞.However, for fixed N the variance in general increases as thevalue of M increases (e.g., see [BurnhamAnderson]); that is,we have less confidence in the parameter estimates as thenumber of approximating parameters increases. Hence, thereis a trade-off between the bias and the variance
Figure: Illustration of the trade-off between the bias and the variance.Banks,Catenacci,Hu Asymptotic Properties
Model selection criteria (Akaike Information Criterion andBayesian Information Criterion) have been widely used toselect a best approximating model from a prior set ofcandidate models
all are based to some extent on the principle of parsimony(see [BHT.book,BurnhamAnderson])
goal in model selection is to simultaneously minimize bothbias (modeling error) and variance (estimation error)
use a model selection criterion to select a best M value(i.e., a best approximating model)
Banks,Catenacci,Hu Asymptotic Properties
Pointwise Asymptotic Normality of theApproximate Probability Measure Estimator
In this section, we consider the pointwise asymptotic normalityof the least-squares estimator PN
M . Since for any given θ, PNM(θ)
is linearly dependent on ANM (for example, in the case where the
Dirac measure approximation is used, PNM(θ) = (∆(θ))T AN
M),we first consider the asymptotic normality of AN
M .
Banks,Catenacci,Hu Asymptotic Properties
Theorem
Under usual standard assumptions, for each fixed M we have
√N(
ANM − a0,M
) d−→ Z ∼ N(0,Σ0,M
), as N → ∞, (18)
where d−→ denotes convergence in distribution,Σ0,M =
(H(a0,M)
)−1 F(a0,M)
(H(a0,M)
)−1, and N
(0,Σ0,M
)
represents a multivariate normal distribution with zero meanand covariance matrix Σ0,M .
Banks,Catenacci,Hu Asymptotic Properties
If one uses the Dirac measure approximation method, then by(22) we know that for any sufficiently large N
PNM(θ) ∼ N ((∆(θ))T aN
M , 1N (∆(θ))T ΣN
M∆(θ)) (26)
holds for any fixed θ ∈ Ωθ. Similarly, one can use (22) to obtainthe pointwise asymptotic result for PN
M(θ) in the case where thelinear spline approximation method is employed.
Banks,Catenacci,Hu Asymptotic Properties
Numerical Results
We use the motivating example in the Introduction todemonstrate our theoretic results through simulated data.Specifically, we consider the following nonparametric model
Yj = |rs(kj ;P0)|2 + Ej , j = 1,2,3, . . . ,N, (27)
with rs given by (3). The simulated data is then generated bysimulating
yj = |rs(kj ;P0)|2 + ǫj , j = 1,2,3, . . . ,N. (28)
Banks,Catenacci,Hu Asymptotic Properties
Optimal Value of M
can use Akaike Information Criterion (AIC) (1973) todetermine the optimal value of M, where, e.g., probabilitymeasure is obtained by the linear spline approximationmethod
one of the most widely used model selection criteria,based on Kullback-Leibler information (a well-knownmeasure of “distance" between two probability densityfunctions) and maximum likelihood estimation
can be used to compare both nested models andnon-nested models, and it can also be used to comparemultiple models at a time
Banks,Catenacci,Hu Asymptotic Properties
For the least squares case, it can be found (e.g., see[BurnhamAnderson2002],[BanksHuThompson2014] ) that if themeasurement errors are i.i.d. normally distributed, then the AICis given by
AIC = N log(
RSSN
)+ 2(M + 1). (29)
Banks,Catenacci,Hu Asymptotic Properties
Here M + 1 is the total number of estimated parametersincluding the coefficients for the splines and the variance ofmeasurement errors, and RSS denotes the residuals of sumsquares given by
RSS =
N∑
j=1
(yj − |rs(kj ; PNM)|2)2.
Banks,Catenacci,Hu Asymptotic Properties
given a prior set of candidate models, one calculates AICvalue for each model; best approximating model is onewith minimum AIC value
AIC value depends on data set used and one must usesame data set to calculate AIC values for each of models
AIC may perform poorly if sample size N is small relative tothe total number of estimated parameters; (it is suggested[BurnhamAnderson2002] that AIC should be used only ifsample size is at least 40 times total number of estimatedparameters
Banks,Catenacci,Hu Asymptotic Properties
Otherwise, one needs to use the small sample AIC, theso-called AICc , which is given by
AICc = AIC +2(M + 1)(M + 2)
N − M − 2. (30)
For more information on the AIC and its variations, we refer the interested reader to [BurnAnd] and [BHT, Chapter 4].
Banks,Catenacci,Hu Asymptotic Properties
set of our candidate models is chosen as model (13) withM = 5,10,15,20,25 and 30, and Ej , j = 1,2,3, . . . ,N,being i.i.d. normally distributed with zero mean andconstant varianceuse the AICc to select the best model (sample size is lessthan 40 times total number of estimated parameters)model with M = 15 is one with minimum AICc value
5 10 15 20 25 30−860
−840
−820
−800
−780
−760
−740
−720
−700
−680
−660
M
AIC
c
Figure: The AICc values with M = 5, 10, 15, 20, 25 and 30.
Banks,Catenacci,Hu Asymptotic Properties
Pointwise Confidence Band
Can construct the pointwise confidence band for PNM by using
the asymptotic normality results presented above where M ischosen as the optimal value of M = 15 obtained in the aboveanalysis.That is, by (22) we know that for any sufficiently large N
PNM(k0) ∼ N (PN
M(k0),1N (L(k0))
T ΣNML(k0)) (31)
holds for any fixed k0 ∈ [k0, k0]. In the above equation,PN
M(k0) = (L(k0))T aN
M , where
L(k0) =
(∫ k0
k0
l1(ξ)dξ,∫ k0
k0
l2(ξ)dξ, . . . ,∫ k0
k0
lM(ξ)dξ
)T
.
Banks,Catenacci,Hu Asymptotic Properties
One can then use (31) to construct the pointwise 100(1 − α)%level confidence band, which is given by[PN
M(k0)− t1−α/2SEPAN(k0), PNM(k0) + t1−α/2SEPAN(k0)
], k0 ∈ [k0, k0]
Here SEPAN(k0) =√
1N (L(k0))T ΣN
ML(k0), and the critical valuet1−α/2 is determined by ProbT ≥ t1−α/2 = α/2, where T hasa student’s t distribution tN−M with N − M degrees of freedom.For the simulations illustrated below, lj is the j th piecewise linearspline element using equally spaced nodes, and centraldifference schemes are used to approximate the first andsecond order derivatives involved in the covariance matrix ΣN
M .
Banks,Catenacci,Hu Asymptotic Properties
The pointwise confidence bands for PNM
pointwise asymptotic normality vs. MC (α = 0.1, K = 1000).Similar except at plateau regions those using asymp results iswider than that of MC.
400 500 600 700 800 900 1000 11000
0.2
0.4
0.6
0.8
1
k0, 1/cm
Pro
babi
lity
Dis
trib
utio
n
True DistributionEstimated DistributionLower BandUpper Band
400 500 600 700 800 900 1000 11000
0.2
0.4
0.6
0.8
1
k0, 1/cm
Pro
babi
lity
Dis
trib
utio
n
True DistributionEstimated DistributionLower BandUpper Band
Figure: pointwise confidence bands for cumulative distributionfunction: pointwise asymptotic normality results (left) and those usingMC simulations (right).
Banks,Catenacci,Hu Asymptotic Properties
400 500 600 700 800 900 1000 11000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
k, cm−1
|rs(k
)|2
DataModel Fit
500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
k0, 1/cm
Pro
babi
lity
Dis
trib
utio
n
Figure: Model fit (left) and the estimated distribution (right) from thefull inverse problem where N = 25 for Vitreous Germania.
Banks,Catenacci,Hu Composite Material Reflectivity
500 600 700 800 900 10000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
k0, 1/cm
Pro
babi
lity
Dis
trib
utio
n
N=5
N=10
N=15
N=20
N=25
Figure: The estimated distributions for all values of N consideredfrom the Vitreous Germania data.
Banks,Catenacci,Hu Composite Material Reflectivity
The parameter estimates for εs, ε∞ and τ are given in the Tablebelow.
N εs ε∞ τ (cm)5 2.7677 2.1518 0.027510 2.5768 1.9634 0.041815 2.5999 1.9904 0.052520 2.4677 1.8341 0.057825 2.4361 1.7732 0.0581
Table: Estimations obtained using the reflectivity data for VitreousGermania using various numbers of Dirac measures.
Banks,Catenacci,Hu Composite Material Reflectivity
H.T. Banks, A Functional Analysis Framework for Modeling,Estimation and Control in Science and Engineering,Chapman and Hall/CRC Press, Boca Raton, FL, 2012.
H.T. Banks and K.L. Bihari, Modeling and estimatinguncertainty in parameter estimation, Inverse Problems, 17(2001), 95-111.
H.T. Banks, S. Hu and W.C. Thompson, Modeling andInverse Problems in the Presence of Uncertainty,Taylor/Francis-Chapman/Hall-CRC Press, Boca Raton, FL,2014.
A.M. Efimov, Optical Contants of Inorganic Glasses, CRCpress, Boca Raton, Florida, 1995.
Yu.V. Prohorov, Convergence of random processes andlimit theorems in probability theory, Theor. Prob. Appl., 1(1956), 157–214.
Banks,Catenacci,Hu Composite Material Reflectivity