propagation of uncertainties using improved surrogate models

28
Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited. SIAM/ASA J. UNCERTAINTY QUANTIFICATION c 2013 Society for Industrial and Applied Mathematics Vol. 1, pp. 164–191 and American Statistical Association Propagation of Uncertainties Using Improved Surrogate Models T. Butler , C. Dawson , and T. Wildey § Abstract. We study the effect of various sources of error on the propagation of uncertain parameters and data through surrogate response surfaces approximating quantities of interest from stochastic differential equations. The main result centers on a novel approach for improving the pointwise accuracy of a surrogate with the use of an adjoint-based a posteriori estimate of its error. A general error analysis on propagated distribution functions for both forward and inverse problems is derived. To provide concrete examples focusing on the use of the improved surrogate, we consider standard polynomial spectral methods to approximate the surrogate. However, neither the definition of the improved surrogate nor the general error analysis for the computed distribution functions requires a specific surrogate formulation. Numerical results comparing pointwise errors in propagated distributions using a surrogate versus its improved counterpart demonstrate global decreases in both actual error and in error bounds for both forward and inverse problems. Key words. a posteriori error analysis, adjoint problem, polynomial chaos, stochastic spectral methods, Bayes- ian inference, distribution function AMS subject classifications. 60H15, 60H35, 65M32, 65M15, 62F15, 35K57 DOI. 10.1137/120888399 1. Introduction. There is considerable interest in developing efficient and accurate meth- ods to quantify the uncertainty in computational differential equation models [48, 47, 40]. Often, a two stage approach for solving this problem is formulated. First, a large number of samples of model parameters or input data are determined in terms of realizations of a stochastic process. Second, the probability distribution is approximately propagated through the computational model to the output or observable data. The first stage involves a priori knowledge and is often a modeling decision of the user, e.g., choosing to model a parameter as a random process with some specified distribution. The majority of the computational burden is in the second stage, involving the propagation of inputs to outputs. A simple approach is Monte Carlo simulation [43, 57], where the model is solved for each randomly generated input sample resulting in samples of the output dis- tribution. From these output samples, statistics or density estimates on specific quantities of interest may be computed. While the implementation is straightforward, this method can become computationally prohibitive for complex models where a large number of runs of the Received by the editors August 17, 2012; accepted for publication (in revised form) March 18, 2013; published electronically May 28, 2013. http://www.siam.org/journals/juq/1/88839.html Department of Statistics, Colorado State University, Fort Collins, CO 80523 ([email protected]). Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, Austin, TX 78712 ([email protected]). § Sandia National Labs, Albuquerque, NM 87185 ([email protected]). Sandia is a multiprogram laboratory operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94-AL85000. 164 Downloaded 11/18/14 to 128.235.8.170. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Upload: t

Post on 24-Mar-2017

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

SIAM/ASA J. UNCERTAINTY QUANTIFICATION c© 2013 Society for Industrial and Applied MathematicsVol. 1, pp. 164–191 and American Statistical Association

Propagation of Uncertainties Using Improved Surrogate Models∗

T. Butler†, C. Dawson‡, and T. Wildey§

Abstract. We study the effect of various sources of error on the propagation of uncertain parameters and datathrough surrogate response surfaces approximating quantities of interest from stochastic differentialequations. The main result centers on a novel approach for improving the pointwise accuracy of asurrogate with the use of an adjoint-based a posteriori estimate of its error. A general error analysison propagated distribution functions for both forward and inverse problems is derived. To provideconcrete examples focusing on the use of the improved surrogate, we consider standard polynomialspectral methods to approximate the surrogate. However, neither the definition of the improvedsurrogate nor the general error analysis for the computed distribution functions requires a specificsurrogate formulation. Numerical results comparing pointwise errors in propagated distributionsusing a surrogate versus its improved counterpart demonstrate global decreases in both actual errorand in error bounds for both forward and inverse problems.

Key words. a posteriori error analysis, adjoint problem, polynomial chaos, stochastic spectral methods, Bayes-ian inference, distribution function

AMS subject classifications. 60H15, 60H35, 65M32, 65M15, 62F15, 35K57

DOI. 10.1137/120888399

1. Introduction. There is considerable interest in developing efficient and accurate meth-ods to quantify the uncertainty in computational differential equation models [48, 47, 40].Often, a two stage approach for solving this problem is formulated. First, a large numberof samples of model parameters or input data are determined in terms of realizations of astochastic process. Second, the probability distribution is approximately propagated throughthe computational model to the output or observable data.

The first stage involves a priori knowledge and is often a modeling decision of the user,e.g., choosing to model a parameter as a random process with some specified distribution.The majority of the computational burden is in the second stage, involving the propagationof inputs to outputs. A simple approach is Monte Carlo simulation [43, 57], where the modelis solved for each randomly generated input sample resulting in samples of the output dis-tribution. From these output samples, statistics or density estimates on specific quantitiesof interest may be computed. While the implementation is straightforward, this method canbecome computationally prohibitive for complex models where a large number of runs of the

∗Received by the editors August 17, 2012; accepted for publication (in revised form) March 18, 2013; publishedelectronically May 28, 2013.

http://www.siam.org/journals/juq/1/88839.html†Department of Statistics, Colorado State University, Fort Collins, CO 80523 ([email protected]).‡Institute for Computational Engineering and Sciences (ICES), University of Texas at Austin, Austin, TX 78712

([email protected]).§Sandia National Labs, Albuquerque, NM 87185 ([email protected]). Sandia is a multiprogram laboratory

operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin company, for the United StatesDepartment of Energy’s National Nuclear Security Administration under contract DE-AC04-94-AL85000.

164

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 2: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 165

computational model may be infeasible.Further complicating the task of quantifying the uncertainty reliably is that each output

sample is polluted by discretization error in the computational model. Often, we can estimateand correct for this deterministic source of error in the computed quantities of interest usingtraditional error estimation techniques such as the well-known adjoint-based a posteriori errorestimation techniques [26, 22, 23, 24, 65, 33, 53, 42, 58]. Except in special cases, correctingfor the deterministic error at each sample of a quantity of interest requires solving an adjointproblem. In general, this implies that if we can afford at least a two-fold increase in the com-putational cost, then the primary source of error in using a standard Monte Carlo method isthe statistical error resulting from the use of finite sample sizes. Unfortunately, this statisticalerror can be quite large if only a limited number of samples can be computed.

An alternative approach for propagation of distributions is to construct a surrogate re-sponse surface and propagate samples using this surrogate rather than the full computationalmodel. This approach essentially eliminates the statistical error component from the computeddistribution since each sample of the surrogate model output has a very low computationalcost, implying a large number of samples may be taken. However, each of these samples maybe contaminated by deterministic error from numerically constructing the surrogate approxi-mation. Thus, the trade-off is that statistical error may be neglected at the cost of possiblylarge deterministic sources of error. Recent work has shown that these deterministic sourcesof error can be efficiently estimated using a surrogate to the adjoint model [10, 9].

The purpose of this paper is to demonstrate the benefits of using an improved surrogate interms of improved pointwise accuracy of propagated distributions compared to propagation ofidentical distributions through the original surrogate response surface. The improved surro-gate is defined using a posteriori estimates for the deterministic error of the original surrogate.Thus, the deterministic error is computed as a function of the variable input parameter. Wecombine and modify existing analysis to derive general error bounds on propagated distri-butions through surrogate models for both forward and inverse problems. In the numericalexamples, we apply the error bounds to both surrogate and improved surrogate response sur-faces, demonstrating the improved pointwise accuracy of distributions for both forward andinverse problems.

We note that most of the growing literature addresses issues of error from the use of surro-gates in terms of convergence of statistical moments or using the Kullback–Leibler divergencefor an analysis of posterior error in a Bayesian framework (see, e.g., [66, 51] as excellent ex-amples of the former and latter, respectively). This is different from the type of error analysisdiscussed here. We are primarily interested in the accurate computation of probabilities ofevents. Thus, the focus is on the pointwise error in the propagated distributions. This work isnot meant to displace these other methods for addressing a different type of error but rathercomplement the growing work in quantifying error in a propagated distribution. Besides theauthors’ recent work [10, 9] on deriving a posteriori pointwise estimates of deterministic errorin surrogate models based on stochastic spectral approaches, we direct the interested reader to[2] for an alternative approach using sparse grids and stochastic collocation that also addressesmesh adaptivity.

We observe that there are many ways to compute surrogate models. In this paper, wefocus on surrogate models based on polynomial approximations computed using stochasticD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 3: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

166 T. BUTLER, C. DAWSON, AND T. WILDEY

spectral methods [41, 66, 62, 61, 48, 40, 10]. The use of stochastic spectral methods hasbecome an increasingly popular way to construct surrogate response surfaces that reduce thecomputational burden of Monte Carlo [41, 66, 63, 13, 51, 48, 47, 40]. These methods havebeen used for both the forward propagation of distributions [66, 48, 47, 40] and the subse-quent statistical inference (Bayesian) problems [51]. In practice, the spectral approximationis truncated and the resulting coupled system is solved numerically, both of which introduceerrors. While these representation and discretization errors are uncertain, they are completelydeterministic and should not be treated as random variables with assumed distributions [56].We address the effect of these errors in both the propagated distributions and also in theformulation of the (inverse) problem to account for these deterministic uncertainties.

This paper is structured as follows. In section 2, we present the basic abstract modelframework for propagation of uncertainties through a response surface, some general nota-tion, and sources of error. The deterministic sources of error considered are representationerror resulting from the use of a surrogate in place of the exact response surface and dis-cretization error from solving, for example, differential equations. We outline in section 3 andits subsections how to compute a specific type of surrogate response surface resulting froma solution to a stochastic differential equation. In section 4, we describe how to computean error estimate for these surrogate quantities of interest. An additional estimation of anadjoint solution is used in the error estimate to define an improved surrogate response surfacein section 4.3. We then provide computational details for the abstract forward and inverseproblems in sections 5.1 and 6.1, respectively. The effect of the deterministic error on thepropagated uncertainties in forward and inverse problems is presented in sections 5.2.1 and6.2.1. For completeness and context, we show bounds on the statistical component of errordue to finite sampling in forward and inverse problems in sections 5.2.2 and 6.2.2, respectively.In section 7, we provide numerical examples.

2. Model framework and sources of error. We study the problem of quantifying theuncertainty associated with solving a stochastic differential equation. Specifically, we considera differential equation with uncertain input parameters and/or data, e.g., a diffusion tensorand/or external forcing, modeled as random processes (we use r.v. as an abbreviation for arandom process, variable, and vector). Often, the motivation for solving a differential equationis to compute a small number of low-dimensional quantities of interest (i.e., output data)involving linear functionals of the solution, e.g., the values of the solution at particular pointsin the space-time domain. While certain components of the model are treated as randomprocesses, the model itself is deterministic in the sense that specified and fixed choices ofparameters and input data produce unique output data.

We consider the forward (and inverse) sensitivity analysis problem from the point of viewof sensitivity of output (and input) data to uncertainty in input (and output) data. The mainfocus is how probability distributions on input (and output) data propagate through responsesurfaces to output (and input) data. In this paper, we use modern techniques to compute asurrogate response surface for a quantity of interest and we address the following issues:

1. Using a posteriori techniques to compute an improved surrogate.2. Distinguishing between the deterministic and stochastic sources of error.3. Quantifying the effect of these errors in propagated probability distributions.D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 4: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 167

To make the framework for both the forward and inverse sensitivity analysis problemsmore precise, assume we are given the following: (1) input parameters λ in parameter spaceΛ ⊂ R

k, (2) a deterministic model with solution u(λ) depending on parameters λ, and (3)linear functional(s) q(λ) = q(u(λ)) taking values in an output data space D. We assume thatthe map q : Λ → D is at least piecewise smooth. The error estimate we derive can be usedto correct for certain sources of discontinuities in the response surface [9], and in practicepiecewise smoothness is often required for certain representation error bounds referenced inthe numerical results.

Often, there are two types of error considered in the approximation of probability distri-butions for both the forward and inverse problems: deterministic and statistical errors. Onesource of deterministic error is representation error in using a surrogate. For the numerical re-sults in this paper, we use polynomial spectral approximations to compute surrogate responsesurfaces. In practice, the order of the spectral approximation is truncated, leading to repre-sentation error in the computed surrogate q(λ). Second, there is numerical error introducedfrom the numerical computation of the coefficients to the surrogate response surface q(λ), e.g.,as happens if computing q(λ) involves solving a differential equation. We refer to this error asdiscretization error. In section 3, we describe more precisely these sources of error in q(λ) andhow we may estimate and correct for the total deterministic error, denoted by εD(λ). Usingthis notation, we write

(2.1) q(λ) = q(λ) + εD(λ).

The main thrusts of this paper are to estimate εD(λ) using a posteriori techniques, to use thiserror estimate to improve the surrogate response surface, and to analyze the general effects ofdeterministic error on the propagation of uncertainty using the improved surrogate responsesurface.

Computable bounds on statistical error introduced from finite sampling are included forcompleteness, and the interested reader can use these bounds to determine the size of thesample set required for accurate computations of probabilities. In general, statistical errorcan be reduced by using more samples or by utilizing a more effective sampling strategy,e.g., importance sampling. We assume the generation of samples in Λ is straightforward, andthe use of the surrogate q(λ) (and the improved surrogate) makes evaluation of the responsesurface relatively cheap. Thus, we assume that we may generate sample distributions usingsample sizes sufficiently large in order to render the statistical error negligible.

3. Defining the surrogate model. We briefly describe below how to compute the sur-rogate models we consider in this paper. As discussed in previous work (see [9]), we makenote that the error estimation results derived below that are used to define the improvedsurrogate do not require the use of polynomial spectral approximations, which are sometimesreferred to as polynomial chaos (PC) expansions. Furthermore, the general error analysis ofthe distributions presented below does not require that such spectral approximations be usedfor defining the surrogate model. The theory of PC expansions is well established, and thetechnique provides a convenient way to illustrate the results numerically.

Below, let {Ω,F , P} denote a probability space, where ω ∈ Ω is the space of outcomes, Fis a σ-algebra over Ω, and P is a probability measure on F . Let {Zi(ω)}∞i=1 denote a set ofD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 5: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

168 T. BUTLER, C. DAWSON, AND T. WILDEY

r.v.’s, i.e., Zi : ω ∈ Ω → R. We denote particular realizations of r.v.’s with lower case lettersso that z = Z(ω) for a particular ω ∈ Ω. Let {Φm(z)}∞m=1 denote (multivariate) orthogonalpolynomials with respect to the choice of r.v. Z. We form multivariate orthogonal polynomialsby computing tensor products of univariate orthogonal polynomials. In the interest of spacewe skip over many details of this well-established theory and direct the interested readerto [63, 13, 66, 62, 41, 61, 16, 48, 47, 40, 10, 9, 15] for development and analysis of PCexpansions. For details on multivariate polynomials see [36, 18]. We focus our attention onthe spectral Galerkin method, and we observe that the pseudospectral projection method isone of many possible useful alternatives that could also be applied for computing surrogateresponse surfaces of this type [9, 15].

3.1. Spectral Galerkin approximation of random variables. We are interested in ex-panding general finite-variance r.v. X(ω) in terms of a basis {Φm(z)}∞m=1 orthogonal withrespect to the density ρ(z) of r.v. Z. For a particular choice of basis {Φm(zi1 , . . . , zim)}∞m=1,orthogonality holds with respect to the inner product 〈·, ·〉 on L2(Ω), i.e.,

(3.1) 〈Φm,Φn〉 =∫ΩΦm(Z(ω))Φn(Z(ω)) dP (ω) =

∫Φm(z)Φn(z)ρ(z) dz = δmn ‖Φm‖2 .

Here, ‖·‖ is the norm induced from the inner product on this Hilbert space.Let {Zi}∞i=1 denote a set of r.v.’s with associated basis {Φm(z)}∞m=1, where Φm is the

(multivariate) polynomial of order m. We expand random process X(ω) in terms of this basisand use the compact representation

X(ω) =∞∑i=0

XiΦi(z).(3.2)

There is a one-to-one correspondence between Φi(z) and Φm(zi1 , . . . , zim) [41], where z is thevector

(Z1(ω) Z2(ω) · · ·

), and

(3.3) Xi =1∥∥∥Φi

∥∥∥2∫ΩX(ω)Φi(z) dP (ω).

Since the polynomials form a complete basis in the Hilbert space determined by their support,a generalized result of the Cameron–Martin theorem shows that the above expansion convergestoX(ω) in the L2 sense. The compact form for this generalized Fourier expansion is commonlyused in practice (for recent examples, see [66, 51, 10, 9]). Thus, we drop the hat notation andconsider the basis to be properly reordered.

In practice, we truncate the order p and dimension n of the polynomial expansion resultingin P + 1 = (n + p)!/(n!p!) total terms so that

(3.4) X(ω) ≈P∑i=0

XiΦi(z).

Using a finite expansion introduces representation error.Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 6: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 169

3.2. Spectral Galerkin approximation and surrogate response surfaces. We briefly out-line the spectral Galerkin method applied to the solution of stochastic differential equationsas presented in [41, 66, 51, 61, 16, 48, 47, 40, 10]. This method uses a (truncated) polynomialexpansion for an uncertain parameter modeled as an r.v. in a differential equation, expandingthe solution in terms of the finite-dimensional polynomial basis and requiring the residual tobe orthogonal to the span of this basis. The result is a set of deterministic equations describingthe propagation of uncertainty from input parameters to the output solution. Nonintrusivemethods exist (the pseudospectral projection method is one such example; see [9, 15] for somespecific details), but for the sake of simplicity and focus, we limit the presentation to thespectral Galerkin method.

Consider the general stochastic differential equation

(3.5) D(x, t, λ;u) = f(x, t;λ),

where u := u(x, t;λ) is the solution, f(x, t;λ) is the source term, and λ is the random parame-ter reflecting uncertainty in model and source parameters. The solution operator’s dependencyon λ implies that u is also uncertain and may be modeled as a random process for which wecompute a spectral approximation with respect to the polynomial basis used to expand theparameter. First, suppose λ and Z are r.v.’s over the same probability space, and define the(truncated) polynomial expansion

(3.6) λ =

P∑i=0

λiΦi(z).

The spectral approximation of u is

(3.7) u(x, t;λ) =

P∑i=0

ui(x, t)Φi(z).

Substitution of (3.7) and (3.6) into (3.5) and computing the Galerkin projection yields thefollowing set of coupled, deterministic equations:(3.8)⟨

D(x, t,

P∑i=0

λiΦi;

P∑i=0

ui(x, t)Φi

),Φk

⟩=

⟨f

(x, t;

P∑i=0

λiΦi

),Φk

⟩, k = 0, 1, . . . , P.

We wish to compute a small number of low-dimensional quantities of interest involvingthe solution u(x, t;λ). These quantities of interest are assumed to be linear functionals of thesolution, e.g.,

(3.9) q(λ) =

∫ T

0(ψ, u(x, t;λ))L2 dt

for some particular ψ in an appropriate space. Here, we write q(λ) to emphasize the im-plicit dependence of this response surface on the uncertain parameter. Below, we discuss amethod for numerical computation of the spectral approximation to u(x, t;λ), i.e., a methodfor numerically solving (3.8). This numerically computed spectral approximation to u(x, t;λ)is substituted into (3.9) to define the surrogate response surface, denoted by q(λ).D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 7: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

170 T. BUTLER, C. DAWSON, AND T. WILDEY

4. Deterministic error estimation and an improved surrogate. We apply the generalframework described above and extend recent work (see [10, 9]) to derive computable a pos-teriori estimates of the deterministic error in quantities of interest, i.e., εD as noted in (2.1).These estimates of the deterministic error are functions of the parameter λ and are used todefine an improved surrogate.

4.1. A fully discrete model problem. We study as an example the fully discrete modelproblem originally presented in [10]. We make use of the following notation:

• Let Q = S × [0, T ], where T > 0 and S ⊂ Rd is a (convex, polygonal) spatial domain

with Lipschitz boundary ∂S and ∂Q = ∂S × (0, T ).• Let (·, ·)L2 denote the L2 inner product on S, and recall the classical Sobolev space

(see [1]),

Hn(S) ={v ∈ L2(S) : ∂kv ∈ L2(S) ∀|k| ≤ n

}.

• Let Th be a quasi-uniform triangulation of S consisting of either simplices or quadri-lateral (in 2-dimensional) elements, E, with maximal element diameter h.

• Let In = (tn−1, tn) and time steps kn = tn − tn−1 denote the discretization of [0, T ] as0 = t0 < t1 < · · · < tN = T , and let Qn = S × In denote a space-time slab.

• Let the space of piecewise linear continuous functions over the spatial domain bedefined by

Vh ={v ∈ C(S) ∩H1(S) : ∀E ∈ Th, v|E ∈ P

1(E)}

and the corresponding space of piecewise polynomials over each space-time slab Qn by

W(q)n =

⎧⎨⎩w(x, t) : w(x, t) =

q∑j=0

tjvj(x), vj ∈ Vh, (x, t) ∈ Qn

⎫⎬⎭ .

• Let W(q) denote the space of functions on Q such that w|Qn ∈ W(q)n for 1 ≤ n ≤ N ,

and for w ∈ W(q) let w+n (x) = limt↓tn w(x, t), w−

n (x) = limt↑tn w(x, t), and [w]n(x) =w+n (x)− w−

n (x).We consider the (nonlinear) stochastic diffusive transport equation,

(4.1)

⎧⎪⎪⎨⎪⎪⎩∂u

∂t−∇ · (A(x, t, λ)∇u) = g(x, t;u) + f(x, t), (x, t) ∈ Q,

A∇u · n = 0, ∂Q,

u(x, 0) = 0, x ∈ S,

where A(x, t, λ) is a symmetric, uniformly bounded, and uniformly positive definite tensor;i.e., each eigenvalue, ηi, of A is real, and there exist positive constants ηmin > 0 and ηmax > 0such that

ηmin ≤ ηi(x, t, λ) ≤ ηmax

for a.e. (x, t) ∈ Q and λ ∈ Λ. We assume f ∈ L2(Q) and g is sufficiently smooth such thatfor fixed λ there exists u ∈ L2([0, T ];H1(S)) such that

(4.2)

∫ T

0[(u, v)L2 + (A(x, t, λ)∇u,∇v)L2 ] dt−

∫ T

0(g(x, t;u), v)L2 dt =

∫ T

0(f(x, t), v)L2 dt

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 8: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 171

for all v ∈ L2([0, T ];H(S)) and v(x, T ) = 0.We assume λ is modeled as an r.v. for which we compute a polynomial spectral approxi-

mation with respect to r.v. Z, as shown in section 3.1. We then apply the stochastic Galerkinmethod to (4.1). Let {Φi(z)} be the orthogonal polynomial basis with respect to the densityof Z. We numerically solve a variational form of (3.8) to approximate u =

∑Pk=0 uk(x, t)Φi(z).

Specifically, we compute U ∈ W(q), where the approximation U is computed on each Qn start-

ing at n = 1. We set U0 = U(x, 0) = 0 and compute U ∈ W(q)n successively for n = 1, 2, . . . , N

satisfying

(4.3)

∫In

(Uk, v

)L2dt

+1

‖Φk‖2∫In

(⟨(A

(x, t,

P∑i=0

λiΦi(z)

)P∑

j=0

∇UjΦj(z)

),Φk

⟩,∇v

)L2

dt

− 1

‖Φk‖2∫In

(⟨g

(x, t;

P∑j=0

UjΦj(z)

),Φk(z)

⟩, v

)L2

dt+([Uk]n−1, v

+)L2

=

∫In

(fk(x, t), v

)L2dt ∀v ∈ W(q)

n and k = 0, 1, . . . , P,

where [·]n represents the standard jump operator at time tn. We remark that the choice ofcontinuous or discontinuous polynomials in time along with an appropriate choice of quadra-ture scheme usually leads to one of the standard time integration rules, e.g., backwards Euleror Crank–Nicolson. For more details, see [26, 21, 22].

4.2. Computable estimate of εD(λ). Let U ∈ W(q)n denote the approximation to (4.1),

and let e = u − U , where u solves (4.1). Following [26, 4], let ϑ denote the adjoint solutioncorresponding to u, and define the linear adjoint operator such that

∫ T

0(D(x, t, λ;u)), ϑ) dt−

∫ T

0(D(x, t, λ;U), ϑ) dt

=

∫ T

0

(D(u,U)e, ϑ

)dt =

∫ T

0(e,D(u,U)

∗ϑ) dt,

where D(u,U) satisfies

D(u,U) =

∫ 1

0∂uD(x, t, λ; su+ (1− s)U) ds.

The strong form of the adjoint to the nonlinear stochastic operator given by (4.1) may bewritten as

(4.4)

⎧⎪⎪⎨⎪⎪⎩−∂ϑ∂t

−∇ ·(AT (x, t, λ)∇ϑ

)= g(u,U)

�ϑ+ ψ1, x ∈ S, T > t ≥ 0,

AT∇ϑ · n = 0, ∂Q,

ϑ(x, T ) = ψ2, x ∈ S,Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 9: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

172 T. BUTLER, C. DAWSON, AND T. WILDEY

where g(u,U) =∫ 10 ∂ug(x, t; su+(1−s)U) ds. The quantity of interest q(λ) defines the adjoint

data; i.e., the choice of the quantity of interest determines ψ1 and ψ2.In [10], we show that if the adjoint is solved using a polynomial spectral approximation and

the representation error is small, then we can use a polynomial expansion of the a posteriorierror estimate to accurately estimate the discretization error in the surrogate quantity ofinterest. This was verified by comparison to standard a posteriori error estimates where theforward and adjoint problems were solved at various fixed λ values. Motivated by the recentwork in [9], we stop short of projecting the error estimate onto the polynomial basis andconsider an estimate for the entire deterministic error; i.e., we estimate both the representationand the discretization error.

Solving the variational form of the adjoint (4.4) at a fixed value of λ and substituting thisadjoint solution for the spectral approximation used in the derivation of the error estimate in[10] gives the following computation of the deterministic error at a fixed value of λ:

(4.5) εD =(e(x, 0), ϑ(x, 0)

)L2

−N∑

n=1

∫In

(A

(x, t;

P∑k=0

λkΦk(z)

)P∑i=0

∇UiΦi(z),∇ϑ)

L2

dt

−N∑

n=1

∫In

(P∑i=0

UiΦi(z), ϑ

)L2

dt+

N∑n=2

(P∑i=0

[Ui]n−1Φi(z), ϑ(x, tn−1)

)L2

+N∑

n=1

∫In

(g

(x, t;

P∑i=0

UiΦi(z)

), ϑ

)L2

dt+N∑

n=1

∫In

(f, ϑ

)L2dt.

If continuous polynomials are used in time, then we use the same computable estimate forthe error (4.5), but the jump terms [Ui]n−1 are all zero. In practice, we do not have access tog(u,U) =

∫ 10 ∂ug(x, t; su + (1 − s)U) ds and must linearize around U , so even if the adjoint

problem is solved exactly, (4.5) represents a computable approximation to the deterministicerror. However, the effect of this linearization error on the estimate can be analyzed (see, e.g.,[24]) and is generally not significant.

4.3. An improved surrogate quantity of interest. We use the computation of εD in (4.5)to improve the surrogate response surface q(λ). However, we now consider the computationfor varying values of λ compared to (4.5) which assumes the value of λ to be fixed. This is anextension of our recent work improving linear functions using error estimates for parameterizedlinear systems [9]. Using the more abstract operator notation we have that

εD(λ) =

∫ T

0(ψ, e(λ)) dt =

∫ T

0(D(u,U)

∗ϑ(λ), e(λ)) dt =

∫ T

0(ϑ(λ),D(u,U)e(λ)) dt.

We note that D(u,U)e(λ) is the residual which implies that if we have access to ϑ(λ), thenwe can compute the deterministic error exactly.

Since we are interested in computing accurate probabilities of events, we are fundamentallyinterested in improving the pointwise error of q(λ). Specifically, we assume that for thenumerically computed spectral approximation of u there exists an error bound in the spaceL∞(Ω;L2([0, T ];H(S))) such that

(4.6) ‖e(λ)‖L∞(Ω;L2([0,T ];H(S))) ≤ ε1(P, h,Δt).Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 10: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 173

Here, P indicates the order of the polynomial expansion and ε1(P, h,Δt) ≥ 0, where h is themaximal spatial element diameter and Δt is the maximal time step.

Let ΘM (λ) be any Mth-order polynomial spectral approximation of ϑ(λ) over Λ; then

(4.7) εD(λ) =

∫ T

0(ΘM (λ),D(u,U)e(λ)) dt+

∫ T

0(ϑ(λ)−ΘM (λ),D(u,U)e(λ)) dt.

As shown in [9] for parameterized linear systems, the second term is higher order and thefirst term can be added back into the computed surrogate quantity of interest to give animproved functional value. We define this as the improved surrogate quantity of interest andwrite q(λ) + εD(λ), where it is understood that we use the approximation to εD(λ) to definethe improved surrogate. Using a Cauchy–Schwarz inequality proves the following theorem.

Theorem 4.1. If the pointwise errors for the forward and adjoint PC approximations satisfy

(4.8) ‖e(λ)‖L∞(Ω;L2([0,T ];H(S))) ≤ ε1(P, h,Δt),

and

(4.9) ‖ϑ(λ)−ΘM (λ)‖L∞(Ω;L2([0,T ];H(S))) ≤ ε2(M),

where ε1(P, h,Δt), ε2(M) ≥ 0, then the pointwise error in the improved surrogate quantity ofinterest satisfies

(4.10) ‖q(λ)− (q(λ) + εD(λ))‖L∞(Ω;L2([0,T ];H(S))) ≤ Cε1(P, h,Δt)ε2(M),

where C > 0 depends only on D(u,U); i.e., C depends on linearization errors and is frequentlyO(1) if the forward numerical approximation is relatively close to the true forward solution.

Remark 4.1. For fixed P , ε1(P, h,Δt) converges at the usual rates for h and Δt to therepresentation error of the exactly computed truncated spectral approximation. For (4.7), weare typically forced to use numerical approximations to the adjoint solution, so ε2(M) alsodepends on h and Δt. However, we often use higher-order numerical methods for the solutionof the adjoint to obtain reliable a posteriori estimates of spatial-temporal discretization errors.We therefore consider only the effect of discretization errors in the forward solution errorbound.

In [9], several general cases for ε1(P ) (note: there was no dependence on h or Δt in theseexamples) and ε2(M) are considered. We see in section 7 that solving the forward and adjointproblems with the same polynomial order, computing the estimate to εD(λ) as above, andusing this estimate to compute the improved surrogate quantity of interest has a reduceddeterministic error comparable to the case where the forward problem is solved with twicethe polynomial order and the discretization error has been corrected. This is consistent withresults shown in [9] and extends these results beyond parameterized linear systems.

5. Forward propagation of uncertainty and errors. We follow [60, 7] to formulate theforward sensitivity analysis problem for deterministic models with uncertain input parametersusing the law of total probability. Let σΛ(λ) denote a density on Λ and ρD(q(λ)) = ρD(q(u(λ)))denote a density on D. Both σΛ(λ) and ρD(q(λ)) define probability measures PΛ(λ) and PD(q),D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 11: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

174 T. BUTLER, C. DAWSON, AND T. WILDEY

respectively. The deterministic model can be expressed in terms of a likelihood functionL(q |λ) of the output q values given the input parameter values λ, where L(q |λ) = δ(q−q(λ))is the Dirac distribution at q = q(λ). The law of total probability implies

(5.1) ρD(q) =∫ΛL(q |λ)σΛ(λ) dλ =

∫Λδ(q − q(λ))σΛ(λ) dλ.

The goal of computing ρD(q), from our point of view, is to compute certain statisticson output events. Specifically, we are interested in computing accurately the cumulativedistribution function (CDF) FD(s) defined by the density.

5.1. Forward problem computational details. The forward sensitivity problem is to de-termine ρD(q) given σΛ(λ) and the likelihood L(q |λ) = δ(q − q(λ)) satisfying (5.1). We notethat while probability densities describe r.v.’s, the densities themselves are not random; i.e.,ρD(q) has a well-defined value for each q ∈ D. Common approaches to approximating prob-ability densities involve, but do not require, Monte Carlo random sampling [34, 43, 44, 57].The forward problem has two clearly defined stages. First, we produce samples of r.v. λ.Second, we propagate through the response surface to obtain samples of the output r.v. q.This problem is straightforward in that each λ produces exactly one output datum q(λ). Weuse a surrogate response surface q(λ) that is computationally cheap to evaluate for a largenumber of samples of parameter λ. While we focus on the spectral Galerkin method in thispaper, the general error analysis below can easily be extended to other types of surrogateswhere accurate a posteriori error estimates can be computed (see, for example, [11], where adifferent, but complete, error analysis is developed for an inverse formulation using a differenttype of surrogate model).

Since q(λ) is cheap to evaluate for any given λ, Monte Carlo simulation [57, 43] is anefficient means of generating random output samples. We generate a set of independent andidentically distributed (i.i.d.) random samples {λ(n)}Nn=1, where λ

(n) ∼ σΛ(λ), and propagatethrough q(λ) to obtain (approximate) output samples {q(n)}Nn=1, where q

(n) = q(λ(n)). Theoutput CDF FD(s) is approximated by the sample CDF FD,N (s) defined as

(5.2) FD,N (s) :=1

N

N∑n=1

1(q(n) ≤ s).

In section 5.2, we consider the effect of deterministic error in q(λ) on the accuracy of (5.2).For completeness, we also present computable error bounds to account for the effect of finitesampling on the distribution.

5.2. Forward problem error analysis. We describe how to separate the statistical anddeterministic sources of error in the computed CDF. First, we define two approximate outputCDFs to FD(s). The sample output CDF FD,N (s) is defined by (5.2) using error-free randomsamples {q(n)}Nn=1. Second, the approximate sample output CDF FD,N (s) is also definedby (5.2) but with approximate random samples {q(n)}Nn=1. The random samples {q(n)}Nn=1

are approximate in the sense that they are generated by propagation of random samples{λ(n)}Nn=1 through the surrogate q(λ). Since the goal is to compute probabilities of events onobservable output data, i.e., the quantities of interest, we are particularly interested in theD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 12: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 175

global approximation of FD(s) by FD,N (s). The Kolmogorov–Smirnov distance [59, 27] givenby

(5.3) sups∈D

∣∣∣FD,N (s)− FD(s)∣∣∣

is a useful measure of the quality of a sample CDF FD,N (s). We decompose the error in theCDF as

(5.4)∣∣∣FD,N (s)− FD(s)

∣∣∣ ≤ ∣∣∣FD,N (s)− FD,N (s)∣∣∣︸ ︷︷ ︸

I

+ |FD,N (s)− FD(s)|︸ ︷︷ ︸II

.

Term I represents the effect of the deterministic (i.e., discretization and representation) errorin computing the surrogate response surface q(λ) on the computed CDF. Term II denotesthe statistical error due to finite sampling, and extensive literature exists for this term; e.g.,see [59]. We summarize some typical bounds for term II in section 5.2.2. The bounds for thisterm converge monotonically to zero as the number of samples increases. In this paper, thecost of generating a large number of samples is greatly reduced by the use of a surrogate q(λ).Thus, in the numerical results we focus on the effect of deterministic error on the CDF andverify the bound we derive below for term I.

5.2.1. Analysis of deterministic error. Let en denote the deterministic error in q(n), i.e.,en = εD(λ

(n)) so q(λ(n))+en = q(n)+en = q(n). In section 4, we extended the results of recentwork [10, 9] to derive a computable a posteriori estimate to the error εD(λ) for any λ ∈ Λ whenq(λ) is computed from a spectral approximation. We use the error estimates to compute animproved surrogate quantity of interest. Depending on the error estimate used (i.e., whetheror not a spectral approximation is used in computing the adjoint solution), the improvedsurrogate may still contain a representation error, but this error will be smaller in magnitudethan the representation error of the original surrogate. We assume there exists a computablebound on the remaining representation error. With no loss of generality, let En ≥ 0 denote abound on en so that |en| ≤ En, and let Iq(n) :=

[q(n) + en − En, q

(n) + en + En

]; i.e., Iq(n) is

the interval defined by the improved surrogate datum and its associated error bound.We use the analysis presented in [27] to arrive at the computable bound

(5.5) I ≤∣∣∣∣∣ 1N

N∑n=1

1Iq(n)

(s)

∣∣∣∣∣ .Observe that (5.5) is an expected value on the random samples.

Estimating and correcting for the deterministic error implies En is a local bound on therepresentation error. In the numerical results, we make use of a global bound on the repre-sentation error; i.e., we set En = E for all n for some fixed E > 0. Then, for any fixed valueof s, the above bound can be rewritten as

(5.6) I ≤∣∣∣∣∣ 1N

N∑n=1

1[s−E,s+E](q(n) + en)

∣∣∣∣∣ .

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 13: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

176 T. BUTLER, C. DAWSON, AND T. WILDEY

Observe that we can compute this expectation at any value of s ∈ D. Thus, the effect ofthe representation error on the accuracy of the numerically computed sample CDF at a fixeds ∈ D is bounded by the probability that the numerically computed output samples are withinthe interval [s− E, s+ E].

5.2.2. Analysis of statistical error. In [19, 52], it was shown that there is a constant0 < C ≤ 2 (not depending on FD) such that for any η > 0,

(5.7) P

(sups∈D

|FD(s)− FD,N (s)| > η

)≤ C exp

(−2η2N

).

Using properties of the probability measure, we have that for any η > 0,

(5.8) P

(sups∈D

|FD(s)− FD,N (s)| ≤ η

)≥ 1− 2 exp

(−2η2N

).

This is an a priori bound. It is possible to prove other forms of this bound; e.g., in [27],

defining η in terms of the number of samples as η = (log(2/ε)/(2N ))1/2 gives

(5.9) sups∈D

|FD(s)− FD,N (s)| ≤(log(2/ε)

2N

)1/2

with probability greater than 1 − ε for ε ∈ (0, 1). Observe that (5.9) is a computable aposteriori error bound for (5.4), and for fixed ε > 0, the error bound in (5.9) goes to zero ata rate of O(N−1/2).

If we consider multiple quantities of interest, i.e., D ⊂ Rd, then a variant of (5.7) is

available [45]. In this case, for any 0 < a < 2, there exists a constant C > 0 depending onlyon d and a > 0 such that for any η > 0,

(5.10) P

(sups∈D

|FD(s)− FD,N (s)| > η

)≤ C exp

(−aη2N

).

In [17], it was shown that for any η > 0, if N is sufficiently large to satisfy√Nη ≥ d, then

(5.11) P

(sups∈D

|FD(s)− FD,N (s)| > η

)≤ 2d+1e2N d exp

(−2η2N

).

We note that (5.11) goes to zero faster than (5.10) for fixed η > 0 as N → ∞. Again, thisis an a priori error bound, but it is possible to manipulate this error bound to achieve an aposteriori bound similar to (5.9).

6. Inverse propagation of uncertainty and errors. We use a common formulation forthe inverse sensitivity problem with either an imposed or observed output density ρD(q)[60, 51, 5, 44]. The goal is to determine what is commonly referred to as a posterior densityσΛ(λ | q) on Λ given a likelihood function and a prior density on Λ. In this formulation, thelikelihood of λ given q is defined as

(6.1) L(λ | q) =∫DρD(q)θ(q |λ) dq,

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 14: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 177

where θ(q |λ) defines the theoretic probability density of output value q given λ (for a thoroughpedagogical development, we refer the interested reader to [60]). Given our assumptions ofa deterministic model, we have that θ(q |λ) = δ(q − q(λ)). We observe that θ plays a roleanalogous to that of L(q |λ) in the forward sensitivity problem. Thus, the likelihood functionfor a deterministic map is given by L(λ | q) = ρD(q(λ)). We let ρΛ(λ) denote the prior densityon Λ. The posterior is defined as

(6.2) σΛ(λ | q) = νρΛ(λ)L(λ | q),

where ν is a normalizing constant.As before, we are interested primarily in computing statistics on events in Λ. Thus, the

object of interest is the CDF FΛ(s) defined by the posterior density. Since the goal is tocompute accurate probabilities, we consider the pointwise accuracy of the CDF computedusing a surrogate response surface.

6.1. Inverse problem computational details. Given output probability density ρD(q),prior parameter density ρΛ(λ), and likelihood L(λ), we seek to compute the posterior param-eter density σΛ(λ | q) satisfying (6.2). Direct implementation of Monte Carlo simulation asin the forward problem is infeasible since there is no clear way to invert a random sample ofoutputs {q(n)}Nn=1 to a finite set of inputs {λ(n)}Nn=1; i.e., the inverse map is not well defined.However, there are indirect ways of generating samples {λ(n)}Nn=1 of the density σΛ(λ | q).A convenient approach often employed to interrogate the posterior indirectly is the use ofMarkov chain Monte Carlo (MCMC) sampling [57, 43, 60].

A popular method for generating a Markov chain of samples is application of the famousMetropolis–Hastings algorithm [57, 60]. In this algorithm, the prior density is used to generateproposed samples that are accepted or rejected using the likelihood. In all MCMCmethods it isimportant to note that even though proposed samples from the prior density may be generatedindependently, the resulting samples of the posterior are serially correlated. A drawback ofthe standard Metropolis–Hastings algorithm is that the acceptance ratio can be small if theprior differs too much from the posterior. If this is the case, then the sample distributioneither converges slowly or may not converge at all to the posterior; see [57] for more details.We may increase the acceptance rate by using the so-called random-walk Metropolis algorithm[57, 38, 39, 43], where a symmetric transition kernel k(·|·) is centered on the current positionof the chain.

As before, q(λ) denotes the surrogate approximation to q(λ) obtained, for example, bya spectral method. This surrogate is used in computing the likelihood, which we denote byL(λ | q). Specifically, the surrogate response surface is evaluated by the theoretic probabilitydensity θ(q |λ), which uses the model to determine the probability of an output value q givenan input value λ. We let FΛ(s) denote the CDF defined by the posterior density and FΛ,N (s)denote the sample surrogate CDF defined by

(6.3) FΛ,N (s) :=1

N

N∑n=1

1(λ(n) ≤ s),

where {λ(n)}Nn=1 indicates a set of samples computed from an accept/reject algorithm withL(λ | q).D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 15: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

178 T. BUTLER, C. DAWSON, AND T. WILDEY

The CDF FΛ(s) can be written as

(6.4) FΛ(s) = ν

∫{λ≤s}

L(λ | q)ρΛ(λ) dλ.

We can rewrite (6.4) for FΛ(s) by replacing L(λ | q) with L(λ | q) and ν with ν. The normalizingconstant ν needed for evaluation of (6.4) is given by

(6.5) ν−1 =

∫ΛL(λ | q)ρΛ(λ) dλ.

Evaluation of either (6.5) or (6.4) is often analytically impossible because of the dependenceof the integrals on the likelihood function [60]. Numerical estimates of these integrals canbe computed using a straightforward Monte Carlo integration approach where independentsamples are generated from the prior, the likelihood evaluated at each of these samples, andthe average value of the likelihood values taken as the numerical estimates of these integrals.If a sufficient number of samples are used, then the computational error in evaluating theseintegrals can be made negligible. This approach is oftentimes disregarded because of severalshortcomings [51, 60], including, but not limited to, the following: (1) the number of samplesrequired for accurate numerical integration is often large, so the cost of evaluating the likeli-hood at this large number of points is computationally expensive; and (2) the samples usedto compute these integrals are fundamentally not from the same distribution as the posterior.

Since we use a surrogate response surface, the computational expense of evaluation of thelikelihood at a large number of samples is less burdensome. The integrals defined by (6.4)and (6.5) are useful in determining the effect of the deterministic error on the CDF computedusing the surrogate response surface, as shown in section 6.2.

6.2. Inverse problem error analysis. As with the forward problem error analysis, wedescribe how to separate the statistical and deterministic sources of error in the computedCDF. We define two approximate posterior CDFs. However, we cannot directly compare twodifferent sample CDFs, as was done in the forward problem, if MCMC sampling is used. Toillustrate why, we define the sample CDFs to the posterior defined by the exact computationalmodel and the surrogate model as FΛ,N (s) and FΛ,N (s), respectively. Assume random samplesfor each sample CDF are computed from the same MCMC algorithm. Even if the likelihoodfunctions were identical (they are not, as discussed in section 6.2.1), the sets of samples definedby the two Markov chains would not be guaranteed to have any samples in common, norwould there exist any functional relationship between the two sets of samples. It is thereforeimpractical to investigate the error between these two sample CDFs using a methodologysimilar to the forward problem error analysis.

We circumvent this problem of comparing two sample CDFs of the posterior by decom-posing the error in the CDF as

(6.6)∣∣∣FΛ(s)− FΛ,N (s)

∣∣∣ ≤ ∣∣∣FΛ(s)− FΛ(s)∣∣∣︸ ︷︷ ︸

I

+∣∣∣FΛ(s)− FΛ,N (s)

∣∣∣︸ ︷︷ ︸II

.

Terms I and II play roles similar to the terms denoted in (5.4) in terms of accounting for theeffect of deterministic and statistical errors, respectively. We discuss in section 6.2.1 the effectD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 16: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 179

of deterministic error in the map q(λ) on the likelihood function and how we may computeestimates for the effect of this error on the CDF. We can bound the statistical error in a wayanalogous to the forward problem analysis if we assume a lagged chain is used, as discussed insection 6.2.2. For more about convergence of sample distributions computed using an MCMCmethod, see, e.g., [57, 43, 39]. As with the forward problem analysis, the bounds for this termconverge monotonically to zero as the number of samples increases, and we focus on verifyingthe bound we derive for term I in the numerical results.

6.2.1. Analysis of deterministic error. As in the forward problem, we may estimateand correct for the deterministic error εD(λ) to obtain an improved surrogate, as shown insection 4. We assume that in the worst case we can bound the representation error of theimproved surrogate only globally. As before, we let E ≥ 0 denote a bound for this error overthe set Λ. We can account for this error in the formulation of the inverse problem in severalways. We present two formulations for which we provide numerical results in section 7. Thefirst formulation we consider takes the likelihood as given by the deterministic map defined bythe improved surrogate. The second formulation we consider alters the likelihood to accountfor the error in the improved surrogate. The error analysis for either formulation is similar,with only the likelihood function L(λ | q) changing below. Thus, we present the error analysisfor the second formulation below and follow with the result for the first formulation.

Let Iq(λ(p)) := (q(λ(p)) + εD(λ(p)) − E, q(λ(p)) + εD(λ

(p)) + E). The implication is that

the exact value q(λ(p)) ∈ Iq(λ(p)). Given such an error bound, we may consider the theoreticprobability density of output value q, given λ, to be uniform on Iq(λ(p)) to account for the

uncertainty in the surrogate map, i.e., θ(q |λ(p)) = 12E1I

q(λ(p))(q), so that

(6.7) L(λ(p) | q) = 1

2E

∫Iq(λ(p))

ρD(q) dq.

Remark 6.1. There exist other methods for accounting for uncertainty in the evaluation ofa response surface in the presence of error. For example, we may consider the error functionitself as a statistical process. This suggests the use of a hierarchical Bayesian model where q(λ)is taken as the mean value of a random process (often chosen to be Gaussian) with variancetaken to be a hyperparameter representing the error in the map evaluation. We do not intendto displace these methods with the approach suggested here, and in fact, they suggest that themodel and its error are both statistical instead of deterministic. The approach suggested hereuses only (1) the knowledge of a well-defined estimable bound of the deterministic error, and(2) the modeling assumption that the exact response surface is within this error bound. If moreinformation is known about the structure of the deterministic error, then it is straightforwardto modify (6.7) to reflect the corresponding change in θ(q |λ(p)).

We rewrite term I in (6.6) by

(6.8)∣∣∣FΛ(s)− FΛ(s)

∣∣∣ =∣∣∣∣∣∫Λ∩{λ≤s}

(νρD(q(λ)) − ν

1

2Eλ

∫Iq(λ)

ρD(q) dq

)ρΛ(λ) dλ

∣∣∣∣∣ .Here, ν and ν are the normalizing constants of the exact and approximate posterior densities.Since the inverse of these normalizing constants are integrals of L(λ | q)ρΛ(λ) and L(λ | q)ρΛ(λ)D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 17: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

180 T. BUTLER, C. DAWSON, AND T. WILDEY

over Λ, we can use the same set of independent samples drawn from the prior ρΛ(λ) tonumerically compute ν, νmin, and νmax such that ν ∈ [νmin, νmax]. There is no accept/rejectscheme applied in this straightforward Monte Carlo approach; e.g., see [51]. A (possiblydifferent) set of samples drawn independently from the prior may then be used to interrogate(6.8) in much the same way as obtaining νmin and νmax. Specifically, let {λ(j)}Jj=1 be a setof independently drawn samples from the prior density (restricted to the set Λ ∩ {λ ≤ t} forsimplicity); then an estimate of (6.8) is given by

(6.9)∣∣∣FΛ(s)− FΛ(s)

∣∣∣ ≈∣∣∣∣∣∣ 1J

J∑j=1

[νρD(q(λ(j)))− ν

1

2Eλ(j)

∫Iq(λ(j))

ρD(q) dq

]∣∣∣∣∣∣ ,and we may use the interval containing ν and q(λ(j)), determined by the deterministic errorbound E at each λ to maximize this sum. Note that by maximizing this sum we are notattempting to compute a “tight” error bound. Rather, in the numerical results below, wedemonstrate that the error bound improves (i.e., decreases) with the use of the improvedsurrogate to provide reasonable bounds of the error in the computed distribution. The errorin the computed distribution from the improved surrogate is often at least an order of mag-nitude smaller than the bound. Moreover, this error bound for the improved surrogate is alsogenerally smaller than the actual error in using the original surrogate despite using a constantbound for the representation error in these computations.

If we consider the original formulation of the inverse problem with deterministic likelihoodL(λ | q) = ρD(q(λ) + εD(λ)), then the computation of (6.9) is similar. The only change is tothe rightmost term. Computing νmin and νmax remains unchanged since these require onlymaximizing or minimizing (respectively) ρD(q) for q ∈ Iq(λ(n)) for each independent sample

of the prior λ(n). In section 7, we use the statistical formulation of the likelihood for a 1-dimensional inverse problem and use the original formulation of the deterministic likelihoodfor a 2-dimensional inverse problem.

6.2.2. Analysis of statistical error. Similarly as in section 5.2.2, we seek uniform boundson the Kolmogorov–Smirnov distance. An implicit assumption of both the a priori bounds,e.g., (5.11), and the computable a posteriori bounds, e.g., (5.9), is that the samples usedto compute the sample distributions are i.i.d. While this is relatively straightforward tosatisfy in the forward problem using traditional Monte Carlo sampling (see section 5.1), thiscondition is clearly not satisfied for samples of a posterior distribution generated according toan accept/reject scheme; i.e., MCMC sampling produces serially correlated samples. Thereare methods to reduce the serial correlation so that the samples from a Markov chain can betreated as if they are independent. Ergodicity of the chain implies that if we wait a sufficienttime between samples, then we may consider the samples to be independently generated[57, 39, 43]. A common approach is to introduce a lag in the MCMC algorithm so that theautocorrelation is reduced; see, e.g., [43] for implementation details and [51] for a practicalexample.

Remark 6.2. Introducing a lag provides other benefits, such as improved mixing of the chain[43], but at a cost of generating more samples (typically one or two orders of magnitude more)than used in a nonlagged chain. Even when using a surrogate to generate samples cheaply, theD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 18: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 181

costs of generating many more samples should not be ignored. However, we may mitigate thiscost by running several chains in parallel and combining the results of each of these chains[37, 39]. Running several long chains and mixing the results appropriately is one such methodof obtaining independent samples [37].

We assume a lagged MCMC is employed and treat the samples independently. Followingsteps similar to those in section 5.2.2, we have that for any η > 0, if N is sufficiently large tosatisfy

√Nη ≥ k, then

(6.10) P

(sups∈Λ

∣∣∣FΛ(s)− FΛ,N (s)∣∣∣ ≤ η

)≥ 1− 2k+1e2N k exp

(−2η2N

).

As in section 5.2.2, we may use this a priori bound to derive a computable a posteriori boundon the Kolmogorov–Smirnov distance. One such bound is given by

(6.11) sups∈Λ

∣∣∣FΛ(s)− FΛ,N (s)∣∣∣ ≤ √

log(2)

N 1/4

with probability greater than 1− 2k+1e2N k4−√N .

7. Numerical results. We illustrate the error analysis with a contaminant source problemmodeled as transient diffusive transport on S = [0, 1]2,

(7.1)

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩∂u

∂t= ∇ · (A(λ1)∇u) +

s

2πσ2exp

(−|x− x|2

2σ2

)[1−H(t− g(λ2))] , S × [0, 0.21],

A(λ)∇u · n = 0, ∂S × (0, 0.21),

u(x, 0) = 0,

where the diffusion A(λ1) is anisotropic and given by

(7.2) A(λ1) =

(exp(cos(πλ1)) 0

0 exp(sin(2πλ1))

).

The model parameter λ1 is uncertain and modeled as a uniform r.v. on [−1, 1]. There is asingle localized source of contaminant at known location x = (0.5, 0.5) active on t ∈ [0, g(λ2)]with a Gaussian profile where the strength s = 10 and the width σ = 0.1 are known. Weassume the source parameter λ2 is modeled as a uniform r.v. on [−1, 1], but g(λ2) is thediscontinuous map

(7.3) g(λ2) =

⎧⎪⎨⎪⎩0.025, λ2 < −2/3,

0.05, −2/3 < λ2 < 1/3,

0.075, 1/3 < λ2.

This is chosen to model the situation where decisions on stopping the flow of contaminant aremade in real time, but the system can shut off flow of the contaminant only at discrete times.While somewhat artificial, this choice allows us to illustrate certain advantages of computingan improved surrogate. Specifically, this problem is particularly ill suited for typical surrogateD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 19: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

182 T. BUTLER, C. DAWSON, AND T. WILDEY

models since the response surface has jump discontinuities across two separate curves in theparameter space; i.e., a polynomial surrogate will not converge pointwise to the exact responsesurface due to the Gibbs phenomenon. However, we demonstrate that the improved surrogatecorrectly captures the discontinuities and provides an accurate propagation of uncertainty.This is due to the discontinuity not being present in the adjoint equation so that while theresidual of the forward problem is discontinuous, the adjoint solution is smooth, which resultsin corrections of the response surface accurately capturing the discontinuity.

We compute forward (resp., adjoint) solutions using continuous piecewise linear (resp.,quadratic) finite element schemes in space and continuous piecewise linear polynomials intime. The grid spacings are taken as either h = 0.1 (or h = 0.05); i.e., this is the spacingof the nodes in the triangulation of S, and Δt = 0.005 (or 0.0025, respectively). Since thenumber of spatial degrees of freedom is relatively modest and the spectral approximations weuse are of low order, we assemble the global matrix corresponding to the discretized stochasticsystem and apply a sparse direct solver. Otherwise, a preconditioned iterative solver, such asconjugate gradients, should be employed.

7.1. 1-dimensional example: Random model parameter.

Forward propagation of uncertainty. Our quantity of interest is the measurement u(x, t)at t = 0.05 and x = (0.5, 0.5). Here, we set h = 0.1 and Δt = 0.005. We compute a 10th-orderspectral approximation of λ1 with respect to r.v. Z ∼ U([−1, 1]) with Legendre polynomialsfor both the forward and adjoint solutions. We fix λ2 so that g(λ2) = 0.05 in this example. Itis clear from (7.1) that the forward problem and its adjoint are both of the general form (3.5),so we solve deterministic coupled systems of equations to obtain spectral approximations forboth the forward and adjoint solutions.

We observe that increasing the order of the surrogate approximation has a minimal affecton the magnitude of the values of the deterministic error even though the representation errordecreases to zero pointwise in stochastic space. The left-hand plot of Figure 1 shows thesurrogate and improved surrogate response curves. The solid curve in the right-hand plotof Figure 1 shows the deterministic a posteriori error estimate εD(λ) used to compute theimproved surrogate. The magnitude of the deterministic error in the surrogate is typicallyaround one order of magnitude less than the magnitude of the surrogate value. As the orderof the polynomial expansion is increased, the deterministic error in the surrogate convergesto the discretization error of the “exactly” solved quantity of interest, which is also shown inthe right-hand plot of Figure 1 as the dashed curve.

We consider the problem of bounding the representation error in order to apply the errorbounds on the computed CDFs presented in sections 5.2.1 and 6.2.1. Since the deterministicerror estimate converges to the discretization error as the order of the spectral approximationincreases, we consider the difference between the two curves in the right-hand plot of Figure 1as a plot of the representation error. In Figure 2, the top-left plot shows this representationerror in the surrogate, and the top-right plot is of the last term used in the expansion definingthe surrogate. In this case, we may also compute a bound of the representation error as 0.0419using an integration-by-parts technique applied to the generalized Rodrigues formula definingthe orthogonal polynomials. For more details, we direct the interested reader to [3, 54].

Remark 7.1. Techniques such as the generalized Rodrigues formula for deriving computableDow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 20: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 183

−1 −0.5 0 0.5 11

1.5

2

2.5

3

3.5

q(λ)

q(λ)+εD

(λ)

−1 −0.5 0 0.5 10.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

λ

εD

(λ)

Numerical Error

Figure 1. Left: 10th-order surrogate approximation (dashed line) and the improved surrogate (solid line).Right: numerical error in the exact quantity of interest solved by fixing λ and using traditional adjoint-basederror analysis (dashed line) and a posteriori estimate for deterministic error εD(λ) in the surrogate (solid line).

bounds have the strong assumption that the surrogate has been computed without discretiza-tion error. In practice, this is often impossible to satisfy, and the bound can prove inaccurateby underestimating the error at a low order and overestimating the error by several orders ofmagnitude at a higher order. As shown below, even if the representation error of the numer-ically computed surrogate is larger than the bound, the representation error of the improvedsurrogate is possibly much smaller. Only in the improved surrogate case can we reasonablyexpect such error bounds to hold since the effect of the discretization error is corrected for bythe a posteriori estimate. For this reason, we in general use empirically determined estimatesfor the representation error, which we show below can result in tighter error bounds.

In Figure 2, the bottom plot shows the representation error in the improved surrogate, andthe right-hand plot shows the representation error in a 20th-order polynomial expansion forthe surrogate; i.e., the forward problem is solved with a 20th-order polynomial basis. Com-puting the bound of the representation error of a 20th-order spectral approximation using thetechnique of [3] gives 0.015 clearly bounding the representation error in the improved surro-gate; i.e., estimating and correcting for the deterministic error gives an improved surrogatethat has the pointwise accuracy of a higher-order surrogate approximation. For a tighterbound on the representation error in the improved surrogate, we can empirically estimate it.We do not seek an optimal error bound of the representation error, so we solve the forwardproblem at a selected number of points in the stochastic domain, e.g., the points where thenext polynomial in the expansion would have local maxima, and choose a conservative boundfor a splined curve through these points. In this case, a bound of 0.01 is sufficient and yieldsresults comparable to 0.015. Therefore, we present results below comparing the a posteriorierror bounds in the distribution functions using the loose bound on the representation error0.0419 and the tight bound on the representation error 0.01.

In Figure 3, we show numerically computed CDFs FD,N (s) computed using N = 500 and20, 000. The effect of error from finite sampling is reduced quickly and is effectively removedwith 20, 000 samples so that all results for the forward propagation are for the case whereD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 21: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

184 T. BUTLER, C. DAWSON, AND T. WILDEY

−1 −0.5 0 0.5 1−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

λ

Computed Representation Error in 10th−order Surrogate

−1 −0.5 0 0.5 1−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

λ

Last Term in PC Expansion

−1 −0.5 0 0.5 1−8

−6

−4

−2

0

2

4

6

8

10x 10

−3

Computed Representation Error of Improved Surrogate

λ

Figure 2. Top left: the computed representation error in the 10th-order surrogate. Top right: The last termin the polynomial expansion defining the surrogate can be used to bound the representation error. Bottom: thecomputed representation error of the 10th-order improved surrogate.

20, 000 samples are used. We study the effect of the deterministic errors below.In Figure 4, the left-hand plot shows the error and error bounds in the finite sample CDFs,

|Fq,N (s)− FD,N (s)|, for N = 20, 000 using the loose and tight bounds on the representationerror for the improved surrogate. We first observe the right-hand plot, which demonstratesthe severe effect of the uncorrected for deterministic error in the original surrogate on theCDF. We compare this to the left-hand plot showing the error and error bounds of the CDFfor the improved surrogate. For the improved surrogate, the error in the CDF is significantlyless than 1% over most of the parameter space with an average error of approximately 0.07%(see the bottom curve of the left-hand plot in Figure 4). When the deterministic error is notestimated and used to improve the surrogate, the average error is approximately 3.7% (seethe right-hand plot of Figure 4). In other words, an error of 0.037 in these plots at a fixedvalue of q(λ) = q indicates that the computed probability of the event {q(λ) ≤ q} is off by3.7%. Thus, not using an improved surrogate for this example results in an average error inthe CDF that is approximately 53 times larger than the average error in the CDF computedD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 22: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 185

1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

q(λ)1 1.5 2 2.5 3 3.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

q(λ)

Figure 3. Left: FD,500(s). Right: FD,20,000(s).

1 1.5 2 2.5 3 3.50

0.02

0.04

0.06

0.08

0.1

q(λ)

True pointwise error

Bound using tight rep. err. bound

Bound using loose rep. err. bound

1 1.5 2 2.5 3 3.50

0.02

0.04

0.06

0.08

0.1

0.12

0.14

q(λ)

Figure 4. Left: The bottom curve is the true pointwise error in the numerically computed CDF, the middlecurve is a bound on this error using the representation error bound of 0.01, and the top curve is a bound onthis error using a representation error bound of 0.0419. Right: the pointwise error in the numerically computedCDF when the numerical error is not corrected for in the surrogate.

using the improved surrogate. This implies that if output points q are chosen (uniformly) atrandom in the output space in order to compute the probability of event {q(λ) ≤ q}, then theaverage error in probability is 53 times worse using the original surrogate to propagate thedistribution instead of the improved surrogate.

Inverse propagation of uncertainty. For the inverse formulation, we use the representationerror bound of 0.01 and use the alternative formulation of the likelihood function, as shownin (6.7). We impose the uniform distribution, U([1.5, 2]), on the quantity of interest; i.e., themeasurable data is assumed to be within [1.5, 2] with equal probability. The prior density isassumed uniformly distributed on [−1, 1]. We use 1, 000, 000 independently generated samplesof the prior to compute Monte Carlo estimates for ν, νmin, and νmax and the subsequentD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 23: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

186 T. BUTLER, C. DAWSON, AND T. WILDEY

bound (6.9). This gives ν ≈ 1.45, νmin ≈ 1.40, and νmax ≈ 1.50. Next, we use this same setof independently generated samples to compute the surrogate CDF FΛ(s) at 1, 000 uniformlydistributed points s ∈ Λ using Monte Carlo integration. We use Monte Carlo integration with20, 000 independently generated samples of the prior to compute the normalizing constantν ≈ 1.47 for the exact CDF FΛ(s) as well as evaluating this CDF at the same points s ∈ Λas above. We summarize these results in Figure 5. The left-hand plot shows FΛ(s), andthe right-hand plot shows the pointwise error in FΛ(s) as well as the pointwise error bound(6.9). The total compute time of FΛ(s) is dominated by the evaluation of the likelihood atthe 20, 000 samples which took 38, 671 seconds with all other computation times, negligible bycomparison. The total compute time for FΛ(s) is divided between the 82.6 seconds to computethe improved surrogate and the 98.5 seconds to evaluate this function at the 1, 000, 000 samplesof the prior. We again observe that use of the improved surrogate quantity gives a CDF that ispointwise accurate over the entire sample space and that the error bound never exceeds 0.07.Without correcting for the deterministic error, the magnitude of the error in the distributionfunction makes computations of probabilities of events unreliable.

−1 −0.5 0 0.5 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λ−1 −0.5 0 0.5 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

λ

Figure 5. Left: The exact CDF FΛ(s) for the inverse problem is estimated by Monte Carlo integrationwith 20, 000 independently generated samples of the prior where the problem defined by (7.1) is solved at eachsample along with an adjoint problem to correct for discretization error in the quantity of interest. Right: Thebottom curve is of |FΛ(s) − FΛ(s)|, where FΛ(s) is estimated using Monte Carlo integration with 1, 000, 000independently generated samples of the prior. The top curve is of bound (6.9).

7.2. 2-dimensional example: Random model and source parameters. We now considerthe case where both the model parameter λ1 and source parameter λ2 vary. We set h = 0.05and Δt = 0.0025. In the previous example, discretization error was the main source of error,but in this example with the choice of h and Δt and the structure of g(λ2), we show that therepresentation error is the main source of deterministic error in the quantity of interest whichis taken as u(x, t) at t = 0.075 and x = (0.5, 0.5).

The left-hand plot of Figure 6 shows the 5th-order spectral approximation for the surro-gate, and the right-hand plot shows the exact response surface. Increasing the order of thespectral approximation used for the surrogate fails to improve the pointwise accuracy becauseof the discontinuities in the exact response surface. In fact, as the order of the polynomialD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 24: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 187

−1

0

1

−1

0

10

2

4

6

λ1

λ2

−1

0

1

−1

0

10

2

4

6

λ1

λ2

Figure 6. Left: the 5th-order spectral approximation for the surrogate response surface. Right: the exactquantity of interest.

basis is increased, the surrogate exhibits Gibbs-type phenomena, i.e., oscillations near thejump discontinuities of the exact response surface.

The left-hand plot of Figure 7 shows the computable deterministic error estimate usinga 10th-order spectral approximation for the adjoint. We note that since the discontinuity inthe sample space occurs only for the forward problem, a spectral approximation of the adjointsolution converges pointwise as the order of the polynomial basis increases. Thus, accordingto Theorem 4.1, we expect that the improved surrogate will have better pointwise accuracy.This is verified in the right-hand plot of Figure 7.

−1

0

1

−1−0.500.51−1

−0.5

0

0.5

1

1.5

λ1λ

2

−1

0

1

−1

0

10

2

4

6

λ1

λ2

Figure 7. Left: estimate of εD(λ) computed using a 10th-order spectral approximation to the adjoint. Right:the improved surrogate response surface.

We consider the inverse problem with the first formulation for the likelihood, i.e., a likeli-hood derived from a fully deterministic map. We assume the prior is uniform on [−1, 1]2 andthe quantity of interest is also uniform on the set [1, 2.5]. We use 106 i.i.d. samples of theD

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 25: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

188 T. BUTLER, C. DAWSON, AND T. WILDEY

prior and the improved surrogate quantity of interest to estimate ν = 0.3705, νmin = 0.3555,and νmax = 0.3778. Using a different set of 20, 000 i.i.d. samples of the prior we computethe deterministic error on the sample space between CDFs computed using the exact (dis-cretization error corrected) quantity of interest and the improved surrogate. Additionally,we compute the bound (6.9) with the deterministic likelihood function using a representationerror of 0.0314 determined empirically from the improved surrogate. The results are shownin Figure 8.

Figure 8. The top transparent surface is the computed error bound on the deterministic error componentfor the CDF computed using an improved surrogate quantity of interest. The surface underneath the transparentsurface is the absolute value of the deterministic error in the CDFs computed using the exact quantity of interestand the improved surrogate.

REFERENCES

[1] R. A. Adams, Sobolev Spaces, Academic Press, New York, 1975.[2] R. C. Almeida and J. T. Oden, Solution verification, goal-oriented adaptive methods for stochastic

advection–diffusion problems, Comput. Methods Appl. Mech. Engrg., 199 (2010), pp. 2472–2486.[3] M. Bain and L. M. Delves, The convergence rates of expansions in Jacobi polynomials, Numer. Math.,

27 (1977), pp. 219–225.[4] W. Bangerth and R. Rannacher, Adaptive Finite Element Methods for Differential Equations,

Birkhauser Verlag, Basel, 2003.[5] J. M. Bernardo, Reference posterior distributions for Bayesian inference, J. Roy. Statist. Soc. Ser. B,

41 (1979), pp. 113–147.[6] P. Billingsley, Probability and Measure, John Wiley & Sons, New York, 1995.[7] J. Breidt, T. Butler, and D. Estep, A measure-theoretic computational method for inverse sensitivity

problems I: Method and analysis, SIAM J. Numer. Anal., 49 (2011), pp. 1836–1859.[8] T. Butler, Computational Measure Theoretic Approach to Inverse Sensitivity Analysis: Methods and

Analysis, Ph.D. thesis, Department of Mathematics, Colorado State University, Fort Collins, CO,2009.

[9] T. Butler, P. Constantine, and T. Wildey, A posteriori error analysis of parameterized linearsystems using spectral methods, SIAM J. Matrix Anal. Appl., 33 (2012), pp. 195–209.

[10] T. Butler, C. Dawson, and T. Wildey, A posteriori error analysis of stochastic differential equationsusing polynomial chaos expansions, SIAM J. Sci. Comput., 33 (2011), pp. 1267–1291.D

ownl

oade

d 11

/18/

14 to

128

.235

.8.1

70. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 26: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 189

[11] T. Butler, D. Estep, and J. Sandelin, A computational measure theoretic approach to inverse sensi-tivity problems II: A posteriori error analysis, SIAM J. Numer. Anal., 50 (2012), pp. 22–45.

[12] D. Cacuci, Sensitivity and Uncertainty Analysis: Theory, Vol. I, Chapman & Hall/CRC, Boca Raton,FL, 2003.

[13] R. Cameron and W. Martin, The orthogonal development of non-linear functionals in series of Fourier-Hermite functionals, Ann. of Math. (2), 48 (1947), pp. 385–392.

[14] V. Carey, D. Estep, A. Johansson, M. Larson, and S. Tavener, Blockwise adaptivity for timedependent problems based on coarse scale adjoint solutions, SIAM J. Sci. Comput., 32 (2010), pp. 2121–2145.

[15] P. G. Constantine, D. F. Gleich, and G. Iaccarino, Spectral methods for parameterized matrixequations, SIAM J. Matrix Anal. Appl., 31 (2010), pp. 2681–2699.

[16] M. K. Deb, I. Babuska, and J. T. Oden, Solution of stochastic partial differential equations usingGalerkin finite element techniques, Comput. Methods Appl. Mech. Engrg., 190 (2001), pp. 6359–6372.

[17] L. P. Devroye, A uniform bound for the deviation of empirical distribution functions, J. MultivariateAnal., 7 (1977), pp. 594–597.

[18] C. F. Dunkl and Y. Xu, Orthogonal Polynomials of Several Variables, Cambridge University Press,Cambridge, UK, 2001.

[19] A. Dvoretzky, J. Kiefer, and J. Wolfowitz, Asymptotic minimax character of the sample distribu-tion function and of the classical multinomial estimator, Ann. Math. Statist., 27 (1956), pp. 642–669.

[20] K. Eriksson, D. Estep, P. Hansbo, and C. Johnson, Introduction to adaptive methods for differentialequations, in Acta Numerica, 1995, Acta Numer. 4, Cambridge University Press, Cambridge, UK,1995, pp. 105–158.

[21] K. Eriksson, D. Estep, P. Hansbo, and C. Johnson, Computational Differential Equations, Cam-bridge University Press, Cambridge, UK, 1996.

[22] D. Estep, A posteriori error bounds and global error control for approximation of ordinary differentialequations, SIAM J. Numer. Anal., 32 (1995), pp. 1–48.

[23] D. Estep and D. French, Global error control for the continuous Galerkin finite element method forordinary differential equations, RAIRO Model. Math. Anal. Numer., 28 (1994), pp. 815–852.

[24] D. Estep, V. Ginting, D. Ropp, J. N. Shadid, and S. Tavener, An a posteriori–a priori analysis ofmultiscale operator splitting, SIAM J. Numer. Anal., 46 (2008), pp. 1116–1146.

[25] D. Estep, M. J. Holst, and A. Malqvist, Nonparametric density estimation for randomly perturbedelliptic problems III: Convergence, computational cost, and generalizations, J. Appl. Math. Comput.,38 (2012), pp. 367–387.

[26] D. Estep, M. G. Larson, and R. D. Williams, Estimating the error of numerical solutions of systemsof reaction-diffusion equations, Mem. Amer. Math. Soc., 146 (2000), 696.

[27] D. Estep, A. Malqvist, and S. Tavener, Nonparametric density estimation for randomly perturbedelliptic problems I: Computational methods, a posteriori analysis, and adaptive error control, SIAMJ. Sci. Comput., 31 (2009), pp. 2935–2959.

[28] D. Estep, A. Malqvist, and S. Tavener, Nonparametric density estimation for randomly perturbedelliptic problems II: Applications and adaptive modeling, Internat. J. Numer. Methods Engrg., 80(2009), pp. 846–867.

[29] D. Estep, B. Mckeown, D. Neckels, and J. Sandelin, GAASP: Globally Accurate Adaptive Sensi-tivity Package, 2006.

[30] D. Estep and D. Neckels, Fast and reliable methods for determining the evolution of uncertain param-eters in differential equations, J. Comput. Phys., 213 (2006), pp. 530–556.

[31] D. Estep and D. Neckels, Fast methods for determining the evolution of uncertain parameters inreaction-diffusion equations, Comput. Methods Appl. Mech. Engrg., 196 (2007), pp. 3967–3979.

[32] D. Estep and A. Stewart, The dynamical behavior of the discontinuous Galerkin method and relateddifference schemes, Math. Comp., 71 (2002), pp. 1075–1103.

[33] D. Estep, S. Tavener, and T. Wildey, A posteriori error estimation and adaptive mesh refinementfor a multiscale operator decomposition approach to fluid-solid heat transfer, J. Comput. Phys., 229(2010), pp. 4143–4158.

[34] J. P. Huelsenbeck, B. Larget, R. E. Miller, and F. Ronquist, Potential applications and pitfalls

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 27: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

190 T. BUTLER, C. DAWSON, AND T. WILDEY

of Bayesian inference of phylogeny, Syst. Biol., 51 (2002), pp. 673–688.[35] W. Gautschi, Orthogonal polynomials: Applications and computation, in Acta Numerica, 1996, Acta

Numer. 5, Cambridge University Press, Cambridge, UK, 1996, pp. 45–119.[36] W. Gautschi, Orthogonal Polynomials: Computation and Approximation, Clarendon Press, Oxford, UK,

2004.[37] A. Gelman and D. B. Rubin, Inference from iterative simulation using multiple sequences, Statist. Sci.,

7 (1992), pp. 457–472.[38] J. E. Gentle, Random Number Generation and Monte Carlo Methods, Springer, New York, 2003.[39] C. J. Geyer, Practical Markov chain Monte Carlo, Statist. Sci., 7 (1992), pp. 473–483.[40] R. Ghanem and J. Red-Horse, Propagation of probabilistic uncertainty in complex physical systems

using a stochastic finite element approach, Phys. D, 133 (1999), pp. 137–144.[41] R. Ghanem and P. Spanos, Stochastic Finite Elements: A Spectral Approach, Springer-Verlag, New

York, 1991.[42] M. B. Giles and E. Suli, Adjoint methods for PDEs: A posteriori error analysis and postprocessing

by duality, in Acta Numerica, 2002, Acta Numer. 11, Cambridge University Press, Cambridge, UK,2002, pp. 145–236.

[43] W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice,Chapman & Hall, Boca Raton, FL, 1996.

[44] J. Kaipio and E. Somersalo, Statistical and Computational Inverse Problems, Springer, New York,2005.

[45] J. Kiefer, On large deviations of the empiric D.F. of vector chance variables and a law of iteratedlogarithm, Pacific J. Math., 11 (1961), pp. 649–660.

[46] C. Lanczos, Linear Differential Operators, Dover, New York, 1997.[47] O. Le Maıtre, R. Ghanem, O. Knio, and H. Najm, Multi-resolution analysis of Wiener-type uncer-

tainty propagation schemes, J. Comput. Phys., 197 (2004), pp. 502–531.[48] O. Le Maıtre, R. Ghanem, O. Knio, and H. Najm, Uncertainty propagation using Wiener-Haar

expansions, J. Comput. Phys., 197 (2004), pp. 28–57.[49] G. I. Marchuk, Adjoint Equations and Analysis of Complex Systems, Kluwer Academic Publishers,

Dordrecht, The Netherlands, 1995.[50] G. I. Marchuk, V. I. Agoshkov, and V. P. Shutyaev, Adjoint Equations and Perturbation Algorithms

in Nonlinear Problems, CRC Press, Boca Raton, FL, 1996.[51] Y. Marzouk, H. Najm, and L. Rahn, Stochastic spectral methods for efficient Bayesian solution of

inverse problems, J. Comput. Phys., 224 (2007), pp. 560–586.[52] P. Massart, The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality, Ann. Probab., 18 (1990),

pp. 1269–1283.[53] L. Mathelin and O. P. Le Maıtre, Dual-based error analysis for uncertainty quantification in a

chemical system, PAMM, 7 (2007), pp. 2010007–2010008.[54] K. O. Mead and L. M. Delves, On the convergence rate of generalized Fourier expansions, J. Inst.

Math. Appl., 12 (1973), pp. 247–259.[55] D. Neckels, Variational Methods for Uncertainty Quantification, Ph.D. thesis, Department of Mathe-

matics, Colorado State University, Fort Collins, CO, 2005.[56] W. L. Oberkampf and C. J. Roy, Verification and Validation in Scientific Computing, Cambridge

University Press, Cambridge, UK, 2010.[57] C. P. Robert and G. Casella, Monte Carlo Statistical Methods, Springer, New York, 2004.[58] J. Sandelin, Global Estimate and Control of Model, Numerical, and Parameter Error, Ph.D. thesis,

Department of Mathematics, Colorado State University, Fort Collins, CO, 2006.[59] R. J. Serfling, Approximation Theorems of Mathematical Statistics, John Wiley & Sons, New York,

1980.[60] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation, SIAM, Philadel-

phia, 2005.[61] X. Wan and G. E. Karniadakis, An adaptive multi-element generalized polynomial chaos method for

stochastic differential equations, J. Comput. Phys., 209 (2005), pp. 617–642.[62] X. Wan and G. Karniadakis, Beyond Wiener-Askey expansions: Handling arbitrary PDFs, J. Sci.

Comput., 27 (2006), pp. 455–464.

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 28: Propagation of Uncertainties Using Improved Surrogate Models

Copyright © by SIAM and ASA. Unauthorized reproduction of this article is prohibited.

PROPAGATION OF UNCERTAINTIES USING IMPROVED SURROGATE MODELS 191

[63] N. Wiener, The homogeneous chaos, Amer. J. Math., 60 (1938), pp. 897–936.[64] T. Wildey, Software Documentation: Adaptive Coupled Equation Solver (ACES), Technical report, 2010.[65] T. Wildey, D. Estep, and S. Tavener, A posteriori estimation of approximate boundary fluxes, Comm.

Numer. Methods Engrg., 24 (2008), pp. 421–434.[66] D. Xiu and G. E. Karniadakis, The Wiener–Askey polynomial chaos for stochastic differential equa-

tions, SIAM J. Sci. Comput., 24 (2002), pp. 619–644.

Dow

nloa

ded

11/1

8/14

to 1

28.2

35.8

.170

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php