bayesian inverse problems in measure spaces with application to burgers and hamilton–jacobi...

Bayesian inverse problems in measure spaces with application to Burgers and

Hamilton–Jacobi equations with white noise forcing

This article has been downloaded from IOPscience. Please scroll down to see the full text article.

2012 Inverse Problems 28 025009

(http://iopscience.iop.org/0266-5611/28/2/025009)

Download details:

IP Address: 147.188.128.74

The article was downloaded on 04/08/2013 at 13:09

Please note that terms and conditions apply.

View the table of contents for this issue, or go to the journal homepage for more

Home Search Collections Journals About Contact us My IOPscience

http://iopscience.iop.org/page/terms

http://iopscience.iop.org/0266-5611/28/2

http://iopscience.iop.org/0266-5611

http://iopscience.iop.org/

http://iopscience.iop.org/search

http://iopscience.iop.org/collections

http://iopscience.iop.org/journals

http://iopscience.iop.org/page/aboutioppublishing

http://iopscience.iop.org/contact

http://iopscience.iop.org/myiopscience

IOP PUBLISHING INVERSE PROBLEMS

Inverse Problems 28 (2012) 025009 (29pp) doi:10.1088/0266-5611/28/2/025009

Bayesian inverse problems in measure spaces withapplication to Burgers and Hamilton–Jacobiequations with white noise forcing

Viet Ha Hoang

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, NanyangTechnological University, Singapore 637371

E-mail: [email protected]

Received 1 May 2011, in final form 4 November 2011Published 20 January 2012Online at stacks.iop.org/IP/28/025009

AbstractThis paper formulates Bayesian inverse problems for inference in a topologicalmeasure space given noisy observations. Conditions for the validity of theBayes’ formula and the well posedness of the posterior measure are studied.The abstract theory is then applied to Burgers and Hamilton–Jacobi equationson a semi-infinite time interval with forcing functions which are white noisein time. Inference is made on the white noise forcing, assuming the Wienermeasure as the prior.

1. Introduction

Bayesian inverse problems are attracting considerable attention due to their importance inmany applications (see the extensive survey by Stuart [25]). The unknown quantities are tobe determined from a set of data, which are subject to observational error. Given a functionalG : X → R

m where X is a function space, the observation y ∈ Rm of G is subject to an

unbiased noise σ :

y = G(x) + σ. (1.1)

The inverse problem determines x given the noisy observation y.Given x, the quantity y can be found by solving the forward (or direct) problem, which

are well-posed partial differential equations: for each x ∈ X , there is a unique solution y whichdepends smoothly on x. However, the inverse problem (1.1) may not have a unique solution;a solution may not even exist; or it can depend sensitively on x.

The solution x of the ill-posed problem (1.1) may be defined reasonably as the minimizerof the functional

‖y − G(x)‖Y , (1.2)

where ‖ · ‖Y is a measuring norm, for example, the Rm norm in our setting. However, (1.2)

may not possess a solution as minimizing sequences can be highly oscillating and diverge.

0266-5611/12/025009+29$33.00 © 2012 IOP Publishing Ltd Printed in the UK & the USA 1

http://dx.doi.org/10.1088/0266-5611/28/2/025009

mailto:[email protected]

http://stacks.iop.org/IP/28/025009

Inverse Problems 28 (2012) 025009 V H Hoang

Classical approaches solve a well-posed regularized problem which approximates (1.1). TheTikhonov method looks for a minimizer of the well-posed regularized functional

Fδ (x) = ‖y − G(x)‖2Y + δ‖x‖2

E

for a chosen norm ‖ · ‖E of x. Another method is the truncated singular value decompositionwhich restricts the problem to an approximating finite-dimensional space in which the problembecomes well posed. Details about classical methods can be found in [9, 15, 17].

Classical methods regard the observational error σ as deterministic and the approximatingsolution to (1.2) is found as a single deterministic function. Ill-posedness is removed in a ratherad hoc manner [17]. The Bayesian approach for inverse problems treats σ as a random variablewhose distribution is known and x as a random variable with values in the function space X ,which is equipped with a predetermined prior probability. The information we seek for theunknown quantity x is encoded in the conditional probability of x given the observation ywhich contains an unknown observational noise σ whose statistics is known. This conditionalprobability is the solution of the inverse problem.

Kaipio and Somersalo [17] provide a foundation of the Bayesian approach to inverseproblems. In finite-dimensional spaces, the existence of a solution is guaranteed by Bayes’theorem. The approach of Kaipio and Somersalo requires discretizing the problems first, andthen applying the Bayesian framework to the discrete problems. However, for the discreteproblem to represent the original accurately, the dimension of the approximating space mustbe very high. Further, in many problems, as discussed by Stuart [25], important insights can beobtained by deferring discretization to the last possible moments. With this viewpoint, in theirpioneering work, Cotter et al [7] formulate a rigorous mathematical framework for Baysianinverse problems in Banach spaces X which are equipped with a prior probability μ0. WhenG is μ0 measurable, they show that Bayes’ formula (2.3) holds for the conditional probabilityμy = P(x|y) in this infinite-dimensional setting. When μ0 is Gaussian, assuming that G(x)

grows polynomially with respect to the norm ‖x‖X , they show that μy, with respect to theHellinger distance, depends continuously on y. The problem is thus well-posed.

In some situations, inverse problems in general metric spaces need to be considered, forwhich the theory in [7] does not readily apply. In fluid turbulence, the behavior of solutionsof stochastic partial differential equations after a long time, when initial conditions havenegligible effect, is of interest. The behavior of the solution depends solely on the realizationof the random forcing. The problems are thus considered over a semi-infinite time interval(−∞, T ); see [22] for Navier–Stokes equations and [8] and [4] for the Burgers equation.Inferring the forcing (as considered in [7]) requires formulating a Bayesian inverse problem inC(−∞, T ] which is a metric space. Another typical example is inference of a positive randomfield (e.g. the random conductivity) which is strictly bounded away from zero, or a randomfield which can be expressed as the sum of an infinite series of random variables (e.g. theKahunen–Loeve expansion) which are distributed in a finite domain (see e.g. [6, 14, 23]). Theunknown x belongs to a topological measure space.

In this paper, we formulate Bayesian inverse problems in general measure spaces. Wedemonstrate that while the validity of the Bayes’ theorem for general measure spaces followfrom the results by Cotter et al [7], the well-posedness of the posterior measure can besignificantly complicated. For Banach spaces X , Cotter et al [7] assume that the functional� in (2.2) grows polynomially with respect to the norm ‖x‖X and is locally Lipschitz withrespect to y and the Lipschitz constant also grows polynomially with respect to ‖x‖X . The well-posedness of the posterior measure is deduced from the Fernique theorem. In general measurespaces, conditions for well-posedness become abstract. Unlike in Banach spaces, in metricspaces, the functions M(r) and G(r, x) in assumption 2.3 generally cannot be expressed in

2


terms of the metric. There is no general description of how to identify them; their identificationis problem dependent. We illustrate these by considering inference of the white noise forcingfor Burgers and Hamilton–Jacobi equations in the semi-infinite time interval (−∞, T ]. Semi-infinite time intervals and the use of the Wiener measure as the prior combine to necessitatethe use of a metric structure.

As shown below, for problems in a spatially compact domain, in particular the periodictorus T

d , the proof for the well-posedness of the posterior measure is not so complicated,as the function G(r, x) in assumption 2.3 can be formulated in terms of a seminorm. It isthe problems in noncompact domains that highlight the complexity of Bayesian problems inmeasure spaces. For the case where the forcing potential F(x) in (3.2) has a large maximumand a small minimum in a compact set in comparison to the potential outside, G(r, x) dependson the first time a fluid particle trajectory hits this compact set. Showing the required conditionthat the function G(r, x) is square integrable with respect to μ0 is highly nontrivial. Burgers andHamilton–Jacobi equations in general noncompact domains are far more complicated than thisparticular case where the forcing potential has a large maximum and a small minimum; similarBayesian inverse problems in general noncompact domains should be far more technicallydemanding. This shows that when the well-posedness conditions cannot be formulated interms of a norm, as in the Banach space setting, Bayesian inverse problems can become farmore difficult.

The Burgers equation plays an important role in turbulent theory, cosmology, acoustictheory and non-equilibrium statistical physics (see, e.g., [11, 8, 4]). It is simpler than Navier–Stokes equations due to the absence of the pressure but still keeps the essential nonlinearfeatures. The Burgers equation with white noise forcing in the spatially periodic case isstudied by E et al [8] in one dimension and by Iturriaga and Khanin [16] in more than onedimension. They prove that almost surely, there is a unique solution that exists for all time.They establish many properties for shocks and for the dynamics of the Lagrange minimizersof the Lax–Oleinik functional. Much less is understood in the non-compact case. Hoang andKhanin [13] study the special case where the forcing potential has a large maximum and asmall minimum. They prove that there is a solution that exists for all time; it is the limit offinite-time solutions with the zero initial condition.

In the case where spatially the forcing is the gradient of a potential, assuming that thevelocity is also a gradient, the velocity potential satisfies a Hamilton–Jacobi equation withrandom forcing. This is the Kardar–Parisi–Zhang (KPZ) equation that describes lateral growthof an interface. In the dissipative case, inverse problems to determine the dissipative coefficientfor the KPZ equation was studied in Lam and Sander [19] using the classical least-squaresfitting method. The forcing for the KPZ equation considered in [19] is white in both space andtime. The dissipative coefficient is inferred from the expectation of the observed heights ofthe growing interface at some spatial and temporal points with respect to the random forcing’sdistribution. This expectation is taken as the Monte Carlo average of the surface height at theselocations which is obtained from a large number of independent simulations of the forcing.This approach is not feasible when the expectation is not available but instead, data are onlyobtained from one forcing’s realization that occurs in a practical situation. The white noiseforcing may need to be inferred. This viscous (or dissipative) problem requires considerationof a stochastic control problem which ‘becomes’ the variational problem in this paper whenthe viscosity converges to 0 (see [12]). Though we do not assume a spatially white forcing, andwe only consider the inviscid problems, our approach may well be adopted for this problem.

The Burgers equation and equations of the Burgers type are employed as models forstudying data assimilation in Bardos and Pironneau [2], Lundvall et al [21], Apte et al [1] andFreitag et al [10].

3


For the Hamilton–Jacobi equation (3.3), we assume that the solution (i.e. the velocitypotential φ) is observed at fixed spatial points at fixed moments. This assumption is reasonableas φ is continuous and is well defined everywhere (though it is determined within an additiveconstant). The inviscid Burgers equation possesses many shocks, at which the solution is notdefined, but for a given time, the solution is in L1

loc(Rd ), so observations in the form of a

continuous functional in L1loc(R

d ) are considered.In section 2, we establish Bayes’ formula for the posterior measures of Bayesian inverse

problems in measure spaces, and derive conditions under which the problems are well-posed.Under this condition, we show that the Hellinger and the Kullback–Leibler distances arelocally Lipschitz with respect to y. The results are generalizations of those in the Banachspace setting by Cotter et al in [7]. When the prior measure μ0 is a Radon Gaussianmeasure in a locally convex topological space (see, for example, Bogachev [5] or Lifshits[20]), we show the existence of the maximum a posteriori (MAP) estimator. Section 3introduces the randomly forced Burgers and Hamilton–Jacobi equations and formulates theBayesian inverse problems. The spatially periodic case is considered in section 4, usingthe results for random Burgers and Hamilton–Jacobi equations by E et al [8] and Iturriagaand Khanin [16]. We establish the validity of Bayes’ formula and the posterior measure’slocal Lipschitzness with respect to the Hellinger and the Kullback–Leibler distances; themain results are recorded in theorems 4.6 and 4.12. We also show the existence of aMAP estimator in theorems 4.7 and 4.13. The non-periodic case is studied in section 5,using the results by Hoang and Khanin [13] where we restrict our consideration to forcingpotentials that satisfy assumption 5.1, with a large maximum and a small minimum. Underthis assumption, for both the Hamilton–Jacobi and Burgers equations, Bayes’ formula isvalid and the posterior measure is locally Lipschitz with respect to the Hellinger andthe Kullback–Leibler distances. The main results are presented in theorems 5.8 and 5.14.Solutions to the Burgers and Hamilton–Jacobi equations are shown to exist almost surely inHoang and Khanin [13]. It is not clear whether a solution exists for the forcing function’srealizations in the Cameron–Martin space so the existence of a MAP estimator is open forthis case. We discuss this in remark 5.9. We devote section 6 to a conclusion summarizingthe paper. In section 7, we prove some technical results that are used in the previoustwo sections.

Throughout the paper, we define 〈·, ·〉� = 〈�−1/2·, �−1/2·〉Rm for a positive definite matrix�; the corresponding norm |�−1/2 · |Rm is denoted | · |� . The R

m norm is denoted | · |. Wedenote by c various positive constants that do not depend on the white noise; the value of ccan differ from one appearance to the next.

2. Bayesian inverse problems in measure spaces

We study Bayesian inverse problems in measure spaces, in particular, the conditions for thevalidity of Bayes’ formula, and the well-posedness of the problems. In [7], Cotter et al requirethe functional G in (1.1) to be measurable with respect to the prior measure for the validity ofthe Bayes formula. Restricting their consideration to Banach spaces, this requirement holdswhen G is continuous. However, this is true for any topological spaces X . Cotter et al [7]impose a polynomial growth on G to prove the well-posedness. Using Fernique’s theorem,they show that when the prior measure is Gaussian, with respect to the Hellinger metric,the posterior measure is locally Lipschitz with respect to y. We will generalize this result tomeasure spaces and introduce a general condition for the local Lipschitzness of the posteriormeasure, without imposing a Gaussian prior.

4


2.1. Problem formulation

For a topological space X with a σ -algebra B(X ), we consider a functional G : X → Rm. An

observation y of G is subject to a multivariate Gaussian noise σ , i.e

y = G(x) + σ, (2.1)

where σ ∼ N (0, �).Let μ0 be the prior probability measure on (X,B(X )). Our purpose is to determine the

conditional probability P(x|y). Let

�(x, y) = 12 |y − G(x)|2�. (2.2)

We have the following theorem.

Theorem 2.1 (Cotter et al [7] theorem 2.1). If G : X → Rm is μ0-measurable, then the

posterior measure μy(dx) = P(dx|y) is absolutely continuous with respect to the prior measureμ0(dx) and the Radon–Nikodym derivative satisfies

dμy

dμ0(x) ∝ exp(−�(x; y)). (2.3)

Although Cotter et al [7] assume that X is a Banach space, their proof holds for anytopological measure spaces X .

Following [7], corollary 2.2, we have the following corollary.

Corollary 2.2. If G : X → Rm is continuous, then the measure μy(dx) = P(dx|y) is absolutely

continuous with respect to μ0(x) and the Radon–Nikodym derivative satisfies (2.3).

Proof. If G : X → Rm is continuous, then it is measurable ([18], lemma 1.5). The conclusion

follows. �

2.2. Well-posedness of the problem

For Banach spaces X , when the prior measure μ0 is Gaussian, the well-posedness of theinverse problem is established by Cotter et al [7] by assuming that the function �(x; y) has apolynomial growth in ‖x‖X fixing y and is Lipschitz in y with the Lipschitz coefficient beingbounded by a polynomial of ‖x‖X . The proof uses Fernique’s theorem. For a general measurespace X , we make the following assumption.

Assumption 2.3. The function � : X × Rm → R satisfies:

• For each r > 0, there is a constant M(r) > 0 and a set X (r) ⊂ X of positive μ0 measuresuch that for all x ∈ X (r) and for all y with |y| � r

0 � �(x; y) � M(r).

• There is a function G : R × X → R such that for each r > 0, G(r, ·) ∈ L2(X, dμ0) and if|y| � r and |y′| � r then

|�(x; y) − �(x; y′)| � G(r, x)|y − y′|.

5


To study the well-posedness of the Bayesian inverse problems (2.1), following Cotteret al [7], we employ the Hellinger metric

dHell(μ,μ′) =

√√√√1

2

∫X

(√dμ

dμ0−√

dμ′

dμ0

)2

dμ0,

where μ and μ′ are measures on (X,B(X )). The probability measure μy is determined as

dμy

dμ0= 1

Z(y)exp(−�(x, y)),

where the normalization constant is

Z(y) =∫

Xexp(−�(x, y)) dμ0(x). (2.4)

With respect to the Hellinger metric, the measure μy depends continuously on y as shown inthe following theorem.

Theorem 2.4. Under assumption 2.3, the measure μy is locally Lipschitz in the data y withrespect to the Hellinger metric: for each positive constant r there is a positive constant c(r)such that if |y| � r and |y′| � r, then

dHell(μy, μy′

) � c(r)|y − y′|.

Proof. First we show that for each r > 0, there is a positive constant α(r) such that Z(y) � α(r)when |y| � r. It is obvious from (2.4) and from assumption 2.3(i) that when |y| � r,

Z(y) � μ0(X (r)) exp(−M(r)). (2.5)

Using the inequality | exp(−a) − exp(−b)| � |a − b| for a > 0 and b > 0, we have

|Z(y) − Z(y′)| �∫

X|�(x; y) − �(x; y′)|dμ0(x).

From assumption 2.3(ii), when |y| � r and |y′| � r:

|�(x; y) − �(x; y′)| � G(r, x)|y − y′|.As G(r, x) is μ0-integrable,

|Z(y) − Z(y′)| � c(r)|y − y′|. (2.6)

The Hellinger distance satisfies

2dHell(μy, μy′

)2 =∫

X

(Z(y)−1/2 exp

(−1

2�(x, y)

)

− Z(y′)−1/2 exp

(−1

2�(x, y′)

))2

dμ0(x)

� I1 + I2,

where

I1 = 2

Z(y)

∫X

(exp

(−1

2�(x, y)

)− exp

(−1

2�(x, y′)

))2

dμ0(x)

and

I2 = 2|Z(y)−1/2 − Z(y′)−1/2|2∫

Xexp(−�(x, y′)) dμ0(x).

6


Using again the inequality | exp(−a) − exp(−b)| � |a − b|, we deduce from (2.5)

I1 � c(r)∫

X|�(x, y) − �(x, y′)|2 dμ0(x)

� c(r)∫

X(G(r, x))2 dμ0(x)|y − y′|2 � C(r)|y − y′|2.

Furthermore,

|Z(y)−1/2 − Z(y′)−1/2|2 = |Z(y) − Z(y′)|2(Z(y)1/2 + Z(y′)1/2)Z(y)Z(y′)

.

From (2.5) and (2.6),

|Z(y)−1/2 − Z(y′)−1/2|2 � c(r)|Z(y) − Z(y′)|2 � c(r)|y − y′|2.The conclusion then follows. �

The Kullback–Leibler distance

dKL(μ,μ′) =∫

Xlog

(dμ

dμ′

)dμ

is larger than the Hellinger distance. However, we can also show a similar result for theKullback–Leibler distance of μy and μy′

.

Theorem 2.5. Under assumption 2.3, the posterior measure μy is locally Lipschitz in y withrespect to the Kullback–Leibler distance: for each positive constant r there is a positiveconstant c(r) such that if |y| � r and |y′| � r, then

dKL(μy, μy′) � c(r)|y − y′|.

Proof. We have

dKL(μy, μy′) =

∫X

[(log

(Z(y′)Z(y)

)− (�(x, y) − �(x, y′)

)]dμy

�∫

X

[log

(Z(y′)Z(y)

)]dμy +

∫X

|�(x, y) − �(x, y′)|dμy.

From assumption 2.3(ii), we have∫X

|�(x, y) − �(x, y′)|dμy �(∫

XG(r, x)

1

Z(y)exp(−�(x, y)) dμ0

)|y − y′|

� 1

α(r)

(∫X

G(r, x) dμ0

)|y − y′| � c(r)|y − y′|.

We note thatZ(y′)Z(y)

� 1 +∣∣∣∣Z(y′)

Z(y)− 1

∣∣∣∣ � 1 + 1

α(r)Z(y)

∣∣∣∣Z(y′)Z(y)

− 1

∣∣∣∣ = 1 + 1

α(r)|Z(y) − Z(y′)|.

Therefore,

log

(Z(y′)Z(y)

)� log

(1 + 1

α(r)|Z(y) − Z(y′)|

)� 1

α(r)|Z(y) − Z(y′)| � c(r)|y − y′|

where the last inequality is deduced from (2.6). �

Remark 2.6. To show theorem 2.5, it is sufficient to assume that G(r, ·) ∈ L1(X, dμ0) insteadof the stronger condition G(r, ·) ∈ L2(X, dμ0) in assumption 2.3(ii).

7


2.3. MAP estimator

In infinite-dimensional spaces, a MAP estimator is identified as the points small balls withcenters at which will have the largest probability in the asymptotic limit when the radiusconverges to 0. To show the existence of a MAP estimator, it is essential to employ asymptoticbehavior of the probability of these small balls. For Gaussian probability measures in Hausdorfflocally convex topological spaces, this has been well established. However, for generalprobability measures, similar small deviation results are still underdeveloped so we willrestrict our consideration to the case of a Radon Gaussian prior probability μ0 in a Hausdorfflocally convex topological space X . The result in this section is a generalization of that forBanach spaces by Cotter et al [7].

We assume that the mean m of μ0 is in the Cameron–Martin space E. Let A be a convex,symmetric about zero, weakly bounded Borel subset of X . Let x be a vector in the Cameron–Martin space E. We then have

μ0(x + rA) ∼ exp(−‖x − m‖2E/2)μ0(m + rA)

when r → 0 (see Lifshits [20], p 270). We therefore define the probability maximizers to bethe minimizers of the functional

I(x) = �(x, y) + 12‖x − m‖2

E .

Indeed, when r → 0, the probability of set x + rA will be maximized when x is a minimizerof this functional. We have the following existence result.

Theorem 2.7. Let X be a Hausdorff locally convex topological space and μ0 be a RadonGaussian measure on X with mean m in the Cameron–Martin space E. Assume that G(x) iscontinuous with respect to the topology of X. Then, there exists a minimizer in the Cameron–Martin space E for the functional I.

Proof. As �(x, y) is nonnegative, the functional I(x) has a nonnegative infimum. Let {xn}n ⊂ Ebe a minimizing sequence of I. Then, ‖xn‖E is bounded. As μ0 is a Radon Gaussian measure,the unit ball in the space E is compact in X (see Bogachev [5], corollary 3.2.4); therefore, wecan extract a subsequence xni that converges strongly in the topology of X to a limit x in E.On the other hand, from the Alaoglu theorem, we can choose the subsequence {xni} so that itconverges weakly to x in the Hilbert space E, with ‖x − m‖E � lim infn→∞ ‖xn − m‖E . As�(x, y) is continuous, we deduce that I(x) � lim infn→∞ I(xn) so I(x) = infx∈E I(x). �

Remark 2.8. Following the same procedure in Cotter et al [7], we can show that thesubsequence converges strongly to x.

3. Bayesian inverse problems for equations with white noise forcing

3.1. Stochastic Burgers and Hamilton–Jacobi equations

We consider the inviscid Burgers equation∂u

∂t+ (u · ∇)u = f W (x, t), (3.1)

where u(x, t) ∈ Rd is the velocity of a fluid particle in R

d . The forcing function f W (x, t) ∈ Rd

depends on t via a one-dimensional white noise W and is of the form

f W (x, t) = f (x)W (t).

8


We study the case where f (x) is a gradient, i.e. there is a potential F(x) such that

f (x) = −∇F(x). (3.2)

Throughout this paper, F(x) is three times differentiable with bounded derivatives up to thethird order. We consider the case where u(x, t) is also a gradient, i.e. u(x, t) = ∇φ(x, t). Thevelocity potential φ satisfies the Hamilton–Jacobi equation

∂φ(x, t)

∂t+ 1

2|∇φ(x, t)|2 + FW (x, t) = 0, (3.3)

where the forcing potential FW (x, t) is

FW (x, t) = F(x)W (t).

The viscosity solution φ of (3.3) that corresponds to the solution u of the Burgers equation(3.1) is determined by the Lax–Oleinik variational principle. With the initial condition φ0 attime 0, φ(x, t) is given by

φ(x, t) = inf

{φ0(γ (0)) +

∫ t

0

(1

2|γ |2 − FW (γ (τ ), τ )

)dτ

}, (3.4)

where the infimum is taken among all the absolutely continuous curves γ : [0, t] → Rm such

that γ (t) = x.When the forcing function is spatially periodic, problem (3.3) possesses a unique solution

(up to an additive constant) that exists for all time t (see [8, 16]). The non-compact setting isfar more complicated. In [13], it is proved that when F(x) has a large maximum and a smallminimum, (3.3) possesses a solution that exists for all time. These imply the existence of asolution to the Burgers equation that exists for all time.

We employ the framework for measure spaces developed in section 2 to study Bayesianinverse problems for Burgers equation (3.1) and Hamilton–Jacobi equation (3.3) with whitenoise forcing.

3.2. Bayesian inverse problems

We study the Bayesian inverse problems that make inference on the white noise forcing giventhe observations at a finite set of times.

For Hamilton–Jacobi equation (3.3), fixing m spatial points xi and m times ti, i = 1, . . . , m,the function φ is observed at (xi, ti). The observations are subject to an unbiased Gaussiannoise. They are

zi = φ(xi, ti) + σ ′i ,

where σ ′i are independent Gaussian random variables.

Since φ is determined within an additive constant, we assume that it is measured at afurther point (x0, t0) where, for simplicity only, t0 < ti for all i = 1, . . . , m.

Let

yi = zi − z0 = φ(xi, ti) − φ(x0, t0) + σ ′i − σ ′

0.

Let σi = σ ′i − σ ′

0. We assume that σ = {σi : i = 1, . . . , m} ∈ Rm follows a Gaussian

distribution with zero mean and covariance �. The vector

GHJ(W ) = {φ(xi, ti) − φ(x0, t0) : i = 1, . . . , m} ∈ Rm (3.5)

is uniquely determined by the Wiener process W . We recover W given

y = GHJ(W ) + σ.

9


The solution u(x, t) of Burgers equation (3.1) is not defined at shocks. However, in the casesstudied here, u(·, t) ∈ L1

loc(Rm) for all time t. We assume that noisy measurements

yi = li(u(·, ti)) + σi, i = 1, . . . , m,

are made for li(u(·, ti)) (i = 1, . . . , m) where li : L1loc(R

m) → R are continuous and boundedfunctionals. The noise σ = (σ1, . . . , σm) ∼ N (0, �). Letting

GB(W ) = (l1(u(·, t1)), . . . , lm(u(·, tm))) ∈ Rm, (3.6)

given

y = GB(W ) + σ,

we determine W .Let tmax = max{t1, . . . , tm}. Let X be the metric space of continuous functions W on

(−∞, tmax] with W (tmax) = 0. The space is equipped with the metric

D(W,W ′) =∞∑

n=1

1

2n

sup−n�t�tmax|W (t) − W ′(t)|

1 + sup−n�t�tmax|W (t) − W ′(t)| (3.7)

(see Stroock and Varadhan [24]). Let μ0 be the Wiener measure onX . Let μy be the conditionalprobability defined on X given y. For GHJ in (3.5), we define the function �HJ : X ×R

m → R

by

�HJ(W ; y) = 12 |y − GHJ(W )|2�, (3.8)

and for GB in (3.6), we define

�B(W ; y) = 12 |y − GB(W )|2�. (3.9)

We will prove that the Radon–Nikodym derivative of μy satisfiesdμy

dμ0∝ exp(−�HJ(W ; y))

anddμy

dμ0∝ exp(−�B(W ; y)),

respectively, and that the posterior measure μy continuously depends on y ∈ Rm.

4. Periodic problems

We first consider the case where the problems are set in the torus Td . The forcing potential

FW (x, t) is spatially periodic. Randomly forced Burgers and Hamilton–Jacobi equations in atorus have been studied thoroughly by E et al in [8] for one dimension and by Iturriaga andKhanin in [16] for multidimensions. We first review their results.

4.1. Existence and uniqueness for periodic problems

The mean value

b =∫

Td

u(x, t) dx

is unchanged for all t. We denote the solution φ of the Hamilton–Jacobi equation (3.3) by φWb .

It is of the form

φWb (x, t) = b · x + ψW

b (x, t), (4.1)

10


where ψWb is spatially periodic. Let C(s, t; T

d ) be the space of absolutely continuous functionsfrom (s, t) to T

d . For each vector b ∈ Rd , we define the operator AW,b

s,t : C(s, t; Td ) → R as

AW,bs,t (γ ) =

∫ t

s

(1

2|γ (τ ) − b|2 − FW (γ (τ ), τ ) − b2

2

)dτ. (4.2)

The variational formulation for the function ψWb (x, t) in (4.1) is

ψWb (x, t) = inf

γ∈C(s,t;Td )

(ψ(γ (s), s) + AW,b

s,t (γ )), (4.3)

where s < t. This is written in terms of the Lax operator as

KW,bs,t ψW

b (·, s) = ψWb (·, t).

Using integration by parts, we get

AW,bs,t (γ ) =

∫ t

s

(1

2|γ (τ ) − b|2 + ∇F(γ (τ )) · γ (τ )(W (τ ) − W (t)) − b2

2

)dτ

−F(γ (s))(W (t) − W (s)). (4.4)

We study the functional AW,bs,t via minimizers which are defined as follows [16].

Definition 4.1. A curve γ : [s, t] → Rd such that γ (s) = x and γ (t) = y is called a minimizer

over [s, t] if it minimizes the action AW,bs,t among all the absolutely continuous curves with

endpoints at x and y at times s and t, respectively.For a function ψ , a curve γ : [s, t] → R

d with γ (t) = x is called a ψ minimizer over[s, t] if it minimizes the action ψ(γ (s)) +AW,b

s,t (γ ) among all the curves with endpoint at x attime t.

A curve γ : (−∞, t] → Rd with γ (t) = x is called a one-sided minimizer if it is a

minimizer over all the time intervals [s, t].

Iturriaga and Khanin [16] proved the following theorem.

Theorem 4.2 ([16], theorem 1).

(i) For almost all W, all b ∈ Rd and T ∈ R, there exists a unique (up to an additive constant)

function φWb (x, t), x ∈ R

m, t ∈ (−∞, T ] such that

φWb (x, t) = b · x + ψW

b (x, t),

where ψWb (·, t) is a T

d periodic function and for all −∞ < s < t < T ,

ψWb (·, t) = KW,b

s,t ψWb (·, s).

(ii) The function ψWb (·, t) is Lipschitz in T

d. If x is a point of differentiability of ψWb (·, t), then

there exists a unique one-sided minimizer γW,bx,t at (x, t) and its velocity is given by the

gradient of φ: γW,bx,t (t) = ∇φW

b (x, t) = b + ∇ψWb (x, t). Further, any one-sided minimizer

is a ψWb minimizer on finite-time intervals.

(iii) For almost all W and all b ∈ Rd:

lims→−∞ sup

η∈C(Td )

minC∈R

maxx∈Td

∣∣KW,bs,t η(x) − ψW

b (x, t) − C∣∣ = 0.

(iv) The unique solution uWb (x, t) that exists for all time of the Burgers equation (3.1) is

determined by uWb (·, t) = ∇φW

b (·, t).

We will employ these results to study the Bayesian inverse problems formulated insection 2.

11


4.2. Bayesian inverse problem for the spatially periodic Hamilton–Jacobi equation (3.3)

We first study the Bayesian inverse problem for Hamilton–Jacobi equation (3.3) in the torusT

d . The observation GHJ(W ) in (3.5) becomes

GHJ(W ) = {ψWb (xi, ti) − ψW

b (x0, t0) : i = 1, . . . , m}.

To establish the validity of the Bayesian formula (2.3), we first show that GHJ(W ) as a mapfrom X to R

m is continuous with respect to the metric D in (3.7).The following proposition holds.

Proposition 4.3. The map GHJ(W ) : X → Rm is continuous with respect to the metric (3.7).

Proof. Let Wk converge to W in the metric (3.7). There are constants Ck which do not dependon i such that

limk→∞

∣∣φWkb (xi, ti) − φW

b (xi, ti) − Ck

∣∣→ 0 (4.5)

for all i = 0, . . . , m. The proof of (4.5) is technical; we present it in section 7.1. From this wededuce that

limk→∞

∣∣(φWkb (xi, ti) − φ

Wkb (x0, t0)

)− (φWb (xi, ti) − φW

b (x0, t0))∣∣ = 0,

so

limk→∞

|GHJ(Wk) − GHJ(W )| = 0.

The conclusion then follows. �

To establish the continuous dependence of μy on y, we verify assumption 2.3. First weprove a bound for GHJ(W ).

Lemma 4.4. There is a positive constant c which only depends on the potential F such that

|GHJ(W )| � c

(1 +

m∑i=1

maxt0−1�τ�ti

|W (τ ) − W (ti)|2)

.

Proof. Let γ be a ψWb (·, t0 − 1) minimizer starting at (xi, ti). Then,

ψWb (xi, ti) = ψW

b (γ (t0 − 1), t0 − 1) + AW,bt0−1,ti

(γ ).

From (4.4), there is a constant c which depends only on b and F such that

AW,bt0−1,ti

(γ ) � 1

2(ti − t0 + 1)

(∫ ti

t0−1|γ (τ )|dτ

)2

− c(1 + maxt0−1�τ�ti

|W (τ ) − W (ti)|)∫ ti

t0−1|γ (τ )|dτ

− c(1 + |W (ti) − W (t0 − 1)|.The right-hand side is a quadratic form of

∫ tit0−1 |γ (τ )|dτ so

AW,bt0−1,ti

(γ ) � −c(1 + max

t0−1�τ�ti|W (τ ) − W (ti)|2

).

Therefore,

ψWb (xi, ti) � ψW

b (γ (t0 − 1), t0 − 1) − c(1 + maxt0−1�τ�ti

|W (τ ) − W (ti)|2).

12


Let γ ′ be the linear curve connecting (x0, t0) and (γ (t0 − 1), t0 − 1). We have

ψWb (x0, t0) � ψW

b (γ (t0 − 1), t0 − 1) + AW,bt0−1,t0

(γ ′)

� ψWb (γ (t0 − 1), t0 − 1) + c(1 + max

t0−1�τ�t0|W (τ ) − W (t0)|).

Therefore,

ψWb (xi, ti) − ψW

b (x0, t0) � −c(1 + max

t0−1�τ�ti|W (τ ) − W (ti)|2

). (4.6)

Let γ be the linear curve connecting (x0, t0) and (xi, ti). We then find that


b (x0, t0) + AW,bt0,ti (γ ).

Thus,


b (x0, t0) + c(1 + maxt0�τ�ti

|W (τ ) − W (ti)|). (4.7)

From (4.6) and (4.7), we conclude that there exists a positive constant c such that∣∣ψWb (xi, ti) − ψW

b (x0, t0)∣∣ � c(1 + max

t0−1�τ�ti

∣∣W (τ ) − W (ti)∣∣2). (4.8)

Therefore,

|GHJ(W )| � c

(1 +

m∑i=1

maxt0−1�τ�ti

|W (τ ) − W (ti)|2)

. (4.9)�

Proposition 4.5. Assumption 2.3 holds.

Proof. Fixing a constant M, from (4.9) there is a set of positive Wiener measures so that

|GHJ(W )| � M.

Therefore, there is a constant M(r) and a set of positive measures such that when |y| � r

|�HJ(W ; y)| = 12 |y − GHJ(W )|2� � 1

2‖�−1/2‖2Rm,Rm (|y| + M)2 � M(r).

Assumption 2.3(i) holds.Now we prove assumption 2.3(ii). We have

|�HJ(W ; y) − �HJ(W ; y′)| = 12

∣∣∣⟨�−1/2(y − GHJ(W )),�−1/2(y − y′)⟩

+〈�−1/2(y − y′),�−1/2(y′ − GHJ(W ))〉∣∣∣

� 12‖�−1/2‖2

Rm,Rm (|y| + |y′| + 2|GHJ(W )|)|y − y′|. (4.10)

Letting

G(r,W ) = ‖�−1/2‖2Rm,Rm

[r + c

(1 +

m∑i=1

maxt0−1�τ�ti

|W (τ ) − W (ti)|2)]

,

obviously G(r, ·) ∈ L2(X , μ0). �From propositions 4.3, 4.5, corollary 2.2 and theorem 2.4 we have

Theorem 4.6. For the spatially periodic randomly forced Hamilton–Jacobi equation (3.3),the measure μy(dW ) = P(dW |y) is absolutely continuous with respect to the Wiener measureμ0(dW ); the Radon–Nikodym derivative satisfies

dμy

dμ0∝ exp(−�HJ(W ; y)).

When |y| � r and |y′| � r, there is a constant c(r) such that

dHell(μy, μy′

) � c(r)|y − y′| and dKL(μy, μy′) � c(r)|y − y′|.

13


We establish the existence of a MAP estimator in the following theorem.

Theorem 4.7. For the spatially periodic randomly forced Hamilton–Jacobi equation (3.3),there exists a MAP estimator for the posterior measure μy.

Proof. Iturriaga and Khanin [16] establish the almost sure existence of a solution for theHamilton–Jacobi equation with white noise forcing in periodic domains. A close examinationof their proofs shows that for a solution to exist, it requires the velocity of finite-time minimizersto be uniformly bounded, and the existence of an ε-narrow place that satisfies the hypothesisof lemma 7.1, i.e. the existence of an interval [s1, s2] of length T such that

maxs1�τ�s2

|W (τ ) − W (s2)| � 1

T 2

for T arbitrarily large.Iturriaga and Khanin [16] show that the velocity is uniformly bounded as long as W is

continuous. This condition is satisfied by all the functions W in the Cameron–Martin space ofthe Wiener space X .

For ε narrow places, we note that when W is in the Cameron–Martin space, for a fixed T,

maxs1�τ�s2

|W (τ ) − W (s2)| �∫ s2

s1

|W (τ )|dτ � T 1/2

(∫ s2

s1

|W (τ )|2 dτ

)1/2

,

which can be arbitrarily small when s2 is small as∫ 0−∞ |W (τ )|2 dτ is finite. This shows the

existence of an ε-narrow place. Therefore, there is a unique solution that exists for all timefor the forcing realizations in the Cameron–Martin space. The Wiener space X with themetric 3.7 is Frechet, so all the requirements of theorem 2.7 hold. Thus, there exists a MAPestimator. �

4.3. Bayesian inverse problem for the spatially periodic Burgers equation

We consider the Bayesian inverse problem for Burgers equation (3.1) in the torus Td . We first

establish the continuity of

GB(W ) = (l1(uWb (·, t1)

), . . . , lm

(uW

b (·, tm)))

as a map from X to Rm to prove the validity of the Bayesian formula (2.3).

First we prove the uniform boundedness of |γ (t)| for any minimizers γ on [t − 1, t]. Theresult is proved in [8] and [16], using Gronvall’s inequality. We present another proof withoutusing this inequality, and establish explicitly the dependence of the bound on W .

Lemma 4.8. For any minimizer γ of AW,bt−1,t ,

|γ (t)| � c(1 + max

t−1�τ�t|W (τ ) − W (t)|2).

Proof. γ be the linear curve connecting (γ (t), t) and (γ (t −1), t −1). As |γ (t)−γ (t −1)| �d1/2,

AW,bt−1,t (γ ) � c

(1 + max

t−1�τ�t|W (τ ) − W (t)|).

We have

AW,bt−1,t (γ ) � 1

2

(∫ t

t−1|γ (τ )|dτ

)2

− c(1 + maxt−1�τ�t

|W (τ ) − W (t)|)∫ t

t−1|γ (τ )|dτ

−c(1 + |W (t − 1) − W (t)|).14


Because AW,bt−1,t (γ ) � AW,b

t−1,t (γ ), solving the quadratic equation, we get∫ t

t−1|γ (τ )|dτ � c(1 + max

t−1�τ�t|W (τ ) − W (t)|). (4.11)

The minimizer γ satisfies (see [8] and [16])

γ (t) = γ (s) +∫ t

s∇2F(γ (τ ))γ (τ )(W (τ ) − W (t)) dτ + ∇F(γ (s))(W (s) − W (t)) (4.12)

for all t − 1 � s � t. Therefore,

|γ (s)| � |γ (t)| − c maxt−1�τ�t

|W (τ ) − W (t)|(

1 +∫ t

t−1|γ (τ )|dτ

)� |γ (t)| − c

(1 + max

t−1�τ�t|W (τ ) − W (t)|2).

From this and (4.11), we deduce that

|γ (t)| � c

(1 + max

t−1�τ�t|W (τ ) − W (t)|2

)+ c(1 + max

t−1�τ�t|W (τ ) − W (t)|)

� c

(1 + max

t−1�τ�t|W (τ ) − W (t)|2

).

�We have the following lemma on the Lipschitzness of the solution φW

b (·, t).

Lemma 4.9. The solution φWb (·, t) satisfies∣∣φW

b (x, t) − φWb (x′, t)

∣∣ � c(1 + max

t−1�τ�t|W (τ ) − W (t)|2)|x − x′|.

The lemma is proved in [16], but to show the explicit Lipschitz constant, we brieflymention the proof here.

Proof. Let γ be a one-sided minimizer starting at (x, t). We consider the curve

γ (τ ) = γ (τ ) + (x′ − x)(τ − t + 1),

(modulus Td) connecting x′ and γ (t − 1). As

ψWb (x, t) = ψW

b (γ (t − 1), t − 1) + AW,bt−1,t (γ )

and

ψWb (x′, t) � φW

b (γ (t − 1), t − 1) + AW,bt−1,t (γ ),

we have

ψWb (x′, t) − ψW

b (x, t) � AW,bt−1,t (γ ) − AW,b

t−1,t (γ )

� c

[1 + max

t−1�s�t|W (s) − W (t)|

(1 +

∫ t

t−1|γ (τ )|dτ

)]|x − x′|.

Similarly, the same estimate holds for ψWb (x, t) − ψW

b (x′, t). From (4.11), we have

|φWb (x, t) − φW

b (x′, t)| � c

(1 + max

t−1�τ�t|W (τ ) − W (t)|2

)|x − x′|.

�We now establish the continuity of GB(W ) as a map from X to R

m.

Proposition 4.10. The function GB(W ) : X → Rm is continuous with respect to the metric D

in (3.7).

15


Proof. From lemma 4.9, when Wk → W in the metric D, φWkb is uniformly Lipschitz in

C0,1(Td )/R. We can therefore extract a subsequence that converges in C0,1(Td )/R. From (4.5),every subsequence converges to φW

b . Thus, uWkb → uW

b whenever φWkb and φW

b are differentiable(φWk

b and φWb are differentiable almost everywhere). From lemma 4.9, |uWk

b (x, t)| are uniformlybounded. The dominated convergence theorem implies that uWk

b (·, ti) → uWb (·, ti) in L1(T) for

i = 1, . . . , m. �

For the well-posedness of the Bayesian inverse problem, we prove assumption 2.3.

Proposition 4.11. For the spatially periodic randomly forced Burgers equation (3.1),assumption 2.3 holds.

Proof. From lemma 4.9, we deduce that

|GB(W )| � c

(1 +

m∑i=1

maxti−1�τ�ti

|W (τ ) − W (ti)|2)

. (4.13)

Therefore, for each positive constant M, there exists a subset of X of positive Wiener measuresuch that for allW in that set: |G(W )| � M. By the same argument as in the proof of proposition4.5, assumption 2.3(i) holds.

From (4.10) and (4.13), assumption 2.3(ii) holds with

G(r,W ) = ‖�−1/2‖2Rm,Rm

[r + c

(1 +

m∑i=1

maxti−1�τ�ti

|W (τ ) − W (ti)|2)]

,

which is in L2(X , dμ0). �

From propositions 4.10 and 4.11, corollary 2.2 and theorem 2.4, we have

Theorem 4.12. For the spatially periodic randomly forced Burgers equation (3.1), the posteriormeasure μy is absolutely continuous with respect to the prior measure μ0 and satisfies

dμy

dμ0∝ exp(−�B(W ; y)),

where �B(W ; y) is defined in (3.9), and is locally Lipschitz in y with respect to the Hellingerand Kullback–Leibler distances, i.e.

dHell(μy, μy′

) � c(r)|y − y′|, and dKL(μy, μy′) � c(r)|y − y′|

when |y| � r and |y′| � r; c(r) depends on r.

Similarly to the Hamilton–Jacobi equation, we can show that there exists a MAP estimatorfor the posterior measure μy.

Theorem 4.13. For the spatially periodic randomly forced Burgers equation (3.1), there is aMAP estimator for the posterior measure μy.

5. Non-periodic problems

We consider Bayesian inverse problems making inference on white noise forcing for Burgersand Hamilton–Jacobi equations in non-compact domains. Burgers and Hamilton–Jacobiequations with white noise forcing are reasonably understood when the forcing potentialF(x) has a large maximum and a small minimum (Hoang and Khanin [13]). We first presentthe setting up of the problem and introduce the basic results that will be used.

16


5.1. Existence and uniqueness for non-periodic problems

Let W (t) be a standard Brownian motion starting at 0 at the time 0. Let

E1 = E{|W (1)|},where E denotes expectation with respect to the Wiener measure. For all 0 < t < 1, there is aconstant E2 not depending on t such that

E

{max

1−t�τ�1|W (τ ) − W (1)|

}= E2t1/2.

We also denote

E3 = E

{max

0�τ�1|W (τ ) − W (1)|2

}.

Following Hoang and Khanin [13], we make the following assumption.

Assumption 5.1.

(i) The forcing potential F has a maximum at xmax and a minimum at xmin such that

F(xmax) > L and F(xmin) < −L; |xmax| < a, |xmin| < a,

where a and L are positive constants; there is a constant b > a such that

|∇F(x)| < K when |x| < b

and for x � b,

|∇F(x)| < K1 K and |F(x)| L1 L.

(ii) The function F(x) is three times differentiable; |∇2F(x)| and |∇3F(x)| are uniformlybounded for all x.

(iii) The constants L, K, L1, K1 satisfy

128a4E22 K2

E31

< L3, 8a2 < LE1, K21 <

LE1

3E3and L1 <

L

16.

We denote by B(b) the ball with center 0 and radius b.The solution of the Hamilton–Jacobi equation (3.3), denoted by φW in this case, satisfies

the Lax–Oleinik formula

φW (x, t) = infγ∈C(s,t;Rd )

{φ(γ (s), s) + AWs,t (γ )}, (5.1)

where C(s, t; Rd ) is the space of absolutely continuous functions from (s, t) to R

d ; the operatorAW

s,t is defined as

AWs,t (γ ) =

∫ t

s

(1

2|γ (τ )|2 + ∇F(γ (τ ))γ (τ )(W (τ ) − W (t))

)dτ

+ F(γ (s))(W (s) − W (t)). (5.2)

We denote (5.1) as

φW (x, t) = KWs,tφ

W (·, s).

Minimizers for the operator AWs,t are defined as in definition 4.1.

Under assumption 5.1, the following existence and uniqueness results for the Hamilton–Jacobi equation (3.3) and Burgers equation (3.1) are established in [13].

17


Theorem 5.2.

(i) With probability 1, there is a solution φW to the Hamilton–Jacobi equation (3.3) suchthat for all t > 0, φW (·, t) = KW

s,tφW (·, s) for all s < t. The solution φW (·, t) is locally

Lipschitz for all t. Let φWs (x, t) be the solution to the Hamilton–Jacobi equation with the

zero initial condition at time s; then, there is a constant Cs such that

lims→−∞ sup

|x|�M|φW (x, t) − φW

s (x, t) − Cs| = 0

for all M > 0. The solution is unique in this sense up to an additive constant.(ii) For every (x, t), there is a one-sided minimizer γW

x,t starting at (x, t), and there is asequence of time si that converges to −∞, and zero minimizers γsi on [si, t] startingat (x, t) such that γsi converge uniformly to γW

x,t on any compact time intervals whensi → −∞; further γW

x,t is a φW minimizer on any finite time interval. If x ∈ Rm is a point

of differentiability of φW (·, t) then such a one-sided minimizer starting at x at time t isunique and γW

x,t (t) = ∇φW (x, t).(iii) The function uW (·, t) = ∇φW (·, t) is a solution of the Burgers equation (3.1) that exists

for all time. It is the limit of the solutions to (3.1) on finite time intervals with the zeroinitial condition. In this sense, the solution uW is unique.

From now on, by one-sided minimizers, we mean those that are limits of zero minimizerson finite time intervals as in theorem 5.2(ii).

The key fact that Hoang and Khanin [13] used to prove these results is that when t − sis sufficiently large, any zero minimizers starting at (x, t) over a time interval (s, t) will beinside the ball B(b) at a time in an interval (t(W ), t) where t(W ) only depends on W , x and t.

Let

α = min{1, L2E2

1

/(16a2K2E2

2

)}.

We define for each integer l the random variable

Pl = 2a2

α+ 2aK max

l+1−t�τ�l+1|W (τ ) − W (l + 1)| − L|W (l) − W (l + 1)|

+ L1|W (l) − W (l + 1)| + K21

2max

l�τ�l+1|W (τ ) − W (l + 1)|2. (5.3)

These random variables are independent and identically distributed. Their expectation satisfies

E{Pl} < −LE1

2, (5.4)

which can be shown from assumption 5.1(iii) (see [13], p 830).Let t ′i and t ′′i be the integers that satisfy

ti � t ′i < ti + 1 and ti − 2 < t ′′i � ti − 1. (5.5)

The following result is shown in [13].

Proposition 5.3 ([13], pp 829–31). Fixing an index i, there are constants c(xi, ti, F ) and c(F )

such that if ti(W ) is the largest integer for whicht ′′i −1∑l=t

Pl > −c(xi, ti, F )

⎛⎝1 +

t ′i∑j=t ′′0

maxj�τ� j+1

|W (τ ) − W ( j + 1)|2⎞⎠

− c(F )

(1 + max

t−2�τ�t|W (τ ) − W (t)|2

), (5.6)

for all integers t � ti(W ) + 2, then when s is sufficiently small, every zero minimizer over(s, ti) starting at (xi, ti) is inside the ball B(b) at a time s′ ∈ [ti(W ), t ′′0 ].

18


Indeed, Hoang and Khanin [13] prove the result for s′ ∈ [ti(W ), ti − 1] but the proof forthe requirement s′ ∈ [ti(W ), t ′′0 ] is identical. The law of large numbers guarantees the existenceof such a value ti(W ). It is obvious that all one-sided minimizers in theorem 5.2(ii) are in B(b)

at a time s′ ∈ [ti(W ), t ′′0 ].We employ these results to prove the validity of the Bayes formula and the well-posedness

of the Bayesian inverse problems for the Burgers and Hamilton–Jacobi equations.

5.2. Bayesian inverse problem for the Hamilton–Jacobi equation (3.3)

We consider Hamilton–Jacobi equation (3.3) with the forcing potential F satisfying assumption5.1. To prove the validity of the Bayes formula, we show the continuity of the function

GHJ(W ) = (φW (x1, t1) − φW (x0, t0), . . . , φW (xm, tm) − φW (x0, t0)).

Proposition 5.4. The function GHJ is continuous as a map from X to Rm with respect to the

metric D in (3.7).

Proof. Let Wk converge to W in the metric D of X . As shown in section 7.2, there are constantsCk which do not depend on i such that

limk→∞

|φWk (xi, ti) − φW (xi, ti) − Ck| = 0 (5.7)

for all i = 1, . . . , m. From this,

limk→∞

|(φWk (xi, ti) − φWk (x0, t0)) − (φW (xi, ti) − φW (x0, t0))| = 0.


To show the continuous dependence of the posterior measure on the noisy data y, we showassumption 2.3. We first prove the following lemmas.

Lemma 5.5. There is a constant c that depends on xi, x0, ti, t0 and F such that

φW (xi, ti) − φW (x0, t0) � c

(1 + max

t0�τ�ti|W (τ ) − W (ti)|

).

Proof. Let γ be the linear curve that connects (xi, ti) with (x0, t0). We have

φW (xi, ti) � φW (x0, t0) + AWt0,ti (γ ).

Since ˙γ (τ ) = (xi − x0)/(ti − t0), from (5.2)

|AWt0,ti | � |xi − x0|2

2|ti − t0| + maxx

|∇F(x)| maxt0�τ�ti

|W (τ ) − W (ti)||xi − x0|+ max

x|F(x)||W (ti) − W (t0)|.


Lemma 5.6. The following estimate holds

φW (xi, ti) − φW (x0, t0) � −c(x0, xi, F )

t ′i∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2

),

where the constant c(x0, xi, F ) depends on x0, xi and F.

19


Proof. We prove this lemma in section 7.3. �

From these two lemmas, we have


Proof. From lemmas 5.5 and 5.6,

|GHJ(W )| � S(W ),

where

S(W ) = c + cm∑

i=1

maxt0�τ�ti

|W (τ ) − W (ti)| + cm∑

i=1

t ′i∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2

),

where the constants do not depend on the Brownian path W .Let I be a positive integer such that the subset XI containing all W ∈ X that satisfy

ti(W ) > I for all i = 1, . . . , m has a positive probability. There is a constant c such that thesubset containing the paths W ∈ XI that satisfy

S(W ) < c

has a positive probability. Assumption 2.3(i) holds.To prove assumption 2.3(ii), we get from (4.10)

|�HJ(W ; y) − �HJ(W ; y′)| � 12‖�−1/2‖2

Rm,Rm (|y| + |y′| + 2|GHJ(W )|)|y − y′|.Let

G(r,W ) = ‖�−1/2‖2Rm,Rm (r + S(W )). (5.8)

We prove that G(r,W ) is in L2(X , μ0) in 7.4. �

From propositions 5.4, 5.7, corollary 2.2 and theorem 2.4, we get

Theorem 5.8. Under assumption 5.1 for the Hamilton–Jacobi equation (3.3), the posteriorprobability measure μy(dW ) = P(dW |y) is absolutely continuous with respect to the Wienermeasure μ0(dW ); its Radon–Nikodym derivative satisfies

dμy

dμ0∝ exp(−�HJ(W ; y))

where �HJ is defined in (3.8). For |y| � r and |y′| � r, there is a constant c(r) depending onr such that

dHell(μy, μy′


We now remark on the existence of a MAP estimator for the Hamilton–Jacobi equationsin the noncompact setting. This remark also holds for the Burgers equation considered below.

Remark 5.9. The existence of a solution for the Hamilton–Jacobi and Burgers equations in[13] is almost sure. Unlike in the periodic case, the proof for the noncompact setting is purelyprobabilistic, where the law of a large number plays a central role. It is not clear whether asolution exists when W belongs to the Cameron–Martin space. Therefore, the existence of aMAP estimator for this case is open.

20


5.3. Bayesian inverse problem for the Burgers equation

We consider Burgers equation (3.1) in the non-compact setting. We first prove that

GB(W ) = (l1(u(·, t1)), . . . , lm(u(·, tm)))

is continuous with respect to W . To this end, we prove a bound for |γ (ti)| where γ (ti) is aone-sided minimizer. Another bound is proved in [13] using Gronvall’s inequality. This boundinvolves an exponential function of the Brownian motion, which is difficult for constructingthe function G(r,W ) in assumption 2.3.

Lemma 5.10. All one-sided minimizers γ that start at (x, t ′′0 ) satisfy

|γ (ti)| � c(ti − ti(W ))

(1 + |x|2 + max

ti(W )�τ�ti|W (τ ) − W (ti)|2

), (5.9)

where the constant c does not depend on x, ti and ti(W ).

Proof. Let s ∈ (ti(W ), ti) such that γ (s) ∈ B(b). From (5.2), we have

AWs,ti (γ ) � 1

2(ti − s)

(∫ ti

s|γ (τ )|dτ

)2

− c maxs�τ�ti

|W (τ )

−W (ti)|∫ ti

s|γ (τ )|dτ − c|W (s) − W (ti)|. (5.10)

We consider the linear curve γ that connects (x, ti) and (γ (s), s). As ˙γ (τ ) � (|x|+|b|)/(ti−s),

AWs,t (γ ) � (|x| + |b|)2

2(ti − s)+ c(|x| + b) max

s�τ�ti|W (τ ) − W (ti)| + c|W (s) − W (ti)|. (5.11)

The constants c in (5.10) and (5.11) do not depend on x. From AWs,ti (γ ) � AW

s,ti (γ ), solvingthe quadratic equation, we deduce that∫ ti

s|γ (τ )|dτ � (ti − s)c

(1 + |x| + max

s�τ�ti|W (τ ) − W (ti)|

),

where the constant c does not depend on x.The rest of the proof is similar to that of lemma 4.8. �

Lemma 5.11. For any x and x′

|φW (x, ti) − φW (x′, ti)| � c(ti − ti(W ))2[1 + |x|2 + |x′|2

+ maxti(W )�τ�ti

|W (τ ) − W (ti)|2]|x − x′|.

The proof of this lemma is similar to that of lemma 4.9.

Proposition 5.12. The function GB is a continuous map from X to Rm with respect to the

metric (3.7).

Proof. When Wk → W in the metric (3.7), ti(Wk) as determined in proposition 5.3 converges toti(W ). From lemma 5.11, φWk (·, ti) are uniformly bounded in C0,1(D)/R for any compact setD ⊂ R

m. Therefore, from (5.7), φWk (·, ti) converges to φW (·, ti) in C0,1(D)/R. Thus, uWk (·, ti)are uniformly bounded in D and converges to uW (·, ti) in L1(D). Therefore, GB(W k) →GB(W ). �

Now we prove the continuous dependence of μy on y.


21


Proof. From lemma 5.11, we find that at the points where φW is differentiable

|u(x, ti)| � c(ti − ti(W ))

⎡⎣1 + |x|2 + (ti − ti(W ) + 1)2

t ′i −1∑l=ti(W )

maxl�τ�l+1

|W (τ ) − W (l + 1)|2⎤⎦ ,

where t ′i is defined in (5.5). The rest of the proof is similar to that for proposition 5.7.The function G(r,W ) is defined as

G(r,W ) = ‖�‖2Rm,Rm

[r + c

m∑i=1

(ti − ti(W ))(

1 + |x|2

+ (ti − ti(W ) + 1)2t ′i −1∑

l=ti(W )

maxl�τ�l+1

|W (τ ) − W (l + 1)|2)⎤⎦ , (5.12)

where c does not depend on x, ti and ti(W ).The proof that the function G(r,W ) in (5.12) is in L2(X , dμ0) is similar to that in

section 7.4 for the function G(r,W ) defined in (5.8). �From propositions 5.12 and 5.13, corollary 2.2 and theorem 2.4, we deduce

Theorem 5.14. Under assumption 5.1, for the randomly forced Burgers equation (3.1) in thenon-compact setting, the posterior measure μy is absolutely continuous with respect to theprior measure μ0 and satisfies

dμy

dμ0∝ exp(−�B(W ; y))

where �B is defined in (3.9). For |y| � r and |y′| � r, there is a constant c(r) depending on rsuch that

dHell(μy, μy′


6. Conclusions

We formulated general conditions in this paper for the existence and well-posedness of aposterior probability measure for Bayesian inverse problems in general topological measurespaces. We showed that the conditions for well-posedness are abstract and are problemdependent. Even when metrics and/or seminorms are available, we may not be able to writethese conditions in terms of the metric or a seminorm; identifying them can be very challenging.With the well-possedness conditions, we showed that the posterior measure is locally Lipschitzwith respect to the Hellinger and the Kullback–Leibler distances. We then applied the abstractframework to Burgers and Hamilton–Jacobi equations with a white noise forcing on a semi-infinite time interval (−∞, T ]. This is the problem of inferring the forcing when the physicalprocess has occurred over a long time and no longer depends on the particular initial conditionit started with. Employing the available results in the literature for the Burgers and Hamilton–Jacobi equations, we consider the Bayesian inverse problems for both spatially periodic andnonperiodic domains.

7. Proofs of technical results

7.1. Proof of (4.5)

We use the concept of ε-narrow places defined in [16], in particular, the following result.

22


Lemma 7.1. For a sufficiently large T , if s2 − s1 = T and

maxs1�τ�s2

|W (τ ) − W (s2)| <1

T 2,

then for any minimizers γ on [s1, s2],∣∣∣∣AW,bs1,s2

(γ ) − b2T

2

∣∣∣∣ < ε.

For each T > 0, for almost all W , ergodicity implies that there are infinitely many timeintervals that satisfy the assumption of lemma 7.1. The proof of lemma 7.1 can be foundin Iturriaga and Khanin [16] lemma 12. We call such an interval [s1, s2] an ε-narrow place.Indeed, the definition of ε-narrow places in [16] is more general, but lemma 7.1 is sufficientfor our purpose.

To prove (4.5), we need a further result from [16].

Lemma 7.2 ([16], equation (42)). Let [s1, s2] be an ε-narrow place. Let γ be a ψ minimizeron [s1, t] where t > s2 for a function ψ . Then, for all p ∈ T

d,

ψ(p) � ψ(γ (s1)) − 2ε.

Proof of (4.5). Let γWk be a one-sided minimizer starting at (xi, ti) for the action AWk,b. AsD(Wk,W ) → 0 when k → ∞, from lemma 4.8, γWk (ti) is uniformly bounded.

Let γW be a one-sided minimizer for AW,b. Let [s1, s2] be an ε-narrow place of the actionAW,b as defined in lemma 7.1. When D(W,Wk) is sufficiently small, [s1, s2] is also an ε-narrowplace for AWk,b.

Let γW1 be an arbitrary ψW

b (·, s1) minimizer on [s1, ti]. From theorem 4.2(ii), γW is aψW

b (·, s1) minimizer on [s1, ti], so we have from lemma 7.2,

ψWb

(γW

1 (s1), s1)

� ψWb (γW (s1), s1) − 2ε.

On the other hand, also from lemma 7.2:

ψWb

(γW (s1), s1

)� ψW

b (γW1 (s1), s1) − 2ε.

Therefore, ∣∣ψWb

(γW (s1), s1

)− ψWb

(γW

1 (s1), s1)∣∣ < 2ε.

Similarly, let γWk1 be an arbitrary ψ

Wkb (·, s1) minimizer on [s1, ti]. We then have∣∣ψWk

b

(γWk (s1), s1

)− ψWkb

(γ

Wk1 (s1), s1

)∣∣ < 2ε.

Fix γW1 and γ

Wk1 . Letting Ck = ψW

b (γW1 (s1), s1) − ψ

Wkb (γ

Wk1 (s1), s1), we deduce

|ψWb (γW (s1), s1) − ψ

Wkb (γWk (s1), s1) − Ck| < 4ε. (7.1)

We have

ψWkb (xi, ti) = ψ

Wkb (γWk (s1), s1) + AWk,b

s1,s2(γWk ) + AWk,b

s2,ti (γWk ).

Let γ be the curve constructed as follows. For τ ∈ [s2, ti], γ (τ ) = γWk (τ ). On [s1, s2], γ is aminimizing curve for AW,b

s1,s2that connects (γWk (s2), s2) and (γW (s1), s1). We then have


b (γW (s1), s1) + AW,bs1,ti (γ )

= ψWb (γW (s1), s1) + AW,b

s1,s2(γ ) + AW,b

s2,ti (γWk ).

Therefore,

ψWb (xi, ti) − ψ

Wkb (xi, ti) � (ψW

b (γW (s1), s1) − ψWkb (γWk (s1), s1))

+ (AW,bs2,ti (γ

Wk ) − AWk,bs2,ti (γWk )) + (AW,b

s1,s2(γ ) − AWk,b

s1,s2(γWk )). (7.2)

23


When D(Wk,W ) → 0, |γWk (ti)| are uniformly bounded so from (4.12) we deduce that |γWk (τ )|and |γWk (τ )| are uniformly bounded for all s2 � τ � ti. Therefore,

limk→∞

AW,bs2,ti (γ

Wk ) − AWk,bs2,ti (γWk ) = 0. (7.3)

From lemma 7.1, as [s1, s2] is an ε-narrow place for both AW,b and AWk,b,∣∣AW,bs1,s2

(γ ) − AWk,bs1,s2

(γWk )∣∣ < 2ε. (7.4)

From (7.1), (7.2), (7.3), (7.4) we have

ψWb (xi, ti) − ψ

Wkb (xi, ti) − Ck � 7ε,

when k is sufficiently large. Similarly,

ψWkb (xi, ti) − ψW

b (xi, ti) + Ck � 7ε,

when k is large. From these, we deduce∣∣ψWb (xi, ti) − ψ

Wkb (xi, ti) − Ck

∣∣ � 7ε,

when k is sufficiently large. Hence, for this particular subsequence of {Wk} (not renumbered)

limk→∞

∣∣φWkb (xi, ti) − φW

b (xi, ti) − Ck

∣∣ = 0 (7.5)

for appropriate constants Ck which do not depend on xi and ti. From any subsequences of {Wk},we can choose such a subsequence. We then get the conclusion. �

7.2. Proof of (5.7)

The proof uses ε-narrow places for the nonperiodic setting which are defined as follows.

Definition 7.3. An interval [s1, s2] is called an ε-narrow place for the action AW defined in(5.2) if there are two compact sets S1 and S2 such that for all minimizing curves γ (τ ) on aninterval [s′

1, s′2] ⊃ [s1, s2] with |γ (s′

1)| � b and |γ (s′2)| � b, γ (s1) ∈ S1 and γ (s2) ∈ S2.

Furthermore, if γ is a minimizing curve over [s1, s2] connecting a point in S1 to a point in S2,then |AW

s1,s2(γ )| < ε.

In [13] theorem 7, such an ε-narrow place is constructed as follows. Assume thats2 − s1 = T and

maxs1�τ�s2

|W (τ ) − W (s2)| <1

2T.

Without loss of generality, we assume that s1 and s2 are integers. Fixing a positive constant cthat depends only on the function F and b, the law of large numbers implies that if two integerss′′

1 < s1 and s′′2 > s2 satisfy

s′′2−1∑

l=s2

Pl +s1−1∑l=s′′

1

Pl � −c[1 + max

s′′1−2�τ�s′′

1

|W (τ ) − W (s′′1 )|2]

− c[1 + max

s′′2�τ�s′′

2+2|W (τ ) − W (s′′

2 )|2],

then s′′1 � s∗

1 and s′′2 � s∗

2 where s∗1 and s∗

2 depend on the path W and the constant c only (Pl isdefined in (5.3)). Let

M1 = sups∗

1−2�τ�s1

|W (τ ) − W (s1)|, M2 = sups2�τ�s∗

2+2|W (τ ) − W (s2)|.

24


Hoang and Khanin ([13], pp 833–5) show that there is a continuous function R(M1, M2) anda constant c that depends on F such that if

c(R(M1, M2)2 + 1)

T< ε,

then [s1, s2] is an ε-narrow place of AW . By using ergodicity, they show that with probability1, an ε-narrow place constructed this way always exists and s2 can be chosen arbitrarily small.When D(W,W ′) is sufficiently small, then it is also an ε-narrow place for AW ′

.We then have the following lemma (indeed Hoang and Khanin [13], in the proof of their

lemma 3, show a similar result for the kicked forcing case, but the proof for white noise forcingfollows the same line).

Lemma 7.4. If γ is a φ(·, s1) minimizer on [s1, t] such that γ (s1) ∈ S1 and γ (s2) ∈ S2, thenfor all p ∈ S1

φ(p, s1) > φ(γ (s1), s1) − 2ε.

Proof of (5.7). The proof is similar to that of (4.5). Assume that [s1, s2] is an ε-narrow placefor both W and Wk. Any one-sided minimizers starting at (x, t) where x is in a compact set,when s2 is sufficiently smaller than t, must intersect the ball B(b) at a time larger than s2 andat a time smaller than s1 (see the proof of proposition 2 in [13]). Therefore, γ (s1) ∈ S1 andγ (s2) ∈ S2. We then proceed as in the proof of (4.5) in section 7.1. �

7.3. Proof of lemma 5.6

In this section, we prove lemma 5.6.Consider a one-sided minimizer γ that starts at (x, ti). There is a time s ∈ [ti(W ), t ′′0 ]

(ti(W ) is defined in proposition 5.3) such that γ (s) ∈ B(b). We then have

φW (xi, ti) = φW (γ (s), s) + AWs,ti (γ ).

Let s′ be the integer such that 1 � s′ − s < 2. We then have

AWs,ti (γ ) = AW

s,s′ (γ ) +t ′′i −1∑l=s′

AWl,l+1(γ ) + AW

t ′′i ,ti(γ ), (7.6)

where t ′′i is defined in (5.5). From (5.2), we have

AWs,s′ (γ ) � 1

2|s′ − s|( ∫ s′

s|γ (τ )|dτ

)2

− maxx

|∇F(x)| maxs�τ�s′

|W (τ ) − W (s′)|∫ s′

s|γ (τ )|dτ

− maxx

|F(x)||W (s) − W (s′)|,which is a quadratic form for

∫ s′

s |γ (τ )|dτ . Therefore, there is a constant c that depends on Fsuch that

AWs,s′ (γ ) � −c(1 + max

s�τ�s′(1 + |W (τ ) − W (s′)|2).

Performing similarly for other terms in (7.6), we find that there is a constant c which does notdepend on x, x0, t, t0 such that

AWs,ti (γ ) � −c

t ′i∑l=s′′

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2),

25


where t ′i is defined in (5.5), and s′′ is the integer such that 0 � s − s′′ < 1. Thus,

φW (xi, ti) � φW (γ (s), s) − c

t ′i∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2). (7.7)

Now we consider the straight line γ1 that connects (x0, t0) with (γ (s), s). We have

φW (x0, t0) � φW (γ1(s), s) + AWs,t0 (γ1).

Let t ′′0 be an integer such that 1 � t0 − t ′′0 < 2. We write

AWs,t0 (γ1) = AW

s,s′ (γ1) +t ′′0 −1∑l=s′

AWl,l+1(γ1) + AW

t ′′0 ,t0(γ1).

The velocity |γ1(τ )| � |b − x0| so that there is a constant c(x0) such that

AWs,t0 (γ1) � c(x0)

t ′0∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|).

Therefore,

φW (x0, t0) � φW (γ (s), s) + c(x0)

t ′0∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|). (7.8)

From equations (7.7) and (7.8), we have

φW (xi, ti) − φW (x0, t0) � −c

t ′i∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2).

7.4. A proof that the function G(r,W ) in (5.8) is in L2(X , μ0)

We now show that the function G(r,W ) defined in (5.8) is in L2(X , μ0). We indeed show that

t ′i∑l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2

)

is in L2(X , μ0).Let ε be a small positive constant. Fixing an integer k, we consider the following events:

(A): ∣∣∣∣∣∣t ′i∑

l=t ′′i −m

maxl�τ�l+1

|W (τ ) − W (l + 1)|2 − (t ′i − t ′′i + m + 1)E3

∣∣∣∣∣∣ < (t ′i − t ′′i + m + 1)ε

for all m � k,(B): ∣∣∣∣∣∣

t ′′i −1∑l=t ′′i −m

maxl�τ�l+1

|W (τ ) − W (l + 1)|2 − mE3

∣∣∣∣∣∣ < mε

for all m � k,

26


(C): ∣∣∣∣∣∣t ′′i −1∑

l=t ′′i −m

Pl − mE{Pl}∣∣∣∣∣∣ < mε

for all m � k.

Assume that all of these events occur. Then, from (A) and (B) for m = k, we get

t ′i∑l=t ′′i

maxl�τ�l+1

|W (τ ) − W ( j + 1)|2 � (t ′i − t ′′i + 1)E3 + (t ′i − t ′′i + 2k + 1)ε;

using (B) for m = k and m = k + 1, we have

maxt ′′i −(k+1)�τ�t ′′i −k

|W (τ ) − W (t ′′i − k)|2 � (2k + 1)ε + E3;using (B) for m = k + 1 and m = k + 2 we have

maxt ′′i −(k+2)�τ�t ′′i −(k+1)

|W (τ ) − W (t ′′i − (k + 1))|2 � (2k + 3)ε + E3;and using (C) for m = k we have

t ′′i −1∑l=t ′′i −k

Pl � k(E{Pl} + ε) � k

(−LE1

8+ ε

)

(note from (5.4) that E{Pl} � −LE1/8).Now suppose that all the events (A), (B) and (C) hold for k = t ′′i − ti(W ); then, from (5.6)

for t = ti(W ), we get

(t ′′i − ti(W ))(

− LE1

8+ ε)

� −c(xi, ti, F ){1 + (t ′i − t ′′i + 1)E3 + [t ′i − t ′′i

+2(t ′′i − ti(W )) + 1]ε} − c(F ){1 + [4(t ′′i − ti(W )) + 4]ε + 2E3}.Thus,

(t ′′i − ti(W ))

[− LE1

8+ ε(

1 + 2c(xi, ti, F ) + 4c(F ))]

� −c(xi, ti, F )[1 + (t ′i − t ′′i + 1)E3 + (t ′i − t ′′i + 1)ε

]− c(F )(1 + 4ε + 2E3).

When ε is sufficiently small, this only holds when t ′′i − ti(W ) is smaller than an integer N whichdoes not depend on the white noise, i.e. when ti(W ) � t ′′i −N. If (5.6) holds but ti(W ) < t ′′i −N,then at least one of the events (A), (B) and (C) must not hold. Now we denote by Xk the subsetof X that contains all the Brownian paths W such that at least one of the events (A), (B) and(C) does not hold. For each number r, there is a constant C(r, ε) such that

P{Xk} � C(r, ε)

kr(7.9)

(see Baum and Katz [3], theorem 1). Therefore,

E

⎧⎪⎨⎪⎩⎡⎣ t ′i∑

l=ti(W )

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2)

⎤⎦

2⎫⎪⎬⎪⎭

� E

⎧⎪⎨⎪⎩⎡⎣ t ′i∑

l=t ′′i −N

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2)

⎤⎦

2⎫⎪⎬⎪⎭

27


+∞∑

k=1

E

⎧⎪⎨⎪⎩⎡⎣ t ′i∑

l=t ′′i −k

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2)

⎤⎦

2

1Xk

⎫⎪⎬⎪⎭ .

From (7.9), we have

E

⎧⎪⎨⎪⎩⎡⎣ t ′i∑

l=t ′′i −k

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2

)⎤⎦2

1Xk

⎫⎪⎬⎪⎭

� C(r, ε)

kr/2

⎡⎢⎣E

⎧⎪⎨⎪⎩⎡⎣ t ′i∑

l=t ′′i −k

(1 + max

l�τ�l+1|W (τ ) − W (l + 1)|2

)⎤⎦4⎫⎪⎬⎪⎭⎤⎥⎦

1/2

� C(r, ε)

kr/2(t ′i − t ′′i + k + 1)3/2

⎡⎣E

⎧⎨⎩

t ′i∑l=t ′′i −k

(1 + max

l�τ�l+1|W (τ )

−W (l + 1)|2)4⎫⎬⎭⎤⎦

1/2

� C(r, ε)k−r/2(t ′i − t ′′i + k + 1)2.

When r is sufficiently large,∑∞

k=1 k−r/2(t ′i − t ′′i + k + 1)2 is finite. The assertion follows.

Acknowledgments

The author thanks Andrew Stuart for suggesting the metric space framework and for helpfuldiscussions. He also thanks Kostya Khanin for introducing him to Burgers turbulence. Financialsupport from the startup research grant from Nanyang Technological University is gratefullyacknowledged.

References

[1] Apte A, Auroux D and Ramaswamy M 2010 Variational data assimilation for discrete Burgers equation Electron.J. Diff. Eqns 19 15–30

[2] Bardos C and Pironneau O 2005 Data assimilation for conservation laws Methods Appl. Anal. 12 103–34[3] Baum L E and Katz M 1963 Convergence rates in the law of large numbers Bull. Am. Math. Soc. 69 771–2[4] Bec J and Khanin K 2007 Burgers turbulence Phys. Rep. 447 1–66[5] Bogachev V I 1998 Gaussian Measures (Mathematical Surveys and Monographs vol 62) (Providence, RI:

American Mathematical Society)[6] Cohen A, DeVore R and Schwab C 2010 Convergence rates of best N-term Galerkin approximations for a class

of elliptic sPDEs Found. Comput. Math. 10 615–46[7] Cotter S L, Dashti M, Robinson J C and Stuart A M 2009 Bayesian inverse problems for functions and

applications to fluid mechanics Inverse Problems 25 115008[8] E W, Khanin K, Mazel A and Sinai Y 2000 Invariant measures for Burgers equation with stochastic forcing

Ann. Math. (2) 151 877–960[9] Engl H W, Hanke M and Neubauer A 1996 Regularization of Inverse Problems (Mathematics and Its

Applications vol 375) (Dordrecht: Kluwer)[10] Freitag M, Nichols N K and Budd C J 2010 L1-regularisation for ill-posed problems in variational data

assimilation Proc. Appl. Math. Mech. 10 665–8[11] Frisch U and Bec J 2001 ‘Burgulence’ Turbulence: Nouveaux Aspects/New Trends in Turbulence (Les Houches,

2000) (Les Ulis: EDP Sciences) pp 341–83[12] Gomes D, Iturriaga R, Khanin K and Padilla P 2005 Viscosity limit of stationary distributions for the random

forced Burgers equation Moscow Math. J. 5 613–31, 743

28

http://dx.doi.org/10.1090/S0002-9904-1963-11027-7

http://dx.doi.org/10.1016/j.physrep.2007.04.002

http://dx.doi.org/10.1007/s10208-010-9072-2

http://dx.doi.org/10.1088/0266-5611/25/11/115008

http://dx.doi.org/10.2307/121126

http://dx.doi.org/10.1002/pamm.201010324


[13] Hoang V H and Khanin K 2003 Random Burgers equation and Lagrangian systems in non-compact domainsNonlinearity 16 819–42

[14] Hoang V H and Schwab C 2012 Analytic regularity and polynomial approximation of stochastic, parametricelliptic multiscale PDEs Analysis Appl. at press

[15] Isakov V 2006 Inverse Problems for Partial Differential Equations (Applied Mathematical Sciences vol 127)2nd edn (New York: Springer)

[16] Iturriaga R and Khanin K 2003 Burgers turbulence and random Lagrangian systems Commun. Math. Phys. 232377–428

[17] Kaipio J and Somersalo E 2005 Statistical and Computational Inverse Problems (Applied Mathematical Sciencesvol 160) (New York: Springer)

[18] Kallenberg O 2002 Foundations of Modern Probability (Probability and Its Applications) 2nd edn (New York:Springer)

[19] Lam C H and Sander L M 1993 Inverse methods for interface problems Phys. Rev. Lett. 71 561–4[20] Lifshits M A 1995 Gaussian Random Functions (Mathematics and Its Applications vol 322) (Dordrecht: Kluwer)[21] Lundvall J, Kozlov V and Weinerfelt P 2006 Iterative methods for data assimilation for Burgers’ equation J.

Inverse Ill-Posed Problem 14 505–35[22] Mattingly J C 1999 Ergodicity of 2 D Navier–Stokes equations with random forcing and large viscosity Commun.

Math. Phys. 206 273–88[23] Schwab C and Stuart A M 2011 Sparse deterministic approximation of Bayesian inverse problems Technical

Report SAM ETH Zurich[24] Stroock D W and Srinivasa Varadhan S R 2006 Multidimensional Diffusion Processes (Classics in Mathematics)

(Berlin: Springer) (reprint of 1997 edn)[25] Stuart A M 2010 Inverse problems: a Bayesian perspective Acta Numer. 19 451–559

29

http://dx.doi.org/10.1088/0951-7715/16/3/303

http://dx.doi.org/10.1103/PhysRevLett.71.561

http://dx.doi.org/10.1515/156939406778247589

http://dx.doi.org/10.1007/s002200050706

http://dx.doi.org/10.1017/S0962492910000061

bayesian inverse problems in measure spaces with application to burgers and hamilton–jacobi...

Documents