a note on a nonlinear semigroup for controlled partially observed diffusions

6
Systems & Control Letters 10 (1988) 365-370 North-Holland 365 A note on a nonlinear semigroup for controlled partially observed diffusions Qing ZHANG Division of Applied Mathematics, Brown University, Providence, RI 02912, U.S.A. Received 7 November 1987 Revised 25 January 1988 Abstract: This paper is concerned with the separated control problems for optimal stochastic controls under partial observations. The 'pathwise' method of nonlinear filtering is extended to the case when a function h(X, }~) ef state Xt and observation Yt plus additive white noise is observed. Markov and continuity properties of the unnormalized conditional distribution measure are obtained and the Nisio nonlinear semigroup is found. Keywords: Partial observation stochastic control, Zakai equation, Nonlinear semigroup, Pathwise method. In this note, we are concerned with the control of diffusions under partial observations which depend both on their paths and controls. Let XI denote the state of the process being controlled, Yt the observation process and Ut the control process, t > O. The state and the observation processes are governed by the following stochastic differential equation: dXt-b(Xt, Yt, Ut) dt+o(Xt, i~) dWt, dYt - h ( X t, Yt) dt+ d~. (1) We shall assume that Xt has values in ?/'dimensional R N, Yt values in R M, and Ut values in ~'c R L. )to has a given distribution p, Yo-Yo ~ RM. In (1), W and I~ are independent standard Wiener processes with values in R D, R M respectively, o in (1) is thus an N x D matrix. The purpose of this note is to make a generalization of the method of the so called 'pathwise' version of an unnormalized conditional distribu- tion, which has been uti!iTed by Fleming [3] and Fleming and Pardoux [4], in case h- h(At). We shall discuss continuity and Markov properties of the unnormalized conditional distribution At, the Zakai equation and cot~tinuity and semigroup properties of some criteria J with respect to initial distributions (Yo, ,~0. An essential point of our n~e~hod ~s introducing a function H: R N × R M ~ R u such that OH(x, y2__.h(,. ' y). H(x, y) + y Oy For I integers, let Cb(R t) denote the space of bounded, continuous functions on R t, Cbk(R t) be the space of f such that f together with all partial derivative~ of order __< k are bounded and continuous on R t, and C~,(Rl) oo k --NkffilCb(R ). We make the following hypotheses about the functions appearing in (1). (A1) o is a bounded, Lipschitz N × D ma~rix-valued function on R N. (A2) b(x, u) - b°(x) + bl(x)u, where b' are bounded Lipschitz functions on R N, i - O, 1. (A3) h ¢ Cb2(R N × RM; RM). (This condition will be relaxed subsequently.) (A4) ~ is a convex, compact subset of R L. Let Y denote an RM-valued function, and U a ~-valued function of time t > 0. Let f~y0 - {(Y, U): Y0 - Y0, Y~ C([0, oo); RM), UE L2([0, T]; ~), for each T< oo}, and ~'~=~.Jyo~RM~'~y o. Let fir denote the set of restrictions to [0, T] of functions (Y, U)~ ~. We give ~ a metric in which convergence of (Y~, U,) 0167-6911/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)

Upload: qing-zhang

Post on 21-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A note on a nonlinear semigroup for controlled partially observed diffusions

Systems & Control Letters 10 (1988) 365-370 North-Holland

365

A note on a nonlinear semigroup for controlled partially observed diffusions

Qing Z H A N G Division of Applied Mathematics, Brown University, Providence, RI 02912, U.S.A.

Received 7 November 1987 Revised 25 January 1988

Abstract: This paper is concerned with the separated control problems for optimal stochastic controls under partial observations. The 'pathwise' method of nonlinear filtering is extended to the case when a function h(X, }~) ef state Xt and observation Yt plus additive white noise is observed. Markov and continuity properties of the unnormalized conditional distribution measure are obtained and the Nisio nonlinear semigroup is found.

Keywords: Partial observation stochastic control, Zakai equation, Nonlinear semigroup, Pathwise method.

In this note, we are concerned with the control of diffusions under partial observations which depend both on their paths and controls. Let XI denote the state of the process being controlled, Yt the observation process and Ut the control process, t > O. The state and the observation processes are governed by the following stochastic differential equation:

d X t - b ( X t , Yt, Ut) d t+o(Xt , i~) dWt,

dYt - h (X t, Yt) d t + d ~ . (1)

We shall assume that X t has values in ?/'dimensional R N, Yt values in R M, and Ut values in ~ ' c R L. )to has a given distribution p, Yo-Yo ~ RM. In (1), W and I~ are independent standard Wiener processes with values in R D, R M respectively, o in (1) is thus an N x D matrix. The purpose of this note is to make a generalization of the method of the so called 'pathwise' version of an unnormalized conditional distribu- tion, which has been uti!iTed by Fleming [3] and Fleming and Pardoux [4], in case h - h(At). We shall discuss continuity and Markov properties of the unnormalized conditional distribution At, the Zakai equation and cot~tinuity and semigroup properties of some criteria J with respect to initial distributions

(Yo, ,~0. An essential point of our n~e~hod ~s introducing a function H: R N × R M ~ R u such that

OH(x, y2__.h(,. ' y). H(x, y) + y Oy

For I integers, let Cb(R t) denote the space of bounded, continuous functions on R t, Cbk(R t) be the space of f such that f together with all partial derivative~ of order __< k are bounded and continuous on R t, and C ~ , ( R l ) oo k --Nkff i lCb(R ).

We make the following hypotheses about the functions appearing in (1). (A1) o is a bounded, Lipschitz N × D ma~rix-valued function on R N. (A2) b(x, u) - b°(x) + bl(x)u, where b' are bounded Lipschitz functions on R N, i - O, 1. (A3) h ¢ Cb2(R N × RM; RM). (This condition will be relaxed subsequently.) (A4) ~ is a convex, compact subset of R L. Let Y denote an RM-valued function, and U a ~-valued function of time t > 0. Let f~y0 - {(Y, U): Y0 -

Y0, Y~ C([0, oo); RM), U E L2([0, T]; ~) , for each T < oo}, and ~'~=~.Jyo~RM~'~y o. Let fir denote the set of restrictions to [0, T] of functions (Y, U ) ~ ~. We give ~ a metric in which convergence of (Y~, U,)

0167-6911/88/$3.50 © 1988, Elsevier Science Publishers B.V. (North-Holland)

Page 2: A note on a nonlinear semigroup for controlled partially observed diffusions

366 Q. Zhang / Controlled partiaily observed diffusions

means uniform convergence on [0, T] of Y,, and weak convergence of U,, in L2([0, T]; U) for every T<o~ . Let ~z't(Y) =o{Yt ~ - Yt,; 0 < t l <t2<t } , ~~(U)=o{f~Uodv; O < s < t } , f#t=~tt(Y)x~tt(U), and ~oo = V, > o~t, the least o-algebra containing ~t for all t >__ 0.

Definition. An admissible control on [0, T] is a probability measure rr r on (12 r, F r), such that Y . - Yo is a rrr, { ~t } Wiener process for 0 _< t _< T. An admissible control is a probability measure ~r on (~, ~o) such that Y is a ~r, { ~t } Wiener process for t > 0. The totality of admissible controls rr is denoted by ~¢.

Let ~ = {measures ~ > 0 on ~ ' (R N)" l.~(R N) < oO }, and ,,¢t'r - - { ~ E o/~ ¢" ~(R N) < r }; by convergence of a sequence in .,¢t' we mean w*-convergence: /~, --* ~ iff ( f , / ~ , ) ~ ( f , p) for every f ~ Cb such that f ( x ) - , O , as Ixl --' oo~ where <f, It) - ff~.x, d#(x). Let

Ck(R~4><,,¢¢) = {~ continuous on RM×J¢: supy~RM sup~,~, I ~ ( y ' P) I < O~} 1 + IIt~ II k

and C(R M )<./f4)=l,.JkaoCk(R M ×o4¢'). We give C(R M xJ/.[) the following metric:

OO

d(~,,, ~2) = E 2-'(supR~ sups, I~ , (y , # ) - ~ 2 ( Y , / ~ ) I ̂ 1 ) , Iffil

where R) n = { y : y E R M, lYl _<1}, for ~1, dP2~C(RMx'/~') • Let

~ = {~: ~b(Y, # ) = F ( y , <fl, #) , . . . ,<fJ , It)), F~C~'(RM+J) ,

/'1, . . . . f j E C ~ ( R N ) , J = 1, 2, . . . },

where C~(R N) C C~(R N) s:,~ch that for f ~ C~¢(RN), f has compact support• Y,U Let J(t, Y0,/~, rr, ~) ffi E~,~(Yt, A t), where A t = A t,yo,~, is the unnormalized conditional distribution

(see [3,4]) and ~ C(R M x J t ) .

Y,U Lemma 1. For each t > O, r < oo, At,yo,~, is continuous on R M X,/f~ r X ~.

Proof. Let H( x, y ) f f loh( x, yt) dt, whe~'e x ¢ R N, y ~ R M. For h E C2( RN + M), H E C2b( R iv+M) and h ffi H + y(OH/Oy), here

OH ay - %H(x, y)ffi

OH1 OH1 Oy I "'" i}y M

OHM HM OYl "'" [}YM

We have

foth(Xs, Y~) dY~- fot(H ( Vs, Y~) + y OH(X~,oy Y~) ) dY~

=H(Xt , Y t )Y t - fotY~Lx(s)H(X~, Y ~ ) d s - fotY~VxH(X~, y~)o(X~, Y~) dW~

-½ fotY~AyH(Xs, Y~) d s - fot diVvH(X ~, Y~) ds

Page 3: A note on a nonlinear semigroup for controlled partially observed diffusions

Q. Zhang / Controlled partially cbseroed diffmions 367

where

N i)__.~2 N a = ½ E r,) a ,o j + E b,(x, v,, v,) ax,

i,jffil i=1

"~ 2 with a = off T, A y m O2//ayl 2 + . . . + ~"/ay~¢, and divyH = i~H1/i~y 1 + . . . + OHM/i)yM . Set

e(s, x ) = - ½ l h ( x , Y~) ] :' + ½(aY~WxH, Y~WxH) - Y ~ L ~ ( s ) H - ½ Y # l y H - d i v y H

and

Z t - e x p l - JoYxWxH(X ~, Y~)o(X~, Y,) dW~- ½ fot(aY~WxH( X~, Y~), Y~WxH( Xs, Y~))ds).

Y,U Define/~r,v by d.Pr'tJ/dPr'U= Zr and the urmormalized conditional distribution At,yo, ~ by

( f, Ar'Vt.yo.~,~ - F'r'v( f ( Xt) exp(YtH( Xt, Yt)) exP fote(s, X~) ds)

for all f ~ Cb(RlV). ~r.v is the probability law of (W, X) given (Y, U) defined in [4]. Now, Lemma 1 can be verified with Yth(x) replaced by YtH(x, Yt) following the proof of Lemma 3.2 in

[41.

Lemma 2. For each ep ~ Ck( R M ×0~'), there exists a sequence tp,, ~ ~ such that II II k II ¢ II k + 1/n and t~, --, ¢k in metric d.

Remark. The proof of Lemma 2.2 in [3] remains valid, mutatis mutandis in general (see Zhang [6]); the proof of Lemma 2 will be omitted here.

Theorem 1. m i n , , ~ J ( t , Yo, P, or, ~) is continuous on R M × ~ r for each t >_ O, and "b ~ C( R M ×~l/).

Proof. Theorem 1 follows Lermna 1 and the proof of Theorem 3.1 in [3]. \

Proposition. I f h belongs to the closure of CZb( R N+M) in sup norm, Theorem I also holds. In particular, for h bounded and uniformly continuous, the above claim applies.

Proof. Let hk ~C2b(RN+M), supx.y Ihk(X, y ) - h ( x , Y)l --,0, as k ~ o0. For h=hk(X, y), (1) becomes

dXt=b(Xt , Yt, Ut)dt + o(X,, Yt) dWt,

dYt = hk( Xt, Yt) dt + dI~'t k, P) a.s.

where d P;k/dp~° = Z~, I~t k is a Wiener process independent of W t, and

ztk=exPIfothk(Xs, y ~ ) d Y _ ½ f o , h k ( X s ' y~)]2 ds) .

Under pO_ po, y ._ Yo is a Wiener process, and the unnormalized conditional distribution At satisfies ~t ' ), since the dynamics are dXt = b( Xt, Yt, U,) dt + o(Xt, Y~) dWt, and Y. - Y0

is a Wiener process. We claim for fixed e > 0 , when k is large, IJk(t, Yo, #, ¢r) -J( t , Yo, P', or)[ <e for all t >_ O, ( yo, #, 'it ) e R M x ~l[ r x ~ .

Page 4: A note on a nonlinear semigroup for controlled partially observed diffusions

368 Q. Zhang / Controlled partially observed diffusions

Via the definition of J, it suffices to show E ° I Z t k - Ztl 2 --* 0 as k ~ oo.

Zt k -- Zt__exp(fo;hs dy s- Ifotlhs[2 ds)(exp(Ot-F t~t)--1),

- = ~ f d ( I h k l - I h l ds, I,~,1 < where 0, = fg(hk h) dY,, 8 t _ 1 , 2 2 ) _ ½et, when k is large. Then,

(E: ,Z~-Zt ,2) 2= {EO[exp(2forh, dYe- foil hs, 2)ds(exp(O, + 8,)-1)2]} 2

(E ° exp(4foth , dY~ -2foil hs 12 ds))(E ° (exp(Ot + St)-1)') <

< Ch ( E ° (exp(Ot + 8, ) - 1)4), where Ch = e 6M'~r, Mh = sup I h, I. Now

E°(exp(Ot + 8 , ) - 1) 4= E°(e~'(e ° ' - 1) + (e ~' - 1)) 4

< 8CE°(e °, - 1) + + 8E°(e n , - 1) a < 8CE°(e °, - 1) a + 8e 2,

while an

(e ° , - 1)' _< E 7.o: • n>2

Applying Ito's differential rule to Ot ~= ( f d ( h k - h) dye)" one has, by I h k - h l < e,

E°O, Zt < [ / ( 2 1 - 1)]' t t- l f o t E ° l h k - h l 2 t d s < [1 (21-1)1 tT'e 2',

E°Ot 21+ I <_< ( E°O, 2 )I/2121(41- I)] lTle21.

Therefore

22' [2•(4•- 1)lie 2t. '-< E l> l

By Stirling's formula,

2 2/ t < (32 e2Te) t < O t

for 32e2Te < p < 1, when e << 1. Thus, we > 0, for large k, I r r f in , r~ jk ( t , Yo, /~, ~r) - m i n , , ~ t J ( t , Yo,/~, ~r) l < ~, for all t > 0, ()b, #) R M X,/IC'r. That is, min,, ~ ~,jk ..., rain,, ~,,J uniformly on R M x.,lt'. Via the conclusion of Theorem 1, Jk(t, YO, #)

-- n f i n , ~ , J k ( t, Yo, #, ~r" continuously depends on Yo and/~ for each k. Hence, J(t, Yo, P.) = nqn,, E~,J( t , Y0,/z, ~r) is continuous on R M ×.At'. For h bounded and wfiformly continuous, take p(x)=(1/2"rr)N/2e-1*l"/2, p s ( x ) = 8 - N O ( x / 8 ) , and

h~ -- h • 0~, then h8 ~ C~, and h8 --* h uniformly as 8 --* 0. In fact, h~ = fs,,h(l~)ps(x - I~) d~ ~ C~, and by choosing a compact set Q, such that fRN_O# < e, sup , ,~SNy~q lh (x -- 6y) -- h (x) l fwP + 2 II h II ~e. This proves the proposition.

Theorem 2. The Zakai equation holds. That is, for every f ~ C~( R jv ),

d ( f , At ) = ( L t f , At ) dt + (h f , A:) dYt, I =.. N . ,~ where At--'AY'Ut,yo,~' L t~ : ~2. ' i , j=lak, j ( x , Yt )O"/OXiOXj + "~N=lbi(x, Yt, Ut)O/~)xi , and h= h(x, Yt).

Page 5: A note on a nonlinear semigroup for controlled partially observed diffusions

Q. Zhang / Controlled partially observed diffusions 369

Remark. Even in case of dYt = h(X t, Yt, Ut)dt + dl~t, the Zakai equation still holds (see Davis and Marcus [2] and Zhang [6]).

Theorem 3. Under the assumptions (A1)-(A4), let ~t~(Yo, I t ) - m i n ~ J ( t , Yo, l~, 9, ~); then for every 4,~:'- C(RM ×~'), s>_0, t>_0,

l~oof of Theorem 2. One way to justify it is to utilize the argument in Davis and Marcus [2], with l~ = h(Xs, Y~,/3,); the other way is to use the methods in Fleming and Pardoux [4]. With the same notation as in [4], for ~(t, x)-f(x) exp(YtH(x, YI)), (f, At> = (~/(t), At>, and

d~(t)=~(t)h(., Yt)dY,+ ½(~(t)divyh(-, Yt)+~(t)lh(-, Yt) 12) dr.

We have, by using Ito's differential rule,

fo' '

(f, At) - (f, Ao) = (f(x) divyh(x ~) ÷ Lx(s)f(x ) )Y~AyH(x, Y~)

- f ( x ) divyH(x, Y,), As) d s + fot(h(x, Y , ) f ( x ) , As) dY,.

By direct computation,

divyh(x, y ) - ½ y A y H ( x , y ) - d i v , , H ( x , y ) = O ,

for all x ~ R ~, y c R M, which gives

-(f, Ao>- fot(Lx(s)f(x), As> ds + fot(h(x, Y~)f(x), As> dye. (/, At)

Proof of Theorem 3. The proof is a modification of the proof of Theorem 4.1 in [3]; we only give an outline, indicating the changes needed. First of all, we modify Lemma 4.1 in [3] as follows. Under the assumptions that o, b °, b I, h are of class C~', Y e C~([0, oo); RM), U e C([0, oo), ~d), and the density of P, Poe C~'(R~), we claim At has a density q e C~ '2 satisfying the partial differential equatioa (where C~ '2 denotes a subset of C 2 and VT> 0, 3c, k > 0 such that for r being any of the partial derivatives of functions in C 1'2, I r(t , x ) [ < C exp( - k [ x 12), o _< t < T)

dq ffi L ' q - ½qlhtl 2 - ½qYtAHt-q divyHt + htqYt', q(O) --Po, (2) dt

where h t - h(. , Yt), Ht - H( . , Yt), and Po -- density of/~. As a matter of fact, we have

L t ( f e r'x') - eY 'H ' [Lx ( t ) f + fYtLx( t )Ht - ½f(aYtWxHt , YtW'xHt)]"

Since

e( t) = - ½ [ ht [ 2 + ½( aYt~'xHt, YtgT~Ht) - 7)tLx( t)Ht - ~ YtAyHt - divyHt,

L~(t)(f e r'n')[ Lx(t)f - ½f lht 12 - ½fYtayHt -f divyHt -fe(t)].

By comparing dq(t)/dt- (d/dt )( p( t, x)exp(YtHt)), where p is defined as p = qexp(- YtHt), we have

dq(t ) -_ ( L ' p ) exp(YtHt) + e ( t ) q ( t ) + htq(t)Yt, dt

which impfies (2). -"

Page 6: A note on a nonlinear semigroup for controlled partially observed diffusions

370 Q. Zhang / Controlled partially observed diffusions

The converse is also true, i.e. if q ~ C 1'2 is a solution to (1), with q(0) the density of !~, then q ( t ) is the density of A, for all t > 0. For s > 0, let Yt ~ = Y~+t, U~ ~ = U~+t; then (Y', U ~) ~ 12. We then continue much as in Fleming [3, pp. 255-256]; A Y'U . ys u s Y,U s + t , Y o , ~ m At.~.-,A, , where A~ = As,yo,t~, and

ff( , ',,'s , ,) J ( s + t, Yo, P, ~r, ¢ ) = t, Y~, A~ Y.e

with ~r~ the restriction to ~t of ~r ~.~¢ and ~r f ' v a regular conditional distribution for (Y~, U ~) given ff~.

Thus, ~q~+,O(Yo, I ~) > ~ r b ( Y o , P)" To show the opposite inequality, we take A0 = R M x d ' t ' - X M x~Ct'~ where R ~ = {y" i y l _.<;k }. Using

a similvr argument as in Lemma 3.5 and 3.6 of [3], and modified dependence for y, we have J( t , Y0, P, ,r, ¢ ) is continuous on R M X~' rX.~¢ for t >_ 0. Therefore, one can always choose disjoint Borel subsets A ~ , . . . , A , , of R M x~Ct'p with R ~ X d g ~ = A ~ t3 . . . t3A m, such that for (y, t~), (.~, ~.) ~A~, i = 1 , . . . , m ,

'n' ~,.~,

I J ( t , y , #, ~r, ~ p ) - J ( t , ~,, ~t, ~r, ¢)1 <a. Take arbitrary ~r0 ~ ¢ , and (y~,/~;) ~ A~, ~r~ ~ ¢ , i = 1 , . . . , m, such that

J ( t , y,, Vt,, ~r,, ~ ) < ~ t O ( Y , , P,) + 8.

Then for all (y: bt) ~ A~,

J ( t , y , p, ~r,, ¢)<ff~'t¢(Y, P') + 38,

f j ( Y.V ~) d~r~ J ( s + t, y , I~, ~r, e# ) = t, ~ , A~, ~r s ,

Since AoC { II J'll > ,o} 'v { lYl >_X} n { II '11 _<,o},

fA (1 + IIA IL k) d~.~_< f{ (1 + ii..~ II k) d~r~ + f( (1 + ilA~ll k) d~ o s , iiA~il >p } Iy~l >h}N( UAsli ~p }

• A C/ l ' (1 +r2k)+(l + Pk)"tr~{ IY I > x } .

Therefore, given e > O, we can choose p, ~ large enough and ~ small enough that

for all Yo ~ RM, /~ Ejf? , and % ~ , ~ . Thus, ~'~+,~(Yo,/~) - < - ~ ( Y o , / ~ ) + e.

Acknowledgement

I wish to thank Professor Wrnde)l H. Fleming for his encouragement and many valuable suggestions.

References

[1] P. Bilfngsley, Convergence of Probability Memures (John Wiley & Sons, New York, 1968). [2] M.H. Davis and S.I. Marcus, An introduction to nonfnear filtering, in: M. Hazewinkel and J.C. Willems, Eds., Stochastic

Systems: The Mathematics of Filtering and Ident~ficction and Applications, NATO Advanced Study Institute Series (Reidel, Dordrecht, 1980).

[3] W.H. Fleming, Nonlinear semigroup for coatrolled Fartially observed diffusions, SIAM J. Control Optim. 20 (1982) 286-301. [4] W.H. Fleming and E. Pardoux, Optimal om:trol for partially observed diffusions, SIAM J. Control Optim. 20 (1982) 261-285. [5] R.S. Lipster and A.N. Shiryayev, The Sta~isticv of ~ d o m Processes (Springer, Berlin, 1977). [6] Q. Zhang, Controlled parti-,.lly observed diffusions with correlated noise, Preprint (Sept. 1987).