optimization in normed linear spaces - core › download › pdf › 30605457.pdf · optimization...

Atlanta University CenterDigitalCommons@Robert W. Woodruff Library, AtlantaUniversity Center

ETD Collection for AUC Robert W. Woodruff Library

7-1-1975

Optimization in normed linear spacesAnnie Ruth SmithAtlanta University

Follow this and additional works at: http://digitalcommons.auctr.edu/dissertations

Part of the Mathematics Commons

This Thesis is brought to you for free and open access by DigitalCommons@Robert W. Woodruff Library, Atlanta University Center. It has beenaccepted for inclusion in ETD Collection for AUC Robert W. Woodruff Library by an authorized administrator of DigitalCommons@Robert W.Woodruff Library, Atlanta University Center. For more information, please contact [email protected].

Recommended CitationSmith, Annie Ruth, "Optimization in normed linear spaces" (1975). ETD Collection for AUC Robert W. Woodruff Library. Paper 2075.

http://digitalcommons.auctr.edu?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.auctr.edu?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.auctr.edu/dissertations?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.auctr.edu/dissertations?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/174?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

http://digitalcommons.auctr.edu/dissertations/2075?utm_source=digitalcommons.auctr.edu%2Fdissertations%2F2075&utm_medium=PDF&utm_campaign=PDFCoverPages

mailto:[email protected]

OPTIMIZATION

IN

NORMED LINEAR SPACES

A THESIS

SUBMITTED TO THE FACULTY OF ATLANTA UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR

THE DEGREE OF MASTER OF SCIENCE

BY

ANNIE RUTH SMITH

DEPARTMENT OF MATHEMATICS

ATLANTA, GEORGIA

JULY 1975

TABLE OF CONTENTS

Page

LIST OP FIGURES iii

Chapter

I. PRELIMINARIES 1

II. OPTIMUM NORMS IN HILBERT SPACES 3

The Projection Theorem......*• • 3

Minimization by Orthonormal Sets 4

Minimization by Infinite-Dimensional Subspaces. 6

Minimization by Convex Sets 8

III. OPTIMUM NORMS IN NORMED LINEAR SPACES 11

Minimization by Dual Spaces 11

Minimization Involving Linear Operators 15

IV. DIPPERENTIATION IN NORMED LINEAR SPACES 18

Gateaux and Frechet Differentials 18

Frechet Derivatives 21

V. OPTIMIZATION BY ITERATIVE METHODS 25

Methods for Solving Nonlinear Equations........ 25

Descent Methods 34

Conjugate Direction Methods 44

BIBLIOGRAPHY 52

ii

LIST OP FIGURES

Page

Figure

1. Successive Approximation Process 26

2. Technique of Newton' s Method 30

3. Descent Process 36

iii

CHAPTER I

PRELIMINARIES

Before beginning our discussion on optimization, we

will devote this chapter to presenting some necessary

preliminary material. Assuming that we are already familiar

with linear space theory and elementary functional analysis,

we will state only certain basic concepts which will be

directly related to the development of our topics. The

proofs in most cases are omitted. The reader is advised to

refer to Luenberger's Optimization by Vector Space Methods.

Lemma 1.1.1. (Cauchy-Schwarz Inequality) For all x,,x2

in an inner product space X, |(x1,x2)|*|lx1ll l|x2li. Equality

holds if and only if x.-vqc- or x2=0.

Lemma 1.1.2. (x1,x2)=O for all x2£ X implies x,=0.

Lemma 1.1.3. (Parallelogram Law) For all x.^x-tX,

!lx1+x2U2+llx1-x2il2=2||x1ll2+2|!x2H2.

Proposition 1.1.4. Let Z be a pre-Hilbert space. The

function Ilz||- fCz^z) is a norm for all z t z.

Lemma 1.1.5. (Continuity of Inner Product) If z -» z____________ jj

and vfl -* v in Z, then fzn,vn) -t (z#v).

Definition 1.1.6. z1 and z2 are orthogonal if (z^a^^O

(symbolized z^z^ for all z1#z2£Z. z^ is orthogonal to a

set V if z-j^ 1 v^ for each v,£ V.

2

2 2 2Lemma 1.1.7. If z1iz2, then )|z,+z2ll =||zJI +UZJI

for each z,,z2e Z.

Definition 1.1.8. A sequence \h\ in a Hilbert space

H is Cauchy if ilh -hJj -* 0 as n,ra -*«o.

CHAPTER II

OPTIMUM NORMS IN HILBERT SPACES

2.1. The Projection Theorem

The solution to our first minimum norm problem is

characterized by the projection theorem. Consequently, the

problem will be solved by proving two different versions of

the projection theorem.

Theorem 2.1.1. Given z in a pre-Hilbert space Z and

a subspace MCZ. If there is an mQ fc M which minimizes

Hz-n»0H such that |lz-mol| ± f|z-m!l for all m£M# then mQ is

unique, itu is unique if and only if (z-mQ) lM.

Proof. Assume that there is an m£ M such that m is

not orthogonal to z-iDq. Then assume llm!S=l and (z-mo,m)=%j*0.

Let m, 6. M be iiu-i-am. Then

2 2Hz-m^l = !|z-mo-sm!i

2 2= ||z-mol! -(z-m0#sm)-(Sim,z-m0)+|Umli

2 2= ||z-mois -5(z-mo,m)S(mzm)+!si

- ||z-n0ll2-2lS»l2+ |Sl2

» ||z-moii2-!s|2<||z-moll2.

Hence mQ is not a minimizing vector and (z-mQ) lM.

Take an m £ M. Then by the Pythagorean theorem,

3

4

2 2 2liz-mll =||z-mo+mo-mll =Hz-moll o

Therefore l|z-mol| <<- i(z-m|| for m^m0 and mQ is unique.

Theorem 2,1,2. Given a Hilbert space H and a closed

subspace M, for any h€ H, there is a unique mQ£ M which

minimizes ||h-mol! such that llh-mot! £ llh-mll for all m€M. mQ

is unique if and only if (h-mQ) 1 M.

Proof. By theorem 2.1.1, mQ € M is unique and (h-mJlM,

Thus we need only show that mQ exists.

If h€M, then mQ=h and mQ obviously exists. But

suppose h£M. Let S=infl|h-mii. We find an mofc M such thatm£M

Hh-moll=S> by taking a sequence of vectors \mA in H such

that (Ih-m^j -» s. By the parallelogram law,

li(m.-h) + (h-mi)tl2+H(m.-h)-(h-.mi)ll2=2|lm.-hll2H-2lih-mi!|2.

By direct computation,

|nns-m.|l 2=2|!m.-hll2+2llh-m.|t2-4|lh-m,-f-m.||2.2

Since M is a subspace, mi+m.€ M for all i,j. Thus, by the

2

definition of 4, Ijh-n^+m .|| 2 s and

tin .-m^l \ 2l|m .-h!l2+2||h-mi!l 2-4s2.? 2

Since llh-m^j -* s, llh-m^l "-* s as i -»«-». It follows that

2Hm.-m^t -» 0 as i,j -^«so. Therefore \mA is Cauchy.

Since H is complete, \itkj has a limit in H and thus

has a limit mo£M. Hence |ih-m0U=§ and mQ exists,

2.2. Minimization by Orthonormal Sets

Definition 2.2.1. A set of vectors y,,y,,...,y in a

linear space Y is linearly dependent if there exists scalars

5

«*•! ,^i • • •«^«, not all zero, such that ^lyi+^oYo** • •+«* Y~=0»

If «*,y ■+ot2y2+*"'+o*nyn==^ implies ^, = ^2=.. .= =^ =0, then the

set is linearly independent.

Definition 2.2.2. A set of vectors V in a Hilbert

space H is said to be an orthogonal set if v. 1 v, for each

v,,v2€V, v,?*v2. V is orthonormal if for each v £ V, llvll=l.

Definition 2.2.3. Let Yi,Y2f">Yn be in a Hilbert

space H. g(y1,Y2»•••,Yn) is the determinant of the Gram

matrix of y1#y2,...,yn and is called the Gram determinant.

Theorem 2.2.4. Let Y;l#Y2» •••»yn be a linearlY

independent subspace M of a Hilbert space H. Given an

arbitrary h£Hf there is an hQ £. M which minimizes Hh-ho(j .

Equivalently, if hg is written in terms of M as

h0=otlvl+ol2v2+'#"+a<hvn' then (n"h0) L (vl»y2'"#'yn) andthere is some § such that

Then

Proof. By definition, we know that

s2=||h-h0H2=(h-h0,h)-(h-h0,h0).

By theorem 2.1.1, (h-hQ) 1 M, so that (h-ho,hQ)=O. Thus

2

S =(hhh) =(hh)a(yh)^(yh)

or

,2(h,h)

David Luenberger, Optimization by Vector Space

Methods (New York, 1969), p. 56.

This equation and the normal equations yield n+1 equations

for the n+1 unknowns ^1, «^,...,o<n, g.. Applying Cramer' s

Rule, we have

(yl'yl)(y2'yl)###(yn'yl)(h'yl>

(yl»y2)

(yl'yn}(y1#h)

...<yn,yn)(h,yn)

...(yn,h) (h,h)

{yl'y2}

(ylfh) ...<ynfh) i

2.3. Minimization by Infinite-Dimensional Subspaces

The approach in theorem 2.2.4 is of little practical

importance since, in some cases, the subspace M may not be

finite dimensional and it would be generally impossible

to reduce the problem to a finite set of linear equations

similar to normal equations. We now turn to these types

of problems which involve a modification of the projection

theorem applicable to linear varieties.

Definition 2.3.1. Let yQ be fixed in a subspace MQ

of a linear space Y. Then a set M^Y is called a linear

variety if M=yo+MQ. M is called a translation of MQ.

Theorem 2.3.2. For a closed subspace MQ of a Hilbert

Ibid., pp. 56-57.

7

space H, a fixed hQ6 H, and a linear variety M=hQ+MQ, there

is a unique h£M of minimum norm with hlMQ.

Proof. First, translate M by -hQ and obtain a closed

subspace MQ. The remainder of the proof follows directly

from the proof of theorem 2.1.2.

Definition 2.3.3. Let M be a nonempty subset of a

Pre__Hilbert space Z. ^xl(x,z)=O for all z Q. Z^ is called

the orthogonal complement of M, denoted M .

Theorem 2.3.4. Take a linear variety in a Hilbert

space H consisting of all h € H satisfying equations of the

form (h,y )=cn for a linearly independent set \,Yi^- H^and

fixed constants c. where i=l,2,...,n. If hQ has minimum

n

norm, then hg= 2 &±Y± where #. satisfies equations of the

form

Proof. Let M be the n-dimensional subspace generated

Y\, i=l,2,...,n. Since each c. is nonzero, the linear

variety is a translation of M1'. Since M is closed, the

existence and uniqueness of an optimal solution follows

from theorem 2.1.2. Now we have that hA.M , and therefore

hQ£ M11. But M is closed, so Mii=M. Therefore hQ£ M orn

hn= Z &iYi» Choose the 6^l s so that hn satisfies theu i=:1 xx x u

equations of the form (n#Yn)=cn» Then 0. satisfies the

equations of the form

and the proof is complete.

8

2.4. Minimization by Convex Sets

Definition 2.4.1. Given a set K in a linear space Y.

If for k^^feK, the set of all points of the form <Ak,+(l-<4)k2

((H <*il) e K, then K is said to be convex.

Proposition 2.4.2. If K,G are convex sets in a linear

space, then

(1) <AK=tx|x=jk, k£Kf is convex for any «<..

(2) *K+£G is convex for any *,&.

Proof. (1) Let x1=<*k1, x2=<4k2. Then,

Therefore *K is convex.

(2) The proof follows from that of (1).

Theorem 2.4.3. Given a Hilbert space H, an h £H, and

a closed convex set K such that KCh. There exists a unique

kQ£K which minimizes ilh-koll such that |lh-kon ± llh-kll for all

k£K. kQ is unique if and only if (h-ko,k-kQ)5O for all

k£K.

Proof. (1) We will first show that kQ exists.

Find a kQ£ K such that iih-k0H=s by taking a sequence of

vectors ik.J in K such that llh-k^l -*s. By the parallelogram

law,

ll(k:.-h)4-(h-ki)l!2+li(k:.-h)-(h-ki)tl2=2Hk.-hll2+2l|h-ki!l2.

By direct computation,

t|kj"ki(l 2=2||kj-h|i2+21lh-kili 2-4|lh-ki+k .||2.2

Since K is convex, ^4-k. tK for all i,j. Thus by the

9

definition of 5, llh-ki+k.|( >s and

2

j

Since Hh-kjjl -* a, llh-kjll 2 -*»s2 as i -*>«*,. It follows that

Ilk.-kjJI -> 0 as i,j -*«*>. Hence Vz^l is Cauchy and has a

limit kQ£K. Therefore |lh-koll = & and kQ exists.

(2) Next, we will show that kQ is unique. Let k^ K

such that llh-k,l|=S». Take a sequence tk i such that k =k«■l n n u

if n is even and Tz^ if n is odd, and ||h-k }} -*• o..

Using the same argument as in (1), since l|h-k || -* s

and ||h-k_|| 2 -» §>2 as n ■♦ » it follows that Ilk -k || -» 0 asn ' m n

m,n -*««. Therefore \knl is Cauchy and has a limit k.t K.

This can only be true for one k£K, so we conclude that

kQ=k1 and therefore kQ is unique.

(3) We will now show that kQ is unique implies that

(h-ko,k-ko)sO for all k €K. Let k^K be such that

(h-kQ,k1-k0)= £ >0. Take vectors k<rf=(l-^)ko+«Ak1(O ^ u <1).

Each k^ € K, since K is convex and

l|h-kJl2=ll(l-«A)(h-k0)+«»(h-k1)!l2

= j|(l- 4) (h-ko)l! 2+( ((1-^) (h-kQ)),

+U(h-k1) ,((l-u) (h-k0)))+

2!!h-koil is differential with respect to 4, so

djlh-kj!2 = -2llh-kJ!2-

= -2(h-k0fk1-kQ) = -26^0.

Thus, ilh-k^j! * |lh-koll for some 4*0 which is a contradiction.

Therefore k1 does not exist.

10

(4) Finally, we will show that if (h-ko,k-ko)<O for

all k£.K, then kQ is unique. Let kQ €. K such that

(h-ko,k-ko)£O for all k€K. Then for any k<£K,

||h-k||2=!lh-k0+k0-kl!2

= |jh-ko|j2-f2(h-ko,ko-k)+||ko-ki|2.

Therefore !lh-ki| ^" 1 jh—lcQ11 for kj^kg and kQ is unique

CHAPTER III

OPTIMUM NORMS IN NORMED LINEAR SPACES

3.1. Minimization by Dual Spaces

Definition 3.1.1. Let N be a nornted linear space and

* 3N the space consisting of all bounded linear functionals

on N. N is called the dual of N.

Theorem 3.1.2. N is a Banach space.

Proof. We will show that N is complete.

N is a normed linear space, so let^x 3>be Cauchy in N .

* * #

Then ||xn~xmll -* 0 as n,m -»«*>. \x (x)£ is a Cauchy sequence

it it if it

of scalars for any x€n, since |xn(x)-xro(x)| £ |lxn"xm^ WXW'

Define a functional x for each x6N as x (x) such that

x (x) -» x (x). Now.n '

x Ux+ay)=lim x («*

it *

x (x)+61im x_n n

=dx (x)+$x (y).

Hence x is certainly linear.

Since ixn| is Cauchy, given an e^O, there exists an M

such that

|x*(x)-x*(x)UeHx(|

Angus Taylor, Introduction to Functional Analysis.

(New York, 1958), pp. 33-34, 162-163, 185-186.

11

12

for all n,m>M and all x. But x (x) -*x (x). Son

for in >M. Therefore,

is & tit ik

|x (x)| = |x (x)-xin(x)+xm(x)l

± |x*(x)-x*(x)i +lx*(x)l— • m m

and x is a bounded linear functional.

. .

Further, |x (x)-xm(x)l ^ £ (|x|| for m ?-M, so j|x -m

and xm -? x eN . Thus N is complete.

We can solve a minimum norm problem by considering two

versions of the Hahn-Banach Extension theorem. The first

parallels the projection theorem and the conclusion is

formulated in a normed linear space as well as its dual.

The second is a geometric approach in which convex sets

are separated with hyperplanes.

Theorem 3.1.3. Let K be a normed linear space and M

a subspace of N, For an x€N,

d(x,M)= inf||x-m||= max(x*(x))

and for some x^M-1, d(x,m)= max (xQ(x)).

If for some mQ£M, d(x,m)= inf ||x-moj|, then

xo(x-mQ)= |(x*|) ||x-moj| .

Proof. Let d=»d(x,m). Given e^O, let ||x-mtj| ±

for ra££M. Then,

* it it

(x (x)) = (x (x-m&)) ±\\x \\ ))x-m£|| <

4David Luenberger, Optimization by Vector Space Methods

(New York, 1969), pp. 110-113.

13

for x*e M1 and |jx*|| * 1.

Since & is arbitrary, we assume that (x

Therefore (xQ(x))=d for any xQ and the first statement is

proved•

Let S be the subspace (x+M). Let n=^x+m, ro£M,.A real,

be the form of elements of S. Let f be the linear functional

on S such that f (n)=«*d. Then,

j|f|J= sup |f(n)( = sup Mid = sup |<Md = d =1.S ;|n|| !|x+mj| mtix+mii infyx+mll

* * uForm the Hahn-Banach extension xQ of f from S to N. Then

||Xq|j=1 and xQ=f on S. So XqsM1 and (xQ(x))=d.

We can now assume that there is an mQ& M such that

||x-mo|!=d. Let x^M1, j|xo|j<-l and txo(x))=d. Then,

=d=||x|| ||x-moi|


Theorem 3.1.4. Let N be a normed linear space and M

a subspace of N. Then for an x £ N ,

cUx*,!!1)- min ||x*-n*U - sup (x*(x))

and for some mQCM^, d(x ,M^)= min ||x -mo(|.

If for some xQ£Mf d(x*#M^)a sup(x*(xQ)), then

((x*-m*)(xo))= ||x*-bJu IIXqII •

Proof. Let d=d(x ,MI). For any m £ M1,

||x -m ||= [sup(x (x))-(m (x))]

> sup[(x (x))-(n (x))]xeM,|jx|||l

= sup (x (x))

xeMJIxjiil

and the first statement is proved.

Now take jjx ||M with norm of the functional x

14

restricted to M. Let y be the Hahn-Banach extension to

the whole space of the restriction of x . Then j|y il=iix|L,

* * * *

0=x -y . Then mQ* * * * * * t

and x -y =0 on M. Set mn=x -y . Then mn <k Mx and

||x*-m*|j=||x*||M.

If d= sup(x (Xq)) for some xQ€Mf then ||Xq||=1, and

jlx -mol! = (x (xQ)) = ((x -ra0)(x0)).

Definition 3.1,5. Let Y be a linear space and U,V

be linear varieties such that U/Y. If UCV, then V=Y or

v=U. U is called a hyperplane.

Definition 3.1.6. Let N be a normed linear space and

it n it it

K a convex set in N. For any x £ N , if h(x )= sup(x (x))#

then h is called the support functional of K.

Theorem 3.1.7. (Minimum Norm Duality) Let N be a

normed linear space and K a convex set. Let x,t N, d(x,K)>0

and let h be the support functional of K. Then

d(x,K)= inf}|x-x,||= max [x (x, )-h(x )]

and for some xQ£ N , d(x#K)= max [x (xQ)-h(x )] .

If for some xQ<= K, d(x,K)= infllXQ-x^J , then

(-XqMxq-o^)*: H+Xq|( IjXq-x^I.

Proof. Let d=d(x,K). We will take the general case

and set x,=0 so that d= inf ||x||= max -h(x ). Thus we will

only be concerned with the negative case of h(x )• If

h(x ) is negative, then K is in the half space

it

which does not contain zero, since (x (0))=0. Thus, the

hyperplane U=tx|(x (x))=h(x H separates K and 0 when

15

h(x ) is negative.

Let s(t) be the sphere of radius t centered at 0. If

h(x*)*0 and I|x*!|=l for any x € N , let t be the supremum

of the £.'s for which U separates K and s(&). Obviously

Ote^d and

h(x*)= inf(x*(x))= -£.*.

Thus -h(x )id for every x 6. N such that nx II ± 1.

K contains no interior points of s(d). Therefore a

hyperplane U separates K and s(d). In addition, -h(xQ)=d

for some Xg£N* such that ||XOM=1. The proof of the first

statement is complete.

Let xQ€K be such that Hx^Md. Then since xQ£ K,

(x*(xQ))<h(x*)= -d. But -(xj(xo)) f |(x*H |\xof!=d. Consequently,

-(x*(xo)) = |(x*j| l|xo!|and the proof is complete.

3.2. Minimization Involving Linear Operators

Definition 3.2.1. Given linear spaces W and Y and a

function A with domain DCW having range Rcy. a is a linear

operator on W into Y if AU^+a^J^^Atw^-f^A^) for all

w1#w26W and any p^,^.

Definition 3.2.2. Let L and N be normed linear spaces.

The spaces consisting of all bounded linear operators on

L into N is denoted b(L,N).

If L and N are linear spaces and A is a linear operator

on L into N, the equation Al=nf for all n£N, may (1) have

Angus Taylor, Introduction to Functional Analysis

(New York, 1958), pp. 85-86, 163, 213-215.

16

one and only one solution lc-L (Notes A exists such that

if Al=n, then A~ (n)=l)f (2) have no solution in which case

an approximate solution can be found, and (3) have any

number of solutions from which the optimal solution is

chosen. Only the latter two cases will be discussed, since

they involve choosing an optimal solution.

Theorem 3.2.3. Given Hilbert spaces G and H and gn

A£B(G,H). There is a g£G which minimizes ||h-Ag!l, h is

fixed in H, if and only if AlAg=A'h.

Proof. This is a case in which no solution exists and

is equivalent to minimizing ||h-hH where h~6R(A) (the range

of A). So by theorem 2.1.1, h is a minimizing vector if and

only if h-h e [R(A)]1 . Then h-hfcN(A') (the nullspace of H),

since [R(A)]1 e N(A* ). Further 0=A*(h-h)=A*(h)-A*Ag.

Theorem 3.2.4. Let G and H be Hilbert spaces. Let

AeB(G,H) such that R(A) is closed in H. The solution of

Ag=h is the g of minimum norm such that g=A*f where the

solution of AA'f=h is f.

Proof. If g satisfies Ag=h, the general solution is

g=gx-fu, u£N(A). But since N(A) is closed, there must be

a unique g of minimum norm satisfying Ag=h such that glN(A).

Assume R(A) is closed. Then g<£ [N(A)]1 =R(A* ). Thus

g=A'f for some f£H. Since Ag=h, it follows that AA*f=h.

Definition 3.2.5. Let G and H be Hilbert spaces. Let

A€B(G,H) such that R(A) is closed in H. Let a unique gQ e G

be of minimum norm corresponding to an h & H which varies

such that gQ is a g1e G which satisfies

17

||Ag1-h||= minllAg-hll.

g

If A1: h -? gQ, then A+ is the pseudoinverse of A.

The concept of the pseudoinverse can be used as another

approach for solving the equation Ag=h. However, it will

not be discussed here.

CHAPTER IV

DIFFERENTIATION IN NORMED LINEAR SPACES

4.1. Gateaux and Frechet Differentials

Definition 4.1,1. Let Y be a vector space, N a normed

space, and T a (possibly nonlinear) transformation defined

on a domain D CY having range RCN. Let y£DCY and let h be

arbitrary in Y. If the limit

(1) &T(y;h)= lira 1 T(y+<*h)-T(y)

exists, it is called the Gateaux differential of T at y

with increment h. If the limit exists for each h 6.Y, then

T is said to be Gateaux differentiable at y.

Consider (1) only if y+dhG D for all j. sufficiently

small in the usual sense of norm convergence in N. For a

fixed yfcD and an h regarded as a variable, the Gateaux

differential defines a transformation from Y to N.

Proposition 4.1.2. If T is linear, then sT(y;h)=T(h).

Proof. Assume T is linear. Thus

h)= lim .1 [T(y+<Oi)-T(y)Ji-»0 *

= lim a,[T(y)+«^T(h)-T(y)j«■< -% 0 •*

= lim 1 [*T(h)l0

- lim ^[T(h)J =T(h).<* -* 0 o<

Definition 4.1.3. Let T be a transformation defined

on an open domain D in a normed space L having range in a

18

19

norraed space N. If for a fixed y£D and for each hfcL,

there exists a linear ^T(y;h)€ N which is continuous with

respect to h such that

lim UT(v+h)-T(y)- T(v:h)ll = 0

0

then T is said to be Prechet differentiable at y and sT(y;h)

is called the Prechet differential of T at y with increment

h.

Note that 5iT(y;h) will be used to represent both the

Gateaux and Prechet differentials since it is obvious from

the context which is meant.

Proposition 4.1.4. If the transformation T has a

Frechet differential, then it is unique.

Proof. Assume sT(y;h) and %'T(y;h) are Prechet

differentials of T at y with increment h. Then

||&T(y;h)-s'T(y;h)|l

= IIT(v-i-h)~T(v)-§>T(y;h)-(T(v-<h)-T(v)-st>T(v:h))ll

llhll

HT(y+h)-T(y)-&T(y;h)-T(y+h)*T(y)+s'T(y:h)ll

llh||

. llT(y-fh)-T(y)-ST(y;h)JI , HT(yfh)-T(y)-^T(y:h)||- Hhil + liHi

So

UT(y+h)-T(y)-4T(y;h)il UT(y+h)-T(y)-s>'T(v:h)H0 |lh!| + llhll

= 0( llhll).

Thus ||sT(y;h)-s4T(y;h)ll » 0( llhll). Since §T(y;h)-s'T(y;h)

is bounded and linear in h, then &T(y;h)-a'T(y;h) must be

zero. Consequently, &T(y;h)=s'T(y;h) and &T(y;h) is

unique.

20

Proposition 4.1.5, If the Prechet differential of T

exists at y, then the Gateaux differential exists at y and

they are equal.

Proof. Let ST(y;h) denote the Frechet differential.

If the Gateaux differential of T exists, then

lim l.[T(y-Mh)-T(y)] =&T(y;h).

In addition,

= 1 ||T(y-^h)-T(y)-£T(y;o<h)|l

w

=_lJ|T(y+*m)-T(y)-s,T(y;o<h)l|

approaches zero when h is constant.

So by the linearity of j>T(y;<*h) with respect to <*,

T(v-fMh)-T(y) = 4T(y;h).0 o»

Proposition 4.1.6. If the transformation T defined

on an open set DCY has a Frechet differential at y, then

T is continuous at y.

Proof. We know that s>T is a bounded linear operator,

so

HIOll Ilhll

since if (1) lira l|T(y+h)-T(y)ll approaches zero, then (1)h->0

becomes zero.

Given e>0, then

||T(y+h)-T(y)-sr(y;h)ll iellhil if ||hl|«s.

Thus

j|T(y+h)-T(y)H-l&T(y;h)ll < |IT(y+h)-T(y)-%T(y;h)H ± eIlhll

||T(y+h)-T(y)H ie||hll + lls»T(y;h)ll 6 fcHh||+MllhH»(e+M)l|h||.

21

Therefore T is continuous at y.

4.2. Frechet Derivatives

If the transformation T defined on an open domain DCY

has a Prechet derivative at each y £ Df then for a fixed

point y£D, £T(y;h) is a bounded linear operator in hey.

i>T(y;h) can be written as Ah where A is a bounded linear

operator from Y to H and A £B(Y,N) which is a normed

linear space.

Since A depends onyeD, y -*A defines a transformation

from D into B(Y,N). This transformation is called the Prechet

derivative T1 of T. We will write A as T'(y). Thus

&T(y;h) = T'(y)h.

Definition 4.2.1. Let U: D -*B(Y,N) be defined by

U(y) » T'(y) where U = T1 and T'(y0) £B(Y,N). U is

continuous at y if and only if for t>0, there exists an

s>>0 such that fly-yo!l<§. implies HT1 (y)-T1 (yQ)IU & . If U

is continuous on some open sphere S, then T is continuously

Prechet differentiable on S.

Proposition 4.2.2. Let S be a transformation mapping

of an open set D cy into an open set E cy. Let P be a

transformation mapping E into a norraed space N. Put T = PS.

Suppose S is Frechet differentiable at yfeD and P is

Frechet differentiable at z - S(y)eB. Then T is Prechet

differentiable at y and T'(y) = P'(z)SMy).

Proof. For h^Y, we have that y+h € D and

T(y*h)-T(y) = PS(y+h)-PS(y)

= P[S(y+h)j -P[S(y)]

22

= P[S(y+h)-S(y)+S(y)] -P

Let g = S(y-s-h)-S(y). Then

P[S(y+h)-S(y)+S(y)] -P[s(y)] = P(g+S(y)] -PJ><y)] .

Since z = S(y), we have that

p[g+S(y)}-PJS(y)] = P(g+z)-P(z).

By the definition of Prechet derivatives, it follows

that P(g+z)-P(z) ■ p'(z)g. So

[|T(v4-h)-T(y)-P'(z)gu - IIP(g-t-z)~P(z)-P> (z)qii *. t

m\\ ng|iif ||gi| *&. Thus,

i!P(g+z)-P(z)-p'(z)g!l = O(llgll)

and

' - ||S(y+h)-S(y)-S4(y)hll = O(!|hll).

T'CyJhll = ilT(y-i-h)-T(y)-P'(x)S'(y)h!l

by definition. Further,

||T(y+h)-T(y)-p'(z)S'(y)hll

= HT(y+h)-T(y)-P '(z)g+P» (z)g-P l(z)S ' (y)hll

* |]T(y+h)-T(y)-P l(z)gH+ilP ' (z)g-P1 (z)S l (y)hll

Ilg-St(y)hll

Since S is continuous at y,

llgll - llS(y+h)-S(y)ll = O(||hll).

Consequently, Tlty)h = P*(z)S'(y)h and T'(y) = Fl(z)Sl(y).

Proposition 4.2.3, Let T be Prechet differentiable

on an open domain D. Let y£D and y+<*h £D for all «*,

Oi^il. Then

||T(y+h)-T(y)lif ||hl| sup

23

Proof. Let zQ = T(y+h)-T(y)£ N where N is a real

normed linear space. If zQ = 0, then zQ = T(y+h)-T(y). If

z0 ^ 0, then by the Hahn-Banach Extension theorem, there

exists a z 6 N and \\z l! = 1. Thus

||T(y+h)-T(y)U = z*(T(y+h)-T(y)).

Define c|(«O = z [T(y+4h>] on the interval [0,l] . By

proposition 4.2.2, <$'(<*) = z [t ' (y+«*h)h] • By the mean

value theorem for functions of a real variable,

<i>(l)-<t>(Q) = $i&n), CU«*oxl. Thus

z* [T(y+h)-T(y)} = z* [T1 (y-Mh)h] .

In addition,

i !)z*l! sup ||T' (yH-th)hll

5 ||z*|| sup ||T'(y+^h)H llhil.

Therefore,

||T(y+h)-T(y)H £ ||h|| sup ffT1 (y+«th)H .

Definition 4.2.4. If T: Y -* N is Prechet differentiable

on an open domain DCY, then T1 maps D into B(Y,N) and may

be Frechet differentiable on a subset D, CD. Here, the

Prechet derivative of T1 is denoted by T" . T11 is called

the second Prechet derivative of T.

Proposition 4.2.5. Let T be twice Frechet differentiable

on an open domain D. Let yfcD and suppose that y+<*h6D for

all c*, 04J11. Then

UT(y+h)-T(y)-Tl (y)hlf ± lllhll2 sup ||T " (y-uh)il .

S. C. Saxena and S. M. Shah, Introduction to Real

Variable Theory (Scranton, 1972), pp. 168-169.

24

Proof. The proof follows from that of proposition 4.2.3.

CHAPTER V

OPTIMIZATION BY ITERATIVE METHODS

5.1, Methods for Solving Nonlinear Equations

The first method which we will discuss is that of

successive approximation. It is used to solve equations

of the form y=T(y) where the solution y is said to be a

fixed point of the transformation T, since T leaves y

invariant. The process is illustrated by figure 1.

We find a fixed point by beginning with an initial

trial vector y^j^ and computing y2=T(y1). This is done by

finding the point of intersection of T(y) with the forty-

five degree line through the origin. y^TCy^) is derived

by moving along the curve as shown. Continuing in this

manner iteratively, successive vectors yn+i=T(yn) are

computed. Thus, the sequence ^yni converges to a solution

of the equation y=T(y).

Definition 5.1.1. Let S be a subset of a normed

space N and T a transformation mapping S into S. Then T

is said to be a contraction mapping if there is an eA,

0<«*«l, such that ||T<y1)-T<y2)H ^dklly1-y2N for a11 vi»y2eS<

Let h=y2-y1 so that y1+<*h=y1-M(y2-y1). Since S

converges, y1+^(y2-y1) ^S. Sly2-y1l!<& where «§ = t/* implies

l(T(y1)-T(y2)|j^ & . Thus, T is absolutely continuous.

25

26

Pig. 1.—Successive approximation process

27

Note that a transformation having \\T* (y)ll ± <*<l1 on a

convex set K is a contraction mapping, since by the mean

value inequality,

llTly^-TCy^lli supllT'Cy)!! Hyj^lte,* lly^y^l.

i.e. when a transformation is continuous and its derivative

is less than one on a convex set, it is a contraction

mapping.

Theorem 5.1.2. (Contraction Mapping Theorem) If T is

a contraction mapping on a closed subset S of a Banach

space, there is a unique vector y«€; S satisfying yo=T(yo).

Furthermore, yQ can be obtained by the method of successive

approximation starting from an arbitrary initial vector in

S.

Proof. Take an arbitrary y,£S, Define the sequence

n\ by the formula yn+i=T^yn)« Then,

By the mean value inequality,

It follows that

In addition.

28

as n -> oo . Therefore Hyn+o-ynll -» 0 as n -s> *», and \yn\ is

Cauchy in N.

Since S is a closed subset of a complete space, there

is a yo£N such that yR -»Yoe S. We know that Yn+1=T<yn> •

Consequently, lim Yn+i=1im T^Yn^ * S:"-nce Yn _»Yo» it

follows that yQ=T(lim yn)=T(y0). Therefore yo=T(yQ).

Assume that yQ and zQ are fixed points. Since

yQ=T(y0) and Zq=T(Zq),

J!yo-zolt=llT(yo)-T(zo)li.

By the mean value inequality,

!|T(yo)-T(zo)|lle<|{yo-zoK.

Therefore )|yo-zoll(l-<*)iO, Y0-zQt and yQ is unique.

Theorem 5.1.3. Let T be a continuous mapping from a

closed subset S of a Banach space into S. Suppose Tn is

a contraction mapping for some positive integer n. Then

T has a unique fixed point in S which can be found by

successive approximation.

Proof. Let s, be arbitrary in S. Define the sequence

ts.S by s. ,=T(s.). Since Tn is a contraction mapping, the

subsequence \s . \ converges to an s«6 S which is a fixed

point of Tn, by theorem 5.1.2.

Since T is continuous, T(sQ) can be derived by

successively applying T11 to Tls^. We know that sn=Tn(sx)

and s ^T^fs, ). Thus, we have that sn=lim Tnk(s,) andnK x u * _ x

T(sn)=T[lim Tnk(s,)] =lim Tnk[T(s,)]0 k-*~ x k-»-««. ■*•

29

Further,

so-T( sQ) = tlim Tnk (s±)] - [lim Tnk [T( s±)}]

Since T is continuous,

||so-T(so)!i= liro||Tnk(s1)-Tnk [T(s1k*<»

o»

i liiMllTk~1(s1)-Tk""1i*T(s1)JS!lk-*<«

U, J-T*-1 [T(s.

where A<1. Thus so=T(sQ).

If so,tQ are fixed points, then

Uso-to!MiTn(so)-Tn(to)IU d iisQ-t0U.

Therefore so=tQ and sQ is a unique fixed point of T.

The next method which we will discuss is Newton's

Method. It is used for solving equations of the form P(y)=O.

However, it has a direct extension applicable to a nonlinear

transformation T on a normed space. The technique is

illustrated in figure 2.

At a given point, tentatively, the graph of the function

P is approximated by its tangent, and an approximate solution

of P(y)=O is taken as the point where the tangent crosses the

x-axis. This process is repeated iteratively from this new

point. This process defines a sequence of points as follows:

of o 0*

& •<

N• •

| o 313 H-

A C (D 0 H»

fl>

O

1 n •< 4;

i 6 I f i

to

L,—-

j !

i_<——

\\

\

o

31

Theorem 5.1.4. Let G and H be Banach spaces and let

P be a mapping from G to H. Assume that:

(1) P is twice Prechet differentiable and ||PU (g)ll £ K.

(2) There is a point g^e G such that P1=PI (g-^) has a

bounded inverse p, ~ with Up," l'£6i# "p-i"X X XX

(3) The constant I, = 0.i1,k satisfies 1,«

Then the sequence gn+i=9n"Pn" (p^n^3 exists for all n>l

and converges to a solution of P(g)=O.

Proof. We must dhow that if g, satisfies (1), (2),

and (3), then g2=gl"pl"" p*gi* satisfies the same conditions

with new constants ^O'^Z'^Z'

g2=g1-P1~1P(g1), so g2-g1=g1-P1"1P(g1)-g1. Then it

follows that

Since Pj^sP'tg-i^) and Pg^'fc^*' we nave

Ilo1"1tp1-P2]lli Wp^W Uj^

sup IIP" (g1+«<(g2-g1))Hfdil

Since l.<l/2# the linear operator

-1r n -1LIp IPP1 P P^ ^2 ^ 2

Since Wp^1 h?1-P2\\\ < 1^1/2, we have that (I-p^1 fe^-pj ) "1

32

exists. Therefore HL 111 1 . Thus (p. p_)~ and(1-1^ X 2

(Po~ Pi) exist. In addition, lip," p,H ± 1 . It

follows that lt=r>^" p,, p1L=p2, and (p.L)~ =(L"" p." ^P?" <

Therefore p2~ exists.

Now we can estimate the bound of p2~ .

To obtain a bound for Hp2~ P(g2)H, we will consider the

operator T^CgJ^g-p^ P(g). Clearly ^(g^sg^ Thus

Ti' (gi)=I-Pi"lpl(g1)=i-p1"1p1=i-i=o.

Consequently, T^' (gj^JsO.

Since T1(g1)=g1-p1~1P(g1), it follows that

T1(g2^=g2"Pi" p^2)»

So

P1"1P(g2)=g2-t1(g2)=T1(g1)-T(g2)-T1'

and

Hp1-1p(g2)ll=llT1(g1)-T(g2)-T1l

-gJ)2 sup |IT12 2 -1

p2"1P(g2)ll=llL"1n1"1P(g2)i! HIL"1!! llp1~1P(g2)H

^P1 (g)so T " (gJsOp^P "We know that T1 (g^I-p^P1 (g), so T " (gJsO-p^P " (g)

Consequently,

llp1"1P(g2))Uln12 sup Hp1"1P"

sup

33

" 2

Let I2=62^2k# Then

6l ^1 1 11 1 12 1

V 1-1, 2 1-1, " 2(1_V2 " 2 <1^;> 2

The conditions (1), (2)f and (3) are satisfied for g2 and

the constants ^2# ^2' ^2* B^ induction, tg_^ exists for

nn Xn> 6n such that

(1) IIP "(g)ll *K

(2) P^=P'(g^) has a bounded inverse lip^"1P(xv,)|Ur\Y,-n n *n n n

Since

n,

2 a 2 ?

we have that ^n;i^ J- %« In addition, since ngn+i-gn!lA An»2n-l

it follows that

on-f-k-4 ,n-l£.2 2

_ru f 1 + 1 +., .■¥ 1 »

1 tii / 1 + 1 +.. .+12n-l 1(2k-2 2k-3

V-1

As n -*oo, k -»««, and ilgn^-gnll £. JHl_ "* °« Therefore,«n-1

lgn? is Cauchy in 6 and there exists a q-. e G such that

It'll

34

To show that gQ satisfies P(gQ)=O, we will begin by

noting that tllpnlll is bounded, since

=llPl(gn)-P1(g1)H+»lp1H

sup

But lig -g^H is bounded, since it is convergent and

lig -g^l- M for a11 nT=1» Therefore UpJI <

Knowing that gn+i-gn-Pn"lp<0n>' we have gn+l"gn= "pn"lp(gn)Then for each n, Pn<gn+i-9n>= -Hgn) and Pn

But Hgn+i-gnll "* ° and UPnH is bounded. So

KP<xn)lj=lipn<gn+1-gn>IU llPnll ||gn+1-gnll

Hence lim ||P(gn)il=O. Since P is differentiable, it is

continuous and P(gQ)=O.

5,2. Descent Methods

We now turn to Descent Methods which iterate in such

a way as to decrease the cost functional continuously from

one ster? to the next and thus insures convergence from an

arbitrary starting point. The procedure consists of

minimizimg a functional f, taking a given initial point y1#

and then constructing iterations according to the equations

of the form

where <* is a scalar and pn is a direction vector. After

selecting the vector p , the scalar ©<n is chosen to minimize

f (y +cto ) which is considered a function of «(. Then - n

35

arrangement is such that f(y +°<Pn)*f (yR) for some small

positive**, a* is then often taken as the smallestn

positive root of

Then, f (Yn+o<nPn) is evaluated to verify that it has

decreased from f(yn)» If f has not decreased, then a

value of «* is chosen,n

The descent process is illustrated in figure 3, where

the contours represent the functional f in the space Y. We

start from a point y.^ and then move along the direction

vector p, to the first point where the line y^^p-j^ is

tangent to a contour of f. If f is bounded below, note

that the descent process defines a bounded decreasing

sequence of functional values and that the objective values

tend toward a limit fQ.

The descent method which we will discuss here is that

of Steepest Descent. This method is used to minimize a

functional f defined on a Hilbert space Y. The direction

vector p at a given point y is chosen to be the negative

gradient of f at yn<

An important application is to minimize the quadratic

functional

f(y)=(y,Qy)=2(b,y),

where Q is a self-adjoint positive-definite operator on Y.

Assume that

m=inf (y.Qy)(y

.QVi

77)

36

/>f increasing

\

) i

Pig. 3.—Descent process

37

M=sup (y.Qv)

Y^O (y,y)

are positive, finite numbers. Then f is minimized by

solving the linear equations of the form

Qyo=b.

The vector

r=b-Qy

is called the residual of the approximation. Inspecting

reveals that fl(y)l, as *=0, is (l,-2r). Thus 2r is the

negative gradient of f at the point y. Therefore the

steepest descent method applied to f takes the form

where r =b-Qy and «* is chosen to minimize fCy.,.-,). Then n n n-i-x

value of o< is fotAnd as follows:n

Further,

2*<rn,Qrn)-2(rn,rn>=0

38

Thus,

Trn7Qrn7 *n'

where rn=b~Qyn»

Theorem 5.2.1. For any y^ Y, the sequence £yn<

defined by

rn(rnfQrn)

converges (in norm) to the unique solution yQ of Qy=b

Furthermore, defining

F(y)=(y-yo,Q(y-yo)),

the rate of convergence satisfies

where zn=y0-yn'

Proof. Note that

F(y)=(y-yo,Q(y-yo))

=(y#Qy)-(yrQyo)-(yOrQy)+(yo,Qyo)

=(y,Qy)-2(y,b)+(yo,Qyo)

=f(y)+(yo,Qyo)

so that f and F achieve a minimum at yQ and the gradients

of f and F are equal. By direct computation,

f(y)+(y0,Qy0>-f(yn.,,1>-<yo,Qyo>

F(yn) F(yn)

39

2"<rn,rn)-*2Crn,Qrn>F(yn)

We know that r =Qz , son n'

(r #r )2

(rn>rn}

From the definition of m, where m>0f it follows that

<rn,Qrn)*M(rn,rn)

and

In addition,

Flyp M*

Then

rn+l >in

M

and we have that

M'

M.

40

Continuing in the same manner,

M

SO

n ~( MJ

We know that

Tzn'Qzn>Consequently,

l0n,0-yn))=I F(yn)

So

Therefore


Theorem 5.2.2. Let f be a functional bounded below

and twice Frechet differentiable on a Hilbert space H.

Given h^fe H, let S be the closed convex hull of lh: f(h)<

Assume that f " (h) is self-adjoint and satisfies

throughout S (i.e., f " (h) is uniformly bounded and positive

definite). If lhn^ is the sequence generated by steepest

7David Luenberger, Optimization by Vector SpaceMethods (New York, 1969), pp. 150-152.

41

descent applied to f starting at h,, then f'(h ) -*0.

Furthermore, there exists an hQes such that hn -*h0 and

f(hQ)=inftf(h): h£HJ.

Proof. Given h£S, apply Taylor's expansion8 with

remainder to the function

g(t)=f(th+(l-t)h1).

Then we have

g(l)=f(h),

g(O)=f(h1),

g'(O)=f'(h1),

g(t)=g(O)+tg'(0)+t2g'(t), where 0<t<L.2

So

g(l)=g(O)+g'(O)+lgl(t), where2

Then

g1 (t)»f * (t(h-h1)+h1) (h-^)

implies that

g'(O)=fl(h1)(h-h1)=fl(h1)f(h-h1)

and

g" (t)=f "

=f" (hj

-f "(hllh-h^fh-h^.

Let h be such that fChUffl^) implies f(h)-

Then we have

o

G.E. Sherwood and Angus Taylor, Calculus (Englewood

Cliffs, N.J., 1954), pp. 395-398.

42

-f'(h1)(h-h1)=|fl(h1)(h-h1)| s-

mp-h^l 2£f (h)-f (h^+llf' (h^ll H

So it follows that \h: f (h)« is bounded.

tf(hn)l -* fQ implies that for an £"0, there is an Nn

such that

|f(h )-

n 4M

if n2NQ. So assume that f'(hn) does not approach zero.

Then there exists an t>0 such that for any N1# there is an

x such that (if '(hn)f| a t . Let N««iax(NOfN1). Then if0%

nsN. we have that |£(h -fn|*e and Jlf' (h_)l|rn ° 4M n

<*>0f let hA=hn-o«f *(hn). We know that

For an

So let

-f'(hh-»fl(hn)+fl(hn))

" (hn-«tf' (hn) )f' (hn)f (hn).

Then

l> (h)fl(n)ft(hn)

and

l(hn)f »(hn))+jii(f " (h)f'(hn)f

llf *

43

M)||f'(h )U22 n

Hence, for«*=l_,

M

f(hj-f<h )«,-

"2M "

"2M*

If <='\l and ^=nn+x# then

Therefore f(hn+1)<fQ. But all f(hn)2fQ. So we have a

contradiction and therefore ||f'(hn)(l -*>0.

For any h,l^Sf by the one-dimensional mean value

theorem,

(f'(h)-f' (l)(h-l) = (h-l,f" (h)(h-D)

where h=th+(l-t)l, Oitsl. Consequently,

^"f'«V 'm

in

Thus

m

Since if'(hn)? is Cauchy, so is thn?• Therefore

44

hQ6S and hR -*h0. Clearly f'(hg)=O. Let s be such that

ho+H£S. Then there is a tf O*t<L, such that

f(ho+s)=f(ho)+l(s,f" (ho+ts)s)

2f(ho)+mtls|l2.

Therefore hQ minimizes f in S and H. The proof is

complete.

5.3. Conjugate Direction Methods

Q

We will use the Fourier Series here to minimize a

quadratic functional f on a Hilbert space H by an

appropriate transformation.

Let f(y)=(y#Qy)-2(y,b) where Q is a self-adjoint

linear operator satisfying (y,Qy)-M(y,y) and (y,Qy)2m(y,y)

for all y£H and some M,m»O. Then the unique vector yQ

minimizing f, is the unique solution of the equation Qy=b.

This problem can be considered a minimum norm problem by

introducing an inner product (y,zj=(y,Qz), since it is

equivalent to minimizing lly-yollQ

If we can generate a sequence of vectors ip,,P2»•••?

that are orthogonal with respect to the inner product [,] ,

then it is said to be Q-orthogonal or a sequence of

conjugate directions. yQ can be expanded in a Fourier

Series with respect to this sequence. If the n-th partial

sum of such an expansion is denoted by y then by the

fundamental approximation property of Fourier Series,

9David Luenberger, Optimization by Vector Space

Methods (New York, 1969), pp. 58-60.

45

||yn-y0!l_ is minimized over the subspace [p-l^, • •

Since fly -YqI!0 is a maximal orthonormal set in Hf

22So by expanding Hyn-yoliQ , we have

^ 2

iin.Hyn-yoi|Q2=Hyoll2-llyol!2=on*

2

Thus y -> yQ and iiyn-yollo decreases as n increases.

Consequently, if tp^ is complete, the process converges

to yQ.

Theorem 5.3.1, (Method of Conjugate Directions) Let

XpA be a sequence in H such that (pi#QPj)=O, i=j, and the

closed linear subspace generated by the sequence is H.

Then for any y,£ Hr the sequence generated by the recursion

yn+1-V-Vn

satisfies (r tP^)^, lc=lf 2,.. ,,n-l. In addition, yn -♦ yQ

(the unique solution of Qy=b).

Proof. Define zn=synm"yi» The recursion is then

equivalent to z,=0 and

pn

46

Pn

pn

<Pn,QPn>

In terms of the inner product [,] , knowing that

we have

[Pn,QPnl

Since z £ [Pi»P2» • • ''pn-ll ' it: follows tnat

and

n-1 n-1

Then

[Pn-1'Pn-llSo

IPa-1'Pn-J

47

In addition,

Continuing in this manner,

which is the n-th partial sum of the Fourier expansion of

z«. With any assumptions on Q, convergence with respect

to III) is equivalent to convergence with respect to IIIL.

Thus, it follows that z -» Zq and y ~*yo» (r

follows from the error that

is orthogonal to the subspace [plrp2,•••#£«_]] • The P^oof

is complete.

The next method of conjugate direction, which we will

discuss, consists of obtaining Q-orthogonal direction

vectors by applying the Gram-Schmidt procedure to any

sequence of vectors that generate a dense subspace of S.

Definition 5.3.2. Let \v±\ be such a sequence in S

so that

for n>l, where fy,z] =(y,Qz). Start with an initial vector

Vj^ and a bounded linear self-adjoint operator B, so that

tv^l is generated by vn+i=Bvn- This sequence is said to

be a sequence of moments of B.

48

Theorem 5.3.3. Let \v.\ be a sequence of moments of

a self-adjoint operator B. Then the sequence

pn-l'

for n^2, defines an orthogonal sequence in S such that for

each n,

Proof. It is clear that the theorem is true for p,

and p?. So to prove that it is true for n>2, we assume

n

that it is true for \p.T . We will show it is true for1 i-1

n+1

\p.\ . Note that Pn+T is nonzero and is in the subspacen=l

lvl»v?» • • *»vn+J * Thus we nee^ only show that Pn,i is

orthogonal to each p.f i^n. By direct calculation.

(1) (2)

If is.n-2, then (1) and (2) are zero. Because

BpAe (plf P2 # • • • # Pi+il #

it follows that

Therefore Pn+^ is orthogonal to each p., iin. The proof

is complete.

Finally, we will discuss the Conjugate Gradient

49

Method, which involves selecting direction vectors when

minimizing the functional

f(y)=(y,Qy)-2(b,y)

by choosing

pl==rl=b"QYl(direction of negative gradient of f at y1). A new negative

gradient direction,

r2=b-Qy2,

is considered and p0 is chosen to be in the soace spanned

by r^r,, but Q-orthogonal to p1# The selection of pA' s

is continued in this manner so that

This leads to a recursion of the form

Theorem 5.3,4. Given yx in a Hilbert space H, define

(1) P1

(2) rn=b-Qyn,

nn

Then ^yn^ converges to yo=Q"~ b.

Proof. We must first show that this is a method of

50

conjugate directions. First, we assume that the theorem

n n+1

is true for t p, ^ and tyvS . Prom (4), we have thatJC k=l K k=l

(7)

If k=n, then (7) and (8) cancel. If k<n, then (8) is zerO

and (-P^Q*^^) can be written as (QPfc*1^}!) • But

and for any conjugate direction method, (r .i»P^)=0» i£n«

Therefore, this method is a conjugate direction method.

Next, we must show that ty ^ converges to Vq. So we

define the functional E by

E(y)=(b-Qy,Q"1(b-Qy)).

By direct computation, using (1) through (6), we have

From (4) and the fact that (r ,p -,)=0,

and

From (4) and since pw and p n are Q-orthogonal,n n—i '

(10) (rn,Qrn)=(pn2

Since (y,Qy)2m(y,y) and (y,Qy)iM(y,y), it follows that

(rn,rn) >m.

51

Combining (9), (10), and (11), we see that

Therefore E(yn) —>0 implies that rn -*0. The proof is

complete.

BIBLIOGRAPHY

Kantorovich, L. V., and G. P. Akilov. Functional Analysis

in Normed Spaces. Translated by D. E. Brown. Edited

by A. P. Robertson. New York: Macmillan Company,1964.

Luenberger, David G. Optimization by Vector Space Methods.

New York: John Wiley and Sons, Inc.f 1969.

Saxena, S. C, and S. M. Shah. Introduction to Real

Variable Theory. Scranton: International TextbookCompany, 1972.

Sherwood, G. E., and Angus Taylor. Calculus. 3rd ed.

Englewood Cliffs: Prentice-Hall, Inc., 1954.

Taylor, Angus. Introduction to Functional Analysis. NewYork: John Wiley and Sons, Inc., 1958.

52

optimization in normed linear spaces - core › download › pdf › 30605457.pdf · optimization...

Documents