theory of ordinary differential equations of ordinary differential equations review of advance...

Post on 09-Jul-2018

229 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

THEORY OF ORDINARYDIFFERENTIAL EQUATIONS

Review of Advance Calculus Topics

John A. Burns

Center for Optimal Design And Control

Interdisciplinary Center for Applied MathematicsVirginia Polytechnic Institute and State University

Blacksburg, Virginia 24061-0531

MATH 5245FALL 2012

TopicsReview of Differentiation for Vector Valued Functions Partial and Directional Derivatives Derivatives (Fréchet) Gradients, Jacobians and Hessians

Matrix Theory

Calculus for F: D(F) Rn ---> Rm

We review calculus for vector-valued functions of n variables. We then go to the infinite dimensional case. The following references are good …

Robert G. Bartle, The Elements of Real Analysis, John Wiley & Sons, New York, 1976L. V. Kantorovich and G. P. Akilov, Functional Analysis in Normed Spaces, Pergamon Press, New York, 1964.M. Z. Nashed, Differentiability and Related Properties of Nonlinear Operators: Some Aspects of the Role of Differentials in Nonlinear Functional Analysis, in Nonlinear Functional Analysis and Applications, Academic Press, New York, 1971.J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970.A. E. Taylor, An Introduction to Functional Analysis, John Wiley & Sons, New York, 1967.E. Zeidler, Applied Functional Analysis, Springer-Verlag, New York, 1995.

Finite Dimensional Spaces1

2

.:

.

.

i

n

xx

x R

x

1

2

.:

.

.

i

n

xx

x C

x

COLUMN VECTORS

iNi

n

i

pip

n

ii

n

ii

nTn

xx

pxx

xx

xxx

Rxxxx

p

max

1 ,

,...,,

1

1

11

1

2

2

21

1

NORMS (SAME for Complex Spaces)

c of moduluscomplex

of valueabsolute22

babiac

rr

and

i) 0 and 0 if and only if 0

ii) x , (or )

iii) , for , (or )

n n

n n

R

x x x

x x R C

x y x y x y R C

A Norm on is a functionsatisfying

RRn :nR

R = Real Numbers R+ = [0, +)C = Complex NumbersRn = n-dimensional Euclidean Space = Cn = n-dimensional Complex SpacexT = [x1, x2, …,xn]

- inequality

Finite Dimensional SpacesThe vector space Rn with a norm || || is a normed linear space. Inparticular, the pair (Rn , || ||2) is called n-dimensional Euclidean space.

2: 12

1

nx x xii

1: | | 11

nx x xii

x1

x2

The distance between two vectors x and y in Rn is given by dist(x,y) = ||x - y||.

The geometry and differentiability of different norms are DIFFERENT!

Finite Dimensional SpacesEquivalent Norm Theorem:

If ( , ) is one of the finite dimensional normed linear spaces ( , ) or

( , ), then there are constants 0 and 0 such that for any 1

nX R nC m M p

. m x x M xp p

2 1 2x x n x

2x x n x

1x x n x

EXAMPLES

Finite Dimensional Spaces

ˆ ˆ( , ) :B x x X x x

Open Ball About A Point Closed Ball About A Point

ˆ ˆ( , ) :B x x X x x

For k = 1,2,3, … let be a sequence of vectors in

Rn. We say that {xh} converges to x in Rn if

and we write

1 2[ .. ]k k k k Tnx x x x

lim 0,k

k x x

k kx x

For each > 0 there is a K() > 1 such that if k ≥ K(), then

.kx x

Finite Dimensional Spaces

NOTE: Because of the Equivalent Norm Theorem if || ||p is any norm on Rn, then

2lim - 0k

k x x

lim - 0k

pk x x

if and only if

NOTE:

IMPLIES that for i= 1, 2 , …, n

k

kx x

ki ikx x

AND :

for all i= 1, 2 , …, n IMPLIES

k

i ikx x

k

kx x

Inner ProductsAn inner product on Rn is a mapping such that , : n nR R R

n

n

n

Rzyx

Ryx

Ryx

x

R

,, ,yz,yx, yz,x iv)

,for xy, yx, iii)

, ,yx, yx, ii)

0 ifonly and if 0xx, and 0xx, i)

and

EXAMPLE: )(... ,1

xyyxyxyx Tn

iii

An inner product on Cn is a mapping such that CCC nn :,

n

n

n

Czyx

Cyx

Cyx

x

C

,, ,yz,yx, yz,x iv)

,for xy, yx, iii)

, ,yx, yx, ii)

0 ifonly and if 0xx, and 0xx, real, is xx, i)

_____

and

EXAMPLE: )*(... ,1

xyxyyxyxyx Tn

iii

Inner Product Spaces

______ _________ ______

__________ ________________

andFor all , ,

, and

x,y z y z,x y,x z,x x,y x,z

nC

α y y αy

x y z C

x, y ,x ,x ,x x, y

NOTE:

A norm || || and an inner product <· , · > are compatible if x,xxx,xx 2or

n

ii

n

iii xxxxxx

1

2

12

||,EXAMPLE:

REMARK: If p 2 then there is NO inner product <· , · > with x,xxp

If <· , · > is an inner product on Rn , then the pair (Rn , <· , · > ) is called aninner product space. Same for (Cn , <· , · > ).

Results for Inner Products

Schwarz Inequality: yxy,x

Parallelogram Law: 2222 2 yxyxyx

Two vectors are call orthogonal if )(or , nn CRyx 0.x,y

Pythagorean Theorem: .yxyx,y,x 222 then0 If

x

y

x

y

x y

Matrix Notation

nm

m,nm,m,

,n,,

,n,,

i,j R

a..aa

aaaa..aa

a

::::

..

21

22212

12111

A

We use standard matrix notation ... for real matrices

mn

m,n,n,n

m,,,

m,,,

T

i,jT R

a..aa

aaaa..aa

a

::::

..

21

22221

11211

A

nni,jn R

..

..

100::::0..10001

II

AND standard terms ... diagonal tridiagonal upper triangular :

Symmetric

skew-symmetric

positive definite

orthogonal

AA T

AA T

nT Rxxx 0 ,0AIAA T

Matrix Notation

11 1 2 1

2 1 2 2 2

1 2

..

: : : :

, , ,n

, , ,n m ni, j

m, m, m,n

a a .. aa a a

a C

a a .. a

A

We use standard matrix notation ... for complex matrices

11 2 1 1

1 2 2 2 2

1 2

..

: : : :

, , m,

T, , m,T n m

i, j

,n ,n m,n

a a .. aa a a

a C

a a .. a

A

nni,jn R

..

..

100::::0..10001

II

AND standard terms ...

diagonal tridiagonal upper triangular :

self-adjoint (Hermitian)

skew-adjoint

positive definite

unitary

* A A* A A

* 0, 0 nx x x C A* A A I

11 2 1 1

1 2 2 2 2*

1 2

..

: : : :

, , m,

T, , m, n m

i, j

,n ,n m,n

a a .. aa a a

a C

a a .. a

A

Matrix Notation

MATRIX NORMS ...

11 1 2 1

2 1 2 2 2

1 2

..

: : : :

, , ,n

, , ,n m ni, j

m, m, m,n

a a .. aa a a

a R

a a .. a

A

2/1

1 1

2

,

m

i

n

jjiF aAFROBENIUS NORM ...

p

p

xp x

xAA sup

0P - NORM ...

FACTS

22 n AAA F max max ,2, nm jiji aa A

AAA12

m

ijia

1,j1 ||max A

n

jjia

1,i

||max A

Matrix Notation

nm

m,nm,m,

,n,,

,n,,

i,j R

a..aa

aaaa..aa

a

::::

..

21

22212

12111

A sn

n,sn,n,

,s,,

,s,,

i,j R

b..bb:.::

bbbb..bb

b

21

22212

12111

B

FFF BABA

THESE MATRIX NORMS ARE MUTUALLY CONSISTENT ...

ppp BABA

IF Q AND Z ARE BOTH ORTHOGONAL, THEN

FF AZAQ 22 AZAQ

Gene H. Golub and Charles Van Loan, Matrix Computation, third edition, The Johns Hopkins University Press, London, 1996.

Functions of n VariablesLet D(F) Rn be a subset of Rn and let F: D(F) --->Rm be a function with domainD(F) Rn and range R(F) in Rm.

One of the most fundamental problems in engineering and science is: Given y Rm, find x D(F) Rn such that

F(x) = y (1)

EXISTENCE: If F is onto Rm (i.e., if R(F) = Rm) then there exists at least onesolution to (1).

UNIQUENESS: If F is one-to-one (i.e, F(x1) = F(x2) implies that x1= x2), then thesolution to (1) is unique.

MOST IMPORTANT PROBLEMS

ACTUALLY COMPUTING THE SOLUTION

FIND CHECKABLE CONDITIONS FOR EXISTENCE & UNIQUENESS

DEVELOP FAST & ACCURATE ALGORITHMS - SOFTWARE TOOLS

Functions of n Variables

The function F is locally one-to-one at the point x D(F) Rn, if there is a > 0 such that F restricted to the set D(F)B(x, ) is one-to-one.

The function F is one-to-one on the set D(F) Rn, if F restricted to the setD(F) is one-to-one.

If F is one-to-one on the set D(F) Rn, then there exists a function F -1 from = F( ) Rm into Rn such that

F(F -1(y)) = y for all y AND F -1(F(x)) = x for all x

The function A: D(A) ---> Rm is linear ifn i) ( ) is a linear subspace of D RA

1 2 1 2 1 2ii) If and , ( ), then , R x x D A ( x + x )= (x )+ (x ) A A A

AND

A linear function : Rn ---> R is called a linear functional.

Functions of n VariablesLet F: D(F) Rn --->Rm and G: D(G) Rm --->Rk be functions. Thecomposite function is the function G o F :D(GoF) Rn --->Rk is definedon the domain

( ) ( ) : ( ) ( ) ( ) nD G F x D F F x D G D F R

by [G o F ](x) = G(F(x))

F G

G o F

xRn F(x)Rm

G(F(x))

Rk

CalculusIf F: D(F) Rn ---> Rm is a function with domain D(F) Rn and range R(F) in Rm,then F is continuous at p Rn, ifi) p D(F)ii) For each > 0, there is a = (p, ) > 0, such that if x D(F) B(p, ),

then || F(x) - F(p) ||Rm <

We say that F is continuous, if F is continuous at each x D(F)

NOTE: i) and ii) are equivalent to ... - 0lim ( ) - ( ) 0m

nRRx p

F x F p

The function F: D(F)Rn ---> Rm is a contraction if there is a < 1 such that ifx0 , x1 D(F) , then || F(x1) - F(x0) ||Rm || x1 - x0 ||Rn

Abuse of NotationLet D(F) Rn be a subset of Rn and let F: D(F)Rn --->Rm be a functionwith domain D(F) in Rn and range R(F) in Rm.

1

21

11 1

22 2 1

1

2

( ):

( )( ) ( )

( ) :

( )

( ):

n

nn m

m

n

xx

F

x

xx F x

xx F x F

F(x) Fx

x F x

xx

F

x

Tn

n

xxx

x

xx

x ..: 12

1

1 1 1 2

2 2 1 21 2

1 2

( ) ( , ,..., )( ) ( , ,..., )

( , ,..., ): :( ) ( , ,..., )

n

nn

m m n

F x F x x xF x F x x x

F(x) F x x x

F x F x x x

Notation

If A: D(A)Rn ---> Rm is a linear function with domain, then one can show that A is continuous. Moreover, A is bounded. In particular, there is a M 0 such that for all x D(A) one has

|| A(x) ||Rm M || x ||Rn.

The operator norm on A is defined to be

0

( )sup m

n

R

x R

A xA A

x

RECALL: All linear operators from Rn to Rm have a matrix representation.Actually, if one selects basis for Rn and Rm, then there exist a mn matrix

such that

11 1 2 1

2 1 2 2 2

1 2

..

: : : :

, , ,n

, , ,n m ni, j

m, m, m,n

a a .. aa a a

a R

a a .. a

A

( )A x x A( )A x x AOPERATOR MATRIX

Matrix Representations

The standard unit vectors ei = [0 0 … 0 1 0 … 0]T

ith position

x=[x1,x2]T

x=[x2,-x1]T

0( ) rotate clockwise by 90A x

1

2

0 1( )

1 0x

x xAx

A

WARNING: The use of matrices as representations of linear operators is important. However, be sure not to confusethe representation (MATRIX) with the operator A.A

Partial DerivativesLet F: D(F) Rn ---> R1 be a real-valued function of n variables.The partial derivative xiF(p) at p is a number Di=Di (p) satisfying:For each > 0, there is a = (p, ) > 0, such thatfor each t with - < t < , [p + tei] D(F) and

| F(p + tei) - F(p) - Di t | | t | (2)or equivalently,

| F(p1, p2, …,pi + t, …pn) - F(p1, p2, …,pi, …pn ) - Di t | | t |. (3)

1t 0lim | [ ( ) ( )] | 0i

it F p te F p D

NOTE: The partial derivatives will be denoted by several symbols ...

APPLYING THE CHAIN RULE (IF POSSIBLE)

0( ) ( ) ( )

i

ididt x x pt

F p te F p x F x

Partial Derivatives

i i i

F(p)i ix x x x p

D F(p) x F(p) F(x)

IMPORTANT REMARK: The partial derivative of a real-valued function F at a point p is a NUMBER!

HIGHER ORDER PARTIAL DERIVATIVES ...2

22

iixx x p

F(p) F(x)

NEED GENERAL NOTATION: Let s=(s1, s2, …, sn) be a multi-index,where each si is a non-negative integer.

1 2

1 2

...

1 2

( ) ( )...

N

N

s s ss

ss sN

D F p F px x x

Let |s| = s1 + s2 + … + sn and define the mixed partial derivative DsF(p) by

(0,0,...,0) 0D F(p) D F(p) F(p)

AGREE THAT

Partial Derivatives

EXAMPLE: In R2 if s=(1,1), then 1 1 2

s (1,1)

1 2 1 2

D ( ) D ( ) ( ) ( )F p F p F p F px x x x

EXAMPLE: In R2 if s=(2,0), then2 0 2

s (2,0)2 0 21 2 1

D ( ) D ( ) ( ) ( )F p F p F p F px x x

EXAMPLE: In R2 if s=(2,0) and r=(0,2), then

2 2(2,0) (0,2)

2 21 2

[ ] ( ) ( ) ( ) ( ) ( ) ( )s rD D F p D F p D F p F p F p F px x

EXAMPLE: In R2 if s=(1,0) and r=(0,1), then

(1,0) (0,1)

1 2

[ ] ( ) ( ) ( ) ( ) ( )s rD D F p D F p D F p F p F px x

Variations and Differentials

There are various notations and terms used for this derivative:

0( ; ) [F( )] ( , )d

dt ttF p p V p

The first variation of F at p in the direction of , The directional derivative of F at p in the direction , (misleading)The Gateaux variation at p in the direction ...

Let Rn be a “direction”. If for each > 0, there is a = (p, ) > 0, such that for each t with - < t < , [p + t ] D(F) and the limit (in R1)

exists, then this limit is called the directional derivative of F at p in thedirection .

10 0

[F( )] lim( )[F( ) ( )]ddt tt t

tp p t F p

If F: D(F) Rn ---> R1, then the partial derivative xiF(p) at p is the first variation of F in the direction of the unit vector ei.

0

1( ) ( ) lim ( )[ ( ) ( )] ( ; )i ii

i t tF p x F p F p te F p F p e

x

Variations and Differentials? WHEN DO THESE VARIATIONS EXIST ?

REQUIRES:

10

For sufficiently small ( )AND lim [ ( ) ( )]

EXISTStt

t p t D F

F p t F p

0( ; ) [F( )]d

dt ttF p p

1

2

( )

( )

:)

( )

(

n

x

x

x

F p

F p

F p

F p

If for all i = 1, 2, …, n the partial derivativesexist, then we can define the gradient vector

i iix x x pF(p) x F(p) F(x)

Variations and DifferentialsIF the CHAIN RULE can be used ...

1 1 2 2 1 1 2 21

( , ,..., ) ( , ,..., )i

nd

N N N N idt xi

F p t p t p t F p t p t p t

At t = 0

1 1 2 2 1 201

( , ,..., ) ( , ,..., ) , ( )i

Nd

N N N idt xti

F p t p t p t F p p p F p

Thus

( ; ) , ( ) [ ( )]TF p F p F p (4)

WARNING: (4) does not always hold!

The Gateaux Derivative

If F(p;) exists for all Rn AND is linear in , then we can define athe Gateaux Derivative.

NOTE: F(p) Rn is a vector. It is not a “derivative”. So,what do we mean by a “derivative”? We start with a weak notion of thederivative and the extend to the stronger notion.

Assume F: D(F) Rn ---> R1 and p int[D(F)]. If the first variation F( p ; )of F at p exists for all Rn AND for 1 , 2 Rn ,

F( p ; 1 + 2 ) = F( p ; 1 ) + F ( p ; 2 ),

(i.e. F( p ; ) is linear in ), then we say that F is Gateaux-differentiableat p. If = p is the (unique) linear functional p: Rn ---> R1 defined byp( ) = F( p ; ), then p is called the Gateaux Derivative of F at p.

The Gateaux Derivative is a linear functional.

Gateaux Derivative

Reisz Representation Theorem. If : Rn ---> R1 is a linear functional onRn,, then

i) is continuous,

and

ii) there exists a fixed vector a Rn (depending on ) such that forall Rn

( ) = < , a > (5)

To make the Gateaux derivative something useful, we need the followingversion of the Reisz Representation Theorem.

If F is Gateaux differentiable at p we can apply the Reisz Representation Theoremto p( ) = F( p ; ), so it follows that there is a unique vectora = a(p) Rn such that for all Rn ,

T

1

( ; ) ) , [ ( )] [ ( )] ( ) .n

p i ii

F p (η a p a p a p

(6)

Gateaux Derivative

IMPORTANT: The linear functional p( ) = f( p ; ) is theGateaux derivative at p. The vector a(p) is just a matrix representation ofthis linear functional.

Note that the linear functional f( p ; )=p: Rn ---> R1 is the Gateaux-derivativeof F at p int[D(F)] if and only if for each Rn

OR EQUIVALENTLY

0lim ( 1/ ) [ ( ) ( )] ( ) 0pt

tt F p F p t

(7)

0

lim (1/ ) [ ( ) ( )] ( ) 0T

ttt F p F p t a p

(8)

Gateaux DerivativeIf F: D(F) Rn ---> R1 is Gateaux-differentiable at p int[D(F)], then the

partial derivatives xiF(p) at p exist. Let a(p) = [a(p)1, a(p)2 , …, a(p)n]T Rn

be the vector defined by (5). It follows that

We have the following theorem.

1

2

( )

( )( ) ( ) .

:( )

n

x

x

x

F p

F pa p F p

F p

(9)

Theorem D1. If F: D(F) Rn ---> R1 is Gateaux differentiable atp int[D(F)], then the gradient of F at p exists, and p( ) = F( p ; ) has the representation

1( ; ) , ( ) ( )

n

i ii

F p F p x F p

(10)

Gateaux DerivativeRECALL ...If A: D(A) Rn ---> Rm is a linear function with domain D(A) Rn and rangeR(A) in Rm, then the operator norm on A is defined to be

n

m

R

R

x x)x(

supA

AA 0

(11)

If f( p ; ) =p : Rn ---> R1 is the Gateaux derivative of F at p, then it is a linearfunctional and the (operator) norm of p is given by

0 0 0

[ ( )]( ) , ( )sup sup sup

n n n

Tp

p pR R R

F p F p

(12)

The DerivativeBUT … what is “the derivative” of a function F: D(F) Rn ---> R1 ? Best tothink of the derivative as a linear function that approximates F locally.

F(x) = (x +1)2 - 1 F( p ; ) = 2(p +1)

Let p denote the linear function with slope F'(p) = 2(p+1). If we look atp = 0, then 0( ) = 2 is a linear approximation to F(·) near p = 0.

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3-4

-2

0

2

4

6

8

10

12

14

16

F()

0( )0( ) = F( 0 ; )

p=0

The Fréchet Derivative

Let F:D(F) Rn ---> R1 be a real-valued function and assumep int[D(F)]. We say that F is (Fréchet) differentiable at p if thereexists a linear functional D :Rn ---> R1 such that for each > 0, thereis a = (p, ) > 0 such that if

0 < || x - p || Rn < ,then

i) x D(F) (13)

ii) | F(x) - F(p) - D(x - p) | || x - p || Rn (14)

If F: D(F) Rn ---> R1 then the (Fréchet) derivative of F at a point p is alinear function D:Rn ---> R1 that “approximates F near p”.

BASIC IDEA …

Notation varies and we shall use ... D = [DF(p)] = DxF(p) = F´(p)

The (Fréchet) Derivative is the LINEAR OPERATOR DxF(p)

The Fréchet DerivativeLet F:D(F) Rn ---> R1 be (Fréchet) differentiable at p with derivativeDxF(p) :Rn ---> R1. Given any > 0, there is a = (p, ) > 0 such that if0 < || x - p || Rn < , then x D(F) and then

| F(x) - F(p) – [DxF(p)](x - p) | || x - p || Rn

Let = x - p. Note that p + = x D(F), 0 < || || Rn < and

| F(p + ) - F(p) - [DxF(p)]() | || || Rn

or equivalently,

(1/ || || Rn) | F(p + ) - F(p) - [DxF(p)]() |

Thus, the linear functional DxF(p) :Rn ---> R1 is the (Fréchet) derivative of thefunction F:D(F) Rn ---> R1 at x = p if and only if

1lim [ ( ) ( )] [ ( )]( ) 0.0 n

xR

F p F p D F p

(15)

The Fréchet DerivativeTheorem D2. If F: D(F) Rn ---> R1 has a Fréchet derivative p int[D(F)], then F has a Gateaux derivative at p and the two derivatives are equal. Inparticular, DxF(p) = p( ) = f( p ; ).

There exist functions F: D(F) Rn ---> R1 such that at p int[D(F)]:

(1) F has a Gateaux differential (first variation) at p but F is not Gateauxdifferentiable at p.

(2) F has a Gateaux derivative at p but F is not Fréchet differentiable at p.

COMMENTS

Problem (P1): Let F :R2 ---> R1 be defined by 2

2 4 ,( , )

0,

x yx y

F x y

(16)(x , y) (0 , 0)

(x , y) = (0 , 0).

Show that F( 0 ; ) exists for all R2 but F does not have a Gateauxderivative at 0 = [0 , 0]T. Also, F is NOT continuous at 0!

The Fréchet Derivative

3

4 2 ,( , )

0,

x yx y

F x y

(17)(x , y) (0 , 0)

(x , y) = (0 , 0)

Problem (P2): Let F :R2 ---> R1 be defined by

Show that F has a Gateaux derivative F( 0 ; ) at p = 0=[0 , 0]T, but F does not have a Fréchet derivative at p = 0=[0 , 0]T.

2

2

( )

2 (2 )

2 ,( , )

0,

x

x

yey eF x y

(18)x 0

x = 0

Problem (P3): Let F :R2 ---> R1 be defined by

Show that F has a Gateaux derivative F( 0 ; ) at p = 0=[0 , 0]T,but F is not continuous at p = 0=[0 , 0]T.

The Fréchet Derivative

,( , )

0,

xy

F x y

(19)y 0

y = 0

Problem (P4): Let F :R2 ---> R1 be defined by

Show that f has partial derivatives at p = 0=[0 , 0]T, but F( 0 ; ) does notexist unless 12 = 0. Also, F is not continuous at (0, 0) = 0.

2 2 ,( , )

0,

x yF x y

(20)if both x and y are rational

otherwise

Problem (P5): Let F :R2 ---> R1 be defined by

Show that F is continuous at only one point p = 0=[0 , 0]T, andyet F is Fréchet differentiable at p = 0=[0 , 0]T.

The Fréchet DerivativeLet D(F) Rn be a subset of Rn and let F: D(F) ---> R1 be a real-valuedfunction of n variables. We say that F is differentiable on the set D(F), if F is Fréchet differentiable at each point x D(F).

Mean Value Theorem For Functionals. Let F: D(F) Rn ---> R1. Assume that F is Fréchet differentiable at all points in theopen set D(F). If x and y are two elements in such that the linesegment {z = x + (1- )y : 0 1 is contained in , then there isa c with 0 < c < 1 such that

F(y) - F(x) = [DxF(c x+(1 - c)y)](y – x) (21)

… OR by defining c = c x+(1 - c)y, we have

F(y) - F(x) = [DxF(c)](y – x) (22)

| F(y) - F(x) | | DxF(c)( y – x) | || DxF(c)|||( y – x) | (23)

IMPLIES

The Mean Value Theorem

c = c x+(1 - c)y

xy

c = c x+(1 - c)yx

y

ASSUMPTIONS SATISFIED ASSUMPTIONS FAIL

The Mean Value Theorem

c = c x+(1 - c)yx

y

If is convex, then the theorem applies to all x and y in .

Let F: D(F) ---> R1 be a real valued function of n variables. If D(F), then we say that F is Holder continuous on the set if there exists constants M 0 and (0, 1] such that for all x, y ,

| F(x) - F(y) | M || x - y||. If = 1, then we say that F is Lipschitz continuous.

Functions F: D(F) Rn ---> Rm

1 1 1 2

2 2 1 21 2

1 2

( ) ( , ,..., )( ) ( , ,..., )

( ) ( , ,..., ) ,: :( ) ( , ,..., )

n

nn

m m n

F x F x x xF x F x x x

F x F x x x

F x F x x x

If F: D( F ) Rn ---> Rm is a function with domain D( F ) Rn and range R(F )in Rm, then F has the form

where for i = 1, 2, …, m, Fi is a real valued function with domain D( Fi ) Rn

and range R( Fi ) R1.

,i

i j j

F (p)i j i j i ix x x x p

D F (p) x F (p) F (x)

Each function Fi: D(F) ---> R1 fits into the previous framework. For example,the definitions and results involving partial derivatives, Gateaux variations,and Fréchet derivatives hold. The notation is as expected … e.g.

Functions F: D(F) Rn ---> Rm

There are various notations and terms used for this derivative:

0( ; ) [F( )] ( , )d

dt tmt RF p p V p

The first variation of F at p in the direction of , The directional derivative of F at p in the direction , (misleading)The Gateaux variation at p in the direction ...

Let Rn be a “direction”. If for each > 0, there is a = (p, ) > 0, such that for each t with - < t < , [p + t ] D(F) and the limit (in Rm)

exists, then this limit is called the directional derivative of F at p in thedirection .

10 0

[F( )] lim( )[F( ) ( )]ddt tt t

tp p t F p

NOTE: These variations are vectors in the range space Rm

Variations and Differentials? WHEN DO THESE VARIATIONS EXIST ?

REQUIRES:

10

For sufficiently small ( ) AND lim [ ( ) ( )] ( ; )

EXISTStt

m

t p t D FF p t F Rp F p

0( ; ) [F( )]d

dt ttF p p

1

2

( )

( ), 1, 2,...,

:( )

( )

n

ix

ix

ix

i

F p

F pF p

i m

F p

If for all i = 1, 2, …, n the partial derivativesexist, then we can define the gradient vectors

j ji j i ix x x pF (p) x F (p) F (x)

Jacobian Matrix

1

2

1 2

( )

( )[ ] ( ) ( ) ( )

( )

( )n

n

Tix

ixTi i ix x x

ix

i

F p

F pF F F

F

F

p

p

x x x

If for all i = 1, 2, …, n the partial derivativesexist, then we can define the Jacobian matrix

j ji j i ix x x pF (p) x F (p) F (x)

1 2

1 2

1 2

1 1 1

2 2 2

( ) ( ) ( )

( ) ( ) ( )

( ) ( )

( ) ( ) ( )

|

n

n

n

x x x

x x x

m n

m m mx x x

F F F

F F F

F F R

F F F

J J

x p

x p

x x x

x x x

p x

x x xAlso, since

Jacobian Matrix

1 2

1 2

1 2

1 1 1 1

2 2 2 2

( ) ( ) ( ) [ ( )]( ) ( ) ( ) [ ( )]

( )

[ ( )]( ) ( ) ( )

n

n

n

Tx x x

Tx x x

Tmm m mx x x

F p F p F p F pF p F p F p F p

F

F pF p F p F p

J

p

NOTE: JF(p) Rm n is a matrix. It is not a derivative. So, again we have todefine what we mean by a “derivative”? The definitions are the same as for thereal valued functions …

The Gateaux Derivative

If F(p;) exists for all Rn AND is linear in , then we can define athe Gateaux Derivative.

Assume F: D(F) Rn ---> Rm and p int[D(F)]. If the first variation F( p ; )of F at p exists for all Rn AND for 1 , 2 Rn ,

F( p ; 1 + 2 ) = F( p ; 1 ) + F ( p ; 2 ),

(i.e. F( p ; ) is linear in ), then we say that F is Gateaux-differentiableat p. If L = Lp is the (unique) linear function Lp: Rn ---> Rm defined byLp( ) = F( p ; ), then Lp is called the Gateaux Derivative of F at p.

The Gateaux Derivative is a linear operator

: n mpL R R

Gateaux Derivative

Reisz Representation Theorem. If L: Rn ---> Rm is a linear operator onRn,, then

i) L is continuous,

and

ii) there exists a matrix L R m n (depending on L and a basis for Rn and Rm) such that for all Rn

L( ) = L

To make the Gateaux derivative something useful, we need the followingversion of the Reisz Representation Theorem.

If F is Gateaux differentiable at p we can apply the Reisz Representation Theoremto L( ) = F( p ; ), so it follows that there is a matrixL = L(p) R m n such that for all Rn ,

( ; ) ) ( )pF p L (η p L

Gateaux Derivative

IMPORTANT: The linear function Lp( ) = f( p ; ) is the Gateaux derivative of F at p. The matrix L(p) is just a matrix representation of this linear function.

Note that the linear functional f( p ; )= Lp: Rn ---> Rm is the Gateaux derivativeof F at p int[D(F)] if and only if for each Rn

OR EQUIVALENTLY

0lim ( 1/ ) ( 1/ )[ ( ) ( )] ( ) 0mp Rt

tt t F p F p L

0

lim (1/ )[ ( ) ( )] ( ) 0mRttt F p F p p

L

Gateaux DerivativeIf F: D(F) Rn ---> Rm is Gateaux differentiable at p int[D(F)], then the

partial derivatives xjFi(p) at p exist. Select the standard basis for Rn and Rm

and let

Theorem D3. If F: D(F) Rn ---> Rm is Gateaux differentiable atp int[D(F)], then the Jacobian matrix of F at p exists, and Lp( ) = F( p ; ) has the representation

( ; ) [ ( )]F p F p J

1 2

1 2

1 2

1 1 1 1

2 2 2 2

( ) ( ) ( ) [ ( )]( ) ( ) ( ) [ ( )]

( )

[ ( )]( ) ( ) ( )

n

n

n

Tx x x

Tx x x

Tmm m mx x x

F p F p F p F pF p F p F p F p

F

F pF p F p F p

J

p

Gateaux DerivativeRECALL ...

If A: D(A) Rn ---> Rm is a linear function with domain D(A) Rn and rangeR(A) in Rm, then the operator norm on A is defined to be

n

m

R

R

x x)x(

supA

AA 0

If f( p ; ) =Lp : Rn ---> Rm is the Gateaux derivative of F at p, then it is a linearoperator and the (operator) norm of Lp is given by

0 0

( ) [ ( )]sup sup [ ( )]m m

n n

p R Rp p

R R

L F pL L F p

JJ

The Fréchet Derivative

Notation varies and we shall use ... D = [DF(p)] = DxF(p) = F´(p)

The (Fréchet) Derivative is the LINEAR OPERATOR D

Let F:D(F) Rn ---> Rm be a vector-valued function and assumep int[D(F)]. We say that F is (Fréchet) differentiable at p if thereexists a linear operator D :Rn ---> Rm such that for each > 0, thereis a = (p, ) > 0 such that if

then

i) x D(F)

and ii) [ ( ) ( )] ( ) m nR R

F x F p D x p x p

0 nRx p

: n mD R R

The Fréchet Derivative

If F is Fréchet differentiable at p we can apply the Reisz Representation Theoremto D( ) = [DxF( p )]() to obtain a matrix representation with respect to the standardbasis …

[ ( )]( ) [ ( )]xD F p F p J[ ]( ) [( ) ( )]xD F p F p J

OPERATOR MATRIX

Theorem D4. If F: D(F) Rn ---> Rm is Fréchet differentiable atp int[D(F)], then the Jacobian matrix of F at p exists, F is continuous at p and has the representation

[ ( )]( ) [ ( )]xD F p F p J

Two Theorems

Theorem D5. If F: D(F) Rn ---> Rm and Jacobian matrix of F atp int[D(F)] exists and is continuous in an open ball

about p, then F is Fréchet differentiable at p.

( , ) : nn

RB p x R x p

Mean Value Theorem For Vector Valued Functions. Let F: D(F) ---> Rm.Assume that F is Fréchet differentiable at all points in an open set D(F). If x and y are two elements in such that the linesegment {z = x + (1- )y : 0 1 is contained in , then there isa c with 0 < c < 1 such that for c = c x+(1 - c)y,

CAN NOT SAY … [F(y) - F(x)] = [DxF(c)]( y – x)

( ) ( ) [ ( )]( ) ( )m m nx xR R RF y F x D F y x D F yc c x

Analysis of F: D(F) Rn ---> Rm

In order to use these concepts we need to know when the usual “calculus” results hold. The important results for this topic are:

• Taylor’s Theorem• Chain Rule• Inverse Function Theorem • Implicit Function Theorem• Necessary conditions for optimization• Higher order derivatives• Convexity

We will cover these topics when needed and indicate where proofs can be found in the references.

We present two important results

Chain RuleChain Rule for Fréchet derivative. Let F: D(F) Rn --->Rm andG: D(G) Rm --->Rk be functions. If F has a Fréchet derivativeat p and G has a Fréchet derivative at y = F(p), then the composite functionG o F has Fréchet derivative at p and

for all Rn. Or ….

[ ( )( )]( ) [ ( ( ))] [ ( )]( )x y xD G F p D G F p D F p

( )[( )( )] [ ( ) ] [ ( )]| y F p

d d

dy dx

d G F p G y F pdx

DOES NOT HOLD for Gateaux derivative

[( )( )] [( ( ( ))] [ ( ( ))] [ ( )]x xD G F p D G F p G F p F p

Partial Fréchet Derivatives Let be the product space with norm defined by

If F : D(F) Rn Rp ---> Rm is a function of the two variables x and q, then we can define the partial Gateaux and Fréchet derivatives. If p = ( x0 , q0 )T int[D(F)], then define the domain

D( F1 ) = {x Rn : (x , q0 )T int[D( F)]}

and the function F1 : D( F1 ) Rp ---> Rm by

F1(x) = F( x , q0 ) Rm.

If F1(x) = F( x, q0 ) has a Gateaux (Fréchet) derivative at x = x0 , then we say that F( x, q0 ) has a first partial Gateaux (Fréchet) derivative atz0 = p and we denote this derivative by DxF( x0 , q0 ) or D1F(x0 , q0 ) or xF(x0 , q0 ). Likewise, we can define qF(x0 , q0 ).

n pZ R R

2 2n p

T

R Rx q x q

The partial Fréchet derivatives are LINEAR OPERATORS

Partial Fréchet DerivativesNote that at p = ( x0 , q0 )T int[D( F )], then the partial Fréchet derivative xF(x0 , q0 ) is a continuous linear operator [xF(x0 , q0 )]: Rn ---> Rm and qf(x0 , q0 ) is a continuous linear operator [qF(x0 , q0 )]: Rp ---> Rm.

Assume F: D( F ) Rn Rp ---> Rm is a Fréchet differentiable function of the form

If all the partial derivatives exist and are continuous,

on the open set , then we say that F is smooth on and write

1 1 2 1 21

2 1 2 1 221 2 1 2

1 2 1 2

( , ,..., , , ,..., )( , )( , ,..., , , ,..., )( , )

( ) ( , ) ( , ,..., , , ,..., ) ,

( , ,..., , , ,..., )( , )

T

n p

n pn p

m n pm

z x q

F x x x q q qF x qF x x x q q qF x q

F z F x q F x x x q q q

F x x x q q qF x q

( , ) ( , ) and i i

i i

F x q F x qx q

1( )F C

Partial Fréchet Derivatives

If the partial Fréchet derivatives xF(x0 , q0 ) and qF(x0 , q0 ) exist at the point z0 = p = ( x0 , q0 )T int[D( F )], , then we define the total derivative of F atp to be the continuous linear operator dzF(x0 , q0 ) : Z = Rn Rp ---> Rm given by

[dzF(p)]( , ) = [xF(x0 , q0 )]( ) + [qF(x0 , q0 )]( ).

Theorem D6. If F: D(F) Rn Rp ---> Rm is Fréchet differentiable at a point p = ( x0 , q0 )T int[D(F)], with derivative [DzF(p)]:Rn Rp ---> Rm , then both partial Fréchet derivatives exists at p and

[DzF(p)]( , ) = [dzF(p)]( , ) = [xF(x0 , q0 )]( ) + [qF(x0 , q0 )]( ).

Theorem D7. If the partial Fréchet derivatives xF(x0 , q0 ) and qF(x0 , q0 )exist and are continuous at each point in a neighborhood Rn Rp ofz0 = p = ( x0 , q0 )T int[D(F)], then F is Fréchet differentiable at the p and

[DzF(p)]( ,) = [xF(x0 , q0 )]( )+ [qF(x0 , q0 )]( ).

Implicit Function Theorem

0 0( , ) 0F x q

Assume F : D(F) Rn Rp ---> Rn is a smooth function on a neighborhood ofz0 = [x0 , q0]T Rn Rp. If

and the partial Fréchet derivative

is one-to-one and onto Rn. Then there exists an open neighborhood Q of q0 and

a function such that

0 0( , ) : n nxF x q R R

: p nw Q R R

0 ) ( ) 0i w q and ) ( ( ), ) 0, for all .ii F w q q q Q

ˆ ˆ ˆ ˆ ˆ ˆ) [ ( ( ), )] [ ( )] [ ( ( ), )] 0, for all x q qiii F w q q D w q F w q q q Q

Moreover, the Fréchet derivative exists, is

continuous at each point , and

ˆ( )qD w q

q̂ Q

pR

nR

( )x w q

( , ) : ( , ) 0x q F x q

top related