chapter3 supplementary

7/26/2019 Chapter3 Supplementary

1/5

Chapter 3 Supplementary

Another condition for the convergence

LetA = N P (N is invertible).Consider the iterative scheme: Nxk+1 =Pxk + f orxk+1 =N1Pxk + N1f.

Theorem (Household-John). Suppose thatA andN+N A are self-adjoint positive definite. (Recall that

N

= (N)T)Then, the iterative scheme: xk+1 =N1Pxk + N1f converges.

Proof. We haveM = N1P= N1(N A) = (I N1A). It suffices to show that ||< 1 for any eigenvaluesofM. Letx be the corresponding eigenvector. ( and x could be complex.) We have:

Mx= x (I N1A)x= x (N A)x= Nx (1 )Nx= Ax.

Note that = 1. Otherwise, Ax = 0, contradicting that A is PD.Multiplyx on both sides: (1 )xNx= xAx

xNx= 1

1 xAx (*)

Taking conjugate transpose on both side:

(1 )xNx= xAx= xAx

xNx= 1

1 xAx (**)

Adding (*) and (**) and subtract xAx from two sides:

x(N + N A)x=

1

1 +

1

1 1

xAx=

1 ||2

|1 |2xAx.

SinceA and N N Aare PD, we have: xAx> 0 and x(N + N A)x> 0. Hence, 1 ||2 >0 and||< 1.

So, (N1P)< 1 and the iterative scheme converges.

Example:

SupposeA is self-adjoint positive definite. Using Household-John Theorem, prove that: SOR method convergesif and only if 0<


2/5

It can be proved that SSOR is associated to the following splitting:

A= NSSOR PSSOR

where: NSSOR()= 1

(2 )(D +L) D1 (D +U) and

PSSOR()= 1

(2 )[(1 )D L] D1 [(1 )D U]

(Verify that: A= NSSOR PSSOR and N1SSORPSSOR= MSSOR !)

Theorem. IfA is self-adjoint and positive-definite, then SSOR converges if and only if0<


3/5

So, we have:

f(k+1) d= 0 (Ak+1 b) dk = 0

(A(k +kdk) b) dk = 0 (Ak b) dk +kd

k Adk = 0

Optimal k is: k = (Ak b) dk

dk

Adk

Convergence of gradient method

We consider the gradient method with constant time step : (*) k+1 =k +dk, where Here, is chosen to be small enough

(**) dk =f(k) =(Ak b)

Let be the solution. Then: = +(A b) (***).(*) (***): ek+1 = (I A)ek (ek =k = error vector)Similar to what we have discuss before: ek+1 (I A)e

kNeed: (I A)< 1 ! Let1, 2, . . . , M= eigenvalues ofA. Then: 1 1, 1 2, . . . 1 Mare the

eigenvalues ofI A.

(I A)< 1 |1 j |< 1 j 1 j 1(always truesincej >0)

j


4/5

(a) k+1 =k +kdk

(b) k = rk dk

dk, dk

(c) dk+1 rk+1 +kdk

(d) k =rk+1, dk

dk, dk

Analysis of each steps:

Recall that for gradient descent method, optimal time step k is:

k =(Ak b) dk

dk Adk .

(b) means k is optimal.Now, (c) meansdk+1, dk= rk+1 +kdk, dk.

dk+1, dk= 0 implies: k = rk+1, dk

dk, dk , which is (d).

But: is dj , dj= 0 for all i=j ?? Yes!

Lemma 1. Form= 0, 1, 2, . . . ,we have:

Span(d0, d1, . . . , dm) =Span(r0, r1, . . . , rm) =Span(r0, Ar0, . . . , Amr0)

Recall: Span(0, 1, . . . , m) ={ Rm :a00 + +amm, aj R}

Proof. We use mathematical induction. When m = 0, obviously true.Suppose now the equality holds for m = k.Now, multiply (a) by A: we get Ak+1 b= Ak b +kAd

k which gives:

rk+1 =rk +kAdk. ()

By induction hypothesis:dk Span{r0, Ar0, . . . , Ak}

and so:Adk Span{r0, Ar0, . . . , Ak+1r0}.

From (), we see that

Span{r0, . . . , rk+1} Span{r0, Ar0, . . . , Ak+1r0}.

Also, from induction hypothesis, Akr0 Span(d0, . . . , dk).

Ak+1r0 Span(Ad0, . . . , Adk)

From (), Adi Span(ri, ri+1). Ar0 Span(r0, r1, . . . , rk+1). we have:

Span(r

0

, Ar

0

, . . . , A

k+1

r

0

) Span(r

0

, r

1

, . . . , r

k+1

).Thus:

Span(r0, Ar0, . . . , Ak+1r0) = Span(r0, r1, . . . , rk+1).

Now, from (c):dk+1 =rk+1 +kd

k.

It is clear that:Span(r0, . . . , rk+1) = Span(d0, d1 . . . , dk+1).

By M.I., the theorem is true!

Lemma 2. The search directiondi are pairwise-conjugate. That is:

d

i

, d

j

= 0 fori=j. (

)Also,ri are pairwise orthogonal. That is:

ri rj = 0 fori=j. ()

4


5/5

Proof. Suppose the statement is true for i, j k. By Lemma 1, Span(d0, . . . , dj) = Span(r0, . . . , rj). From theinduction hypothesis,

rk dj = 0 for j = 0, 1, 2, . . . , k 1 (dj Span(r0, r1, . . . , rj))

Sincerk+1 =rk +kAdk, we have

rk+1 dj =rk dj +dk, dj = 0 for j = 0, 1, 2, . . . , k 1 induction hypothesis

Also,k is optimal and so

rk+1 dk =f(k +kdk) dk =

d

df(k +dk)

=k

= 0.

Hence,rk+1 dj for j = 0, 1, 2, . . . , k.By Lemma 7.1, rk+1 rj = 0 for j = 0, 1, 2, . . . , k, which proves () fori, j k+ 1.

(rj Span(d0, . . . , dk))Now, Adj Span(r0, r1, . . . , rj+1) since rk+1 =rk +kAdk

rk+1, dj= rk+1 Adj = 0 forj = 0, 1, 2, . . . , k 1 (Adj Span(r0, r1, . . . , rj+1))

Now, dk+1 =rk+1 +kdk (from (c))

Also, from induction hypothesis, dk, dj= 0 for j = 0, 1, 2, . . . , k 1.

dk+1, dj= rk+1, dj +kdk, dj= 0 for j = 0, 1, 2, . . . , k 1

Also, by construction, dk+1, dk= 0.

di, dj= 0 for all 1 i, j k + 1 and i =j.

By M.I. the Lemma is true!

Theorem. For somem M, we haveAm =b.

Proof. Note thatri rj = 0 for i =j . Since in RM, there are at most Mpairwise orthogonal non-zero vectors,it follows that rm =Am b= 0 for some m M.

Remark:

- From the theorem, we see that conjugate gradient method must converge in M iterations

- Gradient descent method might not converge to exact sol.

- Conjugate gradient method converges to the exact solution in less thanM iterations.

- In fact, the convergence rate for both steepest descent method and conjugate gradient method dependson(A).

How about if(A) is large?? Preconditioning.

Pre-conditioning

Recall that the gradient method is equivalent to minimizing:

minRM

f() = minRM

12

A b

LetE = non-singular M Mmatrix. Define: = E. Then = E1Define:

f() =f() =f(E1) = 12

(E1) A(E1) b E1

=1

2 (ETAE1) (ETb) =

1

2 A b.

whereA= ETAE1 andb= ETb. (ET = (E1)T )If(

A)

chapter3 supplementary

Documents