chapter3 supplementary

Upload: kan-samuel

Post on 02-Mar-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/26/2019 Chapter3 Supplementary

    1/5

    Chapter 3 Supplementary

    Another condition for the convergence

    LetA = N P (N is invertible).Consider the iterative scheme: Nxk+1 =Pxk + f orxk+1 =N1Pxk + N1f.

    Theorem (Household-John). Suppose thatA andN+N A are self-adjoint positive definite. (Recall that

    N

    = (N)T)Then, the iterative scheme: xk+1 =N1Pxk + N1f converges.

    Proof. We haveM = N1P= N1(N A) = (I N1A). It suffices to show that ||< 1 for any eigenvaluesofM. Letx be the corresponding eigenvector. ( and x could be complex.) We have:

    Mx= x (I N1A)x= x (N A)x= Nx (1 )Nx= Ax.

    Note that = 1. Otherwise, Ax = 0, contradicting that A is PD.Multiplyx on both sides: (1 )xNx= xAx

    xNx= 1

    1 xAx (*)

    Taking conjugate transpose on both side:

    (1 )xNx= xAx= xAx

    xNx= 1

    1 xAx (**)

    Adding (*) and (**) and subtract xAx from two sides:

    x(N + N A)x=

    1

    1 +

    1

    1 1

    xAx=

    1 ||2

    |1 |2xAx.

    SinceA and N N Aare PD, we have: xAx> 0 and x(N + N A)x> 0. Hence, 1 ||2 >0 and||< 1.

    So, (N1P)< 1 and the iterative scheme converges.

    Example:

    SupposeA is self-adjoint positive definite. Using Household-John Theorem, prove that: SOR method convergesif and only if 0<

  • 7/26/2019 Chapter3 Supplementary

    2/5

    It can be proved that SSOR is associated to the following splitting:

    A= NSSOR PSSOR

    where: NSSOR()= 1

    (2 )(D +L) D1 (D +U) and

    PSSOR()= 1

    (2 )[(1 )D L] D1 [(1 )D U]

    (Verify that: A= NSSOR PSSOR and N1SSORPSSOR= MSSOR !)

    Theorem. IfA is self-adjoint and positive-definite, then SSOR converges if and only if0<

  • 7/26/2019 Chapter3 Supplementary

    3/5

    So, we have:

    f(k+1) d= 0 (Ak+1 b) dk = 0

    (A(k +kdk) b) dk = 0 (Ak b) dk +kd

    k Adk = 0

    Optimal k is: k = (Ak b) dk

    dk

    Adk

    Convergence of gradient method

    We consider the gradient method with constant time step : (*) k+1 =k +dk, where Here, is chosen to be small enough

    (**) dk =f(k) =(Ak b)

    Let be the solution. Then: = +(A b) (***).(*) (***): ek+1 = (I A)ek (ek =k = error vector)Similar to what we have discuss before: ek+1 (I A)e

    kNeed: (I A)< 1 ! Let1, 2, . . . , M= eigenvalues ofA. Then: 1 1, 1 2, . . . 1 Mare the

    eigenvalues ofI A.

    (I A)< 1 |1 j |< 1 j 1 j 1(always truesincej >0)

    j

  • 7/26/2019 Chapter3 Supplementary

    4/5

    (a) k+1 =k +kdk

    (b) k = rk dk

    dk, dk

    (c) dk+1 rk+1 +kdk

    (d) k =rk+1, dk

    dk, dk

    Analysis of each steps:

    Recall that for gradient descent method, optimal time step k is:

    k =(Ak b) dk

    dk Adk .

    (b) means k is optimal.Now, (c) meansdk+1, dk= rk+1 +kdk, dk.

    dk+1, dk= 0 implies: k = rk+1, dk

    dk, dk , which is (d).

    But: is dj , dj= 0 for all i=j ?? Yes!

    Lemma 1. Form= 0, 1, 2, . . . ,we have:

    Span(d0, d1, . . . , dm) =Span(r0, r1, . . . , rm) =Span(r0, Ar0, . . . , Amr0)

    Recall: Span(0, 1, . . . , m) ={ Rm :a00 + +amm, aj R}

    Proof. We use mathematical induction. When m = 0, obviously true.Suppose now the equality holds for m = k.Now, multiply (a) by A: we get Ak+1 b= Ak b +kAd

    k which gives:

    rk+1 =rk +kAdk. ()

    By induction hypothesis:dk Span{r0, Ar0, . . . , Ak}

    and so:Adk Span{r0, Ar0, . . . , Ak+1r0}.

    From (), we see that

    Span{r0, . . . , rk+1} Span{r0, Ar0, . . . , Ak+1r0}.

    Also, from induction hypothesis, Akr0 Span(d0, . . . , dk).

    Ak+1r0 Span(Ad0, . . . , Adk)

    From (), Adi Span(ri, ri+1). Ar0 Span(r0, r1, . . . , rk+1). we have:

    Span(r

    0

    , Ar

    0

    , . . . , A

    k+1

    r

    0

    ) Span(r

    0

    , r

    1

    , . . . , r

    k+1

    ).Thus:

    Span(r0, Ar0, . . . , Ak+1r0) = Span(r0, r1, . . . , rk+1).

    Now, from (c):dk+1 =rk+1 +kd

    k.

    It is clear that:Span(r0, . . . , rk+1) = Span(d0, d1 . . . , dk+1).

    By M.I., the theorem is true!

    Lemma 2. The search directiondi are pairwise-conjugate. That is:

    d

    i

    , d

    j

    = 0 fori=j. (

    )Also,ri are pairwise orthogonal. That is:

    ri rj = 0 fori=j. ()

    4

  • 7/26/2019 Chapter3 Supplementary

    5/5

    Proof. Suppose the statement is true for i, j k. By Lemma 1, Span(d0, . . . , dj) = Span(r0, . . . , rj). From theinduction hypothesis,

    rk dj = 0 for j = 0, 1, 2, . . . , k 1 (dj Span(r0, r1, . . . , rj))

    Sincerk+1 =rk +kAdk, we have

    rk+1 dj =rk dj +dk, dj = 0 for j = 0, 1, 2, . . . , k 1 induction hypothesis

    Also,k is optimal and so

    rk+1 dk =f(k +kdk) dk =

    d

    df(k +dk)

    =k

    = 0.

    Hence,rk+1 dj for j = 0, 1, 2, . . . , k.By Lemma 7.1, rk+1 rj = 0 for j = 0, 1, 2, . . . , k, which proves () fori, j k+ 1.

    (rj Span(d0, . . . , dk))Now, Adj Span(r0, r1, . . . , rj+1) since rk+1 =rk +kAdk

    rk+1, dj= rk+1 Adj = 0 forj = 0, 1, 2, . . . , k 1 (Adj Span(r0, r1, . . . , rj+1))

    Now, dk+1 =rk+1 +kdk (from (c))

    Also, from induction hypothesis, dk, dj= 0 for j = 0, 1, 2, . . . , k 1.

    dk+1, dj= rk+1, dj +kdk, dj= 0 for j = 0, 1, 2, . . . , k 1

    Also, by construction, dk+1, dk= 0.

    di, dj= 0 for all 1 i, j k + 1 and i =j.

    By M.I. the Lemma is true!

    Theorem. For somem M, we haveAm =b.

    Proof. Note thatri rj = 0 for i =j . Since in RM, there are at most Mpairwise orthogonal non-zero vectors,it follows that rm =Am b= 0 for some m M.

    Remark:

    - From the theorem, we see that conjugate gradient method must converge in M iterations

    - Gradient descent method might not converge to exact sol.

    - Conjugate gradient method converges to the exact solution in less thanM iterations.

    - In fact, the convergence rate for both steepest descent method and conjugate gradient method dependson(A).

    How about if(A) is large?? Preconditioning.

    Pre-conditioning

    Recall that the gradient method is equivalent to minimizing:

    minRM

    f() = minRM

    12

    A b

    LetE = non-singular M Mmatrix. Define: = E. Then = E1Define:

    f() =f() =f(E1) = 12

    (E1) A(E1) b E1

    =1

    2 (ETAE1) (ETb) =

    1

    2 A b.

    whereA= ETAE1 andb= ETb. (ET = (E1)T )If(

    A)