chapter 3

16
Chapter 3 Iterative methods for Linear system Why do we need to solve big linear systems? E.g. 1 Spectral method: Express solution as : u(x)= n i=1 a i φ i (x). To look for a i ’s We need to solve system of linear equation. E.g. 2 Finite difference method : Consider u”(x)= f (x) where 0 <x< 1 and u(0) = a 0 , u(1) = a 1 . In calculus, we know g(x + h) g(x)+ hg 0 (x)+ h 2 2 g”(x) +g(x - h) g(x) - hg 0 (x)+ h 2 2 g”(x) g(x + h)+ g(x - h) 2g(x)+ h 2 g”(x) f ”(x) g(x + h) - 2g(x)+ g(x - h) h 2 Now, partition [0, 1] into x i = ih; h = 1 n+1 . Then, the differential equation can be approximated by u i+1 - 2u i + u i-1 h 2 = f (x i ), i =1, 2, ..., n A u 1 u 2 . . . u n = f (x 1 ) - u 0 f (x 2 ) . . . f (x n ) - u 1 system of linear system where A = 1 h 2 2 -1 -1 2 -1 . . . . . . . . . -1 2 -1 -1 2 How to solve big linear system? From linear algebra, we learnt gaussian elimination. a 11 a 12 ··· a 1n a 21 a 22 ··· a 2n . . . . . . . . . . . . a n1 a n2 ··· a nn x 1 x 2 . . . x n = b 1 b 2 . . . b n by elementary row operation c 11 c 12 ··· c 1n c 22 ··· a 2n . . . . . . c nn x 1 x 2 . . . x n = b 0 1 b 0 2 . . . b 0 n Upper triangular matrix Solved by backward substitution Computational cost : O(n 3 ) [Check it if interested] From linear algebra, we also learnt LU factorization. Decompose a matrix A into A = LU. Then solve the equation by: A~x = ~ b L(u~x)= ~ b 1

Upload: kan-samuel

Post on 08-Jul-2016

212 views

Category:

Documents


0 download

DESCRIPTION

school work

TRANSCRIPT

Chapter 3 Iterative methods for Linear systemWhy do we need to solve big linear systems?

E.g. 1 Spectral method: Express solution as : u(x) =∑ni=1 aiφi(x). To look for ai’s We need to solve system

of linear equation.

E.g. 2 Finite difference method : Consider u”(x) = f(x) where 0 < x < 1 and u(0) = a0, u(1) = a1.

In calculus, we know

g(x+ h) ≈ g(x) + hg′(x) +h2

2g”(x)

+g(x− h) ≈ g(x)− hg′(x) +h2

2g”(x)

⇒ g(x+ h) + g(x− h) ≈ 2g(x) + h2g”(x)

f”(x) ≈ g(x+ h)− 2g(x) + g(x− h)

h2

Now, partition [0, 1] into xi = ih; h = 1n+1 .

Then, the differential equation can be approximated by

ui+1 − 2ui + ui−1h2

= f(xi), i = 1, 2, ..., n

⇔ A

u1u2...un

=

f(x1)− u0f(x2)

...f(xn)− u1

← system oflinear system

where

A =1

h2

2 −1−1 2 −1

. . . . . . . . .−1 2 −1

−1 2

How to solve big linear system?

From linear algebra, we learnt gaussian elimination.a11 a12 · · · a1na21 a22 · · · a2n...

. . . . . ....

an1 an2 · · · ann

x1x2...xn

=

b1b2...bn

→ by elementaryrow operation

c11 c12 · · · c1n

c22 · · · a2n. . .

...cnn

x1x2...xn

=

b′1b′2...b′n

→Upper triangular matrix

Solved bybackward substitution

Computational cost : O(n3) [Check it if interested]

From linear algebra, we also learnt LU factorization.Decompose a matrix A into A = LU.Then solve the equation by:

A~x = ~b⇔ L(u~x) = ~b

1

Let ~y = u(~x). Solve for L~y = ~b first (backward substitution). Then solve for u~x = ~y(easy).

For A =symmetric positive definite (~xTA~x > 0 for all ~x), then A = LLT . Decomposition can be done by Choleskydecomposition : (Numerical analysis).

Computational cost : O(n3)

Goal :Develop iterative method : Find a sequence ~x0, ~x1, ~x2, ... such that ~xR → ~x = solution as k → ∞ (Stop when theerror is small enough).

Splitting method for general linear systems

Consider the system A~x = ~f ; A ∈Mn×n(R) (n is big )

We can split A as follows:A = M + (A−M) = M − (M −A) = N − P

Solving A~x = ~f is equivalent to solving:

(N − P )~x = f ⇔ N~x = P~x+ ~f

We can develop an iterative scheme as follows

N~xn+1 = P~xn + ~f

to get a sequence {~xn}∞n=1 .

It can be shown that if {~xn} converges, it converges to the solution of A~x = f .

Many different choices of the splitting !!

Goal :N is simple to take inverse ( such as diagonal ).

Splitting choice 1 : Jacobi Method

Split A as A = D + (A−D), where D contains the diagonal entries of A only.Then A~x = ~f :

D~xk+1 + (A−D)~xk = ~f

⇔ D~xk+1 = (D −A)~xk + ~f

⇔ ~xk+1 = D−1(D −A)~xk +D−1 ~f

This is equivalent to solving : a11x

k+11 + a12x

k2 + · · ·+ a1nx

kn = f1

a21xk1 + a22x

k+12 + · · ·+ a2nx

kn = f2

...an1x

k1 + an2x

k2 + · · ·+ annx

k+1n = fn

for xk+11

for xk+12...

for xk+1n

Example : Consider 5 −2 3−3 9 12 −1 −7

x1x2x3

=

−123

2

Then:

~xk+1 =

5 0 00 9 00 0 −7

−1 0 2 −33 0 −1−2 1 0

~xk +

5 0 00 9 00 0 −7

−123

Start with ~x0 =

000

. The sequence converges in 7 iteration to ~x7 =

−0.1860.331−0.423

.

Splitting Choice 2 : Gauss Seidel Method

Split A as :A = L+D + U

Develop an iterative scheme as :L~xk+1 +D~xk+1 + U~xk = ~f

This is equivalent to : a11x

k+11 + a12x

k2 + · · ·+ a1nx

kn = f1

a21xk+11 + a22x

k+12 + · · ·+ a2nx

kn = f2

...an1x

k+11 + an2x

k+12 + · · ·+ annx

k+1n = fn

for xk+11

for xk+12...

for xk+1n

Gauss-Seidel is equivalent to~xk+1 = −(L+D)−1U~xk + (D + L)−1 ~f

Exampl : Continue with last example

~xk+1 = −

5 0 0−3 9 02 −1 −7

0 −2 30 0 10 0 0

~xk +

5 0 0−3 9 02 −1 −7

−1 −123

Start with ~x0 =

000

. The sequence converges in 7 iteration to : ~x7 =

−0.1860.331−0.423

.

Do Jacobi / Gauss-Seidel Method always converges ?

Example : Consider :(

1 −57 −1

)(x1x2

)=

(−46

).

Then : Jacobi method gives :

~xk+1 =

(1 00 −1

)−1(0 5−7 0

)~xk +

(1 00 −1

)−1( −46

)

Start with ~x0 =

(00

). Then ~x1 =

(−46

), ~x2 =

(−34−34

), ~x3 =

(−174−294

), ...~x7 =

(−214374−300127

)which

doesn’t converges.

The real solution should be :(x1x2

)=

(11

)!

How about Gauss Seidel ?? It also doesn’t converges !

Our next goal is to check when Jacobi method and Gauss-Seidel method would converge.

Answer : Matrix A must satisfy certain property : Strictly diagonal dominant (SDD).

Analysis of CovergenceLet A = N − P .

3

Goal : Solve A~x = ~f ⇔ (N − P )~x = ~f .We have : N~x = P~x+ ~f to obtain iterative scheme :

N~xm+1 = P~xm + ~f, m = 0, 1, 2, ...

Let ~x∗be the solution of A~x = ~b. ∴ A~x∗ = ~b.Define error em := ~xm − ~x∗, m = 0, 1, 2, ...Now

N~xm+1 = P~xm + ~f (1)

N~x∗ = P~x∗ + ~f (2)

(1)− (2) : N(~xm+1 − ~x∗) = P (~xm − ~x∗)⇔ N~em+1 = P~em ⇔ ~em+1 = N−1P~em

So let M = N−1P , we have~em = Mm~e0

Assume {~u1, ..., ~un} is the set of linear independent eigenvectors of M . (~ui can be complex-valued vectors )Let ~e0 =

∑ni=1 ai~ui. Then :

~em = Mm~e0 =

n∑i=1

aiMm~ui =

n∑i=1

aλmi ~ui

where λ1, λ2, ...λn =corresponding eigenvalues (can be C).Suppose we can order the eigenvalues :

|λ1| ≥ |λ2| ≥ |λ3| ≥ ... ≥ |λn|

Then

~em = λm1

{a1~u1 +

n∑i=2

ai

(λiλ1

)m~ui

}Assume |λ1| < 1. Then ~em → 0 as m→∞.In order to reduce error by a factor of 10−m, then we need k iterations when |λ1|k ≤ 10−m.That is

k ≥ m

− log10(ρ(M)):=

m

R

We call ρ(M) : asymptotic convergence factor.We call R : asymptotic convergence rate.

In other words, the spectral radius of M :

ρ(M) = maxk{|λk| : λk = eigenvalues of M}

is a good indicator of the rate of convergence.But finding ρ(M) is difficult !Solution : Numerically (Next Topic)

Useful Theorem: Gerschgorin Theorem

Consider : ~e = (e1, e2, ..., en)T

=eigenvector of A = (aij) with eigenvalues λ.Then :

A~e = λ~e

4

Hence for each i (1 ≤ i ≤ n)

N∑j=1

aijej = λei

⇔ aiiei +

N∑j=1,j 6=i

aijej = λei

⇔ ei(aii − λ) = −N∑

j=1,j 6=i

aijej

⇔ |ei| |aii − λ| ≤N∑

j=1,j 6=i

|aij | |ej |

Suppose the component of the largest absolute value is |el| 6= 0 (∴ |el| ≥ |ej | ∀j).Then :

|el| |all − λ| ≤N∑

j=1,j 6=l

|alj | |ej | ≤N∑

j=1,j 6=l

|alj | |el|

⇒ |all − λ| ≤N∑

j=1,j 6=l

|alj |

So we have

λ ∈ Ball

N∑j=1,j 6=l

|alj |

That is the ball with centre all and with radii

∑Nj=1,j 6=i |alj |.

Note : We don’t know l unless we know λ and ~e.But : we can conclude

λ ∈N⋃l=1

Ball

N∑j=1j 6=l

|alj |

Example : Determine the upper bounds on the eigenvalues for the matrix:

A =

2 −1 0 0−1 2 −1 00 −1 2 −10 0 −1 2

Then all eigenvalues lie within

⋃Nl=1Ball

(∑Nj=1,j 6=l |alj |

).

For l = 1 and 4, Ball(∑N

j=1,j 6=l |alj |)

= {λ : |λ− 2| ≤ 1}.

For l = 2 and 3, Ball(∑N

j=1,j 6=l |alj |)

= {λ : |λ− 2| ≤ 2} .Therefore

N⋃l=1

Ball

N∑j=1,j 6=l

|alj |

= circle with radius 2 and centre at (2,0)

Since A is symmetric, all eigenvalues are real.Thus 0 ≤ λ ≤ 4.

In factm eigenvalues of A are : λ1 = 3.618, λ2 = 2.618, λ3 = 1.382, λ4 = 0.382.

5

That is ρ(A) = λ1 ≤ 4.

To prove the convergence of Jacobi method and the Gauss-Seidel method, let us introduce some definition.

Definition: A matrix M = (aij) is called strictly diagonally dominant (SDD) if:

|aii| >n∑

j=1,j 6=i

|aij | , i = 1, 2, ..., n

Theorem 1 : If a matrix A is SDD, then A must be non-singular.Proof : Recall all eigenvalues λ ∈

⋃Nl=1Ball

(∑Nj=1,j 6=l |alj |

).

Now A is SDD iff :

|all| >N∑

j=1,j 6=l

|alj | , l = 1, 2, ..., n

Therefore every ball Ball(∑N

j=1,j 6=l |alj |)must not contain 0.

Hence, eigenvalue cannot be 0.If A is singular, then ∃~v 6= 0 such that A~v = ~0 = 0~v.This implies λ = 0 is an engenvalue. Contradiction.So A is non-singular.

Now, we prove the convergence of Jacobi Method.Recall that Jacobi Method can be written as :

~xm+1 = D−1 (D −A) ~xm +D−1 ~f

Theorem: The Jacobi method converges to the solution of Ax = ~f if A is strictly diagonally dominant.Proof : Note that

~xm+1i = − 1

aii

n∑j=1,j 6=i

aij~xmj +

fiaii, i = 1, 2, ..., n (1)

Let x∗ be the solution. Then we also have

~x∗i = − 1

aii

n∑j=1,j 6=i

aij~x∗j +

fiaii

(2)

(1)-(2):

~em+1i = − 1

aii

n∑j=1,j 6=i

aij~emj

Therefore

∣∣~em+1i

∣∣ ≤ n∑j=1,j 6=i

∣∣∣∣aijaii∣∣∣∣ ∣∣~emj ∣∣

≤n∑

j=1,j 6=i

∣∣∣∣aijaii∣∣∣∣ ‖~em‖∞ (

‖~em‖ = max{∣∣~emj ∣∣})

= r ‖~em‖∞

r = maxi

n∑j=1,j 6=i

∣∣∣∣aijaii∣∣∣∣ < 1

⇒∥∥~em+1

∥∥∞ ≤ r ‖~em‖∞

Inductively,‖~em‖∞ ≤ r

m∥∥~e0∥∥∞

6

Therefore‖~em‖∞ → 0 as m→∞

Theorem : The Gauss-Seidel Method converges to the solution of A~x = ~f if A is strightly diagonally domi-nant.Proof : Gauss-Seidel Method can be written as:

~xm+1i = −

i−1∑j=1

aijaii~xm+1j −

n∑j=i+1

aijaii~xmj +

fiaii

(1)

Let ~x∗ be the solution of A~x = f. Then :

~x∗i = −i−1∑j=1

aijaii~x∗j −

n∑j=i+1

aijaii~x∗j +

fiaii

(2)

(1)-(2):

~em+1i = −

i−1∑j=1

aijaii~em+1j −

n∑j=i+1

aijaii~emj

Let ~em = (em1 , ..., emn ), and r = maxi

{∑nj=1,j 6=i

∣∣∣aijaii ∣∣∣} Again, we will prove:∣∣~em+1∣∣∞ ≤ r ‖~e

m‖∞ , m = 0, 1, 2, ...∀i

Induction on i :When i = 1,

∣∣~em+11

∣∣ ≤ n∑j=2

∣∣∣∣a1ja11∣∣∣∣ · ∣∣~emj ∣∣

≤ ‖~em‖∞n∑j=2

∣∣∣∣a1ja11∣∣∣∣

≤ r ‖~em‖∞

Assume∣∣~em+1

∣∣ ≤ r ‖~em‖∞ for k = 1, 2, ..., i− 1.Then :

∣∣~em+1i

∣∣ ≤ i−1∑j=1

∣∣∣∣aijaii∣∣∣∣ ∣∣~em+1

j

∣∣+

n∑j=i+1

∣∣∣∣aijaii∣∣∣∣ ∣∣~emj ∣∣

≤ r ‖~em‖∞i−1∑j=1

∣∣∣∣aijaii∣∣∣∣+ ‖~em‖∞

n∑j=i+1

∣∣∣∣aijaii∣∣∣∣

< ‖~em‖∞n∑

j=1,j 6=i

∣∣∣∣aijaii∣∣∣∣

≤ r ‖~em‖∞

By MI,∣∣~em+1i

∣∣ < r ‖~em‖∞.Hence ∥∥~em+1

∥∥∞ < r ‖~em‖∞

Therefore‖~em‖∞ < rm

∥∥~e0∥∥∞ → 0 as m→∞ as r < 1

7

Example : Consider

A~x =

(10 11 10

)(x1x2

)=

(1221

)= ~b

A is SDD. Therefore both Jacobi method and Gauss-Seidel method converge.Compare the convergence rates of the two methods.

Solution : Jacobi method: D =

(10 00 10

)

~xk+1 = D−1(D −A)~xk +D−1(

1221

)=

(10 00 10

)−1(0 −1−1 0

)~xk +D−1

(1221

)Let

M =

(10 00 10

)−1(0 −1−1 0

)Need to check the spectral radius of M :

ρ(M) = ρ

((10 00 10

)−1(0 −1−1 0

))= ρ

((0 − 1

10− 1

10 0

))Eigenvalue λ of M : λ2 − 1

100 = 0 ⇒ λ = 110 or λ = − 1

10Therefore M is diagonalizable and ρ(M) = 1

10 .

Recall: ~em = ρ(M)m ~Kj where ~em = error = ~xm − ~x∗ =(

110

)m ~Kj

Now, consider the Guass-Seidel method:Take

L+D =

(10 01 10

), U =

(0 −10 0

), ~xk+1 =

(10 01 10

)−1(0 −10 0

)~xk + (L+D)−1~b

We need to check:ρ(M) = ρ

((0 − 1

100 1

100

))Eigenvalue of M : (

λ− 1

100

)λ = 0⇒ λ =

1

100or λ = 0

Therefore

ρ(M) =1

100⇒ ~em =

(1

100

)m~KG−S

So Guass Seidel converges faster.In fact, recall that in order to reduce error by a factor of 10−m, we need k iteration such that:

|λ1|k ≤ 10−m ⇔ k ≥ m

− log10 (ρ(M))

For Jacobi, k ≥ m

− log10( 110 )

= m

For Guass-Seidel, k ≥ m

− log10( 1100 )

= m2

Therefore Gauss-Seidel converges twice as fast as Jacobi.

What if M = N−1P is not diagonalizable.Theorem : Let A ∈Mn×n(C) be a complex-valued matrix.

ρ(A) = maxi{|λi|} where λ1, λ2, ...λn = eigenvalues of A

8

where ρ(A) is called the spectral radius. Then

limk→∞

Ak = 0 iff ρ(A) < 1

Proof : (⇒) Let λ be an eigenvalues with eigenvector ~v.Then Ak~v = λk~v. Thus,

limk→0

Ak~v = limk→∞

λk~v

⇒ 0 = ~v limk→∞

λk

∴ limk→∞ λk = 0⇒ |λ|<1. Thus ρ(A)<1.

(⇐) Let λ1, λ2, ..., λs =eigenvalues of A. From linear algebra, ∃ invertible Q ∈Mn×n (C) such that :

A = QJQ−1

where J =Jordan Canonical form of A.Actually,

J =

Jm1(λ1)

Jm2(λ2). . .

Jms(λs)

where

Jmi(λi) =

λi 1

λi 1. . . . . .

λi 1λi

∈Mmi×mi (C) 1 ≤ i ≤ s

Now, Ak = QJkQ−1 and

Jk =

Jkm1

(λ1)Jkm2

(λ2). . .

Jkms(λs)

Also, for k ≥ mi − 1

Jkmi(λi) =

λki

(k1

)λk−1i

(k2

)λk−2i · · ·

(k

mi−1

)λk−mi+1i

0 λki

(k1

)λk−1i · · ·

(k

mi−2

)λk−mi+1i

. . ....

...

λki

(k1

)λk−1i

0 λki

Since ρ(A) < 1, then |λi| < 1 ∀i.Therefore limk→∞ Jkmi

= 0 ∀i and so Jk → 0 as k →∞.Thus,

limk→∞

Ak = limk→∞

QJkQ−1 = 0

Remark : Following the same idea, ρ(A) > 1 implies

limk→∞

∥∥Ak∥∥∞ as k →∞ ‖A‖∞ = max {aij}

9

Corollary : The iteration scheme xk+1 = Mxk + b converges iff ρ(M) < 1Proof : Consider

xk+1 = Mxk + b (1)

x = Mx+ b (2) x = solution

Thereforeek+1 = M(xk − x)⇒ ek+1 = Mek ⇒ ek = Mke0

If ρ(M) < 1, then Mk → 0 as k →∞.

Splitting Choice 3 : Successive overrelaxation Method (SOR)

A = L+D + U

Consider the iterative scheme : (Introduce sequence xk and x̄k)

Lxk+1 +Dx̄k+1 + Uxk = b (∗)

xk+1 = xk + ω(x̄k+1 − xk

)(∗∗)(

⇔ x̄k+1 =1

ω

(xk+1 + (ω − 1)xk

))Put (∗∗) to (∗), we have: (

L+1

ωD

)xk+1 +

1

ω(ωU + (ω − 1)D)xk = b

or (L+

1

ωD

)xk+1 =

(1

ωD − (D + U)

)xk + b (SOR)

Clearly, SOR is equivalent to splitting A:

A = N − P =

(L+

1

ωD

)−(

1

ωD − (D + U)

)In fact, SOR is equivalent to solving :

a11x̄k+11 + a12x

k2 + · · ·+ a1nx

kn = b1

a21xk+11 + a22x̄

k+12 + · · ·+ a2nx

kn = f2

...an1x

k+11 + an2x

k+12 + · · ·+ annx̄

k+1n = fn

for xk+11 = xk1 + ω

(x̄k+11 − xk1

)for xk+1

2 = xk2 + ω(x̄k+12 − xk2

)...

for xk+1n = xkn + ω

(x̄k+1n − xkn

)Remark : SOR = Gauss-Seidel if ω = 1.

Condition for the convegence of SORTheorem : The necessary condition (not sufficient) condition for SOR to converge is 0 < ω < 2.

Proff : Consider:

det(N−1P ) = det

(((L+

1

ω

)D

)−1(1

ωD − (D + U)

))

= det

((1

ωD

)−1)det

((1

ωD −D

))=

[ω det

(D−1

)] [ 1

ωdet ((1− ω)D)

]= det ((1− ω) I) = (1− ω)

n

10

Sonce det(N−1P

)=∏i λi

(λi are eigenvalues of M−1N

)Therefore

(1− ω)n

= det(N−1P

)=∏i

λi ≤ max |λi|n = ρ(N−1P

)n⇒ ρ

(N−1P

)≥ |ω − 1|

Now, SOR method converges iff ρ(N−1P ) < 1.So, |ω − 1| ≤ ρ

(M−1N

)< 1⇒ 0 < ω < 2.

Remark : In general, SOR conveges if and only if

ρ(N−1P

)= ρ

((L+

1

ωD

)−1(1

ωD − (D+U)

))< 1

Therefore, to find the sufficient condition for SOR method to converges, we need to check the eigenvalues of thematrix: (

L+1

ωD

)−1(1

ωD − (D + U)

)

Example: Let go back to Ax = ~b where A =

(10 11 10

).

Recall ρ(MJacobi) = 110 ; ρ (MG−S) = 1

100G-S converges faster !!

Now, consider the convergence rate for SOR method.

Recall the SOR method reads:

~xk+1 =

(L+

1

ωD

)−1(1

ωD − (D + U)

)~xk +

(L+

1

ωD

)−1~b

So,

M =

(L+

1

ωD

)−1(1

ωD − (D + U)

)=

[1

ω(D + ωL)

]−1 [1

ω(D − ω (D + U))

]= ω (D + ωL)

−1 1

ω((1− ω)D − ωU)

= (D + ωL)−1

((1− ω)D − ωU)

We examine ρ(M)

(1− ω)D − ωU = (1− ω)

[10 00 10

]− ω

(0 10 0

)=

(10 (1− ω) −ω

0 10 (1− ω)

)and

(D + ωL) =

(10 0ω 10

)⇒ (D + ωL)

−1=

(110 0− ω

100110

)∴MSOR = (D + ωL)

−1((1− ω)D − ωU) =

(1− ω − ω

10

−ω(1−ω)10ω2

100 + 1− ω

)Characteristic polynomial of MSOR is:

[(1− ω)− λ]

[ω2

100+ 1− ω − λ

]− ω2 (1− ω)

100= 0

11

Simplify:

λ2 − λ[2 (1− ω) +

ω2

100

]+ (1− ω)

2= 0

Then:

λ = (1− ω) +ω2

200± ω

20

√4 (1− ω) +

ω2

100

When ω = 1 (Gauss-Seidel method), λ = 0 or λ = 1100 .

Changing ω changes λ.

Choice of ω??Let us choose ω such that 4 (1− ω) + ω2

100 = 0So, the equation has equal root :

λ = (1− ω) +ω2

200= (1− ω)− 2 (1− ω) = ω − 1(

∵ 4 (1− ω) +ω2

100

)The smallest value of ω (2 > ω > 0) such that 4 (1− ω) + ω2

100 = 0 is:

ω = 1.002512579

(Which is very close to gauss-Seidel)But !! ρ(MSOR) = 0.002512579 (Compare to ρ (MGS) = 0.01, ρ (MJ) = 0.1)∴ converges much faster than G-S!!

Remark :

• ρ(MSOR) is very sensitive to ω. If you can hit the right value of ω, we can improve the spped of convergencesignificantly !!!

• One major task in computational math is to find the right parameter !!

How can we choose optimal ω for simple case??

• In general, difficult to choose ω

• ω is usually chosen as 0 < ω < 2

• But for some special matrix, optimum ω can be easily found.

Definition: Consider the system A~x = ~b. And let A = D + L+ U .If the eigenvalue of :

αD−1L+1

αD−1U, α 6= 0

is independent of α. Then the matrix is said to be consistently ordered.

Theorem: If A is consistently ordered, then the optimum ω for SOR method is:

ω =2

1 +√

1− ρ(MJ)2

where MJ = M in the Jacobi method, MJ = −D−1 (L+ U)

Proof: Consistently ordered means the eigenvalues of(αD−1L+

1

αD−1U

)

12

are the same as those for (D−1L+D−1U

)(Jacobi matrix, put α = 1)

Now consider the eigenvalues of MSOR. The characteristic polynomial

det (MSOR − λI) = 0 or

det(

(D + ωL)−1

((1− ω)D − ωU)− λI)

= 0 or

det (D + ωL)−1

det [(1− ω)D − ωU − λ (D + ωL)] = 0

So, λ satisfies:det [(1− ω − λ)D − ωU − λωL] = 0

Sinbce ω 6= 0, the non-zero eigenvalue λmust satisfy:

det

[((1− ω − λ)

ω√λ

D − 1√λU −

√λL

)ω√λ

]= 0

⇒ det

[√λD−1L+

1√λD−1U − (λ+ ω − 1)

ω√λ

I

]= 0

Since A is consistently ordered, the eigenvalues of

√λD−1L+

1√λD−1U

are the same as those of MJ = −D−1 (L+ U).

Let the eigenvalues of MJ be µ, then the non-zero eigenvalue of MSOR safisfies:

µ =λ+ ω − 1

ω√λ

for some λ (∗)

For ω 6= 0, we can solve (∗) and set

λ = (1− ω) +µ2ω2

2± µω

√(1− ω) +

µ2ω2

4(∗∗)

Each µ gives one or two eigenvalues λ.

λ depends on ω. We want to find ω such that ρ (MSOR) is as small as possible.We can show that this happens whenever the roots in (∗∗) are equal when µ takes the maximum norm. That is,

µω

√(1− ω) +

µ2ω2

4= 0

⇒ ω =2

1∓√

1− µ2

We look for smallest value of ω (0 < ω < 2) ad so

ω =2

1 +√

1− µ2=

2

1 +√

1− ρ(BJ)2

Example: Consider A =

(10 11 10

).

Then: αD−1L+ 1αD−1U =

[0 − 1

10α− α

10 0

]⇒ λ2 −

( −110α

) (− α

10

)= 0⇒ λ2 = 1

100

13

∴A is consistently ordered.∴ Optimal ω :

ω =2

1 +√

1− ρ(MJ)2=

2

1 +√

1− 1100

= 1.0025126 (sameasexample1)

∴The fastest convergence rate is:ρ (MSOR) = 0.0025126

SOR method:Take N = 1

ωD + L; P =(1ω − 1

)D − U .

Then: A = N − P .The iterative scheme:

~xk+1 =

(1

ωD + L

)−1((1

ω− 1

)D − U

)~xk +

(1

ωD + L

)−1~b

= (D + ωL)−1

[(1− ω)D − ωU ] ~xk +

(1

ωD + L

)−1~b

Recall:

• SOR converges ⇒|ω − 1| < 1 or 0 < ω < 2

• In general, convergence of SOR ⇐⇒ ρ(

(D + ωL)−1

[(1− ω)D − ωU ])< 1

• ω = 1→G-S method

Remark: In particular, tridiagonal matrix is consistently ordered:

A =

λ1 ∗

λ2 ∗. . . . . . . . .

∗ λn−1 ∗∗ λn

Example: Solve −u” = f, u(0) = a, u(1) = bPartition [0, 1] by x0 = 0 < x1 = h < x2 = 2h < · · · < 1 = xn

Approximate u” by u(x+h)−2u(x)+u(x−h)h2 , then −u” = f can be approximated by

A

u(x1).........

u(xn−1)

= ~b, where A =

2 −1−1 2 −1

. . . . . . . . .−1 2 −1

−1 2

Example of consistent ordering matrix

Example 1: Consider a block tridiagonal matrix of the form

A =

D1 T12T21 D2 T23

. . . . . . . . .

Tp,p−1 Dp

14

where Di are diagonal matrices. Then A is consistently ordered.To see this, it suffices to see that D−1L+D−1U and zD−1L+ 1

zD−1U are similar for all z 6= 0.

Note,

zD−1L+1

zD−1U = X

(D−1L+D−1U

)X−1

where

X =

I

zIz2I

. . .zp−1I

Example 2: Another type of consistent ordered matrices:

A =

T1 D12

D21 T2 T23. . . . . . . . .

Dp−1,pDp,p−1 Tp

where Ti ∈Mn×n are tridiagonal matrices.Proof: Complicated !

Theorem: [D.Young] Assume:

1. ω ∈ (0, 2)

2. MJAC has only real eigenvalue

3. β = ρ (MJAC) < 1

4. A is consistently ordered

Then: ρ (MSOR,ω) < 1In face,

ρ (MSOR,ω) =

{1− ω + 1

2ω2β2 + ωβ

√1− ω + ω2β2

4 , for 0 < ω ≤ ωoptω − 1 for ωopt ≤ ω < 2

whereωopt =

2

1 +√

1− β2

Convergence condition for SOR

Theorem: If A is symmetric positive definite, then the SOR method converges for all 0 < ω < 2.

Theorem: If A is strictly diagonally dominant, then the SOR converges for 0 < ω ≤ 1.

Proof: If A is SDD, then aii 6= 0 and A is invertible. The SOR method reads:

~xk+1 = MSOR~xk + ~c

whereMSOR = (D + ωL)

−1((1− ω)D − ωU) , (A = L+D + U)

We need to show that if 0 < ω ≤ 1, then ρ(MSOR) < 1We will prove by contradiction.

15

Suppose ∃ eigenvalue such that |λ| ≥ 1. Then

det (λI −MSOR) = 0

Also,

det(λ (D + ωL)

−1)[

(D + ωL)− 1

λ((1− ω)D − ωU)

]= 0

⇒ det(

(D + ωL)−1)

det

((1− 1

λ(1− ω)

)D + ωL+

ω

λU

)= 0

⇒ det (C) = 0

whereC =

(1− 1

λ(1− ω)

)D + ωL+

ω

λU

since aii 6= 0, det (D + ωL)−1 6= 0.

Then

|Cii| =

∣∣∣∣1− 1

λ(1− ω)

∣∣∣∣ |aii| ≥ [1− 1

|λ|(1− ω)

]|aii|

≥ ω |aii| > ω

n∑j=1,j 6=1

|aij | ≥ ωi−1∑j=1

|aij |+ω

|λ|

n∑j=i+1

|aij |

So C is also SDD.Thus det(C) 6= 0. (Contradiction)∴ all eigenvalues of A should satisfy |λ| < 1.Thus ρ(MSOR) < 1 and hence SOR converges.

16