ma 580; numerical analysis i - nc state university · 2018. 7. 23. · matrix computations, golub...

Part VIIIb: Eigenvalue Conditioning

MA 580; Numerical Analysis I

C. T. KelleyNC State University

tim [email protected]

Version of November 14, 2016

NCSU, Fall 2016Part VIIIb: Eigenvalue Conditioning

c©C. T. Kelley, I. C. F. Ipsen, 2016 Part VIIIb: Eigenvalue Conditioning MA 580, Fall 2016 1 / 46


References

This part of the notes comes from

Applied Numerical Linear Algebra, Demmel, SIAM 1997

Matrix Computations, Golub and Van Loan, Johns Hopkins,2013

Numerical Linear Algebra, Trefethen and Bau, SIAM 1997

Introduction to Matrix Computations, Stewart, AcademicPress, 1973



Eigenvalue Conditioning

Here’s some bad news.

A =

0 1 00 0 1 0

. . .. . .

. . .. . .

0 0 1 00 0 1

0 0

,B =

0 1 00 0 1 0

. . .. . .

. . .. . .

0 0 1 00 0 1

ε 0 0

.



Eigenvalues

A is a Jordan block. So

σ(A) = {0}Algebraic multiplicity NGeometric multiplicity 1

As for B . . .



Spectrum of B

Suppose Bx = λx and x1 = 1 then

0 1 00 0 1 0

. . .. . .

. . .. . .

0 0 1 00 0 1

ε 0 0

1x2...

xN−2xN−1xN

=

x2...

xN−2xN−1xNε

= λ

1x2...

xN−2xN−1xN



and so . . .

λ = x2, x3 = λx2 = λ2, . . . ,

xN = λxN−1 = λN−1, ε = λxN−1 = λN .

So ε = λN .We have N solutions

λk = ε1/Ne2πik/N 1 ≤ k ≤ N.

which are evenly spaced on the circle of radius ε1/N on thecomplex plane.



Condition number

Any sensible definition of condition number is the ratio of

the (relative) size of the change in the output, which is ε1/N

to the size of the change in the input, which is O(ε)

So

κ =ε1/N

ε= ε

N−1N →∞

as ε→ 0.So, Jordan blocks are bad things. Any hope for nicer problems?



The characteristic polynomial

Recall that the characteristic polynomial of A is

p(z ,A) = det(zI− A)

and its roots are the eigenvalues of A.The roots of a polynomial are continiuous functions of thecoefficients, so then

σ(A + δA)→ σ(A) as ‖δA‖ → 0.

As the Jordan block example shows, eigenvalues need not bedifferentiable functions of the coefficients.



Simple eigenvlaues because I’m tired of Jordan blocks

Suppose λ is a simple eigenvalue of A.

Is a nearby simple eigenvalue λ+ δλ of A + δA out there?

Is there a useful definition of condition number?



Left and Right eigenvectors

Ax = λx says “x is a right eigenvector.

σ(A) = σ(AT ), so there’s also a left eigenvector

ATy = λy or yTA = λyT

From now on, x will be a right eigenvector, y a lefteigenvector, and

‖x‖ = ‖y‖ = 1



Perturbation theory for simple eigenvalues

Theorem: Assume

λ is a simple eigenvalue of A,

x (y) are normalized right (left) eigenvectors,

λ+ δλ is the eigenvalue of A + δA nearest to λ.

Let θ(y, x) be the acute angle between y and x.Note that sec(θ(y, x)) = 1/|yTx|.Then . . .



Perturbation estimates

δλ =yT δAx

yTx+ O

(‖δA‖2

)and

|δλ| ≤ ‖δA‖xyTx

+ sec(θ(y, x))‖δA‖+ O(‖δA‖2

)So sec(θ(y, x)) = 1/|yTx| is the condition number of the simpleeigenvalue λ.



Proof: I

We’ve done things like this for equations.

(A + δA)(x + δx)− Ax = (λ+ δλ)(x + δx)− λx

so,Aδx + δAx + δAδx = λδx + δλx + δλδx

Ignore for now any term with two δs in it and multiply by yT

yTAδx + yT δAx ≈ λyT δx + δλyTx

Note that yTAδx = λyT δx because yTA = λyT . So . . .



Proof: II

yT δAx ≈ δλyTx

so

δλ ≈ yT δAx

yTx

The terms we ingored, after multiplying by yT are

yT δAδx and δλyT δx

If we now put them back we get . . .



Proof: III

δλ =yT δAx

yTx+

yT (δAδx− δλδx)

yTx

The terms we neglected are smaller than the main term, if δA issufficiently small, so

‖δλ‖ = O

(‖δA‖yTx

)We now assume that δA is small enough that we can ignore factorsof 1/yTx in the high order terms.This means that

yT (δAδx− δλδx)

yTx= O(‖δA‖‖δx‖).



Proof: IV

The power method says that

‖δx‖ = O(|δλ|)

if A is sufficiently small. That’s it.



Observations

If A = AT , then |yTx| = 1, and the conditioning is perfect.

For the Jordan block example,

x =

10...00

and y =

00...01

so the condition number is infinite.



Gershgorin Theorem

Let B be a square matrix. The eigenvalues of B lie in the union ofthe disks

Gi =

z

∣∣∣∣ |z − bii | ≤∑j 6=i

|bij |

for 1 ≤ i ≤ N



Proof of Gershgorin Theorem: I

Let λ ∈ σ(A) and let x be a corresponding eigenvector. Let i besuch that

|xi | = ‖x‖∞.

Since Ax = λx(λ− aii )xi =

∑j 6=i

aijxj .



Proof of Gershgorin Theorem: II

We picked i so that |xj |/|xi | ≤ 1 for j 6= i , so

|λ− aii | ≤∑j 6=i

|aij ||xj |/|xi |. ≤∑j 6=i

|aij |,

as asserted.



Diagonalizable Matrices and the Bauer-Fike Theorem

Theorem: Suppose

A is diagonalizable with only simple eigenvalues {λi}xi (yi ) are the normalized left (right) eigenvectorscorresponding to λi .

Then the eigenvalues of A + δA lie in disks Bi where

Bi =

{z | |z − λi | ≤

N‖δA‖|yTi xi |

}



Proof: preliminaries

Lemma: Diagonalization Let S be the matrix with the righteigenvectors as columns. Then

S−1 =

(y1

yT1 x1,

y2yT2 x2

, . . .yN

yTNxN

)T

Proof: plug in.



Proof: more preliminaries

Lemma: Suppose the columns of S are normalized (‖si‖ = 1).Then ‖S‖ ≤

√N.

Proof: Let x be the unit vector so that ‖Sx‖ = ‖S‖. UseCauchy-Schwarz

N∑i=1

|ai ||bi | ≤ ‖a‖2‖b‖2

‖S‖ = ‖Sx‖ = ‖∑N

i=1 sixi‖ ≤∑N

i=1 ‖si‖|xi |

≤√∑N

i=1 ‖si‖2√∑N

i=1 x2i ≤

√∑Ni=1 =

√N.



Proof of Bauer-Fike: I

Note that S is the diagonalizing transformation for A, so

S−1AS = Λ.

Apply Gersgorin to

B = S−1(A + δA)S = Λ + F

where F = S−1δAS.Gersgorin says that the eigenvalues of B lie in the disks,

Gi =

|λ− (λi + fii )| ≤∑j 6=i

|fij |



Proof of Bauer-Fike: II

Since

Gi =

|λ− (λi + fii )| ≤∑j 6=i

|fij |

Any λ ∈ Gi satisfies

|λ− λi | − |fii | ≤∑j 6=i

|fij | which implies that

|λ− λi | ≤N∑i=1

|fij | ≤√N

√√√√ N∑i=1

|fij |2 =√N‖F(i , :)‖



Proof of Bauer-Fike: III

So we need a bound on the ith row of F.Note that if B = B1B2, then

‖B(i , :)‖ ≤ ‖B1(i , :)‖‖B2‖

as you can see from the rules for matrix-matrix multiply.So, since F = S−1δAS,

‖F(i , :)‖ ≤ ‖S−1(i , :)‖‖δA‖‖S‖

and we have formulae to estimate all this stuff . . .



Proof of Bauer-Fike: IV

Since the columns of S are the normalized eigenvectors

‖S‖ ≤√N

by one of the lemmas.

Use the other lemma and ‖yi‖ = 1 to get

‖S−1(i , :)‖ ≤ 1

|yTi xi |

and glue everything together to get . . .



Proof of Bauer-Fike: V

‖F(i , :)‖ ≤√N

|yTi x|‖δA‖.

Plug into

|λ− λi | ≤N∑i=1

|fij | ≤√N

√√√√ N∑i=1

|fij |2 =√N‖F(i , :)‖

and we’re done.



`p etimates: Bauer-Fike revisited

Theorem: Suppose

A is diagonalizable with eigenvalues {λi}xi are the normalized left eigenvectors corresponding to λi .

µ ∈ σ(A + δA)

Thenmin

λ∈σ(A)|µ− λ| ≤ κp(S)‖δA‖p

where S is the matrix whose columns are the eigenvectors of A.



Proof: I

If µ ∈ σ(A), then the left side of the estimate is 0.Here we let ‖ · ‖ be any `p norm.Otherwise, the matrix Λ− µI is not singular, but

S−1(A + δA− µI)S = S−1(A− µI)S + S−1δAS

= (Λ− µI) + S−1δAS

is singular.



Proof: II

Multiply the singular matrix by (Λ− µI)−1 to see that

I + (Λ− µI)−1(S−1δAS)

is also singular. Hence

1 ≤ ‖(Λ− µI)−1(S−1δAS)‖ ≤ ‖(Λ− µI)−1‖‖S−1‖‖δA‖‖S‖

= maxλ∈σ(A)1

|λ−µ|κp(S)‖δA‖.

That’s it since max(1/x) = 1/(min x).



The QR algorithm

Consider this iteration:

A0 = Afor k = 0, . . . do

Factor Ak = QRAk+1 = RQ

end for



What does this have to do with eigenvalues?

Note thatAk+1 = QTQRQ = QTAQ

is similar to Ak , so has the same eigenvalues.Let’s give it a shot.

A=[1 2 3; 4 5 6; 7 8 9];

for i=1:10

[q,r]=qr(A); A=r*q;

end



Results

The eigenvalues are

>> eig(A)

ans =

1.6117e+01

-1.1168e+00

-1.3037e-15

and when the loop’s done

A =

1.6117e+01 4.8990e+00 -6.9295e-16

-8.0448e-11 -1.1168e+00 1.6506e-15

0 0 0



What?

For diagonalizable A with distinct real eigenvalues

The iteration converges to an upper triangular matrix,

which is similar to A,

and therefore has the same eigenvalues.

You can understand this via the power method.This is the core of Matlab’s eig code.



What a real code must do

Reduce A to a form with a cheap QR factorization(upper Hessenberg),

deal with multiple eigenvalues,

deal with complex conjugate pairs of eigenvalues,

build in shifts, . . .



A feel-good theorem

Suppose:

A is symmetric.

A is nonsingular.

The QR iterations An, Rn, Qn converge to A, Q, R.

Then A is diagonal with the eigenvalues of A along the diagonal.



Feel-good proof: I

Convergence implies that

A = QR = RQ

Then symmetry implies that

AT = QT RT = RT QT = A = QR = RQ.

So

RT R = RT QT QR = AT A = A2 = RQQT RT = RRT



Feel-good proof: II

Since R is upper triagular and

RT R = RRT

R is diagonal. Let’s prove this.Lemma: Suppose U is upper triangular and UUT = UTU. ThenU is diagonal.



Proof of Lemma: I

The proof is via induction. It’s clear for N = 1. Assume that thetheorem holds for dimensions up to N − 1. Let U be N × N uppertriangular and decompose it as

U =

(U1 x0 α

)where U1 is (N − 1)× (N − 1) upper triangular, x ∈ RN−1, and αis real.Assume that UUT = UTU then . . .



Proof of Lemma: II

UUT =

(U1 x0 α

)(UT

1 0xT α

)=

(U1UT

1 + xxT αxαxT α2

)=

UTU =

(UT

1 0xT α

)(U1 x0 α

)=

(UT

1 U1 UT1 x

xTU1 α2 + xTx

)So x = 0 and . . .



Proof of Lemma: III

UT1 U1 = U1UT

1 , so

U1 is diagonal by the induction hypothesis.

x = 0 implies that U is diagonal.

We are almost done.



Feel-good proof: III

Now that R is diagonal, we can use

RT R = RT QT QR = AT A = A2 = RQQT RT = RRT

to conclude thatQR = QT RT = QT R

since A is nonsingular, we must have

Q = QT = Q−1.



Feel-good proof: IV

So R is diagonal and Q is symmetric. This means that

A2 = QRRQ = QR2Q

is a spectral decomposition of A2 so

The columns of Q are eigenvectors of A2

and hence they are eigenvectors of A (symmetry).

So I can order the eigenvalues of A so that

A = QΛQ

is a spectral decomposition of A



Feel-good proof: IV

We’re done because

QΛ = AQ = RQQ = R

which means thatΛ = QR = A

is diagonal and the eigenvectors of A (which are the eigenvectorsof A are the entries.



Convergence Theory for Happy Matrices

Assume that A has real distinct eigenvalues and

|λ1| < |λ2| < . . . |λN |.

Then An → R where R has the eigenvalues of A on the diagonal.If A is symmetric, then An → Λ. Moreover

‖R− An‖ = O

([max

i

|λi ||λi+1|

]n)which sure does smell like the power method.


ma 580; numerical analysis i - nc state university · 2018. 7. 23. · matrix computations, golub...

Documents