numerical methods and software for general and structured

Numerical Methods and Software

for General and Structured

Eigenvalue Problems

Daniel Kreßner

Institut fur MathematikTU Berlin

14.05.2004

Outline

Eigenvalue, invariant subspace – basic concepts

Perturbation theory, condition number

QR-like algorithms for small- to medium-sized matrices

Arnoldi algorithms for large and sparse matrices

Conclusions

Daniel Kreßner, Institut fur Mathematik, TU Berlin Numerical Methods and Software for General and Structured Eigenvalue Problems – p.1/21

What is an Eigenvalue?

The word “eigenvalue” is a hybrid translationof the German word “Eigenwert” (like “liver-wurst”), coined by Hilbert around 1900.

In realistic applications: eigenvalue problems often arise only after along process of simplifications, discretizations, linearizations.

Sometimes, eigenvalues have an intrinsic meaning to the originalproblem.

Sometimes, however, they are just meaningless intermediate valuesof a compt. method.

Daniel Kreßner, Institut fur Mathematik, TU Berlin O Numerical Methods and Software for General and Structured Eigenvalue Problems – p.2/21

What is an Eigenvalue?

The word “eigenvalue” is a hybrid translationof the German word “Eigenwert” (like “liver-wurst”), coined by Hilbert around 1900.

In realistic applications: eigenvalue problems often arise only after along process of simplifications, discretizations, linearizations.

Sometimes, eigenvalues have an intrinsic meaning to the originalproblem.

Sometimes, however, they are just meaningless intermediate valuesof a compt. method.


Example: Millennium Footbridge

On its opening day, the Millenniumfootbridge started to wobble underthe weight of 1000s of people, who inturn had difficulties to keep their ba-lance. The bridge had to be closedand was re-opened only after the in-stallation of several viscous dampers,18 months and 5 mill. £ later.

What happened? Some of the naturalfrequencies were similar to the side-ways component of pedestrian foot-steps, causing vibration amplificati-on. Compt. these natural frequenciesamounts to solving an eigenvalueproblem. (Tisseur/Meerbergen’01)


Mathematical Definitions

Eigenvalues of A ∈ Rn×n = roots of det(λI − A).

x 6= 0 is an eigenvector of A if Ax = λx for some eigenvalue λ.⇒ span(Ax) = span(λx) ⊆ span(x).

Generalizes to higher dimensions:AX ⊆ X ⇒ X ⊂ Cn is an invariant subspace.

Schur decomposition: Orthogonal matrix Q ∈ Rn×n s.t.

QT AQ =

Many algorithms compute eigenvalues and invariant subspacesvia (partial) Schur decompositions.







QT AQ =


real eigenvalue

complex conjugate eigenvalue pair


Perturbation Theory: Motivation

Any numerical method for computing eigenvalues is affected byroundoff errors.

Good (i.e., backward stable) methods compute the exacteigenvalues of a perturbed matrix A + E, where

‖E‖2 ≤ O(10−16) · ‖A‖2.

10−16 sounds tiny?




‖E‖2 ≤ O(10−16) · ‖A‖2.


0 1

0. . .

. . . 1

0

∈ R8×8

-0.01 -0.005 0 0.005 0.01

-0.01

-0.005

0

0.005

0.01




‖E‖2 ≤ O(10−16) · ‖A‖2.


0 1

0. . .

. . . 1

0

∈ R8×8

10−16

-0.01 -0.005 0 0.005 0.01-0.01

-0.005

0

0.005

0.01


Perturbation Theory: Condition Numbers

λ simple eigenvalue of A ∈ Rn×n with left {right}, normalized

eigenvectors x{y} ⇒ λ of perturbed matrix A + E satisfies

λ = λ +1

yHxyHEx + O(‖E‖2)

Condition number = worst-case, first-order influence of E on λ:

c(λ) = limε→0

sup‖E‖F ≤ε

E∈Cn×n

|λ − λ|ε

=1

|yHx| .


Perturbation Theory: Condition Numbers

λ simple eigenvalue of A ∈ Rn×n with left {right}, normalized

eigenvectors x{y} ⇒ λ of perturbed matrix A + E satisfies

λ = λ +1

yHxyHEx + O(‖E‖2)

Condition number = worst-case, first-order influence of E on λ:

c(λ) = limε→0

sup‖E‖F ≤ε

E∈Cn×n

|λ − λ|ε

=1

|yHx| .

Similarly for condition number an invariant subspace X :

c(X ) = limε→0

sup‖E‖F ≤ε

E∈Cn×n

‖Θ(X , X )‖F

ε= ‖T−1‖.

matrix of canonical angles

associated Sylvester operator


Structured Condition Numbers

If E is restricted to a subset of Cn×n, condition numbers mayoverestimate actual worst-case effect.

Example: If A is real, most compt. methods preserve realness⇒ backward error E is real.

c(λ) = limε→0

sup‖E‖F ≤ε

E∈Cn×n

|λ − λ|ε





cR(λ) = limε→0

sup‖E‖F ≤ε

E∈Rn×n

|λ − λ|ε

Obviously cR(λ) ≤ c(λ) but cR(λ) � c(λ) possible?





cR(λ) = limε→0

sup‖E‖F ≤ε

E∈Rn×n

|λ − λ|ε

Obviously cR(λ) ≤ c(λ) but cR(λ) � c(λ) possible?

No! cR(λ) ≥ c(λ)/√

2

Similar statement holds for c(X ).


Structured Condition Numbers, ctd.

Hamiltonian matrix:

H =

[

A G

Q −AT

]

, G = GT , Q = QT .

(2√

2 − 2) · c(λ) ≤ cHamiltonian(λ) ≤ c(λ), c(X ) = cHamiltonian(X ),

for stable invariant subspace X .



Hamiltonian matrix:

H =

[

A G

Q −AT

]

, G = GT , Q = QT .

(2√

2 − 2) · c(λ) ≤ cHamiltonian(λ) ≤ c(λ), c(X ) = cHamiltonian(X ),

for stable invariant subspace X .

Skew-Hamiltonian matrix:

W =

[

A G

Q AT

]

, G = −GT , Q = −QT ,

c(X ) = ∞, cskew-Hamiltonian(X ) < ∞.



Product eigenvalue problem:

P = A(p) · A(p−1) · · ·A(1).

Equivalent to computing eigenvalues/invariant subspaces of blockcyclic matrix

A =

0 A(p)

A(1) . . .

. . .. . .

A(p−1) 0

.

cblock cyclic(X ) � c(X ) is possible.


The Basic QR Algorithm

Francis’61: QR generates a sequence of orthogonally similarmatrices:

A0 := A, A1, A2, A3, . . . .

Under suitable conditions (Watkins/Elsner’91) :

Ai −→

Three ingredients make the implicit QR algorithm work:

initial reduction to Hessenberg form;

deflation;

QR iterations = bulge chasing.


QR Algorithm

State-of-the-art implementation:

chasing tightly coupled chains of 3 × 3 bulges(Braman/Byers/Mathias’02, Lang’97);

aggressive early deflation (Braman/Byers/Mathias’02).

0 2 4 6 8 10 12

HessenbergQR

LAPACK 9’10”

State-of-the-art 3’23”

New 2’42”

Schur decomp. of

2000 × 2000 matrix

arising from a linear-

quadratic optimal

control problem

New algorithm is based on 5 × 5 instead of 3 × 3 bulges.


Eigenvalue Reordering

Orth. bases for invariant subspaces can be compt. by reorderingselected eigenvalues in Schur form.


Eigenvalue Reordering

Orth. bases for invariant subspaces can be compt. by reorderingselected eigenvalues in Schur form.

0 0.2 0.4 0.6 0.8 1

LAPACK 37”

New 8”


QZ Algorithm

Block algorithms and aggressive early deflation can be extended toQZ algorithm for computing generalized eigenvalues of matrixpencils.

0 10 20 30 40 50 60

Hessenberg−triangularQZreordering

LAPACK 42’

Dackland/Kågström’99 15’

New 7’

Gen. Schur decomp.

of 2000 × 2000

matrix pencil


Periodic QR Algorithm

QR algorithm applied to block cyclic matrix

A =

0 A(p)

A(1) . . .

. . .. . .

A(p−1) 0

.

does not preserve zero structure of A. Example:

A =

0 0 C

A 0 0

0 B 0



A =

c c c c

c c c c

c c c c

c c c c

a a a a

a a a a

a a a a

a a a a

b b b b

b b b b

b b b b

b b b b

Applying a perfect shuffle permutation..



A =

c c c c

a a a a

b b b b

c c c c

a a a a

b b b b

c c c c

a a a a

b b b b

c c c c

a a a a

b b b b

Hessenberg reduction preserves structure of this cyclic block matrix..



A =

c c c c

a a a a

b b b b

c c c c

0 a a a

0 b b b

0 c c c

0 0 a a

0 0 b b

0 0 c c

0 0 0 a

0 0 0 b

..and so do QR iterations with three Francis shifts.



Hessenberg reduction+QR iterations preserve structure of cyclicblock matrices.

⇒ QR algorithm can be completely written in terms of thecoefficient matrices A, B, C,

strongly backward stable algorithm for computing eigenvalues ofmatrix products.

Completely equivalent to the periodic QR algorithm (Van Loan’75,Bojanczyk/Golub/Van Dooren’92, Hench/Laub’94).

This derivation, however, is new and may lead to more insight.

Example: Eigenvalue reordering in products of matrices can bederived from existing algorithms.


HAPACK

= HAmiltonian matrix PACKage

= LAPACK-like Fortran 77/MATLAB software library for solving(skew-)Hamiltonian eigenvalue problems

Features:

based on (strongly) backward stable algorithms(Benner/Mehrmann/Xu’97–98);

uses new & efficient block algorithms;

includes balancing, invariant subspace/eigenvector compt. anderror estimates.


Large and Structured Eigenvalue Problems

Based on QR-like algorithms, new structure-preserving variants ofthe Krylov-Schur algorithm (Stewart’02) can be developed for:

block cyclic matrices (or, equivalently, matrix products);

(skew-)Hamiltonian matrices (Mehrmann/Watkins’02).

Example:

A = B = C = diag(1, 0.1, 0.01, 0.001, . . . )

# correct digits of compt. eigenvalues with random starting vector:

Eigenvalue 1 10−6 10−12 10−18

Krylov-Schur 15 10 4 0

New 15 14 13 11


Conclusions

This thesis is concerned with numerical methods for solving generaland structured eigenvalue problems.

It also comes with software implementing these methods.

Contributions have been made to various aspects of

the QR algorithm,

the QZ algorithm,

the periodic QR algorithm,

structure-preserving methods for (skew-)Hamiltonian matrices,

and the Krylov-Schur algorithm.

This is partly joint work with Björn Adlerborn, Peter Benner, RalphByers, Bo Kågström, Volker Mehrmann.


Conclusions

This thesis is concerned with numerical methods for solving generaland structured eigenvalue problems.

It also comes with software implementing these methods.

Contributions have been made to various aspects of

the QR algorithm,

the QZ algorithm,

the periodic QR algorithm,

structure-preserving methods for (skew-)Hamiltonian matrices,

and the Krylov-Schur algorithm.

This is partly joint work with Björn Adlerborn, Peter Benner, RalphByers, Bo Kågström, Volker Mehrmann.

Thanks!


Finally..

..thanks for your attention!


numerical methods and software for general and structured

Documents