linear algebra: foundations to frontiers a collection of ... · pdf filelinear algebra:...

Linear Algebra: Foundations to Frontiers

A Collection of Notes

on Numerical Linear Algebra

Robert A. van de Geijn

Release Date December 12, 2014

Kindly do not share this PDF

Point others to * http://www.ulaff.net instead

This is a work in progress

http://www.ulaff.net

Copyright © 2014 by Robert A. van de Geijn.

10 9 8 7 6 5 4 3 2 1

All rights reserved. No part of this book may be reproduced, stored, or transmitted inany manner without the written permission of the publisher. For information, contactany of the authors.

No warranties, express or implied, are made by the publisher, authors, and their em-ployers that the programs contained in this volume are free of error. They should notbe relied on as the sole basis to solve a problem whose incorrect solution could resultin injury to person or property. If the programs are employed in such a manner, itis at the user’s own risk and the publisher, authors, and their employers disclaim allliability for such misuse.

Trademarked names may be used in this book without the inclusion of a trademarksymbol. These names are used in an editorial context only; no infringement of trade-mark is intended.

Library of Congress Cataloging-in-Publication Data not yet availableDraft Edition, November 2014This “Draft Edition” allows this material to be used while we sort out through whatmechanism we will publish the book.

Contents

Preface ix

1. Notes on Simple Vector and Matrix Operations 1Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2. (Hermitian) Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1. Conjugating a complex scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.2. Conjugate of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.3. Conjugate of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.4. Transpose of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.5. Hermitian transpose of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.6. Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.7. Hermitian transpose (adjoint) of a matrix . . . . . . . . . . . . . . . . . . . . . . 51.2.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3. Vector-vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.1. Scaling a vector (scal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.2. Scaled vector addition (axpy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3.3. Dot (inner) product (dot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4. Matrix-vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4.1. Matrix-vector multiplication (product) . . . . . . . . . . . . . . . . . . . . . . . 111.4.2. Rank-1 update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.5. Matrix-matrix multiplication (product) . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.5.1. Element-by-element computation . . . . . . . . . . . . . . . . . . . . . . . . . . 171.5.2. Via matrix-vector multiplications . . . . . . . . . . . . . . . . . . . . . . . . . . 181.5.3. Via row-vector times matrix multiplications . . . . . . . . . . . . . . . . . . . . 181.5.4. Via rank-1 updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.5.5. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.6. Summarizing All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2. Notes on Vector and Matrix Norms 23Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

i

2.1. Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2. Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.1. Vector 2-norm (length) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.2.2. Vector 1-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2.3. Vector ∞-norm (infinity norm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2.4. Vector p-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3. Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.3.1. Frobenius norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.3.2. Induced matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.3.3. Special cases used in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.5. Submultiplicative norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4. An Application to Conditioning of Linear Systems . . . . . . . . . . . . . . . . . . . . . 322.5. Equivalence of Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3. Notes on Orthogonality and the Singular Value Decomposition 35Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.1. Orthogonality and Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2. Toward the SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.3. The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.4. Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.5. Consequences of the SVD Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.6. Projection onto the Column Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.7. Low-rank Approximation of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.8. An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.9. SVD and the Condition Number of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 543.10. An Algorithm for Computing the SVD? . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4. Notes on Gram-Schmidt QR Factorization 57Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.1. Classical Gram-Schmidt (CGS) Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.2. Modified Gram-Schmidt (MGS) Process . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3. In Practice, MGS is More Accurate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.4. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.4.1. Cost of CGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.4.2. Cost of MGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5. Notes on the FLAME APIs 73Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2. Install FLAME@lab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.3. An Example: Gram-Schmidt Orthogonalization . . . . . . . . . . . . . . . . . . . . . . . 75

5.3.1. The Spark Webpage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3.2. Implementing CGS with FLAME@lab . . . . . . . . . . . . . . . . . . . . . . . 765.3.3. Editing the code skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.3.4. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.4. Implementing the Other Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6. Notes on Householder QR Factorization 83Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.2. Householder Transformations (Reflectors) . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.2.1. The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.2.2. As implemented for the Householder QR factorization (real case) . . . . . . . . . 876.2.3. The complex case (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 876.2.4. A routine for computing the Householder vector . . . . . . . . . . . . . . . . . . 88

6.3. Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.4. Forming Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.5. Applying QH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976.6. Blocked Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.6.1. The UT transform: Accumulating Householder transformations . . . . . . . . . . 996.6.2. A blocked algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1026.6.3. Variations on a theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7. Notes on Solving Linear Least-Squares Problems 109Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1107.1. The Linear Least-Squares Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1117.2. Method of Normal Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1117.3. Solving the LLS Problem Via the QR Factorization . . . . . . . . . . . . . . . . . . . . . 112

7.3.1. Simple derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 1127.3.2. Alternative derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . 113

7.4. Via Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1147.5. Via the Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.5.1. Simple derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 1157.5.2. Alternative derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.6. What If A Does Not Have Linearly Independent Columns? . . . . . . . . . . . . . . . . . 1167.7. Exercise: Using the the LQ factorization to solve underdetermined systems . . . . . . . . 123

8. Notes on the Condition of a Problem 127Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1288.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1298.2. The Prototypical Example: Solving a Linear System . . . . . . . . . . . . . . . . . . . . . 1298.3. Condition Number of a Rectangular Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 1338.4. Why Using the Method of Normal Equations Could be Bad . . . . . . . . . . . . . . . . . 1348.5. Why Multiplication with Unitary Matrices is a Good Thing . . . . . . . . . . . . . . . . . 135

9. Notes on the Stability of an Algorithm 137Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1389.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1399.2. Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1409.3. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1429.4. Floating Point Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9.4.1. Model of floating point computation . . . . . . . . . . . . . . . . . . . . . . . . . 1429.4.2. Stability of a numerical algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 1439.4.3. Absolute value of vectors and matrices . . . . . . . . . . . . . . . . . . . . . . . 143

9.5. Stability of the Dot Product Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1449.5.1. An algorithm for computing DOT . . . . . . . . . . . . . . . . . . . . . . . . . . 1449.5.2. A simple start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1449.5.3. Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1469.5.4. Target result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1489.5.5. A proof in traditional format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1499.5.6. A weapon of math induction for the war on error (optional) . . . . . . . . . . . . . 1499.5.7. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

9.6. Stability of a Matrix-Vector Multiplication Algorithm . . . . . . . . . . . . . . . . . . . . 1529.6.1. An algorithm for computing GEMV . . . . . . . . . . . . . . . . . . . . . . . . . 1529.6.2. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

9.7. Stability of a Matrix-Matrix Multiplication Algorithm . . . . . . . . . . . . . . . . . . . . 1549.7.1. An algorithm for computing GEMM . . . . . . . . . . . . . . . . . . . . . . . . . 1549.7.2. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1549.7.3. An application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

10. Notes on Performance 157

11. Notes on Gaussian Elimination and LU Factorization 159Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16011.1. Definition and Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16111.2. LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

11.2.1. First derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16111.2.2. Gauss transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16211.2.3. Cost of LU factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

11.3. LU Factorization with Partial Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16511.3.1. Permutation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16511.3.2. The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

11.4. Proof of Theorem 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17311.5. LU with Complete Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17411.6. Solving Ax = y Via the LU Factorization with Pivoting . . . . . . . . . . . . . . . . . . . 17511.7. Solving Triangular Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 175

11.7.1. Lz = y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17511.7.2. Ux = z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

11.8. Other LU Factorization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

11.8.1. Variant 1: Bordered algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18311.8.2. Variant 2: Left-looking algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 18311.8.3. Variant 3: Up-looking variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18411.8.4. Variant 4: Crout variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18411.8.5. Variant 5: Classical LU factorization . . . . . . . . . . . . . . . . . . . . . . . . . 18511.8.6. All algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18511.8.7. Formal derivation of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

11.9. Numerical Stability Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18711.10.Is LU with Partial Pivoting Stable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18811.11.Blocked Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

11.11.1.Blocked classical LU factorization (Variant 5) . . . . . . . . . . . . . . . . . . . . 18811.11.2.Blocked classical LU factorization with pivoting (Variant 5) . . . . . . . . . . . . 191

11.12.Variations on a Triple-Nested Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

12. Notes on Cholesky Factorization 195Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19612.1. Definition and Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19712.2. Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19712.3. An Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19812.4. Proof of the Cholesky Factorization Theorem . . . . . . . . . . . . . . . . . . . . . . . . 19912.5. Blocked Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20012.6. Alternative Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20012.7. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20312.8. Solving the Linear Least-Squares Problem via the Cholesky Factorization . . . . . . . . . 20412.9. Other Cholesky Factorization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 20412.10.Implementing the Cholesky Factorization with the (Traditional) BLAS . . . . . . . . . . . 206

12.10.1.What are the BLAS? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20612.10.2.A simple implementation in Fortran . . . . . . . . . . . . . . . . . . . . . . . . . 20912.10.3.Implemention with calls to level-1 BLAS . . . . . . . . . . . . . . . . . . . . . . 20912.10.4.Matrix-vector operations (level-2 BLAS) . . . . . . . . . . . . . . . . . . . . . . 20912.10.5.Matrix-matrix operations (level-3 BLAS) . . . . . . . . . . . . . . . . . . . . . . 21312.10.6.Impact on performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

12.11.Alternatives to the BLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21412.11.1.The FLAME/C API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21412.11.2.BLIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

13. Notes on Eigenvalues and Eigenvectors 215Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21613.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21713.2. The Schur and Spectral Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22013.3. Relation Between the SVD and the Spectral Decomposition . . . . . . . . . . . . . . . . . 222

14. Notes on the Power Method and Related Methods 223Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22414.1. The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

14.1.1. First attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22514.1.2. Second attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22614.1.3. Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22714.1.4. Practical Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23014.1.5. The Rayleigh quotient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23014.1.6. What if |λ0| ≥ |λ1|? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

14.2. The Inverse Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23114.3. Rayleigh-quotient Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

15. Notes on the QR Algorithm and other Dense Eigensolvers 235Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23615.1. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23715.2. Subspace Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23715.3. The QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

15.3.1. A basic (unshifted) QR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 24215.3.2. A basic shifted QR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

15.4. Reduction to Tridiagonal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24415.4.1. Householder transformations (reflectors) . . . . . . . . . . . . . . . . . . . . . . 24415.4.2. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

15.5. The QR algorithm with a Tridiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 24715.5.1. Givens’ rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

15.6. QR Factorization of a Tridiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 24815.7. The Implicitly Shifted QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

15.7.1. Upper Hessenberg and tridiagonal matrices . . . . . . . . . . . . . . . . . . . . . 25015.7.2. The Implicit Q Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25115.7.3. The Francis QR Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25215.7.4. A complete algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

15.8. Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25815.8.1. More on reduction to tridiagonal form . . . . . . . . . . . . . . . . . . . . . . . . 25815.8.2. Optimizing the tridiagonal QR algorithm . . . . . . . . . . . . . . . . . . . . . . 258

15.9. Other Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25815.9.1. Jacobi’s method for the symmetric eigenvalue problem . . . . . . . . . . . . . . . 25815.9.2. Cuppen’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26115.9.3. The Method of Multiple Relatively Robust Representations (MRRR) . . . . . . . 261

15.10.The Nonsymmetric QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26115.10.1.A variant of the Schur decomposition . . . . . . . . . . . . . . . . . . . . . . . . 26115.10.2.Reduction to upperHessenberg form . . . . . . . . . . . . . . . . . . . . . . . . . 26215.10.3.The implicitly double-shifted QR algorithm . . . . . . . . . . . . . . . . . . . . . 265

16. Notes on the Method of Relatively Robust Representations (MRRR) 267Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26816.1. MRRR, from 35,000 Feet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26916.2. Cholesky Factorization, Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26916.3. The LDLT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27216.4. The UDUT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27416.5. The UDUT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27416.6. The Twisted Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27816.7. Computing an Eigenvector from the Twisted Factorization . . . . . . . . . . . . . . . . . 279

17. Notes on Computing the SVD of a Matrix 281Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28217.1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28317.2. Reduction to Bidiagonal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28317.3. The QR Algorithm with a Bidiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 28717.4. Putting it all together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

18. Notes on Splitting Methods 293Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29418.1. A Simple Example: One-Dimensional Boundary Value Problem . . . . . . . . . . . . . . 29518.2. A Two-dimensional Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

18.2.1. Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29618.3. Direct solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29818.4. Iterative solution: Jacobi iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

18.4.1. Motivating example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29918.4.2. More generally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

18.5. The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30118.6. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

18.6.1. Some useful facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

19. Notes on Descent Methods and the Conjugate Gradient Method 313Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31419.1. Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31519.2. Descent Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31519.3. Relation to Splitting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31619.4. Method of Steepest Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31719.5. Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31819.6. Methods of A-conjugate Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31819.7. Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

20. Notes on Lanczos Methods 323Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32420.1. Krylov Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

Answers 3331. Notes on Simple Vector and Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . 3332. Notes on Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3373. Notes on Orthogonality and the SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3424. Notes on Gram-Schmidt QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3476. Notes on Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3497. Notes on Solving Linear Least-squares Problems . . . . . . . . . . . . . . . . . . . . . . . . 3528. Notes on the Condition of a Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3549. Notes on the Stability of an Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35610. Notes on Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36111. Notes on Gaussian Elimination and LU Factorization . . . . . . . . . . . . . . . . . . . . . 36212. Notes on Cholesky Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36913. Notes on Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37314. Notes on the Power Method and Related Methods . . . . . . . . . . . . . . . . . . . . . . . 37815. Notes on the Symmetric QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37916. Notes on the Method of Relatively Robust Representations . . . . . . . . . . . . . . . . . . 38316. Notes on Computing the SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39317. Notes on Splitting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

A. How to Download 395

Bibliography 397

Index 401

Preface

This document was created over the course of many years, as I periodically taught an introductory coursetitled “Numerical Analysis: Linear Algebra,” cross-listed in the departments of Computer Science, Math,and Statistics and Data Sciences (SDS), as well as the Computational Science Engineering Mathematics(CSEM) graduate program.

Over the years, my colleagues and I have used many different books for this course: Matrix Computationsby Golub and Van Loan [21], Fundamentals of Matrix Computation by Watkins [45], Numerical LinearAlgebra by Trefethen and Bau [36], and Applied Numerical Linear Algebra by Demmel [11]. All are bookswith tremendous strenghts and depth. Nonetheless, I found myself writing materials for my students thatadd insights that are often drawn from our own research experiences in the field. These became a series ofnotes that are meant to supplement rather than replace any of the mentioned books.

Fundamental to our exposition is the FLAME notation [24, 38], which we use to present algorithms hand-in-hand with theory. For example, in Figure 1 (left), we present a commonly encountered LU factorizationalgorithm, which we will see performs exactly the same computations as does Gaussian elimination. Byabstracting away from the detailed indexing that are required to implement an algorithm in, for example,Matlab’s M-script language, the reader can focus on the mathematics that justifies the algorithm ratherthan the indices used to express the algorithm. The algorithms can be easily translated into code withthe help of a FLAME Application Programming Interface (API). Such interfaces have been created forC, M-script, and Python, to name a few [24, 5, 29]. In Figure 1 (right), we show the LU factorizationalgorithm implemented with the FLAME@lab API for M-script. The C API is used extensively in ourimplementation of the libflame dense linear algebra library [39, 40] for sequential and shared-memoryarchitectures and the notation also inspired the API used to implement the Elemental dense linear algebralibrary [30] that targets distributed memory architectures. Thus, the reader is exposed to the abstractionsthat experts use to translate algorithms to high-performance software.

These notes may at times appear to be a vanity project, since I often point the reader to our research papersfor a glipse at the cutting edge of the field. The fact is that over the last two decades, we have helpedfurther the understanding of many of the topics discussed in a typical introductory course on numericallinear algebra. Since we use the FLAME notation in many of our papers, they should be relatively easy toread once one familiarizes oneself with these notes. Let’s be blunt: these notes do not do the field justicewhen it comes to also giving a historic perspective. For that, we recommend any of the above mentionedtexts, or the wonderful books by G.W. Stewart [34, 35]. This is yet another reason why they should beused to supplement other texts.

ix

Algorithm: A := L\U = LU(A)

Partition A→

AT L AT R

ABL ABR

where AT L is 0×0

while n(AT L)< n(A) do

Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

a21 := a21/α11

A22 := A22−a21aT12

Continue with

AT L AT R

ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

endwhile

function [ A_out ] = LU_unb_var4( A )[ ATL, ATR, ...ABL, ABR ] = FLA_Part_2x2( A, ...

0, 0, ’FLA_TL’ );while ( size( ATL, 1 ) < size( A, 1 ) )[ A00, a01, A02, ...a10t, alpha11, a12t, ...A20, a21, A22 ] = ...

FLA_Repart_2x2_to_3x3( ...ATL, ATR, ...ABL, ABR, 1, 1, ’FLA_BR’ );

%------------------------------------------%a21 = a21 / alpha11;A22 = A22 - a21 * a12t;%------------------------------------------%[ ATL, ATR, ...ABL, ABR ] = ...

FLA_Cont_with_3x3_to_2x2( ...A00, a01, A02, ...a10t, alpha11, a12t, ...A20, a21, A22, ’FLA_TL’ );

endA_out = [ ATL, ATR

ABL, ABR ];return

Figure 1: LU factorization represented with the FLAME notation and the FLAME@lab API.

The order of the chapters should not be taken too seriously. They were initially written as separate,relatively self-contained notes. The sequence on which I settled roughly follows the order that the topicsare encountered in Numerical Linear Algebra by Trefethen and Bau [36]. The reason is the same as theone they give: It introduces orthogonality and the Singular Value Decomposition early on, leading withimportant material that many students will not yet have seen in the undergraduate linear algebra classesthey took. However, one could just as easily rearrange the chapters so that one starts with a more traditionaltopic: solving dense linear systems.

The notes frequently refer the reader to another resource of ours titled Linear Algebra: Foundations toFrontiers - Notes to LAFF With (LAFF Notes) [29]. This is a 900+ page document with more than 270videos that was created for the Massive Open Online Course (MOOC) Linear Algebra: Foundationsto Frontiers (LAFF), offered by the edX platform. That course provides an appropriate undergraduatebackground for these notes.

I (tried to) video tape my lectures during Fall 2014. Unlike the many short videos that we created for theMassive Open Online Course (MOOC) titled “Linear Algebra: Foundations to Frontiers” that are now partof the notes for that course, I simply set up a camera, taped the entire lecture, spent minimal time editing,and uploaded the result for the world to see. Worse, I did not prepare particularly well for the lectures,other than feverishly writing these notes in the days prior to the presentation. Sometimes, I forgot to turnon the microphone and/or the camera. Sometimes the memory of the camcorder was full. Sometimes I

http://www.ulaff.net/


https://www.edx.org/course/utaustinx/utaustinx-ut-5-02x-linear-algebra-4491

forgot to shave. Often I forgot to comb my hair. You are welcome to watch, but don’t expect too much!

One should consider this a living document. As time goes on, I will be adding more material. Ideally,people with more expertise than I have on, for example, solving sparse linear systems will contributednotes of their own.

Acknowledgments

These notes use notation and tools developed by the FLAME project at The University of Texas at Austin(USA), Universidad Jaume I (Spain), and RWTH Aachen University (Germany). This project involvesa large and ever expanding number of very talented people, to whom I am indepted. Over the years, ithas been supported by a number of grants from the National Science Foundation and industry. The mostrecent and most relevant funding came from NSF Award ACI-1148125 titled “SI2-SSI: A Linear AlgebraSoftware Infrastructure for Sustained Innovation in Computational Chemistry and other Sciences”1

In Texas, behind every successful man there is a woman who really pulls the strings. For more than thirtyyears, my research, teaching, and other pedagogical activities have been greatly influenced by my wife,Dr. Maggie Myers. For parts of these notes that are particularly successful, the credit goes to her. Wherethey fall short, the blame is all mine!

1 Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and donot necessarily reflect the views of the National Science Foundation (NSF).

xiii

Chapter 1Notes on Simple Vector and Matrix Operations

We assume that the reader is quite familiar with vectors, linear transformations, and matrices. If not, wesuggest the reader reviews the first five weeks of

Linear Algebra: Foundations to Frontiers - Notes to LAFF With [29].

Since undergraduate courses tend to focus on real valued matrices and vectors, we mostly focus on thecase where they are complex valued as we review.

Video

Read disclaimer regarding the videos in the preface!The below first video is actually for the first lecture of the semester.

* YouTube* Download from UT Box* View After Local Download

(For help on viewing, see Appendix A.)(In the downloadable version, the lecture starts about 27 seconds into the video. I was still learning

how to do the editing...) The video for this particular note didn’t turn out, so you will want to read instead.As I pointed out in the preface: these videos aren’t refined by any measure!

1


http://youtu.be/UYfEMVNYYFU

https://utexas.box.com/s/dmnaeige95mk76n63b02

Chapter 1. Notes on Simple Vector and Matrix Operations 2

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2. (Hermitian) Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1. Conjugating a complex scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.2. Conjugate of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.3. Conjugate of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.4. Transpose of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.5. Hermitian transpose of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.6. Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.7. Hermitian transpose (adjoint) of a matrix . . . . . . . . . . . . . . . . . . . . . 5

1.2.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3. Vector-vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.1. Scaling a vector (scal) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.2. Scaled vector addition (axpy) . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.3. Dot (inner) product (dot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4. Matrix-vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4.1. Matrix-vector multiplication (product) . . . . . . . . . . . . . . . . . . . . . . 11

1.4.2. Rank-1 update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.5. Matrix-matrix multiplication (product) . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.5.1. Element-by-element computation . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.5.2. Via matrix-vector multiplications . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5.3. Via row-vector times matrix multiplications . . . . . . . . . . . . . . . . . . . 18

1.5.4. Via rank-1 updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5.5. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.6. Summarizing All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.1. Notation 3

1.1 Notation

Throughout our notes we will adopt notation popularized by Alston Householder. As a rule, we will uselower case Greek letters (α, β, etc.) to denote scalars. For vectors, we will use lower case Roman letters(a, b, etc.). Matrices are denoted by upper case Roman letters (A, B, etc.).

If x ∈ Cn, and A ∈ Cm×n then we expose the elements of x and A as

x =

χ0

χ1...

χn−1

and A =

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

... · · · ...

αm−1,0 αm−1,1 · · · αm−1,n−1

.

If vectors x is partitioned into N subvectors, we may denote this by

x =

x0

x1...

xN−1

.

It is possible for a subvector to be of size zero (no elements) or one (a scalar). If we want to emphasizethat a specific subvector is a scalar, then we may choose to use a lower case Greek letter for that scalar, asin

x =

x0

χ1

x2

.

We will see frequent examples where a matrix, A ∈ Cm×n, is partitioned into columns or rows:

A =(

a0 a1 · · · an−1

)=

aT

0

aT1...

aTm−1

.

Here we add the to be able to distinguish between the identifiers for the columns and rows. When fromcontext it is obvious whether we refer to a row or column, we may choose to skip the . Sometimes, wepartition matrices into submatrices:

A =(

A0 A1 · · · AN−1

)=

A0

A1...

AM−1

=

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

......

AM−1,0 AM−1,1 · · · AM−1,N−1

.


1.2 (Hermitian) Transposition

1.2.1 Conjugating a complex scalar

Recall that if α = αr + iαc, then its (complex) conjugate is given by

α = αr− iαc

and its length (absolute value) by

|α|= |αr + iαc|=√

α2r +α2

c =√

αr + iαc)(αr− iαc =√

αα =√

αα = |α|.

1.2.2 Conjugate of a vector

The (complex) conjugate of x is given by

x =

χ0

χ1...

χn−1

.

1.2.3 Conjugate of a matrix

The (complex) conjugate of A is given by

A =

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

... · · · ...

αm−1,0 αm−1,1 · · · αm−1,n−1

.

1.2.4 Transpose of a vector

The transpose of x is given by

xT =

χ0

χ1...

χn−1

T

=(

χ0 χ1 · · · χn−1

).

Notice that transposing a (column) vector rearranges its elements to make a row vector.

1.2. (Hermitian) Transposition 5

1.2.5 Hermitian transpose of a vector

The Hermitian transpose of x is given by

xH(= xc) = (x)T =

χ0

χ1...

χn−1

T

=

χ0

χ1...

χn−1

T

=(

χ0 χ1 · · · χn−1

).

1.2.6 Transpose of a matrix

The transpose of A is given by

AT =

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

... · · · ...

αm−1,0 αm−1,1 · · · αm−1,n−1

T

=

α0,0 α1,0 · · · αm−1,0

α0,1 α1,1 · · · αm−1,1...

... · · · ...

α0,n−1 α1,n−1 · · · αm−1,n−1

.

1.2.7 Hermitian transpose (adjoint) of a matrix

The Hermitian transpose of A is given by

AH(= Ac) = AT=

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

... · · · ...

αm−1,0 αm−1,1 · · · αm−1,n−1

T

=

α0,0 α1,0 · · · αm−1,0

α0,1 α1,1 · · · αm−1,1...

... · · · ...

α0,n−1 α1,n−1 · · · αm−1,n−1

.

1.2.8 Exercises

Homework 1.1 Partition A

A =(

a0 a1 · · · an−1

)=

aT

0

aT1...

aTm−1

.

Convince yourself that the following hold:

•(

a0 a1 · · · an−1

)T=

aT

0

aT1...

aTm−1

.


•

aT

0

aT1...

aTm−1

T

=(

a0 a1 · · · an−1

).

•(

a0 a1 · · · an−1

)H=

aH

0

aH1...

aHm−1

.

•

aT

0

aT1...

aTm−1

H

=(

a0 a1 · · · an−1

).

* SEE ANSWER

Homework 1.2 Partition x into subvectors:

x =

x0

x1...

xN−1

.


• x =

x0

x1...

xN−1

.

• xT =(

xT0 xT

1 · · · xTN−1

).

• xH =(

xH0 xH

1 · · · xHN−1

).

* SEE ANSWER

1.3. Vector-vector Operations 7


A =

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

,

where Ai, j ∈ Cmi×ni . Here ∑M−1i=0 mi = m and ∑

N−1j=0 ni = n.


•

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

T

=

AT

0,0 AT1,0 · · · AT

M−1

AT0,1 AT

1,1 · · · ATM−1,1

...... · · · ...

AT0,N−1 AT

1,N−1 · · · ATM−1,N−1

.

•

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

H

=

AH

0,0 AH1,0 · · · AH

M−1

AH0,1 AH

1,1 · · · AHM−1,1

...... · · · ...

AH0,N−1 AH

1,N−1 · · · AHM−1,N−1

.

* SEE ANSWER

1.3 Vector-vector Operations

1.3.1 Scaling a vector (scal)

Let x ∈ Cn and α ∈ C, with

x =

χ0

χ1...

χn−1

.

Then αx equals the vector x “stretched” by a factor α:

αx = α

χ0

χ1...

χn−1

=

αχ0

αχ1...

αχn−1

.


If y := αx with

y =

ψ0

ψ1...

ψn−1

,

then the following loop computes y:

for i := 0, . . . ,n−1ψi := αχi

endfor

Homework 1.4 Homework 1.5 Convince yourself of the following:

• αxT =(

αχ0 αχ1 · · · αχn−1

).

• (αx)T = αxT .

• (αx)H = αxH .

• α

x0

x1...

xN−1

=

αx0

αx1...

αxN−1

* SEE ANSWER

Cost

Scaling a vector of size n requires, approximately, n multiplications. Each of these becomes a floatingpoint operation (flop) when the computation is performed as part of an algorithm executed on a computerthat performs floating point computation. We will thus say that scaling a vector costs n flops.

It should be noted that arithmetic with complex numbers is roughly 4 times as expensive as is arithmeticwith real numbers. In the chapter “Notes on Performance” (page ??) we also discuss that the cost ofmoving data impacts the cost of a flop. Thus, not all flops are created equal!

1.3.2 Scaled vector addition (axpy)

Let x,y ∈ Cn and α ∈ C, with

x =

χ0

χ1...

χn−1

and y =

ψ0

ψ1...

ψn−1

.

1.3. Vector-vector Operations 9

Then αx+ y equals the vector

αx+ y = α

χ0

χ1...

χn−1

+

ψ0

ψ1...

ψn−1

=

αχ0

αχ1...

αχn−1

+

αψ0

αψ1...

αψn−1

.

This operation is known as the axpy operation: scalar α times x plus y. Typically, the vector y is overwrittenwith the result:

y := αx+ y = α

χ0

χ1...

χn−1

+

ψ0

ψ1...

ψn−1

=

αχ0

αχ1...

αχn−1

+

αψ0

αψ1...

αψn−1

so that the following loop updates y:

for i := 0, . . . ,n−1ψi := αχi +ψ

endfor


• α

x0

x1...

xN−1

+

y0

y1...

yN−1

=

αx0 + y0

αx1 + y1...

αxN−1 + yN−1

. (Provided xi,yi ∈ Cni and ∑N−1i=0 ni = n.)

* SEE ANSWER

Cost

The axpy with two vectors of size n requires, approximately, n multiplies and n additions or 2n flops.

1.3.3 Dot (inner) product (dot)

Let x,y ∈ Cn with

x =

χ0

χ1...

χn−1

and y =

ψ0

ψ1...

ψn−1

.


Then the dot product of x and y is defined by

xHy =

χ0

χ1...

χn−1

Hψ0

ψ1...

ψn−1

=(

χ0 χ1 · · · χn−1

)

ψ0

ψ1...

ψn−1

= χ0ψ0 +χ1ψ1 + · · ·+χn−1ψn−1 =

n−1

∑i=0

χiψi.

The following loop computes α := xHy:

α := 0for i := 0, . . . ,n−1

α := χiψ+α

endfor


•

x0

x1...

xN−1

Hy0

y1...

yN−1

= ∑N−1i=0 xH

i yi. (Provided xi,yi ∈ Cni and ∑N−1i=0 ni = n.)

* SEE ANSWER

Homework 1.10 Homework 1.11 Prove that xHy = yHx.* SEE ANSWER

As we discuss matrix-vector multiplication and matrix-matrix multiplication, the closely related oper-ation xT y is also useful:

xT y =

χ0

χ1...

χn−1

T ψ0

ψ1...

ψn−1

=(

χ0 χ1 · · · χn−1

)

ψ0

ψ1...

ψn−1

= χ0ψ0 +χ1ψ1 + · · ·+χn−1ψn−1 =

n−1

∑i=0

χiψi.

We will sometimes refer to this operation as a “dot” product since it is like the dot product xHy, but withoutconjugation. In the case of real valued vectors, it is the dot product.

1.4. Matrix-vector Operations 11

Cost

The dot product of two vectors of size n requires, approximately, n multiplies and n additions. Thus, a dotproduct cost, approximately, 2n flops.

1.4 Matrix-vector Operations

1.4.1 Matrix-vector multiplication (product)

Be sure to understand the relation between linear transformations and matrix-vector multiplication byreading Week 2 of

Linear Algebra: Foundations to Fountiers - Notes to LAFF With [29].

Let y ∈ Cm, A ∈ Cm×n, and x ∈ Cn with

y =

ψ0

ψ1...

ψm−1

, A =

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

......

αm−1,0 αm−1,1 · · · αm−1,n−1

, and x =

χ0

χ1...

χn−1

.

Then y = Ax means thatψ0

ψ1...

ψm−1

=

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

......

αm−1,0 αm−1,1 · · · αm−1,n−1

χ0

χ1...

χn−1

=

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

......

αm−1,0 αm−1,1 · · · αm−1,n−1

χ0

χ1...

χn−1

.

This is the definition of the matrix-vector product Ax, sometimes referred to as a general matrix-vectormultiplication (gemv) when no special structure or properties of A are assumed.

Now, partition

A =(

a0 a1 · · · an−1

)=

aT

0

aT1...

aTm−1

.



Focusing on how A can be partitioned by columns, we find that

y = Ax =(

a0 a1 · · · an−1

)

χ0

χ1...

χn−1

= a0χ0 +a1χ1 + · · ·+an−1χn−1

= χ0a0 +χ1a1 + · · ·+χn−1an−1

= χn−1an−1 +(· · ·+(χ1a1 +(χ0a0 +0)) · · ·),where 0 denotes the zero vector of size m. This suggests the following loop for computing y := Ax:

y := 0for j := 0, . . . ,n−1

y := χ ja j + y (axpy)endfor

In Figure 1.1 (left), we present this algorithm using the FLAME notation (LAFF Notes Week 3 [29]).Focusing on how A can be partitioned by rows, we find that

y =

ψ0

ψ1...

ψn−1

= Ax =

aT

0

aT1...

aTm−1

x =

aT

0 x

aT1 x...

aTm−1x

.

This suggests the following loop for computing y := Ax:

for i := 0, . . . ,m−1ψi := aT

i x+ψi (‘‘dot’’)endfor

Here we use the term “dot” because for complex valued matrices it is not really a dot product. In Figure 1.1(right), we present this algorithm using the FLAME notation (LAFF Notes Week 3 [29]).

It is important to notice that this first “matrix-vector” operation (matrix-vector multiplication) can be“layered” upon vector-vector operations (axpy or ‘‘dot’’).

Cost

Matrix-vector multiplication with a m× n matrix costs, approximately, 2mn flops. This can be easilyargued in three different ways:

• The computation requires a multiply and an add with each element of the matrix. There are mn suchelements.

• The operation can be computed via n axpy operations, each of which requires 2m flops.

• The operation can be computed via m dot operations, each of which requires 2n flops.




Algorithm: [y] := MVMULT UNB VAR1(A,x,y)

Partition A→(

AL AR

), x→

xT

xB

where AL is 0 columns, xT has 0 elements

while n(AL)< n(A) do

Repartition

(AL AR

)→(

A0 a1 A2

),

xT

xB

→

x0

χ1

x2

y := χ1a1 + y

Continue with

(AL AR

)←(

A0 a1 A2

),

xT

xB

←

x0

χ1

x2

endwhile

Algorithm: [y] := MVMULT UNB VAR2(A,x,y)

Partition A→

AT

AB

, y→

yT

yB

where AT has 0 rows, yT has 0 elements

while m(AT )< m(A) do

Repartition

AT

AB

→

A0

aT1

A2

,

yT

yB

→

y0

ψ1

y2

ψ1 := aT

1 x+ψ1

Continue withAT

AB

←

A0

aT1

A2

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 1.1: Matrix-vector multiplication algorithms for y := Ax + y. Left: via axpy operations (bycolumns). Right: via “dot products” (by rows).

1.4.2 Rank-1 update

Let y ∈ Cm, A ∈ Cm×n, and x ∈ Cn with

y =

ψ0

ψ1...

ψm−1

, A =

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

......

αm−1,0 αm−1,1 · · · αm−1,n−1

, and x =

χ0

χ1...

χn−1

.

The outerproduct of y and x is given by

yxT =

ψ0

ψ1...

ψm−1

χ0

χ1...

χn−1

T

=

ψ0

ψ1...

ψm−1

(

χ0 χ1 · · · χn−1

)


=

ψ0χ0 ψ0χ1 · · · ψ0χn−1

ψ1χ0 ψ1χ1 · · · ψ1χn−1...

......

ψm−1χ0 ψm−1χ1 · · · ψm−1χn−1

.

Also,yxT = y

(χ0 χ1 · · · χn−1

)=(

χ0y χ1y · · · χn−1y).

This shows that all columns are a multiple of vector y. Finally,

yxT =

ψ0

ψ1...

ψm−1

xT =

ψ0xT

ψ1xT

...

ψm−1xT

,

which shows that all columns are a multiple of row vector xT . This motivates the observation that thematrix yxT has rank at most equal to one (LAFF Notes Week 10 [29]).

The operation A := yxT +A is called a rank-1 update to matrix A and is often referred to as underline-general rank-1 update (ger):

α0,0 α0,1 · · · α0,n−1

α1,0 α1,1 · · · α1,n−1...

......

αm−1,0 αm−1,1 · · · αm−1,n−1

:=

ψ0χ0 +α0,0 ψ0χ1 +α0,1 · · · ψ0χn−1 +α0,n−1

ψ1χ0 +α1,0 ψ1χ1 +α1,1 · · · ψ1χn−1 +α1,n−1...

......

ψm−1χ0 +αm−1,0 ψm−1χ1 +αm−1,1 · · · ψm−1χn−1 +αm−1,n−1

.

Now, partition

A =(

a0 a1 · · · an−1

)=

aT

0

aT1...

aTm−1

.

Focusing on how A can be partitioned by columns, we find that

yxT +A =(

χ0y χ1y · · · χn−1y)+(

a0 a1 · · · an−1

)=

(χ0y+a0 χ1y+a1 · · · χn−1y+an−1

).



Algorithm: [A] := RANK1 UNB VAR1(y,x,A)

Partition x→

xT

xB

, A→(

AL AR

)where xT has 0 elements, AL has 0 columns

while m(xT )< m(x) do

Repartition

xT

xB

→

x0

χ1

x2

,(

AL AR

)→(

A0 a1 A2

)

a1 := yχ1 +a1

Continue with xT

xB

←

x0

χ1

x2

,(

AL AR

)←(

A0 a1 A2

)

endwhile

Algorithm: [A] := RANK1 UNB VAR2(y,x,A)

Partition y→

yT

yB

, A→

AT

AB

where yT has 0 elements, AT has 0 rows

while m(yT )< m(y) do

Repartition

yT

yB

→

y0

ψ1

y2

,

AT

AB

→

A0

aT1

A2

aT

1 := ψ1xT +aT1

Continue with yT

yB

←

y0

ψ1

y2

,

AT

AB

←

A0

aT1

A2

endwhile

Figure 1.2: Rank-1 update algorithms for computing A := yxT +A. Left: by columns. Right: by rows.

Notice that each column is updated with an axpy operation. This suggests the following loop for comput-ing A := yxT +A:

for j := 0, . . . ,n−1a j := χ jy+a j (axpy)

endfor

In Figure 1.2 (left), we present this algorithm using the FLAME notation (LAFF Notes Week 3).Focusing on how A can be partitioned by rows, we find that

yxT +A =

ψ0xT

ψ1xT

...

ψm−1xT

+

aT

0

aT1...

aTm−1



=

ψ0xT + aT

0

ψ1xT + aT1

...

ψm−1xT + aTm−1

.

Notice that each row is updated with an axpy operation. This suggests the following loop for computingA := yxT +A:

for i := 0, . . . ,m−1aT

j := ψixT + aTj (axpy)

endfor

In Figure 1.2 (right), we present this algorithm using the FLAME notation (LAFF Notes Week 3).Again, it is important to notice that this “matrix-vector” operation (rank-1 update) can be “layered”

upon the axpy vector-vector operation.

Cost

A rank-1 update of a m× n matrix costs, approximately, 2mn flops. This can be easily argued in threedifferent ways:

• The computation requires a multiply and an add with each element of the matrix. There are mn suchelements.

• The operation can be computed one column at a time via n axpy operations, each of which requires2m flops.

• The operation can be computed one row at a time via m axpy operations, each of which requires 2nflops.

1.5 Matrix-matrix multiplication (product)

Be sure to understand the relation between linear transformations and matrix-matrix multiplication (LAFFNotes Weeks 3 and 4 [29]).

We will now discuss the computation of C := AB+C, where C ∈ Cm×n, A ∈ Cm×k, and B ∈ Ck×n.(If one wishes to compute C := AB, one can always start by setting C := 0, the zero matrix.) This is thedefinition of the matrix-matrix product AB, sometimes referred to as a general matrix-matrix multiplication(gemm) when no special structure or properties of A and B are assumed.




1.5. Matrix-matrix multiplication (product) 17

1.5.1 Element-by-element computation

Let

C =

γ0,0 γ0,1 · · · γ0,n−1

γ1,0 γ1,1 · · · γ1,n−1...

... · · · ...

γm−1,0 γm−1,1 · · · γm−1,n−1

,A =

α0,0 α0,1 · · · α0,k−1

α1,0 α1,1 · · · α1,k−1...

... · · · ...

αm−1,0 αm−1,1 · · · αm−1,k−1

B =

β0,0 β0,1 · · · β0,n−1

β1,0 β1,1 · · · β1,n−1...

... · · · ...

βk−1,0 βk−1,1 · · · βk−1,n−1

.

Then

C := AB+C

=

∑

k−1p=0 α0,pβp,0 + γ0,0 ∑

k−1p=0 α0,pβp,1 + γ0,1 · · · ∑

k−1p=0 α0,pβp,n−1 + γ0,n−1

∑k−1p=0 α1,pβp,0 + γ1,0 ∑

k−1p=0 α1,pβp,1 + γ1,1 · · · ∑

k−1p=0 α1,pβp,n−1 + γ1,n−1

...... · · · ...

∑k−1p=0 αm−1,pβp,0 + γm−1,0 ∑

k−1p=0 αm−1,pβp,1 + γm−1,1 · · · ∑

k−1p=0 αm−1,pβp,n−1 + γm−1,n−1

.

This can be more elegantly stated by partitioning

A =

aT

0

aT1...

aTm−1

and B =(

b0 b1 · · · bn−1

).

Then

C := AB+C =

aT

0

aT1...

aTm−1

(

b0 b1 · · · bn−1

)

=

aT

0 b0 + γ0,0 aT0 b1 + γ0,1 · · · aT

0 bn−1 + γ0,n−1

aT1 b0 + γ1,0 aT

1 b1 + γ1,1 · · · aT1 bn−1 + γ1,n−1

...... · · · ...

aTm−1b0 + γm−1,0 aT

m−1b1 + γm−1,1 · · · aTm−1bn−1 + γm−1,n−1

.


1.5.2 Via matrix-vector multiplications

PartitionC =

(c0 c1 · · · cn−1

)and B =

(b0 b1 · · · bn−1

).

Then

C :=AB+C =A(

b0 b1 · · · bn−1

)+(

c0 c1 · · · cn−1

)=(

Ab0 + c0 Ab1 + c1 · · · Abn−1 + cn−1

)which shows that each column of C is updated with a matrix-vector multiplication: c j := Ab j + c j:

for j := 0, . . . ,n−1c j := Ab j + c j (matrix-vector multiplication)

endfor

1.5.3 Via row-vector times matrix multiplications

Partition

C =

cT

0

cT1...

cTm−1

and A =

aT

0

aT1...

aTm−1

.

Then

C := AB+C =

aT

0

aT1...

aTm−1

B+

cT

0

cT1...

cTm−1

=

aT

0 B+ cT0

aT1 B+ cT

1...

aTm−1B+ cT

m−1

which shows that each row of C is updated with a row-vector time matrix multiplication: cT

i := aTi B+ cT

i :

for i := 0, . . . ,m−1cT

i := aTi B+ cT

i (row-vector times matrix-vector multiplication)endfor

1.5.4 Via rank-1 updates

Partition

A =(

a0 a1 · · · ak−1

)and B =

bT

0

bT1...

bTk−1

.

1.6. Summarizing All 19

The

C := AB+C =(

a0 a1 · · · ak−1

)

bT0

bT1...

bTk−1

+C

= a0bT0 +a1bT

1 + · · ·+ak−1bTk−1 +C

= ak−1bTk−1 +(· · ·(a1bT

1 +(a0bT0 +C)) · · ·)

which shows that C can be updated with a sequence of rank-1 update, suggesting the loop

for p := 0, . . . ,k−1C := apbT

p +C (rank-1 update)endfor

1.5.5 Cost

A matrix-matrix multiplication C := AB, where C is m×n, A is m×k, and B is k×n, costs approximately2mnk flops. This can be easily argued in four different ways:

• The computation requires a dot product for each element in C, at a cost of 2k flops per dot product.There are mn such elements.

• The operation can be computed one column at a time via n gemv operations, each of which requires2mk flops.

• The operation can be computed one row at a time via m gemv operations (row-vector times matrix),each of which requires 2nk flops.

• The operation can be computed via k rank-1 updates, 2mn flops.

1.6 Summarizing All

It is important to realize that almost all operations that we discussed in this note are special cases ofmatrix-matrix multiplication. This is summarized in Figure 1.3. A few notes:

• Row-vector times matrix is matrix-vector multiplication in disguise: If yT := xT A then y := AT x.

• For a similar reason yT := αxT + yT is an axpy in disguise: y := αx+ y.

• The operation y := xα+ y is the same as y := αx+ y since multiplication of a vector by a scalarcommutes.

• The operation γ := αβ+ γ is known as a Multiply-ACcumulate (MAC) operation. Often floatingpoint hardware implements this as an integrated operation.


m n k Illustration Label

large large large:= +

gemm

large large 1:= +

ger

large 1 large:= +

gemv

1 large large:= +

gemv

1 large 1:= +

axpy

large 1 1:= +

axpy

1 1 large:= +

dot

1 1 1:= +

MAC

Figure 1.3: Special shapes of gemm C := AB+C. Here C, A, and B are m×n, m× k, and k×n matrices,respectively.

Observations made about operations with partitioned matrices and vectors can be summarized by par-titioning matrices into blocks: Let

C =

C0,0 C0,1 · · · C0,N−1

C1,0 C1,1 · · · C1,N−1...

... · · · ...

CM−1,0 CM−1,1 · · · CM−1,N−1

,A =

A0,0 A0,1 · · · A0,K−1

A1,0 A1,1 · · · A1,K−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,K−1

1.6. Summarizing All 21

B =

B0,1 B0,1 · · · B0,N−1

B1,0 B1,1 · · · B1,N−1...

... · · · ...

BK−1,0 BK−1,1 · · · BK−1,N−1

.

Then

C := AB+C =∑

K−1p=0 A0,pBp,0 +C0,0 ∑

K−1p=0 A0,pBp,1 +C0,1 · · · ∑

K−1p=0 A0,pBp,N−1 +C0,N−1

∑K−1p=0 A1,pBp,0 +C1,0 ∑

K−1p=0 A1,pBp,1 +C1,1 · · · ∑

K−1p=0 A1,pBp,N−1 +C1,N−1

...... · · · ...

∑K−1p=0 AM−1,pBp,0 +CM−1,0 ∑

k−1p=0 AM−1,pBp,1 +CM−1,1 · · · ∑

k−1p=0 AM−1,pBp,N−1 +CM−1,N−1

.

(Provided the partitionings of C, A, and B are “conformal.) Notice that multiplication with partitioned ma-trices is exactly like regular matrix-matrix multiplication with scalar elements, except that multiplicationof two blocks does not necessarily commute.

Chapter 2Notes on Vector and Matrix Norms

Video

Read disclaimer regarding the videos in the preface!


(For help on viewing, see Appendix A.)

23

http://youtu.be/l2r_k_4wiKY

https://utexas.box.com/s/e411vohwzauutamgn616

Chapter 2. Notes on Vector and Matrix Norms 24

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.1. Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2. Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.1. Vector 2-norm (length) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2.2. Vector 1-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.3. Vector ∞-norm (infinity norm) . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.4. Vector p-norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3. Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.1. Frobenius norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.2. Induced matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.3.3. Special cases used in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.3.5. Submultiplicative norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4. An Application to Conditioning of Linear Systems . . . . . . . . . . . . . . . . . . . 32

2.5. Equivalence of Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.1. Absolute Value 25

2.1 Absolute Value

Recall that if α ∈ C, then |α| equals its absolute value. In other words, if α = αr + iαc, then |α| =√α2

r +α2c =√

αα.This absolute value function has the following properties:

• α 6= 0⇒ |α|> 0 (| · | is positive definite),

• |αβ|= |α||β| (| · | is homogeneous), and

• |α+β| ≤ |α|+ |β| (| · | obeys the triangle inequality).

2.2 Vector Norms

A (vector) norm extends the notion of an absolute value (length or size) to vectors:

Definition 2.1 Let ν : Cn→ R. Then ν is a (vector) norm if for all x,y ∈ Cn

• x 6= 0⇒ ν(x)> 0 (ν is positive definite),

• ν(αx) = |α|ν(x) (ν is homogeneous), and

• ν(x+ y)≤ ν(x)+ν(y) (ν obeys the triangle inequality).

Homework 2.2 Prove that if ν :Cn→R is a norm, then ν(0) = 0 (where the first 0 denotes the zero vectorin Cn).

* SEE ANSWER

Note: often we will use ‖ · ‖ to denote a vector norm.

2.2.1 Vector 2-norm (length)

Definition 2.3 The vector 2-norm ‖ · ‖2 : Cn→ R is defined for x ∈ Cn by

‖x‖2 =√

xHx =√

χ0χ0 + · · ·+χn−1χn−1 =√|χ0|2 + · · ·+ |χn−1|2.

To show that the vector 2-norm is a norm, we will need the following theorem:

Theorem 2.4 (Cauchy-Schartz inequality) Let x,y ∈ Cn. Then |xHy| ≤ ‖x‖2‖y‖2.

Proof: Assume that x 6= 0 and y 6= 0, since otherwise the inequality is trivially true. We can then choosex = x/‖x‖2 and y = y/‖y‖2. This leaves us to prove that |xH y| ≤ 1, with ‖x‖2 = ‖y‖2 = 1.

Pick α ∈C with |α|= 1 s that αxH y is real and nonnegative. Note that since it is real, αxH y = αxH y =αyH x.


Now,

0 ≤ ‖x−αy‖22

= (x−αy)H(x−αy) (‖z‖22 = zHz)

= xH x−αyH x−αxH y+ααyH y (multiplying out)

= 1−2αxH y+ |α|2 (‖x‖2 = ‖y‖2 = 1 and αxH y = αxH y = αyH x)

= 2−2αxH y (|α|= 1).

Thus 1≥ αxH y and, taking the absolute value of both sides,

1≥ |αxH y|= |α||xH y|= |xH y|,

which is the desired result. QED

Theorem 2.5 The vector 2-norm is a norm.

Proof: To prove this, we merely check whether the three conditions are met:Let x,y ∈ Cn and α ∈ C be arbitrarily chosen. Then

• x 6= 0⇒‖x‖2 > 0 (‖ · ‖2 is positive definite):

Notice that x 6= 0 means that at least one of its components is nonzero. Let’s assume thatχ j 6= 0. Then

‖x‖2 =√|χ0|2 + · · ·+ |χn−1|2 ≥

√|χ j|2 = |χ j|> 0.

• ‖αx‖2 = |α|‖x‖2 (‖ · ‖2 is homogeneous):

‖αx‖2 =√|αχ0|2 + · · ·+ |αχn−1|2

=√|α|2|χ0|2 + · · ·+ |α|2|χn−1|2

=√|α|2(|χ0|2 + · · ·+ |χn−1|2)

= |α|√|χ0|2 + · · ·+ |χn−1|2

= |α|‖x‖2.

• ‖x+ y‖2 ≤ ‖x‖2 +‖y‖2 (‖ · ‖2 obeys the triangle inequality).

‖x+ y‖22 = (x+ y)H(x+ y)

= xHx+ yHx+ xHy+ yHy≤ ‖x‖2

2 +2‖x‖2‖y‖2 +‖y‖22

= (‖x‖2 +‖y‖2)2.

Taking the square root of both sides yields the desired result.

QED

2.3. Matrix Norms 27

2.2.2 Vector 1-norm

Definition 2.6 The vector 1-norm ‖ · ‖1 : Cn→ R is defined for x ∈ Cn by

‖x‖1 = |χ0|+ |χ1|+ · · ·+ |χn−1|.

Homework 2.7 The vector 1-norm is a norm.* SEE ANSWER

The vector 1-norm is sometimes referred to as the “taxi-cab norm”. It is the distance that a taxi travelsalong the streets of a city that has square blocks.

2.2.3 Vector ∞-norm (infinity norm)

Definition 2.8 The vector ∞-norm ‖ · ‖∞ : Cn→ R is defined for x ∈ Cn by ‖x‖∞ = maxi |χi|.

Homework 2.9 The vector ∞-norm is a norm.* SEE ANSWER

2.2.4 Vector p-norm

Definition 2.10 The vector p-norm ‖ · ‖p : Cn→ R is defined for x ∈ Cn by

‖x‖p =p√|χ0|p + |χ1|p + · · ·+ |χn−1|p.

Proving that the p-norm is a norm is a little tricky and not particularly relevant to this course. To provethe triangle inequality requires the following classical result:

Theorem 2.11 (Holder inequality) Let x,y∈Cn and 1p +

1q = 1 with 1≤ p,q≤∞. Then |xHy| ≤ ‖x‖p‖y‖q.

Clearly, the 1-norm and 2 norms are special cases of the p-norm. Also, ‖x‖∞ = limp→∞ ‖x‖p.

2.3 Matrix Norms

It is not hard to see that vector norms are all measures of how “big” the vectors are. Similarly, we wantto have measures for how “big” matrices are. We will start with one that are somewhat artificial and thenmove on to the important class of induced matrix norms.

2.3.1 Frobenius norm

Definition 2.12 The Frobenius norm ‖ · ‖F : Cm×n→ R is defined for A ∈ Cm×n by

‖A‖F =

√√√√m−1

∑i=0

n−1

∑j=0|αi, j|2.

Notice that one can think of the Frobenius norm as taking the columns of the matrix, stacking them ontop of each other to create a vector of size m×n, and then taking the vector 2-norm of the result.


Homework 2.13 Show that the Frobenius norm is a norm.* SEE ANSWER

Similarly, other matrix norms can be created from vector norms by viewing the matrix as a vector. Itturns out that other than the Frobenius norm, these aren’t particularly interesting in practice.

2.3.2 Induced matrix norms

Definition 2.14 Let ‖ · ‖µ : Cm→ R and ‖ · ‖ν : Cn→ R be vector norms. Define ‖ · ‖µ,ν : Cm×n→ R by

‖A‖µ,ν = supx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

.

Let us start by interpreting this. How “big” A is, as measured by ‖A‖µ,ν, is defined as the most that Amagnifies the length of nonzero vectors, where the length of the vectors (x) is measured with norm ‖ · ‖ν

and the length of the transformed vector (Ax) is measured with norm ‖ · ‖µ.Two comments are in order. First,

supx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

= sup‖x‖ν=1

‖Ax‖µ.

This follows immediately from the fact this sequence of equivalences:

supx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

= supx ∈ Cn

x 6= 0

‖ Ax‖x‖ν

‖µ = supx ∈ Cn

x 6= 0

‖A x‖x‖ν

‖µ = supx ∈ Cn

x 6= 0y = x

‖x‖ν

‖Ay‖µ = sup‖y‖ν=1

‖Ay‖µ = sup‖x‖ν=1

‖Ax‖µ.

Also the “sup” (which stands for supremum) is used because we can’t claim yet that there is a vector xwith ‖x‖ν = 1 for which

‖A‖µ,ν = ‖Ax‖µ.

The fact is that there is always such a vector x. The proof depends on a result from real analysis (sometimescalled “advanced calculus”) that states that supx∈S f (x) is attained for some vector x ∈ S as long as f iscontinuous and S is a compact set. Since real analysis is not a prerequisite for this course, the reader mayhave to take this on faith!

We conclude that the following two definitions are equivalent definitions to the one we already gave:


‖A‖µ,ν = maxx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

.

and



‖A‖µ,ν = max‖x‖ν=1

‖Ax‖µ.

Theorem 2.17 ‖ · ‖µ,ν : Cm×n→ R is a norm.

Proof: To prove this, we merely check whether the three conditions are met:Let A,B ∈ Cm×n and α ∈ C be arbitrarily chosen. Then

• A 6= 0⇒‖A‖µ,ν > 0 (‖ · ‖µ,ν is positive definite):

Notice that A 6= 0 means that at least one of its columns is not a zero vector (since at leastone element). Let us assume it is the jth column, a j, that is nonzero. Then

‖A‖µ,ν = maxx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

≥‖Ae j‖µ

‖e j‖ν

=‖a j‖µ

‖e j‖ν

> 0.

• ‖αA‖µ,ν = |α|‖A‖µ,ν (‖ · ‖µ,ν is homogeneous):

‖αA‖µ,ν = maxx ∈ Cn

x 6= 0

‖αAx‖µ

‖x‖ν

= maxx ∈ Cn

x 6= 0

|α|‖Ax‖µ

‖x‖ν

= |α| maxx ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

= |α|‖A‖µ,ν.

• ‖A+B‖µ,ν ≤ ‖A‖µ,ν +‖B‖µ,ν (‖ · ‖µ,ν obeys the triangle inequality).

‖A+B‖µ,ν = maxx ∈ Cn

x 6= 0

‖(A+B)x‖µ

‖x‖ν

= maxx ∈ Cn

x 6= 0

‖Ax+Bx‖µ

‖x‖ν

≤ maxx ∈ Cn

x 6= 0

‖Ax‖µ +‖Bx‖µ

‖x‖ν

≤ maxx ∈ Cn

x 6= 0

(‖Ax‖µ

‖x‖ν

+‖Bx‖µ

‖x‖ν

)≤ max

x ∈ Cn

x 6= 0

‖Ax‖µ

‖x‖ν

+ maxx ∈ Cn

x 6= 0

‖Bx‖µ

‖x‖ν

= ‖A‖µ,ν +‖B‖µ,ν.

QED

2.3.3 Special cases used in practice

The most important case of ‖ ·‖µ,ν : Cm×n→R uses the same norm for ‖ ·‖µ and ‖ ·‖ν (except that m maynot equal n).

Definition 2.18 Define ‖ · ‖p : Cm×n→ R by

‖A‖p = maxx ∈ Cn

x 6= 0

‖Ax‖p

‖x‖p= max‖x‖p=1

‖Ax‖p.


Theorem 2.19 Let A ∈ Cm×n and partition A =(

a0 a1 · · · an−1

). Show that

‖A‖1 = max0≤ j<n

‖a j‖1.

Proof: Let be chosen so that max0≤ j<n ‖a j‖1 = ‖a ‖1. Then

max‖x‖1=1

‖Ax‖1 = max‖x‖1=1

∥∥∥∥∥∥∥∥∥∥∥(

a0 a1 · · · an−1

)

χ0

χ1...

χn−1

∥∥∥∥∥∥∥∥∥∥∥1

= max‖x‖1=1

‖χ0a0 +χ1a1 + · · ·+χn−1an−1‖1

≤ max‖x‖1=1

(‖χ0a0‖1 +‖χ1a1‖1 + · · ·+‖χn−1an−1‖1)

= max‖x‖1=1

(|χ0|‖a0‖1 + |χ1|‖a1‖1 + · · ·+ |χn−1|‖an−1‖1)

≤ max‖x‖1=1

(|χ0|‖a ‖1 + |χ1|‖a ‖1 + · · ·+ |χn−1|‖a ‖1

)= max

‖x‖1=1(|χ0|+ |χ1|+ · · ·+ |χn−1|)‖a ‖1

= ‖a ‖1.

Also,

‖a ‖1 = ‖Ae ‖1 ≤ max‖x‖1=1

‖Ax‖1.

Hence‖a ‖1 ≤ max

‖x‖1=1‖Ax‖1 ≤ ‖a ‖1

which implies thatmax‖x‖1=1

‖Ax‖1 = ‖a ‖1 = max0≤ j<n

‖a j‖.

QED

Homework 2.20 Let A ∈ Cm×n and partition A =

aT

0

aT1...

aTm−1

. Show that

‖A‖∞ = max0≤i<m

‖ai‖1 = max0≤i<m

(|αi,0|+ |αi,1|+ · · ·+ |αi,n−1|)

* SEE ANSWER

Notice that in the above exercise ai is really (aTi )

T since aTi is the label for the ith row of matrix A.

Homework 2.21 Let y ∈ Cm and x ∈ Cn. Show that ‖yxH‖2 = ‖y‖2‖x‖2.* SEE ANSWER


2.3.4 Discussion

While ‖ ·‖2 is a very important matrix norm, it is in practice often difficult to compute. The matrix norms,‖ · ‖F , ‖ · ‖1, and ‖ · ‖∞ are more easily computed and hence more practical in many instances.

2.3.5 Submultiplicative norms

Definition 2.22 A matrix norm ‖ · ‖ν : Cm×n → R is said to be submultiplicative (consistent) if it alsosatisfies

‖AB‖ν ≤ ‖A‖ν‖B‖ν.

Theorem 2.23 Let ‖ · ‖ν : Cn → R be a vector norm and given any matrix C ∈ Cm×n define the corre-sponding induced matrix norm as

‖C‖ν = maxx 6=0

‖Cx‖ν

‖x‖ν

= max‖x‖ν=1

‖Cx‖ν.

Then for any A ∈ Cm×k and B ∈ Ck×n the inequality ‖AB‖ν ≤ ‖A‖ν‖B‖ν holds.

In other words, induced matrix norms are submultiplicative.To prove the above, it helps to first proof a simpler result:

Lemma 2.24 Let ‖ · ‖ν : Cn → R be a vector norm and given any matrix C ∈ Cm×n define the inducedmatrix norm as

‖C‖ν = maxx 6=0

‖Cx‖ν

‖x‖ν

= max‖x‖ν=1

‖Cx‖ν.

Then for any A ∈ Cm×n and y ∈ Cn the inequality ‖Ay‖ν ≤ ‖A‖ν‖y‖ν holds.

Proof: If y = 0, the result obviously holds since then ‖Ay‖ν = 0 and ‖y‖ν = 0. Let y 6= 0. Then

‖A‖ν = maxx 6=0

‖Ax‖ν

‖x‖ν

≥ ‖Ay‖ν

‖y‖ν

.

Rearranging this yields ‖Ay‖ν ≤ ‖A‖ν‖y‖ν. QEDWe can now prove the theorem:

Proof:

‖AB‖ν = max‖x‖ν=1

‖ABx‖ν = max‖x‖ν=1

‖A(Bx)‖ν ≤ max‖x‖ν=1

‖A‖ν‖Bx‖ν ≤ max‖x‖ν=1

‖A‖ν‖B‖ν‖x‖ν = ‖A‖ν‖B‖ν.

QED

Homework 2.25 Show that ‖Ax‖µ ≤ ‖A‖µ,ν‖x‖ν.* SEE ANSWER

Homework 2.26 Show that ‖AB‖µ ≤ ‖A‖µ,ν‖B‖ν.* SEE ANSWER

Homework 2.27 Show that the Frobenius norm, ‖ · ‖F , is submultiplicative.* SEE ANSWER


2.4 An Application to Conditioning of Linear Systems

A question we will run into later in the course asks how accurate we can expect the solution of a linearsystem to be if the right-hand side of the system has error in it.

Formally, this can be stated as follows: We wish to solve Ax = b, where A ∈ Cm×m but the right-handside has been perturbed by a small vector so that it becomes b+δb. (Notice how that δ touches the b. Thisis meant to convey that this is a symbol that represents a vector rather than the vector b that is multipliedby a scalar δ.) The question now is how a relative error in b propogates into a potential error in the solutionx.

This is summarized as follows:

Ax = b Exact equation

A(x+δx) = b+δb Perturbed equation

We would like to determine a formula, κ(A,b,δb), that tells us how much a relative error in b is potentiallyamplified into an error in the solution b:

‖δx‖‖x‖≤ κ(A,b,δb)

‖δb‖‖b‖

.

We will assume that A has an inverse. To find an expression for κ(A,b,δb), we notice that

Ax+Aδx = b+δb

Ax = b −

Aδx = δb

and from thisAx = b

δx = A−1δb.

If we now use a vector norm ‖ · ‖ and induced matrix norm ‖ · ‖, then

‖b‖ = ‖Ax‖ ≤ ‖A‖‖x‖‖δx‖ = ‖A−1δb‖ ≤ ‖A−1‖‖δb‖.

From this we conclude that1‖x‖ ≤ ‖A‖ 1

‖b‖

‖δx‖ ≤ ‖A−1‖‖δb‖.

so that‖δx‖‖x‖≤ ‖A‖‖A−1‖‖δb‖

‖b‖.

Thus, the desired expression κ(A,b,δb) doesn’t depend on anything but the matrix A:

‖δx‖‖x‖≤ ‖A‖‖A−1‖︸︷︷︸

κ(A)

‖δb‖‖b‖

.

2.5. Equivalence of Norms 33

κ(A) = ‖A‖‖A−1‖ the called the condition number of matrix A.A question becomes whether this is a pessimistic result or whether there are examples of b and δb for

which the relative error in b is amplified by exactly κ(A). The answer is, unfortunately, “yes!”, as we willshow next.

Notice that

• There is an x for which‖A‖= max

‖x‖=1‖Ax‖= ‖Ax‖,

namely the x for which the maximum is attained. Pick b = Ax.

• There is an δb for which

‖A−1‖= max‖x‖6=0

‖A−1x‖‖x‖

=‖A−1δb‖‖δb‖

,

again, the x for which the maximum is attained.

It is when solving the perturbed system

A(x+δx) = b+ δb

that the maximal magnification by κ(A) is attained.

Homework 2.28 Let ‖ · ‖ be a matrix norm induced by the ‖· · ·‖ vector norm. Show that κ(A) =‖A‖‖A−1‖ ≥ 1.

* SEE ANSWER

This last exercise shows that there will always be choices for b and δb for which the relative error is atbest directly translated into an equal relative error in the solution (if κ(A) = 1).

2.5 Equivalence of Norms

Many results we encounter show that the norm of a particular vector or matrix is small. Obviously, itwould be unfortunate if a vector or matrix is large in one norm and small in another norm. The followingresult shows that, modulo a constant, all norms are equivalent. Thus, if the vector is small in one norm, itis small in other norms as well.

Theorem 2.29 Let ‖ · ‖µ : Cn→ R and ‖ · ‖ν : Cn→ R be vector norms. Then there exist constants αµ,νand βµ,ν such that for all x ∈ Cn

αµ,ν‖x‖µ ≤ ‖x‖ν ≤ βµ,ν‖x‖µ.

A similar result holds for matrix norms:

Theorem 2.30 Let ‖ · ‖µ : Cm×n→ R and ‖ · ‖ν : Cm×n→ R be matrix norms. Then there exist constantsαµ,ν and βµ,ν such that for all A ∈ Cm×n

αµ,ν‖A‖µ ≤ ‖A‖ν ≤ βµ,ν‖A‖µ.

Chapter 3Notes on Orthogonality and the Singular ValueDecomposition

The reader may wish to review Weeks 9-11 of


Video


* YouTube Part 1 * YouTube Part 2

* Download Part 1 from UT Box * Download Part 2 from UT Box

* View Part 1 After Local Download * View Part 2 After Local Download


35


http://youtu.be/ZMEv15Fgt5E

http://youtu.be/tZPhXcI_to8

https://utexas.box.com/s/rg8prvbdase73mm4i6lt

https://utexas.box.com/s/zl3nhmy2zfjjmkvvau15

Chapter 3. Notes on Orthogonality and the Singular Value Decomposition 36

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1. Orthogonality and Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2. Toward the SVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3. The Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4. Geometric Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5. Consequences of the SVD Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.6. Projection onto the Column Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.7. Low-rank Approximation of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.8. An Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.9. SVD and the Condition Number of a Matrix . . . . . . . . . . . . . . . . . . . . . . 54

3.10. An Algorithm for Computing the SVD? . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.1. Orthogonality and Unitary Matrices 37

3.1 Orthogonality and Unitary Matrices

Definition 3.1 Let u,v ∈ Cm. These vectors are orthogonal (perpendicular) if uHv = 0.

Definition 3.2 Let q0,q1, . . . ,qn−1 ∈ Cm. These vectors are said to be mutually orthonormal if for all0≤ i, j < n

qHi q j =

1 if i = j

0 otherwise.

Notice that for n vectors of length m to be mutually orthonormal, n must be less than or equal tom. This is because n mutually orthonormal vectors are linearly independent and there can be at most mlinearly independent vectors of length m.

Definition 3.3 Let Q ∈ Cm×n (with n ≤ m). Then Q is said to be an orthonormal matrix if QHQ = I (theidentity).

Homework 3.4 Let Q ∈ Cm×n (with n ≤ m). Partition Q =(

q0 q1 · · · qn−1

). Show that Q is an

orthonormal matrix if and only if q0,q1, . . . ,qn−1 are mutually orthonormal.* SEE ANSWER

Definition 3.5 Let Q ∈ Cm×m. Then Q is said to be a unitary matrix if QHQ = I (the identity).

Notice that unitary matrices are always square and only square matrices can be unitary. Sometimes theterm orthogonal matrix is used instead of unitary matrix, especially if the matrix is real valued.

Homework 3.6 Let Q ∈ Cm×m. Show that if Q is unitary then Q−1 = QH and QQH = I.* SEE ANSWER

Homework 3.7 Let Q0,Q1 ∈ Cm×m both be unitary. Show that their product, Q0Q1, is unitary.* SEE ANSWER

Homework 3.8 Let Q0,Q1, . . . ,Qk−1 ∈ Cm×m all be unitary. Show that their product, Q0Q1 · · ·Qk−1, isunitary.

* SEE ANSWER

The following is a very important observation: Let Q be a unitary matrix with

Q =(

q0 q1 · · · qm−1

).

Let x ∈ Cm. Then

x = QQHx =(

q0 q1 · · · qm−1

)(q0 q1 · · · qm−1

)Hx

=(

q0 q1 · · · qm−1

)

qH0

qH1...

qHm−1

x


=(

q0 q1 · · · qm−1

)

qH0 x

qH1 x...

qHm−1x

= (qH

0 x)q0 +(qH1 x)q1 + · · ·+(qH

m−1x)qm−1.

What does this mean?

• The vector x =

χ0

χ1...

χm−1

gives the coefficients when the vector x is written as a linear combination

of the unit basis vectors:x = χ0e0 +χ1e1 + · · ·+χm−1em−1.

• The vector

QHx =

qH

0 x

qH1 x...

qHm−1x

gives the coefficients when the vector x is written as a linear combination of the orthonormal vectorsq0,q1, . . . ,qm−1:

x = (qH0 x)q0 +(qH

1 x)q1 + · · ·+(qHm−1x)qm−1.

• The vector (qHi x)qi equals the component of x in the direction of vector qi.

Another way of looking at this is that if q0,q1, . . . ,qm−1 is an orthonormal basis for Cm, then any x ∈ Cm

can be written as a linear combination of these vectors:

x = α0q0 +α1q1 + · · ·+αm−1qm−1.

Now,

qHi x = qH

i (α0q0 +α1q1 + · · ·+αi−1qi−1 +αiqi +αi+1qi+1 + · · ·+αm−1qm−1)

= α0 qHi q0︸︷︷︸0

+α1 qHi q1︸︷︷︸0

+ · · ·+αi−1 qHi qi−1︸︷︷︸

0

+αi qHi qi︸︷︷︸1

+αi+1 qHi qi+1︸︷︷︸

0

+ · · ·+αm−1 qHi qm−1︸︷︷︸

0= αi.

Thus qHi x = αi, the coefficient that multiplies qi.

3.2. Toward the SVD 39

Homework 3.9 Let U ∈ Cm×m be unitary and x ∈ Cm, then ‖Ux‖2 = ‖x‖2.* SEE ANSWER

Homework 3.10 Let U ∈ Cm×m and V ∈ Cn×n be unitary matrices and A ∈ Cm×n. Then

‖UA‖2 = ‖AV‖2 = ‖A‖2.

* SEE ANSWER

Homework 3.11 Let U ∈ Cm×m and V ∈ Cn×n be unitary matrices and A ∈ Cm×n. Then ‖UA‖F =‖AV‖F = ‖A‖F .

* SEE ANSWER

3.2 Toward the SVD

Lemma 3.12 Given A ∈Cm×n there exists unitary U ∈Cm×m, unitary V ∈Cn×n, and diagonal D ∈Rm×n

such that A =UDV H where D =

DT L 0

0 0

with DT L = diag(δ0, · · · ,δr−1) and δi > 0 for 0≤ i < r.

Proof: First, let us observe that if A = 0 (the zero matrix) then the theorem trivially holds: A = UDV H

where U = Im×m, V = In×n, and D =

0

, so that DT L is 0×0. Thus, w.l.o.g. assume that A 6= 0.

We will prove this for m ≥ n, leaving the case where m ≤ n as an exercise, employing a proof byinduction on n.

• Base case: n = 1. In this case A =(

a0

)where a0 ∈ Rm is its only column. By assumption,

a0 6= 0. Then

A =(

a0

)=(

u0

)(‖a0‖2)

(1)H

where u0 = a0/‖a0‖2. Choose U1 ∈ Cm×(m−1) so that U =(

u0 U1

)is unitary. Then

A =(

a0

)=(

u0

)(‖a0‖2)

(1)H

=(

u0 U1

) ‖a0‖2

0

( 1)H

=UDV H

where DT L =(

δ0

)=(‖a0‖2

)and V =

(1)

.

• Inductive step: Assume the result is true for all matrices with 1 ≤ k < n columns. Show that it istrue for matrices with n columns.

Let A ∈ Cm×n with n ≥ 2. W.l.o.g., A 6= 0 so that ‖A‖2 6= 0. Let δ0 and v0 ∈ Cn have the prop-erty that ‖v0‖2 = 1 and δ0 = ‖Av0‖2 = ‖A‖2. (In other words, v0 is the vector that maximizes


max‖x‖2=1 ‖Ax‖2.) Let u0 = Av0/δ0. Note that ‖u0‖2 = 1. Choose U1 ∈Cm×(m−1) and V1 ∈Cn×(n−1)

so that U =(

u0 U1

)and V =

(v0 V1

)are unitary. Then

UHAV =(

u0 U1

)HA(

v0 V1

)=

uH0 Av0 uH

0 AV1

UH1 Av0 UH

1 AV1

=

δ0uH0 u0 uH

0 AV1

δUH1 u0 UH

1 AV1

=

δ0 wH

0 B

,

where w =V H1 AHu0 and B =UH

1 AV1. Now, we will argue that w = 0, the zero vector of appropriatesize:

δ20 = ‖A‖2

2 = ‖UHAV‖22 = max

x 6=0

‖UHAV x‖22

‖x‖22

= maxx 6=0

∥∥∥∥∥∥ δ0 wH

0 B

x

∥∥∥∥∥∥2

2

‖x‖22

≥

∥∥∥∥∥∥ δ0 wH

0 B

δ0

w

∥∥∥∥∥∥2

2∥∥∥∥∥∥ δ0

w

∥∥∥∥∥∥2

2

=

∥∥∥∥∥∥ δ2

0 +wHw

Bw

∥∥∥∥∥∥2

2∥∥∥∥∥∥ δ0

w

∥∥∥∥∥∥2

2

≥(δ2

0 +wHw)2

δ20 +wHw

= δ20 +wHw.

Thus δ20 ≥ δ2

0 +wHh which means that w = 0 and UHAV =

δ0 0

0 B

.

By the induction hypothesis, there exists unitary U ∈ C(m−1)×(m−1), unitary V ∈ C(n−1)×(n−1), and

D ∈ R(m−1)×(n−1) such that B = UDV H where D =

DT L 0

0 0

with DT L = diag(δ1, · · · ,δr−1).

Now, let

U = U

1 0

0 U

,V = V

1 0

0 V

, and D =

δ0 0

0 D

.

(There are some really tough to see ”checks” in the definition of U , V , and D!!) Then A = UDV H

where U , V , and D have the desired properties.

• By the Principle of Mathematical Induction the result holds for all matrices A∈Cm×n with m≥ n.

Homework 3.13 Let D = diag(δ0, . . . ,δn−1). Show that ‖D‖2 = maxn−1i=0 |δi|.

* SEE ANSWER

3.3. The Theorem 41

Homework 3.14 Let A =

AT

0

. Use the SVD of A to show that ‖A‖2 = ‖AT‖2.

* SEE ANSWER

Homework 3.15 Assume that U ∈ Cm×m and V ∈ Cn×n be unitary matrices. Let A,B ∈ Cm×n with B =UAV H . Show that the singular values of A equal the singular values of B.

* SEE ANSWER

Homework 3.16 Let A ∈ Cm×n with A =

σ0 0

0 B

and assume that ‖A‖2 = σ0. Show that ‖B‖2 ≤

‖A‖2. (Hint: Use the SVD of B.)* SEE ANSWER

Homework 3.17 Prove Lemma 3.12 for m≤ n.* SEE ANSWER

3.3 The Theorem

Theorem 3.18 (Singular Value Decomposition) Given A∈Cm×n there exists unitary U ∈Cm×m, unitary

V ∈ Cn×n, and Σ ∈ Rm×n such that A = UΣV H where Σ =

ΣT L 0

0 0

with ΣT L = diag(σ0, · · · ,σr−1)

and σ0 ≥ σ1 ≥ ·· · ≥ σr−1 > 0. The σ0, . . . ,σr−1 are known as the singular values of A.

Proof: Notice that the proof of the above theorem is identical to that of Lemma 3.12. However, thanksto the above exercises, we can conclude that ‖B‖2 ≤ σ0 in the proof, which then can be used to show thatthe singular values are found in order.Proof: (Alternative) An alternative proof uses Lemma 3.12 to conclude that A =UDV H . If the entries onthe diagonal of D are not ordered from largest to smallest, then this can be fixed by permuting the rowsand columns of D, and correspondingly permuting the columns of U and V .

3.4 Geometric Interpretation

We will now quickly illustrate what the SVD Theorem tells us about matrix-vector multiplication (lineartransformations) by examining the case where A ∈ R2×2. Let A = UΣV T be its SVD. (Notice that allmatrices are now real valued, and hence V H =V T .) Partition

A =(

u0 u1

) σ0 0

0 σ1

( v0 v1

)T.

Since U and V are unitary matrices, u0,u1 and v0,v1 form orthonormal bases for the range and domainof A, respectively:


R2: Domain of A. R2: Range (codomain) of A.

Let us manipulate the decomposition a little:

A =(

u0 u1

) σ0 0

0 σ1

( v0 v1

)T=

( u0 u1

) σ0 0

0 σ1

( v0 v1

)T

=(

σ0u0 σ1u1

)(v0 v1

)T.

Now let us look at how A transforms v0 and v1:

Av0 =(

σ0u0 σ1u1

)(v0 v1

)Tv0 =

(σ0u0 σ1u1

) 1

0

= σ0u0

and similarly Av1 = σ1u1. This motivates the pictures


3.4. Geometric Interpretation 43

Now let us look at how A transforms any vector with (Euclidean) unit length. Notice that x =

χ0

χ1

means that

x = χ0e0 +χ1e1,

where e0 and e1 are the unit basis vectors. Thus, χ0 and χ1 are the coefficients when x is expressed usinge0 and e1 as basis. However, we can also express x in the basis given by v0 and v1:

x = VV T︸︷︷︸I

x =(

v0 v1

)(v0 v1

)Tx =

(v0 v1

) vT0 x

vT1 x

= vT0 x︸︷︷︸α0

v0 + vT1 x︸︷︷︸α1

v1 = α0v0 +α0v1 =(

v0 v1

) α0

α1

.

Thus, in the basis formed by v0 and v1, its coefficients are α0 and α1. Now,

Ax =(

σ0u0 σ1u1

)(v0 v1

)Tx =

(σ0u0 σ1u1

)(v0 v1

)T (v0 v1

) α0

α1

=

(σ0u0 σ1u1

) α0

α1

= α0σ0u0 +α1σ1u1.

This is illustrated by the following picture, which also captures the fact that the unit ball is mapped to an“ellipse”1with major axis equal to σ0 = ‖A‖2 and minor axis equal to σ1:


1It is not clear that it is actually an ellipse and this is not important to our observations.


Finally, we show the same insights for general vector x (not necessarily of unit length).


Another observation is that if one picks the right basis for the domain and codomain, then the com-putation Ax simplifies to a matrix multiplication with a diagonal matrix. Let us again illustrate this fornonsingular A ∈ R2×2 with

A =(

u0 u1

)︸︷︷︸

U

σ0 0

0 σ1

︸︷︷︸

Σ

(v0 v1

)︸︷︷︸

V

T.

Now, if we chose to express y using u0 and u1 as the basis and express x using v0 and v1 as the basis, then

y = UUT︸︷︷︸I

y = (uT0 y)u0 +(uT

1 y)u1 =

ψ0

ψ1

x = VV T︸︷︷︸I

x = (vT0 x)v0 +(vT

1 x)v1 ==

χ0

χ1.

.

If y = Ax thenU UT y︸︷︷︸

y

= UΣV T x︸︷︷︸Ax

=UΣx

so that y = Σx and ψ0

ψ1.

=

σ0χ0

σ1χ1.

.

These observation generalize to A ∈ Cm×m.

3.5. Consequences of the SVD Theorem 45

3.5 Consequences of the SVD Theorem

Throughout this section we will assume that

• A =UΣV H is the SVD of A ∈ Cm×n, with U and V unitary and Σ diagonal.

• Σ =

ΣT L 0

0 0

where ΣT L = diag(σ0, . . . ,σr−1) with σ0 ≥ σ1 ≥ . . .≥ σr−1 > 0.

• U =(

UL UR

)with UL ∈ Cm×r.

• V =(

VL VR

)with VL ∈ Cn×r.

We first generalize the observations we made for A∈R2×2. Let us track what the effect of Ax=UΣV Hxis on vector x. We assume that m≥ n.

• Let U =(

u0 · · · um−1

)and V =

(v0 · · · vn−1

).

• Let

x = VV Hx =(

v0 · · · vn−1

)(v0 · · · vn−1

)Hx =

(v0 · · · vn−1

)vH

0 x...

vHn−1x

= vH

0 xv0 + · · ·+ vHn−1xvn−1.

This can be interpreted as follows: vector x can be written in terms of the usual basis of Cn as χ0e0+· · ·+χ1en−1 or in the orthonormal basis formed by the columns of V as vH

0 xv0 + · · ·+ vHn−1xvn−1.

• Notice that Ax = A(vH0 xv0 + · · ·+ vH

n−1xvn−1) = vH0 xAv0 + · · ·+ vH

n−1xAvn−1 so that we next look athow A transforms each vi: Avi =UΣV Hvi =UΣei = σiUei = σiui.

• Thus, another way of looking at Ax is

Ax = vH0 xAv0 + · · ·+ vH

n−1xAvn−1

= vH0 xσ0u0 + · · ·+ vH

n−1xσn−1un−1

= σ0u0vH0 x+ · · ·+σn−1un−1vH

n−1x

=(σ0u0vH

0 + · · ·+σn−1un−1vHn−1)

x.

Corollary 3.19 A =ULΣT LV HL . This is called the reduced SVD of A.

Proof:

A =UΣV H =(

UL UR

) ΣT L 0

0 0

( VL VR

)H=ULΣT LV H

L .


Corollary 3.20 Let A =ULΣT LV HL be the reduced SVD with

UL =(

u0 · · · ur−1

), ΣT L = diag(σ0, . . . ,σr−1), and VL =

(v0 · · · vr−1

).

ThenA = σ0u0vH

0︸︷︷︸σ0

+ σ1u1vH1︸︷︷︸

σ1

+ · · ·+ σr−1ur−1vHr−1︸︷︷︸

σr−1

.

(Each term nonzero and an outer product, and hence a rank-1 matrix.)

Proof: We leave the proof as an exercise.

Corollary 3.21 C (A) = C (UL).

Proof:

• Let y ∈ C (A). Then there exists x ∈ Cn such that y = Ax (by the definition of y ∈ C (A)). But then

y = Ax =UL ΣT LV HL x︸︷︷︸

z

=ULz,

i.e., there exists z ∈ Cr such that y =ULz. This means y ∈ C (UL).

• Let y ∈ C (UL). Then there exists z ∈ Cr such that y =ULz. But then

y =ULz =UL ΣT LΣ−1T L︸︷︷︸

I

z =ULΣT L V HL VL︸︷︷︸I

Σ−1T Lz = A VLΣ

−1T Lz︸︷︷︸

x

= Ax

so that there exists x ∈ Cn such that y = Ax, i.e., y ∈ C (A).

Corollary 3.22 Let A = ULΣT LV HL be the reduced SVD of A where UL and VL have r columns. Then the

rank of A is r.

Proof: The rank of A equals the dimension of C (A) = C (UL). But the dimension of C (UL) is clearly r.

Corollary 3.23 N (A) = C (VR).

Proof:

• Let x ∈N (A). Then

x = VV H︸︷︷︸I

x =(

VL VR

)(VL VR

)Hx =

(VL VR

) V HL

V HR

x

=(

VL VR

) V HL x

V HR x

=VLV HL x+VRV H

R x.

If we can show that V HL x = 0 then x =VRz where z =V H

R x. Assume that V HL x 6= 0. Then ΣT L(V H

L x) 6=0 (since ΣT L is nonsingular) and UL(ΣT L(V H

L x)) 6= 0 (since UL has linearly independent columns).But that contradicts the fact that Ax =ULΣT LV H

L x = 0.

3.5. Consequences of the SVD Theorem 47

• Let x ∈ C (VR). Then x =VRz for some z ∈ Cr and Ax =ULΣT L V HL VR︸︷︷︸0

z = 0.

Corollary 3.24 For all x ∈ Cn there exists z ∈ C (VL) such that Ax = Az.

Proof:

Ax = A VV H︸︷︷︸I

x = A(

VL VR

)(VL VR

)Hx

= A(VLV H

L x+VRV HR x)= AVLV H

L x+AVRV HR x

= AVLV HL x+ULΣT L V H

L VR︸︷︷︸0

V HR x = A VLV H

L x︸︷︷︸z

.

Alternative proof (which uses the last corollary):

Ax = A(VLV H

L x+VRV HR x)= AVLV H

L x+A VRV HR x︸︷︷︸

∈N (A)

= A VLV HL x︸︷︷︸

z

.

The proof of the last corollary also shows that

Corollary 3.25 Any vector x ∈Cn can be written as x = z+xn where z ∈ C (VL) and xn ∈N (A) = C (VR).

Corollary 3.26 AH =VLΣT LUHL so that C (AH) = C (VL) and N (AH) = C (UR).

The above corollaries are summarized in Figure 3.1.

Theorem 3.27 Let A ∈ Cn×n be nonsingular. Let A =UΣV H be its SVD. Then

1. The SVD is the reduced SVD.

2. σn−1 6= 0.

3. IfU =

(u0 · · · un−1

),Σ = diag(σ0, . . . ,σn−1), and V =

(v0 · · · vn−1

),

then

A−1 = (V PT )(PΣ−1PT )(UPT )H =

(vn−1 · · · v0

)diag(

1σn−1

, . . . ,1

σ0)(

un−1 · · · u0

),

where P=

0 · · · 0 1

0 · · · 1 0...

......

1 · · · 0 0

is the permutation matrix such that Px reverses the order of the entries

in x. (Note: for this permutation matrix, PT = P. In general, this is not the case. What is the casefor all permutation matrices P is that PT P = PPT = I.)


Cn Cm

N (A) = C (VR)

dim n− r

C (AH) = C (VL)

dim r C (A) = C (UL)

dim r

N (AH) = C (UR)

dim m− r

z

xn

x = z+ xny = Az

y = Ax =ULΣT LV HL x

y

Figure 3.1: A pictorial description of how x = z+xn is transformed by A ∈Cm×n into y = Ax = A(z+xn).We see that C (VL) and C (VR) are orthogonal complements of each other within Cn. Similarly, C (UL) andC (UR) are orthogonal complements of each other within Cm. Any vector x can be written as the sum of avector z ∈ C (VR) and xn ∈ C (VC) = N (A).

4. ‖A−1‖2 = 1/σn−1.

Proof: The only item that is less than totally obvious is (3). Clearly A−1 =V Σ−1UH . The problem is thatin Σ−1 the diagonal entries are not ordered from largest to smallest. The permutation fixes this.

Corollary 3.28 If A ∈ Cm×n has linearly independent columns then AHA is invertible (nonsingular) and(AHA)−1 =VL

(Σ2

T L)−1V H

L .

Proof: Since A has linearly independent columns, A = ULΣT LV HL is the reduced SVD where UL has n

columns and VL is unitary. Hence

AHA = (ULΣT LV HL )HULΣT LV H

L =VLΣHT LUH

L ULΣT LV HL =VLΣT LΣT LV H

L =VLΣ2T LV H

L .

Since VL is unitary and ΣT L is diagonal with nonzero diagonal entries, tey are both nonsingular. Thus(VLΣ

2T LV H

L)(

VL(Σ

2T L)−1

V HL ))= I.

This means AT A is invertible and (AT A)−1 is as given.

3.6. Projection onto the Column Space 49

3.6 Projection onto the Column Space

Definition 3.29 Let UL ∈Cm×k have orthonormal columns. The projection of a vector y∈Cm onto C (UL)is the vector ULx that minimizes ‖y−ULx‖2, where x ∈ Ck. We will also call this vector y the componentof x in C (UL).

Theorem 3.30 Let UL ∈ Cm×k have orthonormal columns. The projection of y onto C (UL) is given byULUH

L y.

Proof: The vector ULx that we want must satisfy

‖ULx− y‖2 = minw∈Ck‖ULw− y‖2.

Now, the 2-norm is invariant under multiplication by the unitary matrix UH =(

UL UR

)H

‖ULx− y‖22 = min

w∈Ck‖ULw− y‖2

2

= minw∈Ck

∥∥UH(ULw− y)∥∥2

2 (since the two norm is preserved)

= minw∈Ck

∥∥∥∥( UL UR

)H(ULw− y)

∥∥∥∥2

2

= minw∈Ck

∥∥∥∥∥∥ UH

L

UHR

(ULw− y)

∥∥∥∥∥∥2

2

= minw∈Ck

∥∥∥∥∥∥ UH

L

UHR

ULw−

UHL

UHR

y

∥∥∥∥∥∥2

2

= minw∈Ck

∥∥∥∥∥∥ UH

L ULw

UHR ULw

− UH

L y

UHR y

∥∥∥∥∥∥2

2

= minw∈Ck

∥∥∥∥∥∥ w

0

− UH

L y

UHR y

∥∥∥∥∥∥2

2

= minw∈Ck

∥∥∥∥∥∥ w−UH

L y

−UHR y

∥∥∥∥∥∥2

2

= minw∈Ck

(∥∥w−UHL y∥∥2

2 +∥∥−UH

R y∥∥2

2

)(since

∥∥∥∥∥∥ u

v

∥∥∥∥∥∥2

2

= ‖u‖22 +‖v‖2

2)

=

(minw∈Ck

∥∥w−UHL y∥∥2

2

)+∥∥UH

R y∥∥2

2 .

This is minimized when w =UHL y. Thus, the vector that is closest to y in the space spanned by UL is given

by x =ULUHL y.


Corollary 3.31 Let A ∈Cm×n and A =ULΣT LV HL be its reduced SVD. Then the projection of y ∈Cm onto

C (A) is given by ULUHL y.

Proof: This follows immediately from the fact that C (A) = C (UL).

Corollary 3.32 Let A ∈ Cm×n have linearly independent columns. Then the projection of y ∈ Cm ontoC (A) is given by A(AHA)−1AHy.

Proof: From Corrolary 3.28, we know that AHA is nonsingular and that (AHA)−1 =VL(Σ2

T L)−1V H

L . Now,

A(AHA)−1AHy = (ULΣT LV HL )(VL

(Σ

2T L)−1

V HL )(ULΣT LV H

L )Hy

= ULΣT L V HL VL︸︷︷︸I

Σ−1T LΣ

−1T L V H

L VL︸︷︷︸I

ΣT LUHL y =ULUH

L y.

Hence the projection of y onto C (A) is given by A(AHA)−1AHy.

Definition 3.33 Let A have linearly independent columns. Then (AHA)−1AH is called the pseudo-inverseor Moore-Penrose generalized inverse of matrix A.

3.7 Low-rank Approximation of a Matrix

Theorem 3.34 Let A ∈ Cm×n have SVD A =UΣV H and assume A has rank r. Partition

U =(

UL UR

), V =

(VL VR

), and Σ =

ΣT L 0

0 ΣBR

,

where UL ∈ Cm×k, VL ∈ Cn×k, and ΣT L ∈ Rk×k with k ≤ r. Then B = ULΣT LV HL is the matrix in Cm×n

closest to A in the following sense:

‖A−B‖2 = minC ∈ Cm×n

rank(C)≤ k

‖A−C‖2 = σk.

Proof: First, if B is as defined, then clearly ‖A−B‖2 = σk:

‖A−B‖2 = ‖UH(A−B)V‖2 = ‖UHAV −UHBV‖2

=

∥∥∥∥Σ−(

UL UR

)HB(

VL VR

)∥∥∥∥2=

∥∥∥∥∥∥ ΣT L 0

0 ΣBR

− ΣT L 0

0 0

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ 0 0

0 ΣBR

∥∥∥∥∥∥2

= ‖ΣBR‖2 = σk

Next, assume that C has rank t ≤ k and ‖A−C‖2 < ‖A− B‖2. We will show that this leads to acontradiction.

3.8. An Application 51

• The null space of C has dimension at least n− k since dim(N (C))+ rank(C) = n.

• If x ∈N (C) then‖Ax‖2 = ‖(A−C)x‖2 ≤ ‖A−C‖2‖x‖2 < σk‖x‖2.

• Partition U =(

u0 · · · um−1

)and V =

(v0 · · · vn−1

). Then ‖Av j‖2 = ‖σ ju j‖2 = σ j ≥ σs

for j = 0, . . . ,k. Now, let x be any linear combination of v0, . . . ,vk: x = α0v0 + · · ·+αkvk. Noticethat

‖x‖22 = ‖α0v0 + · · ·+αkvk‖2

2 ≤ |α0|2 + · · · |αk|2.

Then

‖Ax‖22 = ‖A(α0v0 + · · ·+αkvk)‖2

2 = ‖α0Av0 + · · ·+αkAvk‖22

= ‖α0σ0u0 + · · ·+αkσkuk‖22 = ‖α0σ0u0‖2

2 + · · ·+‖αkσkuk‖22

= |α0|2σ20 + · · ·+ |αk|2σ

2k ≥ (|α0|2 + · · ·+ |αk|2)σ2

k

so that ‖Ax‖2 ≥ σk‖x‖2. In other words, vectors in the subspace of all linear combinations ofv0, . . . ,vk satisfy ‖Ax‖2 ≥ σk‖x‖2. The dimension of this subspace is k + 1 (since v0, · · · ,vkform an orthonormal basis).

• Both these subspaces are subspaces of Cn. Since their dimensions add up to more than n there mustbe at least one nonzero vector z that satisfies both ‖Az‖2 < σk‖z‖2 and ‖Az‖2 ≥ σk‖z‖2, which is acontradiction.

The above theorem tells us how to pick the best approximation with given rank to a given matrix.

3.8 An Application

Let Y ∈ Rm×n be a matrix that, for example, stores a picture. In this case, the (i, j) entry in Y is, forexample, a number that represents the grayscale value of pixel (i, j). The following instructions, executedin octave or matlab, generate the picture of Mexican artist Frida Kahlo in Figure 3.2(top-left). The fileFridaPNG.png can be found at http://www.cs.utexas.edu/users/flame/Notes/FridaPNG.png.

octave> IMG = imread( ’FridaPNG.png’ ); % this reads the imageoctave> Y = IMG( :,:,1 );octave> imshow( Y ) % this dispays the image

Although the picture is black and white, it was read as if it is a color image, which means a m×n×3 arrayof pixel information is stored. Setting Y = IMG( :,:,1 ) extracts a single matrix of pixel information.(If you start with a color picture, you will want to approximate IMG( :,:,1), IMG( :,:,2), and IMG(:,:,3) separately.)

Now, let Y =UΣV T be the SVD of matrix Y . Partition, conformally,

U =(

UL UR

), V =

(VL VR

), and Σ =

ΣT L 0

0 ΣBR

,


Original picture k = 1

k = 2 k = 5

k = 10 k = 25

Figure 3.2: Multiple pictures as generated by the code

3.8. An Application 53

Figure 3.3: Distribution of singular values for the picture.

where UL and VL have k columns and ΣT L is k× k. so that

Y =(

UL UR

) ΣT L 0

0 ΣBR

( VL VR

)T

=(

UL UR

) ΣT L 0

0 ΣBR

V TL

V TR

=

(UL UR

) ΣT LV TL

ΣBRV TR

= ULΣT LV T

L +URΣBRV TR .

Recall that then ULΣT LV TL is the best rank-k approximation to Y .

Let us approximate the matrix that stores the picture with ULΣT LV TL :

>> IMG = imread( ’FridaPNG.png’ ); % read the picture>> Y = IMG( :,:,1 );>> imshow( Y ); % this dispays the image>> k = 1;>> [ U, Sigma, V ] = svd( Y );


>> UL = U( :, 1:k ); % first k columns>> VL = V( :, 1:k ); % first k columns>> SigmaTL = Sigma( 1:k, 1:k ); % TL submatrix of Sigma>> Yapprox = uint8( UL * SigmaTL * VL’ );>> imshow( Yapprox );

As one increases k, the approximation gets better, as illustrated in Figure 3.2. The graph in Figure 3.3helps explain. The original matrix Y is 387×469, with 181,503 entries. When k = 10, matrices U , V , andΣ are 387×10, 469×10 and 10×10, respectively, requiring only 8,660 entries to be stores.

3.9 SVD and the Condition Number of a Matrix

In “Notes on Norms” we saw that if Ax = b and A(x+δx) = b+δb, then

‖δx‖2

‖x‖2≤ κ2(A)

‖δb‖2

‖b‖2,

where κ2(A) = ‖A‖2‖A−1‖2 is the condition number of A, using the 2-norm.

Homework 3.35 Show that if A ∈ Cm×m is nonsingular, then

• ‖A‖2 = σ0, the largest singular value;

• ‖A−1‖2 = 1/σm−1, the inverse of the smallest singular value; and

• κ2(A) = σ0/σm−1.

* SEE ANSWER

If we go back to the example of A ∈ R2×2, recall the following pictures that shows how A transformsthe unit circle:


In this case, the ratio σ0/σn−1 represents the ratio between the major and minor axes of the “ellipse” onthe right.

3.10. An Algorithm for Computing the SVD? 55

3.10 An Algorithm for Computing the SVD?

It would seem that the proof of the existence of the SVD is constructive in the sense that it provides analgorithm for computing the SVD of a given matrix A ∈ Cm×m. Not so fast! Observe that

• Computing ‖A‖2 is nontrivial.

• Computing the vector that maximizes max‖x‖2=1 ‖Ax‖2 is nontrivial.

• Given a vector q0 computing vectors q0, . . . ,qm−1 is expensive (as we will see when we discuss theQR factorization).

Towards the end of the course we will discuss algorithms for computing the eigenvalues and eigenvectorsof a matrix, and related algorithms for computing the SVD.

Chapter 4Notes on Gram-Schmidt QR Factorization

A classic problem in linear algebra is the computation of an orthonormal basis for the space spanned by agiven set of linearly independent vectors: Given a linearly independent set of vectors a0, . . . ,an−1 ⊂Cm

we would like to find a set of mutually orthonormal vectors q0, . . . ,qn−1 ⊂ Cm so that

Span(a0, . . . ,an−1) = Span(q0, . . . ,qn−1).

This problem is equivalent to the problem of, given a matrix A =(

a0 · · · an−1

), computing a matrix

Q =(

q0 · · · qn−1

)with QHQ = I so that C (A) = C (Q), where (A) denotes the column space of A.

A review at the undergraduate level of this topic (with animated illustrations) can be found in Week 11of


Video




57


http://youtu.be/9NW4Z0oqy1g

https://utexas.box.com/s/p7s0fvdee5j885jw817t

Chapter 4. Notes on Gram-Schmidt QR Factorization 58

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.1. Classical Gram-Schmidt (CGS) Process . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.2. Modified Gram-Schmidt (MGS) Process . . . . . . . . . . . . . . . . . . . . . . . . 64

4.3. In Practice, MGS is More Accurate . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.4.1. Cost of CGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.4.2. Cost of MGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.1. Classical Gram-Schmidt (CGS) Process 59

4.1 Classical Gram-Schmidt (CGS) Process

Given a set of linearly independent vectors a0, . . . ,an−1 ⊂ Cm, the Gram-Schmidt process computes anorthonormal basis q0, . . . ,qn−1 that span the same subspace, i.e.

Span(a0, . . . ,an−1) = Span(q0, . . . ,qn−1).

The process proceeds as described in Figure 4.1 and in the algorithms in Figure 4.2.

Homework 4.1 • What happens in the Gram-Schmidt algorithm if the columns of A are NOT linearlyindependent?

• How might one fix this?

• How can the Gram-Schmidt algorithm be used to identify which columns of A are linearly indepen-dent?

* SEE ANSWER

Homework 4.2 Homework 4.3 Convince yourself that the relation between the vectors a j and q j inthe algorithms in Figure 4.2 is given by

(a0 a1 · · · an−1

)=(

q0 q1 · · · qn−1

)

ρ0,0 ρ0,1 · · · ρ0,n−1

0 ρ1,1 · · · ρ1,n−1...

... . . . ...

0 0 · · · ρn−1,n−1

,

where

qHi q j =

1 for i = j

0 otherwiseand ρi, j =

qH

i a j for i < j

‖a j−∑j−1i=0 ρi, jqi‖2 for i = j

0 otherwise.

* SEE ANSWER

Thus, this relationship between the linearly independent vectors a j and the orthonormal vectors q jcan be concisely stated as

A = QR,

where A and Q are m×n matrices (m≥ n), QHQ = I, and R is an n×n upper triangular matrix.

Theorem 4.4 Let A have linearly independent columns, A=QR where A,Q∈Cm×n with n≤m, R∈Cn×n,QHQ = I, and R is an upper triangular matrix with nonzero diagonal entries. Then, for 0 < k < n, the firstk columns of A span the same space as the first k columns of Q.

Proof: Partition

A→(

AL AR

), Q→

(QL QR

), and R→

RT L RT R

0 RBR

,

where AL,QL ∈ Cm×k and RT L ∈ Ck×k. Then RT L is nonsingular (since it is upper triangular and has nozero on its diagonal), QH

L QL = I, and AL = QLRT L. We want to show that C (AL) = C (QL):


Steps Comment

ρ0,0 := ‖a0‖2

q0 =: a0/ρ0,0

Compute the length of vector a0, ρ0,0 := ‖a0‖2.

Set q0 := a0/ρ0,0, creating a unit vector in the direction ofa0.

Clearly, Span(a0) = Span(q0). (Why?)

ρ0,1 = qH0 a1

a⊥1 = a1−ρ0,1q0

ρ1,1 = ‖a⊥1 ‖2

q1 = a⊥1 /ρ1,1

Compute a⊥1 , the component of vector a1 orthogonal to q0.

Compute ρ1,1, the length of a⊥1 .

Set q1 = a⊥1 /ρ1,1, creating a unit vector in the direction ofa⊥1 .

Now, q0 and q1 are mutually orthonormal andSpan(a0,a1) = Span(q0,q1). (Why?)

ρ0,2 = qH0 a2

ρ1,2 = qH1 a2

a⊥2 = a2−ρ0,2q0−ρ1,2q1

ρ2,2 = ‖a⊥2 ‖2

q2 = a⊥2 /ρ2,2

Compute a⊥2 , the component of vector a2 orthogonal to q0and q1.

Compute ρ2,2, the length of a⊥2 .

Set q2 = a⊥2 /ρ2,2, creating a unit vector in the direction ofa⊥2 .

Now, q0,q1,q2 is an orthonormal basis andSpan(a0,a1,a2) = Span(q0,q1,q2). (Why?)

And so forth.

Figure 4.1: Gram-Schmidt orthogonalization.


for j = 0, . . . ,n−1

a⊥j := a j

for k = 0, . . . , j−1ρk, j := qH

k a ja⊥j := a⊥j −ρk, jqk

endρ j, j := ‖a⊥j ‖2

q j := a⊥j /ρ j, j

end

for j = 0, . . . ,n−1for k = 0, . . . , j−1

ρk, j := qHk a j

end

a⊥j := a j

for k = 0, . . . , j−1

a⊥j := a⊥j −ρk, jqk


q j := a⊥j /ρ j, j

end

for j = 0, . . . ,n−1ρ0, j

...

ρ j−1, j

:=

qH

0 a j...

qHj−1a j

=(

q0 · · · q j−1

)Ha j

a⊥j := a j−(

q0 · · · q j−1

)ρ0, j

...

ρ j−1, j

ρ j, j := ‖a⊥j ‖2

q j := a⊥j /ρ j, j

end

Figure 4.2: Three equivalent (Classical) Gram-Schmidt algorithms.

• We first show that C (AL)⊆ C (QL). Let y ∈ C (AL). Then there exists x ∈ Ck such that y = ALx. Butthen y = QLz, where z = RT Lx 6= 0, which means that y ∈ C (QL). Hence C (AL)⊆ C (QL).

• We next show that C (QL)⊆ C (AL). Let y ∈ C (QL). Then there exists z ∈Ck such that y = QLz. Butthen y = ALx, where x = R−1

T Lz, from which we conclude that y ∈ C (AL). Hence C (QL)⊆ C (AL).

Since C (AL)⊆ C (QL) and C (QL)⊆ C (AL), we conclude that C (QL) = C (AL).

Theorem 4.5 Let A∈Cm×n have linearly independent columns. Then there exist Q∈Cm×n with QHQ = Iand upper triangular R with no zeroes on the diagonal such that A = QR. This is known as the QRfactorization. If the diagonal elements of R are chosen to be real and positive, th QR factorization isunique.

Proof: (By induction). Note that n≤ m since A has linearly independent columns.

• Base case: n = 1. In this case A =(

a0

)where a0 is its only column. Since A has linearly

independent columns, a0 6= 0. Then

A =(

a0

)= (q0)(ρ00) ,

where ρ00 = ‖a0‖2 and q0 = a0/ρ00, so that Q = (q0) and R = (ρ00).

• Inductive step: Assume that the result is true for all A with n−1 linearly independent columns. Wewill show it is true for A ∈ Cm×n with linearly independent columns.

Let A ∈ Cm×n. Partition A→(

A0 a1

). By the induction hypothesis, there exist Q0 and R00

such that QH0 Q0 = I, R00 is upper triangular with nonzero diagonal entries and A0 = Q0R00. Now,


Algorithm: [Q,R] := QR(A)

Partition A→(

AL AR

),

Q→(

QL QR

),

R→

RT L RT R

0 RBR

where AL and QL has 0 columns and RT L is 0×0

while n(AL) 6= n(A) do

Repartition(AL AR

)→(

A0 a1 A2

),(

QL QR

)→(

Q0 q1 Q2

), RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

r01 := QT

0 a1

a⊥1 := a1−Q0r01

ρ11 := ‖a⊥1 ‖2

q1 := a⊥1 /ρ11

Continue with(AL AR

)←(

A0 a1 A2

),(

QL QR

)←(

Q0 q1 Q2

), RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

for j = 0, . . . ,n−1ρ0, j

...

ρ j−1, j

︸︷︷︸

r01

:=(

q0 · · · q j−1

)H

︸︷︷︸QH

0

a j︸︷︷︸a1

a⊥j︸︷︷︸a⊥1

:= a j︸︷︷︸a1

−(

q0 · · · q j−1

)︸︷︷︸

Q0

ρ0, j

...

ρ j−1, j

︸︷︷︸

r01

ρ j, j := ‖a⊥j ‖2 (ρ11 := ‖a⊥1 ‖2)

q j := a⊥j /ρ j, j (q1 := a⊥1 /ρ11)

end

Figure 4.3: (Classical) Gram-Schmidt algorithm for computing the QR factorization of a matrix A.

compute r01 = QH0 a1 and a⊥1 = a1−Q0r01, the component of a1 orthogonal to C (Q0). Because the

columns of A are linearly independent, a⊥1 6= 0. Let ρ11 = ‖a⊥1 ‖2 and q1 = a⊥1 /ρ11. Then

(Q0 q1

) R00 r01

0 ρ11

=(

Q0R00 Q0r01 +q1ρ11

)=

(A0 Q0r01 +a⊥1

)=(

A0 a1

)= A.


Hence Q =(

Q0 q1

)and R =

R00 r01

0 ρ11

.


The proof motivates the algorithm in Figure 4.3 (left) in FLAME notation1.

An alternative for motivating that algorithm is as follows: Consider A = QR. Partition A, Q, and R toyield

(A0 a1 A2

)=(

Q0 q1 Q2

)R00 r01 R02

0 ρ11 rT12

0 0 R22

.

Assume that Q0 and R00 have already been computed. Since corresponding columns of both sides must beequal, we find that

a1 = Q0r01 +q1ρ11. (4.1)

Also, QH0 Q0 = I and QH

0 q1 = 0, since the columns of Q are mutually orthonormal. Hence QH0 a1 =

QH0 Q0r01 +QH

0 q1ρ11 = r01. This shows how r01 can be computed from Q0 and a1, which are alreadyknown. Next, a⊥1 = a1−Q0r01 is computed from (4.1). This is the component of a1 that is perpendicularto the columns of Q0. We know it is nonzero since the columns of A are linearly independent. Sinceρ11q1 = a⊥1 and we know that q1 has unit length, we now compute ρ11 = ‖a⊥1 ‖2 and q1 = a⊥1 /ρ11, whichcompletes a derivation of the algorithm in Figure 4.3.

Homework 4.6 Homework 4.7 Let A have linearly independent columns and let A = QR be a QR fac-torization of A. Partition

A→(

AL AR

), Q→

(QL QR

), and R→

RT L RT R

0 RBR

,

where AL and QL have k columns and RT L is k× k. Show that

1. AL = QLRT L: QLRT L equals the QR factorization of AL,

2. C (AL) = C (QL): the first k columns of Q form an orthonormal basis for the space spanned by thefirst k columns of A.

3. RT R = QHL AR,

4. (AR−QLRT R)HQL = 0,

5. AR−QLRT R = QRRBR, and

6. C (AR−QLRT R) = C (QR).

* SEE ANSWER1 The FLAME notation should be intuitively obvious. If it is not, you may want to review the earlier weeks in Linear

Algebra: Foundations to Frontiers - Notes to LAFF With.




[y⊥,r] = Proj orthog to QCGS(Q,y) [y⊥,r] = Proj orthog to QMGS(Q,y)

(used by classical Gram-Schmidt) (used by modified Gram-Schmidt)

y⊥ = y y⊥ = y

for i = 0, . . . ,k−1 for i = 0, . . . ,k−1

ρi := qHi y ρi := qH

i y⊥

y⊥ := y⊥−ρiqi y⊥ := y⊥−ρiqi

endfor endfor

Figure 4.4: Two different ways of computing y⊥ = (I−QQH)y, the component of y orthogonal to C (Q),where Q has k orthonormal columns.

4.2 Modified Gram-Schmidt (MGS) Process

We start by considering the following problem: Given y ∈ Cm and Q ∈ Cm×k with orthonormal columns,compute y⊥, the component of y orthogonal to the columns of Q. This is a key step in the Gram-Schmidtprocess in Figure 4.3.

Recall that if A has linearly independent columns, then A(AHA)−1AHy equals the projection of y ontothe columns space of A (i.e., the component of y in C (A) ) and y−A(AHA)−1AHy = (I−A(AHA)−1AH)yequals the component of y orthogonal to C (A). If Q has orthonormal columns, then QHQ = I and henceQQHy equals the projection of y onto the columns space of Q (i.e., the component of y in C (Q) ) andy−QQHy = (I−QQH)y equals the component of y orthogonal to C (A).

Thus, mathematically, the solution to the stated problem is given by

y⊥ = (I−QQH)y = y−QQHy

= y−(

q0 · · · qk−1

)(q0 · · · qk−1

)Hy

= y−(

q0 · · · qk−1

)qH

0...

qHk−1

y

= y−(

q0 · · · qk−1

)qH

0 y...

qHk−1y

= y−

[(qH

0 y)q0 + · · ·+(qHk−1y)qk−1

]= y− (qH

0 y)q0−·· ·− (qHk−1y)qk−1.

This can be computed by the algorithm in Figure 4.4 (left) and is used by what is often called the ClassicalGram-Schmidt (CGS) algorithm given in Figure 4.3.

An alternative algorithm for computing y⊥ is given in Figure 4.4 (right) and is used by the ModifiedGram-Schmidt (MGS) algorithm also given in Figure 4.5. This approach is mathematically equivalent to

4.2. Modified Gram-Schmidt (MGS) Process 65

Algorithm: [AR] := Gram-Schmidt(A) (overwrites A with Q)

Partition A→(

AL AR

), R→

RT L RT R

0 RBR

whereAL has 0 columns and RT L is 0×0


Repartition

(AL AR

)→(

A0 a1 A2

),

RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

wherea1 and q1 are columns, ρ11 is a scalar

CGS MGS MGS (alternative)

r01 := AH0 a1

a1 := a1−A0r01 [a1,r01] = Proj orthog to QMGS(A0,a1)

ρ11 := ‖a1‖2 ρ11 := ‖a1‖2 ρ11 := ‖a1‖2

a1 := a1/ρ11 q1 := a1/ρ11 a1 := a1/ρ11

rT12 := aH

1 A2

A2 := A2−a1rT12

Continue with

(AL AR

)←(

A0 a1 A2

),

RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

Figure 4.5: Left: Classical Gram-Schmidt algorithm. Middle: Modified Gram-Schmidt algorithm. Right:Modified Gram-Schmidt algorithm where every time a new column of Q, q1 is computed the componentof all future columns in the direction of this new vector are subtracted out.

the algorithm to its left for the following reason:The algorithm on the left in Figure 4.4 computes

y⊥ := y− (qH0 y)q0−·· ·− (qH

k−1y)qk−1

by in the ith step computing the component of y in the direction of qi, (qHi y)qi, and then subtracting this


for j = 0, . . . ,n−1a⊥j := a j

for k = 0, . . . , j−1ρk, j := qH

k a⊥ja⊥j := a⊥j −ρk, jqk


q j := a⊥j /ρ j, j

end

for j = 0, . . . ,n−1

for k = 0, . . . , j−1ρk, j := aH

k a ja j := a j−ρk, jak

endρ j, j := ‖a j‖2a j := a j/ρ j, j

end

(a) MGS algorithm that computes Q and R fromA.

(b) MGS algorithm that computes Q and R fromA, overwriting A with Q.

for j = 0, . . . ,n−1ρ j, j := ‖a j‖2a j := a j/ρ j, jfor k = j+1, . . . ,n−1

ρ j,k := aHj ak

ak := ak−ρ j, ja jend

end

for j = 0, . . . ,n−1ρ j, j := ‖a j‖2a j := a j/ρ j, jfor k = j+1, . . . ,n−1

ρ j,k := aHj ak

endfor k = j+1, . . . ,n−1

ak := ak−ρ j,ka jend

end

(c) MGS algorithm that normalizes the jth col-umn to have unit length to compute q j (overwrit-ing a j with the result) and then subtracts the com-ponent in the direction of q j off the rest of thecolumns (a j+1, . . . ,an−1).

(d) Slight modification of the algorithm in (c) thatcomputes ρ j,k in a separate loop.

for j = 0, . . . ,n−1ρ j, j := ‖a j‖2a j := a j/ρ j, j(

ρ j, j+1 · · · ρ j,n−1

):=(

aHj a j+1 · · · aH

j an−1

)(a j+1 · · · an−1

):=(

a j+1−ρ j, j+1a j · · · an−1−ρ j,n−1a j

)end

for j = 0, . . . ,n−1ρ j, j := ‖a j‖2a j := a j/ρ j, j(

ρ j, j+1 · · · ρ j,n−1

):= aH

j

(a j+1 · · · an−1

)(a j+1 · · · an−1

):=(

a j+1 · · · an−1

)−

a j

(ρ j, j+1 · · · ρ j,n−1

)end

(e) Algorithm in (d) rewritten without loops. (f) Algorithm in (e) rewritten to exposethe row-vector-times matrix multiplica-tion aH

j

(a j+1 · · · an−1

)and rank-1 update(

a j+1 · · · an−1

)−a j

(ρ j, j+1 · · · ρ j,n−1

).

Figure 4.6: Various equivalent MGS algorithms.

4.2. Modified Gram-Schmidt (MGS) Process 67

Algorithm: [A,R] := QR(A)

Partition A→(

AL AR

),

R→

RT L RT R

0 RBR

where AL and QL has 0 columns and RT L is 0×0


Repartition(AL AR

)→(

A0 a1 A2

), RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

ρ11 := ‖a1‖2

a1 := a1/ρ11

rT12 := aH

1 A2

A2 := A2−a1rT12

Continue with(AL AR

)←(

A0 a1 A2

), RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

for j = 0, . . . ,n−1ρ j, j := ‖a j‖2 (ρ11 := ‖a⊥1 ‖2)a j := a j/ρ j, j (a1 := a1/ρ11)

rT12︷︸︸︷(

ρ j, j+1 · · · ρ j,n−1

):=

aHj︸︷︷︸

aH1

(a j+1 · · · an−1

)︸︷︷︸

A2

A2︷︸︸︷(a j+1 · · · an−1

):=

A2︷︸︸︷(a j+1 · · · an−1

)− a j︸︷︷︸

a1

(ρ j, j+1 · · · ρ j,n−1

)︸︷︷︸

rT12

end

Figure 4.7: Modified Gram-Schmidt algorithm for computing the QR factorization of a matrix A.

off the vector y⊥ that already contains

y⊥ = y− (qH0 y)q0−·· ·− (qH

i−1y)qi−1,

leaving us withy⊥ = y− (qH

0 y)q0−·· ·− (qHi−1y)qi−1− (qH

i y)qi.

Now, notice that

qHi[y− (qH

0 y)q0−·· ·− (qHi−1y)qi−1

]= qH

i y−qHi (q

H0 y)q0−·· ·−qH

i (qHi−1y)qi−1

= qHi y− (qH

0 y) qHi q0︸︷︷︸0

−·· ·− (qHi−1y) qH

i qi−1︸︷︷︸0

= qHi y.


What this means is that we can use y⊥ in our computation of ρi instead:

ρi := qHi y⊥ = qH

i y,

an insight that justifies the equivalent algorithm in Figure 4.4 (right).Next, we massage the MGS algorithm into the third (right-most) algorithm given in Figure 4.5. For

this, consider the equivalent algorithms in Figure 4.6 and 4.7.

4.3 In Practice, MGS is More Accurate

In theory, all Gram-Schmidt algorithms discussed in the previous sections are equivalent: they computethe exact same QR factorizations. In practice, in the presense of round-off error, MGS is more accuratethan CGS. We will (hopefully) get into detail about this later, but for now we will illustrate it with a classicexample.

When storing real (or complex for that matter) valued numbers in a computer, a limited accuracy canbe maintained, leading to round-off error when a number is stored and/or when computation with numbersare performed. The machine epsilon or unit roundoff error is defined as the largest positive number εmachsuch that the stored value of 1+ εmach is rounded to 1. Now, let us consider a computer where the onlyerror that is ever incurred is when 1+ εmach is computed and rounded to 1. Let ε =

√εmach and consider

the matrix

A =

1 1 1

ε 0 0

0 ε 0

0 0 ε

=(

a0 a1 a2

)(4.2)

In Figure 4.8 (left) we execute the CGS algorithm. It yields the approximate matrix

Q≈

1 0 0

ε −√

22 −

√2

2

0√

22 0

0 0√

22

If we now ask the question “Are the columns of Q orthonormal?” we can check this by computing QHQ,which should equal I, the identity. But

QHQ =

1 0 0

ε −√

22 −

√2

2

0√

22 0

0 0√

22

H

1 0 0

ε −√

22 −

√2

2

0√

22 0

0 0√

22

=

1+ εmach −

√2

2 ε −√

22 ε

−√

22 ε 1 1

2

−√

22 ε

12 1

.

Clearly, the computed columns of Q are not mutually orthogonal.

4.3. In Practice, MGS is More Accurate 69

First iteration

ρ0,0 = ‖a0‖2 =√

1+ ε2 =√

1+ εmach

which is rounded to 1.

q0 = a0/ρ0,0 =

1

ε

0

0

/1 =

1

ε

0

0

Second iterationρ0,1 = qH

0 a1 = 1

a⊥1 = a1−ρ0,1q0 =

0

−ε

ε

0

ρ1,1 = ‖a⊥1 ‖2 =

√2ε2 =

√2ε

q1 = a⊥1 /ρ1,1 =

0

−ε

ε

0

/(√

2ε) =

0

−√

22√2

2

0

Third iterationρ0,2 = qH

0 a2 = 1

ρ1,2 = qH1 a2 = 0

a⊥2 = a2−ρ0,2q0−ρ1,2q1 =

0

−ε

0

ε

ρ2,2 = ‖a⊥2 ‖2 =

√2ε2 =

√2ε

q2 = a⊥2 /ρ2,2 =

0

−ε

0

ε

/(√

2ε) =

0

−√

22

0√

22

First iteration

ρ0,0 = ‖a0‖2 =√

1+ ε2 =√

1+ εmach

which is rounded to 1.

q0 = a0/ρ0,0 =

1

ε

0

0

/1 =

1

ε

0

0

Second iterationρ0,1 = qH

0 a1 = 1

a⊥1 = a1−ρ0,1q0 =

0

−ε

ε

0

ρ1,1 = ‖a⊥1 ‖2 =

√2ε2 =

√2ε

q1 = a⊥1 /ρ1,1 =

0

−ε

ε

0

/(√

2ε) =

0

−√

22√2

2

0

Third iterationρ0,2 = qH

0 a2 = 1

a⊥2 = a2−ρ0,2q0 =

0

−ε

0

ε

ρ1,2 = qH

1 a⊥2 = (√

2/2)ε

a⊥2 = a⊥2 −ρ1,2q1 =

0

−ε/2

−ε/2

ε

ρ2,2 = ‖a⊥2 ‖2 =

√(6/4)ε2 = (

√6/2)ε

q2 = a⊥2 /ρ2,2 =

0

− ε

2

− ε

2

ε

/(

√6

2ε) =

0√

66

−√

66

− 2√

66

Figure 4.8: Execution of the CGS algorith (left) and MGS algorithm (right) on the example in Eqn. (4.2).


Similarly, in Figure 4.8 (right) we execute the MGS algorithm. It yields the approximate matrix

Q≈

1 0 0

ε −√

22

√6

6

0√

22 −

√6

6

0 0 2√

66

.

If we now ask the question “Are the columns of Q orthonormal?” we can check if QHQ = I. The answer:

QHQ =

1 0 0

ε −√

22 −

√6

6

0√

22 −

√6

6

0 0 2√

66

H

1 0 0

ε −√

22 −

√6

6

0√

22 −

√6

6

0 0 2√

66

=

1+ εmach −

√2

2 ε −√

66 ε

−√

22 ε 1 0

−√

66 ε 0 1

,

which shows that for this example MGS yields better orthogonality than does CGS. What is going on?The answer lies with how a⊥2 is computed in the last step of each of the algorithms.

• In the CGS algorithm, we find that

a⊥2 := a2− (qH0 a2)q0− (qH

1 a2)q1.

Now, q0 has a relatively small error in it and hence qH0 a2q0 has a relatively) small error in it. It is

likely that a part of that error is in the direction of q1. Relative to qH0 a2q0, that error in the direction

of q1 is small, but relative to a2−qH0 a2q0 it is not. The point is that then a2−qH

0 a2q0 has a relativelylarge error in it in the direction of q1. Subtracting qH

1 a2q1 does not fix this and since in the enda⊥2 is small, it has a relatively large error in the direction of q1. This error is amplified when q2 iscomputed by normalizing a⊥2 .

• In the MGS algorithm, we find that

a⊥2 := a2− (qH0 a2)q0

after whicha⊥2 := a⊥2 −qH

1 a⊥2 q1 = [a2− (qH0 a2)q0]− (qH

1 [a2− (qH0 a2)q0])q1.

This time, if a2− qH1 a⊥2 q1 has an error in the direction of q1, this error is subtracted out when

(qH1 a⊥2 )q1 is subtracted from a⊥2 . This explains the better orthogonality between the computed vec-

tors q1 and q2.

Obviously, we have argued via an example that MGS is more accurage than CGS. A more thoroughanalysis is needed to explain why this is generally so. This is beyond the scope of this note.

4.4 Cost

Let us examine the cost of computing the QR factorization of an m×n matrix A. We will count multipliesand an adds as each as one floating point operation.

We start by reviewing the cost, in floating point operations (flops), of various vector-vector and matrix-vector operations:

4.4. Cost 71

Name Operation Approximate cost (in flops)

Vector-vector operations (x,y ∈ Cn,α ∈ C)

Dot α := xHy 2n

Axpy y := αx+ y 2n

Scal x := αx n

Nrm2 α := ‖a1‖2 2n

Matrix-vector operations (A ∈ Cm×n,α,β ∈ C, with x and y vectors of appropriate size)

Matrix-vector multiplication (Gemv) y := αAx+βy 2mn

y := αAHx+βy 2mn

Rank-1 update (Ger) A := αyxH +A 2mn

Now, consider the algorithms in Figure 4.5. Notice that the columns of A are of size m. During the kthiteration (0≤ k < n), A0 has k columns and A2 has n− k−1 columns.

4.4.1 Cost of CGS

Operation Approximate cost (in flops)

r01 := AH0 a1 2mk

a1 := a1−A0r01 2mk

ρ11 := ‖a1‖2 2m

a1 := a1/ρ11 m

Thus, the total cost is (approximately)

∑n−1k=0 [2mk+2mk+2m+m]

= ∑n−1k=0 [3m+4mk]

= 3mn+4m∑n−1k=0 k

≈ 3mn+4mn2

2 (∑n−1k=0 k = n(n−1)/2≈ n2/2

= 3mn+2mn2

≈ 2mn2 (3mn is of lower order).


4.4.2 Cost of MGS

Operation Approximate cost (in flops)

ρ11 := ‖a1‖2 2m

a1 := a1/ρ11 m

rT12 := aH

1 A2 2m(n− k−1)

A2 := A2−a1rT12 2m(n− k−1)

Thus, the total cost is (approximately)

∑n−1k=0 [2m+m+2m(n− k−1)+2m(n− k−1)]

= ∑n−1k=0 [3m+4m(n− k−1)]

= 3mn+4m∑n−1k=0(n− k−1)

= 3mn+4m∑n−1i=0 i (Change of variable: i = n− k−1)

≈ 3mn+4mn2

2 (∑n−1i=0 i = n(n−1)/2≈ n2/2

= 3mn+2mn2

≈ 2mn2 (3mn is of lower order).

Chapter 5Notes on the FLAME APIs

Video


No video.

73

Chapter 5. Notes on the FLAME APIs 74

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2. Install FLAME@lab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3. An Example: Gram-Schmidt Orthogonalization . . . . . . . . . . . . . . . . . . . . 75

5.3.1. The Spark Webpage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.3.2. Implementing CGS with FLAME@lab . . . . . . . . . . . . . . . . . . . . . . 76

5.3.3. Editing the code skeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.3.4. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.4. Implementing the Other Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.1. Motivation 75

[y⊥,r] = Proj orthog to QCGS(Q,y) [y⊥,r] = Proj orthog to QMGS(Q,y)

(used by classical Gram-Schmidt) (used by modified Gram-Schmidt)

y⊥ = y y⊥ = y

for i = 0, . . . ,k−1 for i = 0, . . . ,k−1

ρi := qHi y ρi := qH

i y⊥

y⊥ := y⊥−ρiqi y⊥ := y⊥−ρiqi

endfor endfor

Figure 5.1: Two different ways of computing y⊥ = (I−QQH)y, the component of y orthogonal to C (Q),where Q has k orthonormal columns.

5.1 Motivation

In the course so far, we have frequently used the “FLAME Notation” to express linear algebra algorithms.In this note we show how to translate such algorithms into code, using various QR factorization algorithmsas examples.

5.2 Install FLAME@lab

The API we will use we refer to as the “FLAME@lab” API, which is an API that targets the M-scriptlanguage used by Matlab and Octave (an Open Source Matlab implementation). This API is very intuitive,and hence we will spend (almost) no time explaining it.

Download all files from http://www.cs.utexas.edu/users/flame/Notes/FLAMEatlab/ and placethem in the same directory as you will the remaining files that you will create as part of the exercises inthis document. (Unless you know how to set up paths in Matlab/Octave, in which case you can put itwhereever you please, and set the path.)

5.3 An Example: Gram-Schmidt Orthogonalization

Let us start by considering the various Gram-Schmidt based QR factorization algorithms from “Notes onGram-Schmidt QR Factorization”, typeset using the FLAME Notation in Figure 5.2.

5.3.1 The Spark Webpage

We wish to typeset the code so that it closely resembles the algorithms in Figure 5.2. The FLAMEnotation itself uses “white space” to better convey the algorithms. We want to do the same for the codesthat implement the algorithms. However, typesetting that code is somewhat bothersome because of thecareful spacing that is required. For this reason, we created a webpage that creates a “code skeleton.”. Wecall this page the “Spark” page:

http://www.cs.utexas.edu/users/flame/Spark/.

http://www.mathworks.com

http://www.octave.org

http://www.cs.utexas.edu/users/flame/Notes/FLAMEatlab/

http://www.cs.utexas.edu/users/flame/Notes/FLAMEatlab/

http://www.cs.utexas.edu/users/flame/Spark/

http://www.cs.utexas.edu/users/flame/Spark/


Algorithm: [AR] := Gram-Schmidt(A) (overwrites A with Q)

Partition A→(

AL AR

), R→

RT L RT R

0 RBR

whereAL has 0 columns and RT L is 0×0


Repartition

(AL AR

)→(

A0 a1 A2

),

RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

wherea1 and q1 are columns, ρ11 is a scalar

CGS MGS MGS (alternative)

r01 := AH0 a1

a1 := a1−A0r01 [a1,r01] = Proj orthog to QMGS(A0,a1)

ρ11 := ‖a1‖2 ρ11 := ‖a1‖2 ρ11 := ‖a1‖2

a1 := a1/ρ11 q1 := a1/ρ11 a1 := a1/ρ11

rT12 := aH

1 A2

A2 := A2−a1rT12

Continue with

(AL AR

)←(

A0 a1 A2

),

RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

Figure 5.2: Left: Classical Gram-Schmidt algorithm. Middle: Modified Gram-Schmidt algorithm. Right:Modified Gram-Schmidt algorithm where every time a new column of Q, q1 is computed the componentof all future columns in the direction of this new vector are subtracted out.

When you open the link, you will get a page that looks something like the picture in Figure 5.3.

5.3.2 Implementing CGS with FLAME@lab

.

5.3. An Example: Gram-Schmidt Orthogonalization 77

Figure 5.3: The Spark webpage.

We will focus on the Classical Gram-Schmidt algorithm on the left, which we show by itself in Fig-ure 5.4 (left). To its right, we show how the menu on the left side of the Spark webpage needs to be filledout.

Some comments:

Name: Choose a name that describes the algorithm/operation being implemented.

Type of function: Later you will learn about “blocked” algorithms. For now, we implement “unblocked”algorithms.

Variant number: Notice that there are a number of algorithmic variants for implementing the Gram-Schmidt algorithm. We choose to call the first one “Variant 1”.

Number of operands: This routine requires two operands: one each for matrices A and R. (A will beoverwritten by the matrix Q.)

Operand 1: We indicate that A is a matrix through which we “march” from left to right (L->R) and it isboth input and output.

Operand 2: We indicate that R is a matrix through which we “march” from to-left to bottom-right(TL->BR) and it is both input and output. Our API requires you to pass in the array in which to putan output, so an appropriately sized R must be passed in.

Pick and output language: A number of different representations are supported, including APIs for M-script (FLAME@lab), C (FLAMEC), LATEX(FLaTeX), and Python (FlamePy). Pick FLAME@lab.

To the left of the menu, you now find what we call a code skeleton for the implementation, as shown inFigure 5.5. In Figure 5.6 we show the algorithm and generated code skeleton side-by-side.


Algorithm: [A,R] := Gram-Schmidt(A) (overwrites A with Q)

Partition A→(

AL AR

),

R→

RT L RT R

0 RBR

where AL has 0 columns and RT L is 0×0


Repartition(AL AR

)→(

A0 a1 A2

), RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

r01 := AH

0 a1

a1 := a1−A0r01

ρ11 := ‖a1‖2

a1 := a1/ρ11

Continue with(AL AR

)←(

A0 a1 A2

), RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

Figure 5.4: Left: Classical Gram-Schmidt algorithm. Right: Generated code-skeleton for CGS.

5.3.3 Editing the code skeleton

At this point, one should copy the code skeleton into one’s favorite text editor. (We highly recommendemacs for the serious programmer.) Once this is done, there are two things left to do:

Fix the code skeleton: The Spark webpage “guesses” the code skeleton. One detail that it sometimesgets wrong is the “stopping criteria”. In this case, the algorithm should stay in the loop as longas n(AL) 6= n(A) (the width of AL is not yet the width of A). In our example, the Spark webpageguessed that the column size of matrix A is to be used for the stopping criteria:

while ( size( AL, 2 ) < size( A, 2 ) )

which happens to be correct. (When you implement the Householder QR factorization, you may notbe so lucky...)

The “update” statements: The Spark webpage can’t guess what the actual updates to the various partsof matrices A and R should be. It fills in

5.3. An Example: Gram-Schmidt Orthogonalization 79

Figure 5.5: The Spark webpage filled out for CGS Variant 1.

% update line 1 %% : %% update line n %

Thus, one has to manually translate

r01 := AH0 a1

a1 := a1−A0r01

ρ11 := ‖a1‖2

a1 := a1/ρ11

into appropriate M-script code:

r01 = A0’ * a1;a1 = a1 - A0 * r01;rho11 = norm( a1 );a1 = a1 / rho11;

(Notice: if one forgets the “;”, when executed the results of the assignment will be printed byMatlab/Octave.)

At this point, one saves the resulting code in the file CGS unb var1.m. The “.m” ending is importantsince the name of the file is used to find the routine when using Matlab/Octave.

5.3.4 Testing

To now test the routine, one starts octave and, for example, executes the commands


Algorithm: [A,R] := Gram-Schmidt(A) (overwrites A with Q)

Partition A→(

AL AR

),

R→

RT L RT R

0 RBR

where AL has 0 columns and RT L is 0×0


Repartition(AL AR

)→(

A0 a1 A2

), RT L RT R

0 RBR

→

R00 r01 R02

0 ρ11 rT12

0 0 R22

r01 := AH

0 a1

a1 := a1−A0r01

ρ11 := ‖a1‖2

a1 := a1/ρ11

Continue with(AL AR

)←(

A0 a1 A2

), RT L RT R

0 RBR

←

R00 r01 R02

0 ρ11 rT12

0 0 R22

endwhile

Figure 5.6: Left: Classical Gram-Schmidt algorithm. Right: Generated code-skeleton for CGS.

> A = rand( 5, 4 )> R = zeros( 4, 4 )> [ Q, R ] = CGS_unb_var1( A, R )> A - Q * triu( R )

The result should be (approximately) a 5×4 zero matrix.(The first time you execute the above, you may get a bunch of warnings from Octave. Just ignore

those.)

5.4 Implementing the Other Algorithms

Next, we leave it to the reader to implement

• Modified Gram Schmidt algorithm, (MGS unb var1, corresponding to the right-most algorithm inFigure 5.2), respectively.

5.4. Implementing the Other Algorithms 81

• The Householder QR factorization algorithm and algorithm to form Q from “Notes on HouseholderQR Factorization”.

The routine for computing a Householder transformation (similar to Figure 5.1) can be found at

http://www.cs.utexas.edu/users/flame/Notes/FLAMEatlab/Housev.m

That routine implements the algorithm on the left in Figure 5.1). Try and see what happens if youreplace it with the algorithm to its right.

Note: For the Householder QR factorization and “form Q” algorithm how to start the algorithm when thematrix is not square is a bit tricky. Thus, you may assume that the matrix is square.



Chapter 6Notes on Householder QR Factorization

Video




83

http://youtu.be/lfh6NovXVWQ

https://utexas.box.com/s/7784p27hhp0q2fqy093t

Chapter 6. Notes on Householder QR Factorization 84

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.2. Householder Transformations (Reflectors) . . . . . . . . . . . . . . . . . . . . . . . 85

6.2.1. The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.2.2. As implemented for the Householder QR factorization (real case) . . . . . . . . 87

6.2.3. The complex case (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.2.4. A routine for computing the Householder vector . . . . . . . . . . . . . . . . . 88

6.3. Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4. Forming Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.5. Applying QH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.6. Blocked Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.6.1. The UT transform: Accumulating Householder transformations . . . . . . . . . 99

6.6.2. A blocked algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.6.3. Variations on a theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.1. Motivation 85

6.1 Motivation

A fundamental problem to avoid in numerical codes is the situation where one starts with large values andone ends up with small values with large relative errors in them. This is known as catastrophic cancellation.The Gram-Schmidt algorithms can inherently fall victim to this: column a j is successively reduced inlength as components in the directions of q0, . . . ,q j−1 are subtracted, leaving a small vector if a j wasalmost in the span of the first j columns of A. Application of a unitary transformation to a matrix or vectorinherently preserves length. Thus, it would be beneficial if the QR factorization can be implementatedas the successive application of unitary transformations. The Householder QR factorization accomplishesthis.

The first fundamental insight is that the product of unitary matrices is itself unitary. If, given A∈Cm×n

(with m≥ n), one could find a sequence of unitary matrices, H0, . . . ,Hn−1, such that

Hn−1 · · ·H0A =

R

0

,

where R ∈ Cm×n is upper triangular, then

Hn−1 · · ·H0A = H0 · · ·Hn−1︸︷︷︸Q

R

0

= Q

R

0

=(

QL QR

) R

0

= QLR,

where QL equals the first n columns of A. Then A = QLR is the QR factorization of A. The secondfundamental insight will be that the desired unitary transformations H0, . . . ,Hn−1 can be computed andapplied cheaply.

6.2 Householder Transformations (Reflectors)

6.2.1 The general case

In this section we discuss Householder transformations, also referred to as reflectors.

Definition 6.1 Let u∈Cn be a vector of unit length (‖u‖2 = 1). Then H = I−2uuH is said to be a reflectoror Householder transformation.

We observe:

• Any vector z that is perpendicular to u is left unchanged:

(I−2uuH)z = z−2u(uHz) = z.

• Any vector x can be written as x = z+uHxu where z is perpendicular to u and uHxu is the componentof x in the direction of u. Then

(I−2uuH)x = (I−2uuH)(z+uHxu) = z+uHxu−2u uHz︸︷︷︸0

−2uuHuHxu

= z+uHxu−2uHx uHu︸︷︷︸1

u = z−uHxu.


u

z

Hx = z + u x u

u x uH

u x uH

(I − 2 u u )xH

u

v = x − y

y

x

u(I − 2 u u )x

H

Figure 6.1: Left: Illustration that shows how, given vectors x and unit length vector u, the subspaceorthogonal to u becomes a mirror for reflecting x represented by the transformation (I− 2uuH). Right:Illustration that shows how to compute u given vectors x and y with ‖x‖2 = ‖y‖2.

This can be interpreted as follows: The space perpendicular to u acts as a “mirror”: any vector in that space(along the mirror) is not reflected, while any other vector has the component that is orthogonal to the space(the component outside, orthogonal to, the mirror) reversed in direction, as illustrated in Figure 6.1. Noticethat a reflection preserves the length of the vector.

Homework 6.2 Show that if H is a reflector, then

• HH = I (reflecting the reflection of a vector results in the original vector),

• H = HH , and

• HHH = I (a reflection is a unitary matrix and thus preserves the norm).

* SEE ANSWER

Next, let us ask the question of how to reflect a given x ∈ Cn into another vector y ∈ Cn with ‖x‖2 =‖y‖2. In other words, how do we compute vector u so that (I−2uuH)x = y. From our discussion above,we need to find a vector u that is perpendicular to the space with respect to which we will reflect. FromFigure 6.1(right) we notice that the vector from y to x, v = x− y, is perpendicular to the desired space.Thus, u must equal a unit vector in the direction v: u = v/‖v‖2.

Remark 6.3 In subsequent discussion we will prefer to give Householder transformations as I−uuH/τ,where τ = uHu/2 so that u needs no longer be a unit vector, just a direction. The reason for this willbecome obvious later.

In the next subsection, we will need to find a Householder transformation H that maps a vector x to amultiple of the first unit basis vector (e0).

Let us first discuss how to find H in the case where x∈Rn. We seek v so that (I− 2vT vvvT )x =±‖x‖2e0.

Since the resulting vector that we want is y =±‖x‖2e0, we must choose v = x− y = x∓‖x‖2e0.

6.2. Householder Transformations (Reflectors) 87

Homework 6.4 Show that if x ∈ Rn, v = x∓‖x‖2e0, and τ = vT v/2 then (I− 1τvvT )x =±‖x‖2e0.

* SEE ANSWER

In practice, we choose v = x+sign(χ1)‖x‖2e0 where χ1 denotes the first element of x. The reason is asfollows: the first element of v, ν1, will be ν1 = χ1∓‖x‖2. If χ1 is positive and ‖x‖2 is almost equal to χ1,then χ1−‖x‖2 is a small number and if there is error in χ1 and/or ‖x‖2, this error becomes large relativeto the result χ1−‖x‖2, due to catastrophic cancellation. Regardless of whether χ1 is positive or negative,we can avoid this by choosing x = χ1 + sign(χ1)‖x‖2e0.

6.2.2 As implemented for the Householder QR factorization (real case)

Next, we discuss a slight variant on the above discussion that is used in practice. To do so, we view x as avector that consists of its first element, χ1, and the rest of the vector, x2: More precisely, partition

x =

χ1

x2

,

where χ1 equals the first element of x and x2 is the rest of x. Then we will wish to find a Householder

vector u =

1

u2

so that

I− 1τ

1

u2

1

u2

T χ1

x2

=

±‖x‖2

0

.

Notice that y in the previous discussion equals the vector

±‖x‖2

0

, so the direction of u is given by

v =

χ1∓‖x‖2

x2

.

We now wish to normalize this vector so its first entry equals “1”:

u =vν1

=1

χ1∓‖x‖2

χ1∓‖x‖2

x2

=

1

x2/ν1

.

where ν1 = χ1∓‖x‖2 equals the first element of v. (Note that if ν1 = 0 then u2 can be set to 0.)

6.2.3 The complex case (optional)

Next, let us work out the complex case, dealing explicitly with x as a vector that consists of its first element,χ1, and the rest of the vector, x2: More precisely, partition

x =

χ1

x2

,


where χ1 equals the first element of x and x2 is the rest of x. Then we will wish to find a Householder

vector u =

1

u2

so that

I− 1τ

1

u2

1

u2

H χ1

x2

=

©±‖x‖2

0

.

Here©± denotes a complex scalar on the complex unit circle. By the same argument as before

v =

χ1−©±‖x‖2

x2

.

We now wish to normalize this vector so its first entry equals “1”:

u =v‖v‖2

=1

χ1−©±‖x‖2

χ1−©±‖x‖2

x2

=

1

x2/ν1

.

where ν1 = χ1−©±‖x‖2. (If ν1 = 0 then we set u2 to 0.)

Homework 6.5 Verify that I− 1τ

1

u2

1

u2

H χ1

x2

=

ρ

0

where τ = uHu/2 = (1+uH

2 u2)/2 and ρ =©±‖x‖2.

Hint: ρρ = |ρ|2 = ‖x‖22 since H preserves the norm. Also, ‖x‖2

2 = |χ1|2 +‖x2‖22 and

√zz =

z|z| .

* SEE ANSWER

Again, the choice©± is important. For the complex case we choose©±=−sign(χ1) =χ1|χ1|

6.2.4 A routine for computing the Householder vector

We will refer to the vector 1

u2

as the Householder vector that reflects x into©±‖x‖2e0 and introduce the notation ρ

u2

,τ

:= Housev

χ1

x2

as the computation of the above mentioned vector u2, and scalars ρ and τ, from vector x. We will use thenotation H(x) for the transformation I− 1

τuuH where u and τ are computed by Housev(x).

6.3. Householder QR Factorization 89

Algorithm:

ρ

u2

,τ

= Housev

χ1

x2

χ2 := ‖x2‖2

α :=

∥∥∥∥∥∥ χ1

χ2

∥∥∥∥∥∥2

(= ‖x‖2)

ρ =−sign(χ1)‖x‖2 ρ :=−sign(χ1)α

ν1 = χ1 + sign(χ1)‖x‖2 ν1 := χ1−ρ

u2 = x2/ν1 u2 := x2/ν1

χ2 = χ2/|ν1|(= ‖u2‖2)

τ = (1+uH2 u2)/2 τ = (1+χ2

2)/2

Figure 6.2: Computing the Householder transformation. Left: simple formulation. Right: efficient com-putation. Note: I have not completely double-checked these formulas for the complex case. Theywork for the real case.

6.3 Householder QR Factorization

Let A be an m× n with m ≥ n. We will now show how to compute A→ QR, the QR factorization, as asequence of Householder transformations applied to A, which eventually zeroes out all elements of thatmatrix below the diagonal. The process is illustrated in Figure 6.3.

In the first iteration, we partition

A→

α11 aT12

a21 A22

.

Let ρ11

u21

,τ1

= Housev

α11

a21

be the Householder transform computed from the first column of A. Then applying this Householdertransform to A yields α11 aT

12

a21 A22

:=

I− 1τ1

1

u2

1

u2

H α11 aT12

a21 A22

=

ρ11 aT12−wT

12

0 A22−u21wT12

,

where wT12 = (aT

12 + uH21A22)/τ1. Computation of a full QR factorization of A will now proceed with the

updated matrix A22.


Original matrix

ρ11

u21

,τ1

=

Housev

α11

a21

α11 aT12

a21 A22

:= ρ11 aT12−wT

12

0 A22−u21wT12

“Move forward”

× × × ×× × × ×× × × ×× × × ×× × × ×

× × × ×

× × × ×× × × ×× × × ×× × × ×

× × × ×

0 × × ×0 × × ×0 × × ×0 × × ×

× × × ×

0 × × ×0 × × ×0 × × ×0 × × ×

× × × ×

0 × × ×

0 × × ×0 × × ×0 × × ×

× × × ×

0 × × ×

0 0 × ×0 0 × ×0 0 × ×

× × × ×0 × × ×

0 0 × ×0 0 × ×0 0 × ×

× × × ×0 × × ×

0 0 × ×

0 0 × ×0 0 × ×

× × × ×0 × × ×

0 0 × ×

0 0 0 ×0 0 0 ×

× × × ×0 × × ×0 0 × ×

0 0 0 ×0 0 0 ×

× × × ×0 × × ×0 0 × ×

0 0 0 ×

0 0 0 ×

× × × ×0 × × ×0 0 × ×

0 0 0 ×

0 0 0 0

× × × ×0 × × ×0 0 × ×0 0 0 ×

0 0 0 0

Figure 6.3: Illustration of Householder QR factorization.

6.3. Householder QR Factorization 91

Now let us assume that after k iterations of the algorithm matrix A contains

A→

RT L RT R

0 ABR

=

R00 r01 R02

0 α11 aT12

0 a21 A22

,

where RT L and R00 are k× k upper triangular matrices. Let ρ11

u21

,τ1

= Housev

α11

a21

.

and update

A :=

I 0

0

I− 1τ1

1

u2

1

u2

H

R00 r01 R02

0 α11 aT12

0 a21 A22

=

I− 1τ1

0

1

u2

0

1

u2

H

R00 r01 R02

0 α11 aT12

0 a21 A22

=

R00 r01 R02

0 ρ11 aT12−wT

12

0 0 A22−u21wT12

,

where again wT12 = (aT

12 +uH21A22)/τ1.

Let

Hk =

I− 1τ1

0k

1

u21

0k

1

u21

H

be the Householder transform so computed during the (k+1)st iteration. Then upon completion matrix Acontains

R =

RT L

0

= Hn−1 · · ·H1H0A

where A denotes the original contents of A and RT L is an upper triangular matrix. Rearranging this we findthat

A = H0H1 · · ·Hn−1R

which shows that if Q = H0H1 · · ·Hn−1 then A = QR.


Homework 6.6 Show thatI 0

0

I− 1τ1

1

u2

1

u2

H=

I− 1τ1

0

1

u2

0

1

u2

H .

* SEE ANSWER

Typically, the algorithm overwrites the original matrix A with the upper triangular matrix, and at eachstep u21 is stored over the elements that become zero, thus overwriting a21. (It is for this reason that thefirst element of u was normalized to equal “1”.) In this case Q is usually not explicitly formed as it can bestored as the separate Householder vectors below the diagonal of the overwritten matrix. The algorithmthat overwrites A in this manner is given in Fig. 6.4.

We will let[U\R, t] = HouseholderQR(A)

denote the operation that computes the QR factorization of m×n matrix A, with m ≥ n, via Householdertransformations. It returns the Householder vectors and matrix R in the first argument and the vector ofscalars “τi” that are computed as part of the Householder transformations in t.

Theorem 6.7 Given A ∈ Cm×n the cost of the algorithm in Figure 6.4 is given by

CHQR(m,n)≈ 2mn2− 23

n3 flops.

Proof: The bulk of the computation is in wT12 = (aT

12 + uH21A22)/τ1 and A22− u21wT

12. During the kthiteration (when RT L is k× k), this means a matrix-vector multiplication (uH

21A22) and rank-1 update withmatrix A22 which is of size approximately (m− k)× (n− k) for a cost of 4(m− k)(n− k) flops. Thus thetotal cost is approximately

n−1

∑k=0

4(m− k)(n− k) = 4n−1

∑j=0

(m−n+ j) j = 4(m−n)n−1

∑j=0

j+4n−1

∑j=0

j2

= 2(m−n)n(n−1)+4n−1

∑j=0

j2

≈ 2(m−n)n2 +4∫ n

0x2dx = 2mn2−2n3 +

43

n3 = 2mn2− 23

n3.

6.4 Forming Q

Given A ∈ Cm×n, let [A, t] = HouseholderQR(A) yield the matrix A with the Householder vectors storedbelow the diagonal, R stored on and above the diagonal, and the τi stored in vector t. We now discuss howto form the first n columns of Q = H0H1 · · ·Hn−1. The computation is illustrated in Figure 6.5.

6.4. Forming Q 93

Algorithm: [A, t] = HOUSEQR UNB VAR1(A)

Partition A→

AT L AT R

ABL ABR

and t→

tT

tB

whereAT L is 0×0 and tT has 0 elements

while n(ABR) 6= 0 do

Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

and

tT

tB

→

t0

τ1

t2

whereα11 and τ1 are scalars

α11

a21

,τ1

:=

ρ11

u21

,τ1

= Housev

α11

a21

Update

aT12

A22

:=

I− 1τ1

1

u21

( 1 uH21

) aT12

A22

via the steps

• wT12 := (aT

12 +aH21A22)/τ1

•

aT12

A22

:=

aT12−wT

12

A22−a21wT12

Continue with AT L AT R

ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

and

tT

tB

←

t0

τ1

t2

endwhile

Figure 6.4: Unblocked Householder transformation based QR factorization.


Original matrix

α11 aT12

a21 A22

:= 1−1/τ1 −(uH21A22)/τ1

−u21/τ1 A22 +u21aT12

“Move forward”

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

0 0 0 0

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 ×

0 0 0 ×

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 ×0 0 0 ×

1 0 0 0

0 1 0 0

0 0 × ×

0 0 × ×0 0 × ×

1 0 0 0

0 1 0 0

0 0 × ×0 0 × ×0 0 × ×

1 0 0 0

0 × × ×

0 × × ×0 × × ×0 × × ×

1 0 0 0

0 × × ×0 × × ×0 × × ×0 × × ×

× × × ×

× × × ×× × × ×× × × ×× × × ×

× × × ×× × × ×× × × ×× × × ×× × × ×

Figure 6.5: Illustration of the computation of Q.

Notice that to pick out the first n columns we must form

Q

In×n

0

= H0 · · ·Hn−1

In×n

0

= H0 · · ·Hk−1 Hk · · ·Hn−1

In×n

0

︸︷︷︸

Bk

.

6.4. Forming Q 95

where Bk is defined as indicated.

Lemma 6.8 Bk has the form

Bk = Hk · · ·Hn−1

In×n

0

=

Ik×k 0

0 Bk

.

Proof: The proof of this is by induction on k:

• Base case: k = n. Then Bn =

In×n

0

, which has the desired form.

• Inductive step: Assume the result is true for Bk. We show it is true for Bk−1:

Bk−1 = Hk−1Hk · · ·Hn−1

In×n

0

= Hk−1Bk = Hk−1

Ik×k 0

0 Bk

.

=

I(k−1)×(k−1) 0

0 I− 1τk

1

uk

( 1 uHk

)

I(k−1)×(k−1) 0 0

0 1 0

0 0 Bk

=

I(k−1)×(k−1) 0

0

I− 1τk

1

uk

( 1 uHk

) 1 0

0 Bk

=

I(k−1)×(k−1) 0

0

1 0

0 Bk

− 1

uk

( 1/τk yTk

) where yT

k = uHk Bk/τk

=

I(k−1)×(k−1) 0

0

1−1/τk −yTk

−uk/τk Bk−ukyTk

=

I(k−1)×(k−1) 0 0

0 1−1/τk −yTk

0 −uk/τk Bk−ukyTk

=

I(k−1)×(k−1) 0

0 Bk−1

.

• By the Principle of Mathematical Induction the result holds for B0, . . . ,Bn.

Theorem 6.9 Given [A, t] = HouseholderQR(A) from Figure 6.4, the algorithm in Figure 6.6 overwritesA with the first n = n(A) columns of Q as defined by the Householder transformations stored below thediagonal of A and in the vector t.


Algorithm: [A] = FORMQ(A, t)

Partition A→

AT L AT R

ABL ABR

and t→

tT

tB

whereAT L is n(A)×n(A) and tT has n(A) elements

while n(AT R) 6= 0 do

Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

and

tT

tB

→

t0

τ1

t2


Update

α11 aT12

a21 A22

:=

I− 1τ1

1

u21

( 1 uH21

) 1 0

0 A22

via the steps

• α11 := 1−1/τ1

• aT12 :=−(aH

21A22)/τ1

• A22 := A22 +a21aT12

• a21 :=−a21/τ1


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

and

tT

tB

←

t0

τ1

t2

endwhile

Figure 6.6: Algorithm for overwriting A with Q from the Householder transformations stored as House-holder vectors below the diagonal of A (as produced by [A, t] = HouseholderQR(A) ).

6.5. Applying QH 97

Proof: The algorithm is justified by the proof of Lemma 6.8.

Theorem 6.10 Given A ∈ Cm×n the cost of the algorithm in Figure 6.6 is given by

CFormQ(m,n)≈ 2mn2− 23

n3 flops.

Proof: Hence the proof for Theorem 6.7 can be easily modified to establish this result.

Homework 6.11 If m = n then Q could be accumulated by the sequence

Q = (· · ·((IH0)H1) · · ·Hn−1).

Give a high-level reason why this would be (much) more expensive than the algorithm in Figure 6.6.* SEE ANSWER

6.5 Applying QH

In a future Note, we will see that the QR factorization is used to solve the linear least-squares problem. Todo so, we need to be able to compute y = QHy where QH = Hn−1 · · ·H0.

Let us start by computing H0y:I− 1τ1

1

u2

1

u2

H ψ1

y2

=

ψ1

y2

− 1

u2

1

u2

H ψ1

y2

/τ1︸︷︷︸ω1

=

ψ1

y2

−ω1

1

u2

=

ψ1−ω1

y2−ω1u2

.

More generally, let us compute Hky:I− 1τ1

0

1

u2

0

1

u2

H

y0

ψ1

y2

=

y0

ψ1−ω1

y2−ω1u2

,

where ω1 = (ψ1 +uH2 y2)/τ1. This motivates the algorithm in Figure 6.7 for computing y := Hn−1 · · ·H0y

given the output matrix A and vector t from routine HouseholderQR.The cost of this algorithm can be analyzed as follows: When yT is of length k, the bulk of the com-

putation is in an inner product with vectors of length m− k (to compute ω1) and an axpy operation withvectors of length m− k to subsequently update ψ1 and y2. Thus, the cost is approximately given by

n−1

∑k=0

4(m− k)≈ 4mn−2n2.

Notice that this is much cheaper than forming Q and then multiplying.


Algorithm: [y] = APPLYQT(A, t,y)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB

, and y→

yT

yB

whereAT L is 0×0 and tT ,yT has 0 elements

while n(ABR) 6= 0 do

Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

→

t0

τ1

t2

, and

yT

yB

→

y0

ψ1

y2

whereα11, τ1, and ψ1 are scalars

Update

ψ1

y2

:=

I− 1τ1

1

u21

( 1 uH21

) ψ1

y2

via the steps

• ω1 := (ψ1 +aH21y2)/τ1

•

ψ1

y2

:=

ψ1−ω1

y2−ω1u2


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

←

t0

τ1

t2

, and

yT

yB

←

y0

ψ1

y2

endwhile

Figure 6.7: Algorithm for computing y := Hn−1 · · ·H0y given the output from routine HouseholderQR.

6.6. Blocked Householder QR Factorization 99

6.6 Blocked Householder QR Factorization

6.6.1 The UT transform: Accumulating Householder transformations

Through a series of exercises, we will show how the application of a sequence of k Householder transfor-mations can be accumulated. What we exactly mean that this will become clear.

Homework 6.12 Assuming all inverses exist, show that T00 t01

0 τ1

−1

=

T−100 −T−1

00 t01/τ1

0 1/τ1

.

* SEE ANSWER

An algorithm that computes the inverse of an upper triangular matrix T based on the above exercise isgiven in Figure 6.8.

Homework 6.13 Homework 6.14 Consider u1 ∈ Cm with u1 6= 0 (the zero vector), U0 ∈ Cm×k, and non-singular T00 ∈ Ck×k. Define τ1 = (uH

1 u1)/2, so that

H1 = I− 1τ1

u1uH1

equals a Householder transformation, and let

Q0 = I−U0T−100 UH

0 .

Show that

Q0H1 = (I−U0T−100 UH

0 )(I− 1τ1

u1uH1 ) = I−

(U0 u1

) T00 t01

0 τ1

−1(U0 u1

)H,

where t01 = QH0 u1.

* SEE ANSWER

Homework 6.15 Homework 6.16 Consider ui ∈ Cm with ui 6= 0 (the zero vector). Define τi = (uHi ui)/2,

so thatHi = I− 1

τiuiuH

i


U =(

u0 u1 · · · uk−1

).

Show thatH0H1 · · ·Hk−1 = I−UT−1UH ,

where T is an upper triangular matrix.* SEE ANSWER


Algorithm: [T ] := UTRINV UNB VAR1(T )

Partition T →

TT L TT R

TBL TBR

whereTT L is 0×0

while m(TT L)< m(T ) do

Repartition

TT L TT R

TBL TBR

→

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

whereτ11 is 1×1

t01 :=−T00t01/τ11

τ11 := 1/τ11

Continue with TT L TT R

TBL TBR

←

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

endwhile

Figure 6.8: Unblocked algorithm for inverting an upper triangular matrix. The algorithm assumes that T00has already been inverted, and computes the next column of T in the current iteration.


Algorithm: [T ] := FORMT UNB VAR1(U, t,T )

Partition U →

UT L UT R

UBL UBR

, t→

tT

tB

, T →

TT L TT R

TBL TBR

whereUT L is 0×0, tT has 0 rows, TT L is 0×0

while m(UT L)< m(U) do

Repartition

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

tT

tB

→

t0

τ1

t2

,

TT L TT R

TBL TBR

→

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

whereυ11 is 1×1, τ1 has 1 row, τ11 is 1×1

t01 :=UH0 u1

τ11 := τ1

Continue with UT L UT R

UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

tT

tB

←

t0

τ1

t2

,

!

TT L TT R

TBL TBR

←

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

endwhile

Figure 6.9: Algorithm that computes T from U and the vector t so that I−UTUH equals the UT transform.Here U is assumed to the output of, for example, an unblocked Householder based QR algorithm. Thismeans that it is a lower trapezoidal matrix, with ones on the diagonal.


The above exercises can be summarized in the algorithm for computing T from U in Figure 6.9.In [27] we call the transformation I−UT−1UH that equals the accumulated Householder transforma-

tions the UT transform and prove that T can instead by computed as

T = triu(UHU

)(the upper triangular part of UHU) followed by either dividing the diagonal elements by two or settingthem to τ0, . . . ,τk−1 (in order). In that paper, we point out similar published results [9, 33, 44, 31].

6.6.2 A blocked algorithm

A QR factorization that exploits the insights that resulted in the UT transform can now be described:

• Partition

A→

A11 A12

A21 A22

where A11 is b×b.

• We can use the unblocked algorithm in Figure 6.4 to factor the panel

A11

A21

[

A11

A21

, t1] := HOUSEQR UNB VAR1(

A11

A21

),

overwriting the entries below the diagonal with the Householder vectors

U11

U21

(with the ones

on the diagonal implicitly stored) and the upper triangular part with R11.

• For T11 from the Householder vectors using the procedure descrived in Section 6.6.1:

T11 := FORMT(

A11

A21

, t1)

• Now we need to also apply the Householder transformations to the rest of the columns: A12

A22

:=

I−

U11

U21

T−111

U11

U21

HH A12

A22

=

A12

A22

− U11

U21

W H21

=

A12−U11W H21

A22−U21W H21

,


Algorithm: [A, t] := HOUSEQR BLK VAR1(A, t)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB

whereAT L is 0×0, tT has 0 rows

while m(AT L)< m(A) do

Determine block size bRepartition

AT L AT R

ABL ABR

→

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

tT

tB

→

t0

t1

t2

whereA11 is b×b, t1 has b rows

[

A11

A21

, t1] := HOUSEQR UNB VAR1(

A11

A21

)

T11 := FORMT(

A11

A21

, t1)

W H21 := T−H

11 (UH11A12 +UH

21A22) A12

A22

:=

A12−U11W H21

A22−U21W H21


ABL ABR

←

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

tT

tB

←

t0

t1

t2

endwhile

Figure 6.10: Blocked Householder transformation based QR factorization.


whereW H

21 = T−H11 (UH

11A12 +UH21A22).

This motivates the blocked algorithm in Figure 6.10.

6.6.3 Variations on a theme

Merging the unblocked Householder QR factorization and the formation of T

There are many possible algorithms for computing the QR factorization. For example, the unblockedalgorithm from Figure 6.4 can be merged with the unblocked algorithm for forming T in Figure 6.9 toyield the algorithm in Figure 6.11.

An alternative unblocked merged algorithm

Let us now again compute the QR factorization of A simultaneous with the forming of T , but now takingadvantage of the fact that T is partically computed to change the algorithm into what some would considera “left-looking” algorithm.

Partition

A→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

and T →

T00 t01 T02

0 τ11 tT12

0 0 T22

.

Assume that

A00

aT10

A20

has been factored and overwritten with

U00

uT10

U20

and R00 while also computing

T00. In the next step, we need to apply previous Householder transformations to the next column of A andthen update that column with the next column of R and U . In addition, the next column of T must becomputing. This means:

• Update a01

α11

a21

:=

I−

U00

uT10

U20

T−100

U00

uT10

U20

H

Ha01

α11

a21

• Compute the next Householder transform:

[

α11

a21

,τ11] := Housev(

α11

a21

)

• Compute the rest of the next column of T

t01 := AH20a21

This yields the algorithm in Figure 6.12.


Algorithm: [A,T ] := HOUSEQR AND FORMT UNB VAR1(A,T )

Partition A→

AT L AT R

ABL ABR

, T →

TT L TT R

TBL TBR

whereAT L is 0×0, TT L is 0×0


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

TT L TT R

TBL TBR

→

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

whereα11 is 1×1, τ11 is 1×1

α11

a21

,τ11

:=

ρ11

u21

,τ11

= Housev

α11

a21

Update

aT12

A22

:=

I− 1τ11

1

u21

( 1 uH21

) aT12

A22

via the steps

• wT12 := (aT

12 +aH21A22)/τ11

•

aT12

A22

:=

aT12−wT

12

A22−a21wT12

t01 := AH

20a21


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

TT L TT R

TBL TBR

←

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

endwhile

Figure 6.11: Unblocked Householder transformation based QR factorization merged with the computationof T for the UT transform.


Algorithm: [A,T ] := HOUSEQR AND FORMT UNB VAR2(A,T )

Partition A→

AT L AT R

ABL ABR

, T →

TT L TT R

TBL TBR

whereAT L is 0×0, TT L is 0×0


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

TT L TT R

TBL TBR

→

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

whereα11 is 1×1, τ11 is 1×1

a01

α11

a02

:=

I−

U00

uT10

U20

T−100

U00

uT10

U20

H

Ha01

α11

a21

via the steps

• w01 := T−H00 (UH

00a01 +α11(uT10)

H +UH20a21)

•

a01

α11

a02

:=

a01

α11

a02

−

U00

uT10

U20

w01

α11

a21

,τ11

:=

ρ11

u21

,τ11

= Housev

α11

a21

t01 := AH

20a21


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

TT L TT R

TBL TBR

←

T00 t01 T02

tT10 τ11 tT

12

T20 t21 T22

endwhile

Figure 6.12: Alternative unblocked Householder transformation based QR factorization merged with thecomputation of T for the UT transform.


Alternative blocked algorithm (Variant 2)

An alternative blocked variant that uses either of the unblocked factorization routines that merges theformation of T is given in Figure 6.13.



Partition A→

AT L AT R

ABL ABR

, t→

tT

tB




AT L AT R

ABL ABR

→

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

tT

tB

→

t0

t1

t2

whereA11 is b×b, t1 has b rows

[

A11

A21

,T11] := HOUSEQR FORMT UNB VARX(

A11

A21

)

W H21 := T−H

11 (UH11A12 +UH

21A22) A12

A22

:=

A12−U11W H21

A22−U21W H21


ABL ABR

←

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

tT

tB

←

t0

t1

t2

endwhile

Figure 6.13: Alternative blocked Householder transformation based QR factorization.

Chapter 7Notes on Solving Linear Least-SquaresProblems

For a motivation of the linear least-squares problem, read Week 10 (Sections 10.3-10.5) of


Video


No video... Camera ran out of memory...

109


Chapter 7. Notes on Solving Linear Least-Squares Problems 110

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.1. The Linear Least-Squares Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.2. Method of Normal Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.3. Solving the LLS Problem Via the QR Factorization . . . . . . . . . . . . . . . . . . 112

7.3.1. Simple derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7.3.2. Alternative derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . 113

7.4. Via Householder QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.5. Via the Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.5.1. Simple derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.5.2. Alternative derivation of the solution . . . . . . . . . . . . . . . . . . . . . . . . 116

7.6. What If A Does Not Have Linearly Independent Columns? . . . . . . . . . . . . . . 116

7.7. Exercise: Using the the LQ factorization to solve underdetermined systems . . . . . 123

7.1. The Linear Least-Squares Problem 111

7.1 The Linear Least-Squares Problem

Let A ∈ Cm×n and y ∈ Cm. Then the linear least-square problem (LLS) is given by

Find x s.t. ‖Ax− y‖2 = minz∈Cn‖Az− y‖2.

In other words, x is the vector that minimizes the expression ‖Ax− y‖2. Equivalently, we can solve

Find x s.t. ‖Ax− y‖22 = min

z∈Cn‖Az− y‖2

2.

If x solves the linear least-squares problem, then Ax is the vector in C (A) (the column space of A) closestto the vector y.

7.2 Method of Normal Equations

Let A ∈ Rm×n have linearly independent columns (which implies m≥ n). Let f : Rn→ Rm be defined by

f (x) = ‖Ax− y‖22 = (Ax− y)T (Ax− y) = xT AT Ax− xT AT y− yT Ax+ yT y

= xT AT Ax−2xT AT y+ yT y.

This function is minimized when the gradient is zero,5 f (x) = 0. Now,

5 f (x) = 2AT Ax−2AT y.

If A has linearly independent columns then AT A is nonsingular. Hence, the x that minimizes ‖Ax− y‖2solves AT Ax = AT y. This is known as the method of normal equations. Notice that then

x = (AT A)−1AT︸︷︷︸A†

y,

where A† is known as the pseudo inverse or Moore-Penrose pseudo inverse.In practice, one performs the following steps:

• Form B = AT A, a symmetric positive-definite (SPD) matrix.Cost: approximately mn2 floating point operations (flops), if one takes advantage of symmetry.

• Compute the Cholesky factor L, a lower triangular matrix, so that B = LLT .This factorization, discussed in Week 8 (Section 8.4.2) of Linear Algebra: Foundations to Frontiers- Notes to LAFF With and to be revisited later in this course, exists since B is SPD.Cost: approximately 1

3n3 flops.

• Compute y = AT y.Cost: 2mn flops.

• Solve Lz = y and LT x = z.Cost: n2 flops each.




Thus, the total cost of solving the LLS problem via normal equations is approximately mn2 + 13n3 flops.

Remark 7.1 We will later discuss that if A is not well-conditioned (its columns are nearly linearly depen-dent), the Method of Normal Equations is numerically unstable because AT A is ill-conditioned.

The above discussion can be generalized to the case where A ∈ Cm×n. In that case, x must solve AHAx =AHy.

A geometric explanation of the method of normal equations (for the case where A is real valued) canbe found in Week 10 (Sections 10.3-10.5) of Linear Algebra: Foundations to Frontiers - Notes to LAFFWith.

7.3 Solving the LLS Problem Via the QR Factorization

Assume A ∈ Cm×n has linearly independent columns and let A = QLRT L be its QR factorization. We wishto compute the solution to the LLS problem: Find x ∈ Cn such that

‖Ax− y‖22 = min


2.

7.3.1 Simple derivation of the solution

Notice that we know that, if A has linearly independent columns, the solution is given by x = (AHA)−1AHy(the solution to the normal equations). Now,

x = [AHA]−1AHy Solution to the Normal Equations

=[(QLRT L)

H(QLRT L)]−1

(QLRT L)Hy A = QLRT L

=[RH

T LQHL QLRT L

]−1 RHT LQH

L y (BC)H = (CHBH)

=[RH

T LRT L]−1 RH

T LQHL y QH

L QL = I

= R−1T LR−H

T L RHT LQH

L y (BC)−1 =C−1B−1

= R−1T LQH

L y R−HT L RH

T L = I

Thus, the x that solves RT Lx = QHL y solves the LLS problem.



7.3. Solving the LLS Problem Via the QR Factorization 113

7.3.2 Alternative derivation of the solution

We know that then there exists a matrix QR such that Q =(

QL QR

)is unitary. Now,

minz∈Cn ‖Az− y‖22

= minz∈Cn ‖QLRT Lz− y‖22 (substitute A = QLRT L)

= minz∈Cn ‖QH(QLRT Lz− y)‖22 (two-norm is preserved since QH is unitary)

= minz∈Cn

∥∥∥∥∥∥ QH

L

QHR

QLRT Lz−

QHL

QHR

y

∥∥∥∥∥∥2

2

(partitioning, distributing)

= minz∈Cn

∥∥∥∥∥∥ RT Lz

0

− QH

L y

QHR y

∥∥∥∥∥∥2

2

(partitioned matrix-matrix multiplication)

= minz∈Cn

∥∥∥∥∥∥ RT Lz−QH

L y

−QHR y

∥∥∥∥∥∥2

2

(partitioned matrix addition)

= minz∈Cn

(∥∥RT Lz−QHL y∥∥2

2 +‖QHR y‖2

2

)(property of the 2-norm:∥∥∥∥∥∥ x

y

∥∥∥∥∥∥2

2

= ‖x‖22 +‖y‖2

2)

=(

minz∈Cn∥∥RT Lz−QH

L y∥∥2

2

)+‖QH

R y‖22 (QH

R y is independent of z)

= ‖QHR y‖2

2 (minimized by x that satisfies RT Lx = QHL y)

Thus, the desired x that minimizes the linear least-squares problem solves RT Lx = QHL y. The solution is

unique because RT L is nonsingular (because A has linearly independent columns).In practice, one performs the following steps:

• Compute the QR factorization A = QLRT L.If Gram-Schmidt or Modified Gram-Schmidt are used, this costs 2mn2 flops.

• Form y = QHL y.

Cost: 2mn flops.

• Solve RT Lx = y (triangular solve).Cost: n2 flops.

Thus, the total cost of solving the LLS problem via (Modified) Gram-Schmidt QR factorization is approx-imately 2mn2 flops.

Notice that the solution computed by the Method of Normal Equations (generalized to the complexcase) is given by

(AHA)−1AHy = ((QLRT L)H(QLRT L))

−1(QLRT L)Hy = (RH

T LQHL QLRT L)

−1RHT LQH

L y= (RH

T LRT L)−1RH

T LQHL y = R−1

T LR−HT L RH

T LQHL y = R−1

T LQHL y = R−1

T L y = x

where RT Lx= y. This shows that the two approaches compute the same solution, generalizes the Method ofNormal Equations to complex valued problems, and shows that the Method of Normal Equations computesthe desired result without requiring multivariate calculus.


7.4 Via Householder QR Factorization

Given A ∈ Cm×n with linearly independent columns, the Householder QR factorization yields n House-holder transformations, H0, . . . ,Hn−1, so that

Hn−1 · · ·H0︸︷︷︸Q =

(QL QR

)H

A =

RT L

0

.

We wish to solve RT Lx = QHL y︸︷︷︸y

. But

y = QHL y =

( I 0) QH

L

QHR

y =(

I 0)(

QL QR

)Hy =

(I 0

)QHy

=(

I 0)(Hn−1 · · ·H0)y.=

(I 0

)( Hn−1 · · ·H0y︸︷︷︸

w =

wT

wB

) = wT .

This suggests the following approach:

• Compute H0, . . . ,Hn−1 so that Hn−1 · · ·H0A=

RT L

0

, storing the Householder vectors that define

H0, . . . ,Hn−1 over the elements in A that they zero out (see “Notes on Householder QR Factoriza-tion”).Cost: 2mn2− 2

3n3 flops.

• Form w = (Hn−1(· · ·(H0y) · · ·)) (see “Notes on Householder QR Factorization”). Partition w = wT

wB

where wT ∈ Cn. Then y = wT .

Cost: 4m2−2n2 flops. (See “Notes on Householder QR Factorization” regarding this.)

• Solve RT Lx = y.Cost: n2 flops.

Thus, the total cost of solving the LLS problem via Householder QR factorization is approximately 2mn2−23n3 flops. This is cheaper than using (Modified) Gram-Schmidt QR factorization, and hence preferred(because it is also numerically more stable, as we will discuss later in the course).

7.5. Via the Singular Value Decomposition 115

7.5 Via the Singular Value Decomposition

Given A ∈ Cm×n with linearly independent columns, let A =UΣV H be its SVD decomposition. Partition

U =(

UL UR

)and Σ =

ΣT L

0

,

where UL ∈ Cm×n and ΣT L ∈ Rn×n so that

A =(

UL UR

) ΣT L

0

V H =ULΣT LV H .

We wish to compute the solution to the LLS problem: Find x ∈ Cn such that

‖Ax− y‖22 = min


2.

7.5.1 Simple derivation of the solution

Notice that we know that, if A has linearly independent columns, the solution is given by x = (AHA)−1AHy(the solution to the normal equations). Now,

x = [AHA]−1AHy Solution to the Normal Equations

=[(ULΣT LV H)H(ULΣT LV H)

]−1(ULΣT LV H)Hy A =ULΣT LV H

=[(V ΣT LUH

L )(ULΣT LV H)]−1

(V ΣT LUHL )y (BCD)H = (DHCHBH) and ΣH

T L = ΣT L

=[V ΣT LΣT LV H]−1V ΣT LUH

L y UHL UL = I

= V Σ−1T LΣ

−1T LV HV ΣT LUH

L y V−1 =V H and (BCD)−1 = D−1C−1B−1

= V Σ−1T LUH

L y V HV = I and Σ−1T LΣT L = I


7.5.2 Alternative derivation of the solution

We now discuss a derivation of the result that does not depend on the Normal Equations, in preparationfor the more general case discussed in the next section.


= minz∈Cn ‖UΣV Hz− y‖22 (substitute A =UΣV H)

= minz∈Cn ‖U(ΣV Hz−UHy)‖22 (substitute UUH = I and factor out U)

= minz∈Cn ‖ΣV Hz−UHy‖22 (multiplication by a unitary matrix

preserves two-norm)

= minz∈Cn

∥∥∥∥∥∥ ΣT L

0

V Hz−

UHL y

UHR y

∥∥∥∥∥∥2

2

(partition, partitioned matrix-matrix multiplication)

= minz∈Cn

∥∥∥∥∥∥ ΣT LV Hz−UH

L y

−UHR y

∥∥∥∥∥∥2

2

(partitioned matrix-matrix multiplication and addition)

= minz∈Cn∥∥ΣT LV Hz−UH

L y∥∥2

2 +∥∥UH

R y∥∥2

2

∥∥∥∥∥∥ vT

vB

∥∥∥∥∥∥2

2

= ‖vT‖22 +‖vB‖2

2

The x that solves ΣT LV Hx =UH

L y minimizes the expression. That x is given by x =V Σ−1T LUH

L y.This suggests the following approach:

• Compute the reduced SVD: A =ULΣT LV H .Cost: Greater than computing the QR factorization! We will discuss this in a future note.

• Form y = Σ−1T LUH

L y.Cost: approx. 2mn flops.

• Compute z =V y.Cost: approx. 2mn flops.

7.6 What If A Does Not Have Linearly Independent Columns?

In the above discussions we assume that A has linearly independent columns. Things get a bit trickier if Adoes not have linearly independent columns. There is a variant of the QR factorization known as the QRfactorization with column pivoting that can be used to find the solution. We instead focus on using theSVD.

Given A ∈ Cm×n with rank(A) = r < n, let A =UΣV H be its SVD decomposition. Partition

U =(

UL UR

), V =

(VL VR

)and Σ =

ΣT L 0

0 0

,

7.6. What If A Does Not Have Linearly Independent Columns? 117

where UL ∈ Cm×r, VL ∈ Cn×r and ΣT L ∈ Rr×r so that

A =(

UL UR

) ΣT L 0

0 0

( VL VR

)H=ULΣT LV H

L .

Now,


= minz∈Cn ‖UΣV Hz− y‖22 (substitute A =UΣV H)

= minz∈Cn ‖UΣV Hz−UUHy‖22 (UUH = I)

= minw ∈ Cn

z =V w

‖UΣV HV w−UUHy‖2

2 (choosing the max over w ∈ Cn with z =V w

is the same as choosing the max over z ∈ Cn.)

= minw ∈ Cn

z =V w

‖U(Σw−UHy)‖2

2 (factor out U and V HV = I)

= minw ∈ Cn

z =V w

‖Σw−UHy‖2

2 (‖Uv‖2 = ‖v‖2)

= minw ∈ Cn

z =V w

‖

ΣT L 0

0 0

wT

wB

− UH

L

UHR

y‖22 (partition Σ, w, and U)

= minw ∈ Cn

z =V w

∥∥∥∥∥∥ ΣT LwT −UH

L y

−UHR y

∥∥∥∥∥∥2

2

(partitioned matrix-matrix multiplication)

= minw ∈ Cn

z =V w

∥∥ΣT LwT −UH

L y∥∥2

2 +∥∥UH

R y∥∥2

2

∥∥∥∥∥∥ vT

vB

∥∥∥∥∥∥2

2

= ‖vT‖22 +‖vB‖2

2

Since ΣT L is a diagonal with no zeroes on its diagonal, we know that Σ

−1T L exists. Choosing wT = Σ

−1T LUH

L ymeans that

minw ∈ Cn

z =V w

∥∥ΣT LwT −UH

L y∥∥2

2 = 0,

which obviously minimizes the entire expression. We conclude that

x =V w =(

VL VR

) Σ−1T LUH

L y

wB

=VLΣ−1T LUH

L y+VRwB

characterizes all solutions to the linear least-squares problem, where wB can be chosen to be any vector ofsize n− r. By choosing wB = 0 and hence x = VLΣ

−1T LUH

L y we choose the vector x that itself has minimal2-norm.


The sequence of pictures on the following pages reasons through the insights that we gained so far(in “Notes on the Singular Value Decomposition” and this note). These pictures can be downloaded as aPowerPoint presentation from

http://www.cs.utexas.edu/users/flame/Notes/Spaces.pptx




Row space

Null space

Column space

Le1 null space

AC(VL )

C(VR )

C(UL )

C(UR )

Ax = y

x = xr + xn

y

0 0

dim = r

dim = n− r

dim = r

dim =m− r

If A ∈ Cm×n and

A =(

UL UR

) ΣT L 0

0 0

( VL VR

)H=ULΣT LV H

L

equals the SVD, where UL ∈ Cm×r, VL ∈ Cn×r, and ΣT L ∈ Cr×r, then

• The row space of A equals C (VL), the column space of VL;

• The null space of A equals C (VR), the column space of VR;

• The column space of A equals C (UL); and

• The left null space of A equals C (UR).

Also, given a vector x∈Cn, the matrix A maps x to y=Ax, which must be in C (A)=C (UL).


Row space

Null space

Column space

Le1 null space

AC(VL )

C(VR )

C(UL )

C(UR )

Ax = y??y

x0 0

dim = r

dim = n− r

dim = r

dim =m− r

Given an arbitrary y ∈ Cm not in C (A) = C (UL), there cannot exist a vector x ∈ Cn suchthat Ax = y.


Row space

Null space

Column space

Le1 null space

AC(VL )

C(VR )

C(UL )

C(UR )

y

y =ULULH y

Ax = y

x = xr + xn= xr + xn0 0

xr =VLΣ

TL

−1UL

H yAxr =ULΣTLVL

H xr = y

xn =VRwB

Axn = 0

dim = r

dim = n− r

dim = r

dim =m− r

The solution to the Linear Least-Squares problem, x, equals any vector that is mapped byA to the projection of y onto the column space of A: y = A(AHA)−1AHy = QQHy. Thissolution can be written as the sum of a (unique) vector in the row space of A and any vectorin the null space of A. The vector in the row space of A is given by

xr =VLΣ−1T LUH

L y ==VLΣ−1T LUH

L y.


The sequence of pictures, and their explanations, suggest a much simply path towards the formula forsolving the LLS problem.

• We know that we are looking for the solution x to the equation

Ax =ULUHL y.

• We know that there must be a solution xr in the row space of A. It suffices to find wT such thatxr =VLwT .

• Hence we search for wT that satisfies

AVLwT =ULUHL y.

• Since there is a one-to-one mapping by A from the row space of A to the column space of A, weknow that wT is unique. Thus, if we find a solution to the above, we have found the solution.

• Multiplying both sides of the equation by UHL yields

UHL AVLwT =UH

L y.

• Since A =ULΣ−1T LV H

L , we can rewrite the above equation as

ΣT LwT =UHL y

so that wT = Σ−1T LUH

L y.

• Hencexr =VLΣ

−1T LUH

L y.

• Adding any vector in the null space of A to xr also yields a solution. Hence all solutions to the LLSproblem can be characterized by

x =VLΣ−1T LUH

L y+VRwR.

Here is yet another important way of looking at the problem:

• We start by considering the LLS problem: Find x ∈ Cn such that

‖Ax− y‖22 = max


2.

• We changed this into the problem of finding wL that satisfied

ΣT LwL = vT

where x =VLwL and y =ULUHL y =ULvT .

• Thus, by expressing x in the right basis (the columns of VL) and the projection of y in the rightbasis (the columns of UL), the problem became trivial, since the matrix that related the solutionto the right-hand side became diagonal.

7.7. Exercise: Using the the LQ factorization to solve underdetermined systems 123

7.7 Exercise: Using the the LQ factorization to solve underdeter-mined systems

We next discuss another special case of the LLS problem: Let A ∈ Cm×n where m < n and A has linearlyindependent rows. A series of exercises will lead you to a practical algorithm for solving the problem ofdescribing all solutions to the LLS problem

‖Ax− y‖2 = minz‖Az− y‖2.

You may want to review “Notes on the QR Factorization” as you do this exercise.

Homework 7.2 Let A ∈ Cm×n with m < n have linearly independent rows. Show that there exist a lowertriangular matrix LL ∈ Cm×m and a matrix QT ∈ Cm×n with orthonormal rows such that A = LLQT ,noting that LL does not have any zeroes on the diagonal. Letting L =

(LL 0

)be Cm×n and unitary

Q =

QT

QB

, reason that A = LQ.

Don’t overthink the problem: use results you have seen before.* SEE ANSWER

Homework 7.3 Let A ∈ Cm×n with m < n have linearly independent rows. Consider


Use the fact that A = LLQT , where LL ∈Cm×m is lower triangular and QT has orthonormal rows, to arguethat any vector of the form QH

T L−1L y+QH

B wB (where wB is any vector in Cn−m) is a solution to the LLS

problem. Here Q =

QT

QB

.

* SEE ANSWER

Homework 7.4 Continuing Exercise 7.2, use Figure 7.1 to give a Classical Gram-Schmidt inspired al-gorithm for computing LL and QT . (The best way to check you got the algorithm right is to implementit!)

* SEE ANSWER

Homework 7.5 Continuing Exercise 7.2, use Figure 7.2 to give a Householder QR factorization inspiredalgorithm for computing L and Q, leaving L in the lower triangular part of A and Q stored as Householdervectors above the diagonal of A. (The best way to check you got the algorithm right is to implement it!)

* SEE ANSWER


Algorithm: [L,Q] := LQ CGS UNB(A,L,Q)

Partition A→

AT

AB

, L→

LT L LT R

LBL LBR

, Q→

QT

QB

whereAT has 0 rows, LT L is 0×0, QT has 0 rows

while m(AT )< m(A) do

Repartition

AT

AB

→

A0

aT1

A2

,

LT L LT R

LBL LBR

→

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

QT

QB

→

Q0

qT1

Q2

wherea1 has 1 row, λ11 is 1×1, q1 has 1 row

Continue with AT

AB

←

A0

aT1

A2

,

LT L LT R

LBL LBR

←

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

QT

QB

←

Q0

qT1

Q2

endwhile

Figure 7.1: Algorithm skeleton for CGS-like LQ factorization.

7.7. Exercise: Using the the LQ factorization to solve underdetermined systems 125

Algorithm: [A, t] := HLQ UNB(A, t)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB



Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

→

t0

τ1

t2

whereα11 is 1×1, τ1 has 1 row


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

←

t0

τ1

t2

endwhile

Figure 7.2: Algorithm skeleton for Householder QR factorization inspired LQ factorization.

Chapter 8Notes on the Condition of a Problem

Correctness in the presence of error (e.g., when floating point computations are performed) takes on adifferent meaning. For many problems for which computers are used, there is one correct answer, andwe expect that answer to be computed by our program. The problem is that, as we will see later, mostreal numbers cannot be stored exactly in a computer memory. They are stored as approximations, floatingpoint numbers, instead. Hence storing them and/or computing with them inherently incurs error.

Naively, we would like to be able to define a program that computes with floating point numbers asbeing “correct” if it computes an answer that is close to the exact answer. Unfortunately, some problemsthat are computed this way have the property that a small change in the input yields a large change in theoutput. Surely we can’t blame the program for not computing an answer close to the exact answer in thiscase. The mere act of storing the input data as a floating point number may cause a completely differentoutput, even if all computation is exact. We will later define stability to be a property of a program. Itis what takes the place of correctness. In this note, we instead will focus on when a problem is a “good”problem, meaning that in exact arithmetic a ”small” change in the input will always cause at most a “small”change in the output, or a “bad” problem if a “small” change may yield a “large” A good problems will becalled well-conditioned. A bad problem will be called ill-conditioned.

Notice that “small” and “large” are vague. To some degree, norms help us measure size. To somedegree, “small” and “large” will be in the eyes of the beholder (in other words, situation dependent).

Video


Video did not turn out...

127

Chapter 8. Notes on the Condition of a Problem 128

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

8.1. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

8.2. The Prototypical Example: Solving a Linear System . . . . . . . . . . . . . . . . . . 129

8.3. Condition Number of a Rectangular Matrix . . . . . . . . . . . . . . . . . . . . . . . 133

8.4. Why Using the Method of Normal Equations Could be Bad . . . . . . . . . . . . . . 134

8.5. Why Multiplication with Unitary Matrices is a Good Thing . . . . . . . . . . . . . . 135

8.1. Notation 129

8.1 Notation

Throughout this note, we will talk about small changes (perturbations) to scalars, vectors, and matrices.To denote these, we attach a “delta” to the symbol for a scalar, vector, or matrix.

• A small change to scalar α ∈ C will be denoted by δα ∈ C;

• A small change to vector x ∈ Cn will be denoted by δx ∈ Cn; and

• A small change to matrix A ∈ Cm×n will be denoted by ∆A ∈ Cm×n.

Notice that the “delta” touches the α, x, and A, so that, for example, δx is not mistaken for δ · x.

8.2 The Prototypical Example: Solving a Linear System

Assume that A ∈ Rn×n is nonsingular and x,y ∈ Rn with Ax = y. The problem here is the function thatcomputes x from y and A. Let us assume that no error is introduced in the matrix A when it is stored,but that in the process of storing y a small error is introduced: δy ∈ Rn so that now y+ δy is stored. Thequestion becomes by how much the solution x changes as a function of δy. In particular, we would liketo quantify how a relative change in the right-hand side y (‖δy‖/‖y‖ in some norm) translates to a relativechange in the solution x (‖δx‖/‖x‖). It turns out that we will need to compute norms of matrices, usingthe norm induced by the vector norm that we use.

Since Ax = y, if we use a consistent (induced) matrix norm,

‖y‖= ‖Ax‖ ≤ ‖A‖‖x‖ or, equivalently,1‖x‖≤ ‖A‖ 1

‖y‖. (8.1)

Also,

A(x+δx) = y+δy

Ax = y

implies that Aδx = δy so that δx = A−1δy.

Hence‖δx‖= ‖A−1

δy‖ ≤ ‖A−1‖‖δy‖. (8.2)

Combining (8.1) and (8.2) we conclude that

‖δx‖‖x‖≤ ‖A‖‖A−1‖‖δy‖

‖y‖.

What does this mean? It means that the relative error in the solution is at worst the relative error inthe right-hand side, amplified by ‖A‖‖A−1‖. So, if that quantity is “small” and the relative error in theright-hand size is “small” and exact arithmetic is used, then one is guaranteed a solution with a relatively“small” error.

The quantity κ‖·‖(A) = ‖A‖‖A−1‖ is called the condition number of nonsingular matrix A (associatedwith norm ‖ · ‖).


Are we overestimating by how much the relative error can be amplified? The answer to this isno. For every nonsingular matrix A, there exists a right-hand side y and perturbation δy such that, ifA(x+δx) = y+δy,

‖δx‖‖x‖

= ‖A‖‖A−1‖‖δy‖‖y‖

.

In order for this equality to hold, we need to find y and δy such that

‖y‖= ‖Ax‖= ‖A‖‖x‖ or, equivalently, ‖A‖= ‖Ax‖‖x‖

and

‖δx‖= ‖A−1δy‖= ‖A−1‖‖δy‖. or, equivalently, ‖A−1‖= ‖A

−1δy‖‖δy‖

.

In other words, x can be chosen as a vector that maximizes ‖Ax‖/‖x‖ and δy should maximize ‖A−1δy‖/‖δy‖.The vector y is then chosen as y = Ax.

What if we use the 2-norm? For this norm, κ2(A) = ‖A‖2‖A−1‖2 = σ0/σn−1. So, the ratio between thelargest and smallest singular value determines whether a matrix is well-conditioned or ill-conditioned.

To show for what vectors the maximal magnification is attained, consider the SVD

A =UΣV T =(

u0 u1 · · · un−1

)

σ0

σ1. . .

σn−1

(

v0 v1 · · · vn−1

)H.

Recall that

• ‖A‖2 = σ0, v0 is the vector that maximizes max‖z‖2=1 ‖Az‖2, and Av0 = σ0u0;

• ‖A−1‖2 = 1/σn−1, un−1 is the vector that maximizes max‖z‖2=1 ‖A−1z‖2, and Avn−1 = σn−1un−1.

Now, take y = σ0u0. Then Ax = y is solved by x = v0. Take δy = βσ1u1. Then Aδx = δy is solved byx = βv1. Now,

‖δy‖2

‖y‖2=|β|σ1

σ0and‖δx‖2

‖x‖2= |β|.

Hence‖δx‖2

‖x‖2=

σ0

σn−1

‖δy‖2

‖y‖2

This is depicted in Figure 8.1 for n = 2.The SVD can be used to show that A maps the unit ball to an ellipsoid. The singular values are

the lengths of the various axes of the ellipsoid. The condition number thus captures the eccentricity ofthe ellipsoid: the ratio between the lengths of the largest and smallest axes. This is also illustrated inFigure 8.1.

8.2. The Prototypical Example: Solving a Linear System 131

R2: Domain of A. R2: Codomain of A.

Figure 8.1: Illustration for choices of vectors y and δy that result in ‖δx‖2‖x‖2

= σ0σn−1

‖δy‖2‖y‖2

. Because of theeccentricity of the ellipse, the relatively small change δy relative to y is amplified into a relatively largechange δx relative to x. On the right, we see that ‖δy‖2/‖y‖2 = βσ1/σ0 (since ‖u0‖2 = ‖u1‖2 = 1). On theleft, we see that ‖δx‖2/‖x‖= β (since ‖v0‖2 = ‖v1‖2 = 1).

Number of accurate digits Notice that for scalars δψ and ψ, log10(δψ

ψ) = log10(δψ)− log10(ψ) roughly

equals the number leading decimal digits of ψ + δψ that are accurate, relative to ψ. For example, ifψ = 32.512 and δψ = 0.02, then ψ+ δψ = 32.532 which has three accurate digits (highlighted in read).Now, log10(32.512)− log10(0.02)≈ 1.5− (−1.7) = 3.2.

Now, if‖δx‖‖x‖

= κ(A)‖δy‖‖y‖

.

thenlog10(‖δx‖)− log10(‖x‖) = log10(κ(A))+ log10(‖δy‖)− log10(‖y‖)

so thatlog10(‖x‖)− log10(‖δx‖) = [log10(‖y‖)− log10(‖δy‖)]− log10(κ(A)).

In other words, if there were k digits of accuracy in the right-hand side, then it is possible that (due onlyto the condition number of A) there are only k− log10(κ(A)) digits of accuracy in the solution. If we startwith only 8 digits of accuracy and κ(A) = 105, we may only get 3 digits of accuracy. If κ(A) ≥ 108, wemay not get any digits of accuracy...

Homework 8.1 Show that, if A is a nonsingular matrix, for a consistent matrix norm, κ(A)≥ 1.* SEE ANSWER

We conclude from this that we can generally only expect as much relative accuracy in the solution aswe had in the right-hand side.


Alternative exposition Note: the below links conditioning of matrices to the relative condition numberof a more general function. For a more thorough treatment, you may want to read Lecture 12 of “Trefethenand Bau”. That book discusses the subject in much more generality than is needed for our discussion oflinear algebra. Thus, if this alternative exposition baffles you, just skip it!

Let f : Rn→ Rm be a continuous function such that f (y) = x. Let ‖ · ‖ be a vector norm. Consider fory 6= 0

κf (y) = lim

δ→0sup

‖δy‖ ≤ δ

(‖ f (y+δy)− f (y)‖

‖ f (y)‖

)/

(‖δy‖‖y‖

).

Letting f (y+δy) = x+δx, we find that

κf (y) = lim

δ→0sup

‖δy‖ ≤ δ

(‖x+δx‖‖x‖

)/

(‖δy‖‖y‖

).

(Obviously, if δy = 0 or y = 0 or f (y) = 0 , things get a bit hairy, so let’s not allow that.)Roughly speaking, κ f (y) equals the maximum that a(n infitessimally) small relative error in y is mag-

nified into a relative error in f (y). This can be considered the relative condition number of function f . Alarge relative condition number means a small relative error in the input (y) can be magnified into a largerelative error in the output (x = f (y)). This is bad, since small errors will invariable occur.

Now, if f (y) = x is the function that returns x where Ax = y for a nonsingular matrix A ∈ Cn×n, thenvia an argument similar to what we did earlier in this section we find that κ f (y)≤ κ(A) = ‖A‖‖A−1‖, thecondition number of matrix A:

limδ→0

sup

‖δy‖ ≤ δ

(‖ f (y+δy)− f (y)‖

‖ f (y)‖

)/

(‖δy‖‖y‖

)

= limδ→0

sup

‖δy‖ ≤ δ

(‖A−1(y+δy)−A−1(y)‖

‖A−1y‖

)/

(‖δy‖‖y‖

)

= limδ→0

max‖z‖= 1

δy = δ · z

(‖A−1(y+δy)−A−1(y)‖

‖A−1y‖

)/

(‖δy‖‖y‖

)

= limδ→0

max‖z‖= 1

δy = δ · z

(‖A−1δy‖‖δy‖

)/

(‖A−1y‖‖y‖

)

= limδ→0

max‖z‖= 1

δy = δ · z


)(‖y‖‖A−1y‖

)

8.3. Condition Number of a Rectangular Matrix 133

=

limδ→0

max‖z‖= 1

δy = δ · z


)[(

‖y‖‖A−1y‖

)]

= limδ→0

max‖z‖= 1

δy = δ · z

(‖A−1(δ · z)‖‖δ · z‖

)(‖y‖‖A−1y‖

)

= max‖z‖= 1

(‖A−1z‖‖z‖

)(‖y‖‖A−1y‖

)

= max‖z‖= 1

(‖A−1z‖‖z‖

)(‖Ax‖‖x‖

)

≤

max‖z‖= 1

(‖A−1z‖‖z‖

) max

x 6= 0

(‖Ax‖‖x‖

)= ‖A‖‖A−1‖,

where ‖ · ‖ is the matrix norm induced by vector norm ‖ · ‖.

8.3 Condition Number of a Rectangular Matrix

Given A ∈ Cm×n with linearly independent columns and y ∈ Cm, consider the linear least-squares (LLS)problem

‖Ax− y‖2 = minw‖Aw− y‖2 (8.3)

and the perturbed problem

‖A(x+δx)− y‖2 = minw+δw‖A(w+δw)− (y+δy)‖2. (8.4)

We will again bound by how much the relative error in y is amplified.Notice that the solutions to (8.3) and (8.4) respectively satisfy

AHAx = AHyAHA(x+δx) = AH(y+δy)

so that AHAδx = AHδy (subtracting the first equation from the second) and hence

‖δx‖2 = ‖(AHA)−1AHδy‖2 ≤ ‖(AHA)−1AH‖2‖δy‖2.

Now, let z = A(AHA)−1AHy be the projection of y onto C (A) and let θ be the angle between z and y. Letus assume that y is not orthogonal to C (A) so that z 6= 0. Then cosθ = ‖z‖2/‖y‖2 so that

cosθ‖y‖2 = ‖z‖2 = ‖Ax‖2 ≤ ‖A‖2‖x‖2


Figure 8.2: Linear least-squares problem ‖Ax− y‖2 = minv ‖Av− y‖2. Vector z is the projection of y ontoC (A).

and hence1‖x‖2

≤ ‖A‖2

cosθ‖y‖2

We conclude that‖δx‖2

‖x‖2≤ ‖A‖2‖(AHA)−1AH‖2

cosθ

‖δy‖2

‖y‖2=

1cosθ

σ0

σn−1

‖δy‖2

‖y‖2

where σ0 and σn−1 are (respectively) the largest and smallest singular values of A, because of the followingresult:

Homework 8.2 If A has linearly independent columns, show that ‖(AHA)−1AH‖2 = 1/σn−1, where σn−1equals the smallest singular value of A. Hint: Use the SVD of A.

* SEE ANSWER

The condition number of A ∈ Cm×n with linearly independent columns is κ2(A) = σ0/σn−1.Notice the effect of the cosθ. When y is almost perpendicular to C (A), then its projection z is small

and cosθ is small. Hence a small relative change in y can be greatly amplified. This makes sense: if y isalmost perpendical to C (A), then x≈ 0, and any small δy ∈ C (A) can yield a relatively large change δx.

8.4 Why Using the Method of Normal Equations Could be Bad

Homework 8.3 Let A have linearly independent columns. Show that κ2(AHA) = κ2(A)2.* SEE ANSWER

Homework 8.4 Let A ∈ Cn×n have linearly independent columns.

• Show that Ax = y if and only if AHAx = AHy.

• Reason that using the method of normal equations to solve Ax= y has a condition number of κ2(A)2.

8.5. Why Multiplication with Unitary Matrices is a Good Thing 135

* SEE ANSWER

Let A∈Cm×n have linearly independent columns. If one uses the Method of Normal Equations to solvethe linear least-squares problem minx ‖Ax−y‖2, one ends up solving the square linear system AHAx=AHy.Now, κ2(AHA)= κ2(A)2. Hence, using this method squares the condition number of the matrix being used.

8.5 Why Multiplication with Unitary Matrices is a Good Thing

Next, consider the computation C = AB where A ∈ Cm×m is nonsingular and B,∆B,C,∆C ∈ Cm×n. Then

(C+∆C) = A(B+∆B)C = AB

∆C = A∆B

Thus,‖∆C‖2 = ‖A∆B‖2 ≤ ‖A‖2‖∆B‖2.

Also, B = A−1C so that‖B‖2 = ‖A−1C‖2 ≤ ‖A−1‖2‖C‖2

and hence1‖C‖2

≤ ‖A−1‖21‖B‖2

.

Thus,‖∆C‖2

‖C‖2≤ ‖A‖2‖A−1‖2

‖∆B‖2

‖B‖2= κ2(A)

‖∆B‖2

‖B‖2.

This means that the relative error in matrix C = AB is at most κ2(A) greater than the relative error in B.The following exercise gives us a hint as to why algorithms that cast computation in terms of multipli-

cation by unitary matrices avoid the buildup of error:

Homework 8.5 Let U ∈ Cn×n be unitary. Show that κ2(U) = 1.* SEE ANSWER

This means is that the relative error in matrix C =UB is no greater than the relative error in B when U isunitary.

Homework 8.6 Characterize the set of all square matrices A with κ2(A) = 1.* SEE ANSWER

Chapter 9Notes on the Stability of an Algorithm

Based on “Goal-Oriented and Modular Stability Analysis” [6, 7] by Paolo Bientinesi and Robert van deGeijn.

Video


* Lecture on the Stability of an Algorithm



137

https://utexas.box.com/s/korxh39yuvty56smahch

http://youtu.be/hnYEen3fn4o

https://utexas.box.com/s/korxh39yuvty56smahch

Chapter 9. Notes on the Stability of an Algorithm 138

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

9.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

9.2. Floating Point Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

9.3. Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9.4. Floating Point Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

9.4.1. Model of floating point computation . . . . . . . . . . . . . . . . . . . . . . . . 142

9.4.2. Stability of a numerical algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 143

9.4.3. Absolute value of vectors and matrices . . . . . . . . . . . . . . . . . . . . . . 143

9.5. Stability of the Dot Product Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 144

9.5.1. An algorithm for computing DOT . . . . . . . . . . . . . . . . . . . . . . . . . 144

9.5.2. A simple start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

9.5.3. Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

9.5.4. Target result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

9.5.5. A proof in traditional format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

9.5.6. A weapon of math induction for the war on error (optional) . . . . . . . . . . . . 149

9.5.7. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

9.6. Stability of a Matrix-Vector Multiplication Algorithm . . . . . . . . . . . . . . . . . 152

9.6.1. An algorithm for computing GEMV . . . . . . . . . . . . . . . . . . . . . . . . 152

9.6.2. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

9.7. Stability of a Matrix-Matrix Multiplication Algorithm . . . . . . . . . . . . . . . . . 154

9.7.1. An algorithm for computing GEMM . . . . . . . . . . . . . . . . . . . . . . . . 154

9.7.2. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

9.7.3. An application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

9.1. Motivation 139

Figure 9.1: In this illustation, f : D → R is a function to be evaluated. The function f represents theimplementation of the function that uses floating point arithmetic, thus incurring errors. The fact thatfor a nearby value x the computed value equals the exact function applied to the slightly perturbed x,f (x) = f (x), means that the error in the computation can be attributed to a small change in the input. Ifthis is true, then f is said to be a (numerically) stable implementation of f for input x.

9.1 Motivation

Correctness in the presence of error (e.g., when floating point computations are performed) takes on adifferent meaning. For many problems for which computers are used, there is one correct answer and weexpect that answer to be computed by our program. The problem is that most real numbers cannot be storedexactly in a computer memory. They are stored as approximations, floating point numbers, instead. Hencestoring them and/or computing with them inherently incurs error. The question thus becomes “When is aprogram correct in the presense of such errors?”

Let us assume that we wish to evaluate the mapping f : D → R where D ⊂ Rn is the domain andR ⊂ Rm is the range (codomain). Now, we will let f : D → R denote a computer implementation of thisfunction. Generally, for x ∈D it is the case that f (x) 6= f (x). Thus, the computed value is not “correct”.From the Notes on Conditioning, we know that it may not be the case that f (x) is “close to” f (x). Afterall, even if f is an exact implementation of f , the mere act of storing x may introduce a small error δx andf (x+δx) may be far from f (x) if f is ill-conditioned.

The following defines a property that captures correctness in the presense of the kinds of errors thatare introduced by computer arithmetic:

Definition 9.1 Let given the mapping f : D→ R, where D ⊂ Rn is the domain and R ⊂ Rm is the range(codomain), let f : D→ R be a computer implementation of this function. We will call f a (numerically)stable implementation of f on domain D if for all x∈D there exists a x “close” to x such that f (x) = f (x).


In other words, f is a stable implementation if the error that is introduced is similar to that introducedwhen f is evaluated with a slightly changed input. This is illustrated in Figure 9.1 for a specific input x. Ifan implemention is not stable, it is numerically unstable.

9.2 Floating Point Numbers

Only a finite number of (binary) digits can be used to store a real number number. For so-called single-precision and double-precision floating point numbers 32 bits and 64 bits are typically employed, respec-tively. Let us focus on double precision numbers.

Recall that any real number can be written as µ×βe, where β is the base (an integer greater than one),µ ∈ (−1,1) is the mantissa, and e is the exponent (an integer). For our discussion, we will define F asthe set of all numbers χ = µβe such that β = 2, µ =±.δ0δ1 · · ·δt−1 has only t (binary) digits (δ j ∈ 0,1),δ0 = 0 iff µ = 0 (the mantissa is normalized), and −L≤ e≤U . Elements in F can be stored with a finitenumber of (binary) digits.

• There is a largest number (in absolute value) that can be stored. Any number that is larger “over-flows”. Typically, this causes a value that denotes a NaN (Not-a-Number) to be stored.

• There is a smallest number (in absolute value) that can be stored. Any number that is smaller“underflows”. Typically, this causes a zero to be stored.

Example 9.2 For x ∈ Rn, consider computing

‖x‖2 =

√√√√n−1

∑i=0

χ2i . (9.1)

Notice that‖x‖2 ≤

√n

n−1maxi=0|χi|

and hence unless some χi is close to overflowing, the result will not overflow. The problem is that ifsome element χi has the property that χ2

i overflows, intermediate results in the computation in (9.1) willoverflow. The solution is to determine k such that

|χk|= maxn−1i=0 |χi|

and to then instead compute

‖x‖2 = |χk|

√√√√n−1

∑i=0

(χi

χk

)2

.

It can be argued that the same approach also avoids underflow if underflow can be avoided..

In our further discussions, we will ignore overflow and underflow issues.What is important is that any time a real number is stored in our computer, it is stored as the nearest

floating point number (element in F). We first assume that it is truncated (which makes our explanationslightly simpler).

9.2. Floating Point Numbers 141

Let positive χ be represented byχ = .δ0δ1 · · ·×2e,

where δ0 = 1 (the mantissa is normalized). If t digits are stored by our floating point system, then χ =.δ0δ1 · · ·δt−1×2e is stored. Let δχ = χ− χ. Then

δχ = .δ0δ1 · · ·δt−1δt · · ·×2e︸︷︷︸χ

− .δ0δ1 · · ·δt−1×2e︸︷︷︸χ

= .0 · · ·0︸︷︷︸t

δt · · ·×2e < .0 · · ·01︸︷︷︸t

×2e = 2e−t .

Also, since χ is positive,χ = .δ0δ1 · · ·×2e ≥ .1×2e ≥ 2e−1.

Thus,δχ

χ≤ 2e−t

2e−1 = 21−t

which can also be written asδχ≤ 21−t

χ.

A careful analysis of what happens when χ might equal zero or be negative yields

|δχ| ≤ 21−t |χ|.

Now, in practice any base β can be used and floating point computation uses rounding rather thantruncating. A similar analysis can be used to show that then

|δχ| ≤ u|χ|

where u = 12β1−t is known as the machine epsilon or unit roundoff. When using single precision or dou-

ble precision real arithmetic, u≈ 10−8 or 10−16, respectively. The quantity u is machine dependent; it is afunction of the parameters characterizing the machine arithmetic. The unit roundoff is often alternativelydefined as the maximum positive floating point number which can be added to the number stored as 1without changing the number stored as 1. In the notation introduced below, fl(1+u) = 1.

Homework 9.3 Assume a floating point number system with β = 2 and a mantissa with t digits so that atypical positive number is written as .d0d1 . . .dt−1×2e, with di ∈ 0,1.

• Write the number 1 as a floating point number.

• What is the largest positive real number u (represented as a binary fraction) such that the float-ing point representation of 1+ u equals the floating point representation of 1? (Assume roundedarithmetic.)

• Show that u = 1221−t .

* SEE ANSWER


9.3 Notation

When discussing error analyses, we will distinguish between exact and computed quantities. The functionfl(expression) returns the result of the evaluation of expression, where every operation is executed infloating point arithmetic. For example, assuming that the expressions are evaluated from left to right,fl(χ+ψ+ζ/ω) is equivalent to fl(fl(fl(χ)+fl(ψ))+fl(fl(ζ)/fl(ω))). Equality between the quantities lhsand rhs is denoted by lhs = rhs. Assignment of rhs to lhs is denoted by lhs := rhs (lhs becomes rhs). Inthe context of a program, the statements lhs := rhs and lhs := fl(rhs) are equivalent. Given an assignmentκ := expression, we use the notation κ (pronounced “check kappa”) to denote the quantity resulting fromfl(expression), which is actually stored in the variable κ.

9.4 Floating Point Computation

We introduce definitions and results regarding floating point arithmetic. In this note, we focus on realvalued arithmetic only. Extensions to complex arithmetic are straightforward.

9.4.1 Model of floating point computation

The Standard Computational Model (SCM) assumes that, for any two floating point numbers χ and ψ,the basic arithmetic operations satisfy the equality

fl(χ op ψ) = (χ op ψ)(1+ ε), |ε| ≤ u, and op ∈ +,−,∗,/.

The quantity ε is a function of χ,ψ and op. Sometimes we add a subscript (ε+,ε∗, . . .) to indicate whatoperation generated the (1+ ε) error factor. We always assume that all the input variables to an operationare floating point numbers. We can interpret the SCM as follows: These operations are performedexactly and it is only in storing the result that a roundoff error occurs (equal to that introducedwhen a real number is stored as a floating point number).

Remark 9.4 Given χ,ψ ∈ F, performing any operation op ∈ +,−,∗,/ with χ and ψ in floating pointarithmetic, [χ op ψ], is a stable operation: Let ζ = χ op ψ and ζ = ζ+δζ = [χ (op) ψ]. Then |δζ| ≤ u|ζ|and hence ζ is close to ζ (it has k correct binary digits).

For certain problems it is convenient to use the Alternative Computational Model (ACM) [25],which also assumes for the basic arithmetic operations that

fl(χ op ψ) =χ op ψ

1+ ε, |ε| ≤ u, and op ∈ +,−,∗,/.

As for the standard computation model, the quantity ε is a function of χ,ψ and op. Note that the ε’sproduced using the standard and alternative models are generally not equal.

Remark 9.5 Notice that the Taylor series expansion of 1/(1+ ε) is given by

11+ ε

= 1+(−ε)+O(ε2),

which explains how the SCM and ACM are related.

Remark 9.6 Sometimes it is more convenient to use the SCM and sometimes the ACM. Trial and error,and eventually experience, will determine which one to use.

9.4. Floating Point Computation 143

9.4.2 Stability of a numerical algorithm

In the presence of round-off error, an algorithm involving numerical computations cannot be expected toyield the exact result. Thus, the notion of “correctness” applies only to the execution of algorithms inexact arithmetic. Here we briefly introduce the notion of “stability” of algorithms.

Let f : D → R be a mapping from the domain D to the range R and let f : D → R represent themapping that captures the execution in floating point arithmetic of a given algorithm which computes f .

The algorithm is said to be backward stable if for all x ∈ D there exists a perturbed input x ∈ D ,close to x, such that f (x) = f (x). In other words, the computed result equals the result obtained whenthe exact function is applied to slightly perturbed data. The difference between x and x, δx = x− x, is theperturbation to the original input x.

The reasoning behind backward stability is as follows. The input to a function typically has some errorsassociated with it. Uncertainty may be due to measurement errors when obtaining the input and/or may bethe result of converting real numbers to floating point numbers when storing the input on a computer. If itcan be shown that an implementation is backward stable, then it has been proved that the result could havebeen obtained through exact computations performed on slightly corrupted input. Thus, one can think ofthe error introduced by the implementation as being comparable to the error introduced when obtainingthe input data in the first place.

When discussing error analyses, δx, the difference between x and x, is the backward error and thedifference f (x)− f (x) is the forward error. Throughout the remainder of this note we will be concernedwith bounding the backward and/or forward errors introduced by the algorithms executed with floatingpoint arithmetic.

The algorithm is said to be forward stable on domain D if for all x∈D it is that case that f (x)≈ f (x).In other words, the computed result equals a slight perturbation of the exact result.

9.4.3 Absolute value of vectors and matrices

In the above discussion of error, the vague notions of “near” and “slightly perturbed” are used. Makingthese notions exact usually requires the introduction of measures of size for vectors and matrices, i.e.,norms. Instead, for the operations analyzed in this note, all bounds are given in terms of the absolutevalues of the individual elements of the vectors and/or matrices. While it is easy to convert such boundsto bounds involving norms, the converse is not true.

Definition 9.7 Given x ∈ Rn and A ∈ Rm×n,

|x|=

|χ0||χ1|

...

|χn−1|

and |A|=

|α0,0| |α0,1| . . . |α0,n−1||α1,0| |α1,1| . . . |α1,n−1|

...... . . . ...

|αm−1,0| |αm−1,1| . . . |αm−1,n−1|

.

Definition 9.8 Let M∈<,≤,=,≥,> and x,y ∈ Rn. Then

|x|M |y| iff |χi|M |ψi|,

with i = 0, . . . ,n−1. Similarly, given A and B ∈ Rm×n,

|A|M |B| iff |αi j|M |βi j|,

with i = 0, . . . ,m−1 and j = 0, . . . ,n−1.


The next Lemma is exploited in later sections:

Lemma 9.9 Let A ∈ Rm×k and B ∈ Rk×n. Then |AB| ≤ |A||B|.

Homework 9.10 Prove Lemma 9.9.* SEE ANSWER

The fact that the bounds that we establish can be easily converted into bounds involving norms is aconsequence of the following theorem, where ‖ ‖F indicates the Frobenius matrix norm.

Theorem 9.11 Let A,B ∈ Rm×n. If |A| ≤ |B| then ‖A‖1 ≤ ‖B‖1, ‖A‖∞ ≤ ‖B‖∞, and ‖A‖F ≤ ‖B‖F .

Homework 9.12 Prove Theorem 9.11.* SEE ANSWER

9.5 Stability of the Dot Product Operation

The matrix-vector multiplication algorithm discussed in the next section requires the computation of thedot (inner) product (DOT) of vectors x,y ∈ Rn: κ := xT y. In this section, we give an algorithm for thisoperation and the related error results.

9.5.1 An algorithm for computing DOT

We will consider the algorithm given in Figure 9.2. It uses the FLAME notation [24, 4] to express thecomputation

κ :=(((χ0ψ0 +χ1ψ1)+ · · ·)+χn−2ψn−2

)+χn−1ψn−1 (9.2)

in the indicated order.

9.5.2 A simple start

Before giving a general result, let us focus on the case where n = 2:

κ := χ0ψ0 +χ1ψ1.

Then, under the computational model given in Section 9.4, if κ := χ0ψ0+χ1ψ1 is executed, the computedresult, κ, satisfies

κ =

χ0

χ1

T ψ0

ψ1

= [χ0ψ0 +χ1ψ1]

= [[χ0ψ0]+ [χ1ψ1]]

= [χ0ψ0(1+ ε(0)∗ )+χ1ψ1(1+ ε

(1)∗ )]

= (χ0ψ0(1+ ε(0)∗ )+χ1ψ1(1+ ε

(1)∗ ))(1+ ε

(1)+ )

= χ0ψ0(1+ ε(0)∗ )(1+ ε

(1)+ )+χ1ψ1(1+ ε

(1)∗ )(1+ ε

(1)+ )

9.5. Stability of the Dot Product Operation 145


x→

xT

xB

,y→

yT

yB

while m(xT )< m(x) do xT

xB

→

x0

χ1

x2

,

yT

yB

→

y0

ψ1

y2

κ := κ+χ1ψ1

xT

xB

←

x0

χ1

x2

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 9.2: Algorithm for computing κ := xT y.

=

χ0

χ1

T ψ0(1+ ε(0)∗ )(1+ ε

(1)+ )

ψ1(1+ ε(1)∗ )(1+ ε

(1)+ )

=

χ0

χ1

T (1+ ε(0)∗ )(1+ ε

(1)+ ) 0

0 (1+ ε(1)∗ )(1+ ε

(1)+ )

ψ0

ψ1

=

χ0(1+ ε(0)∗ )(1+ ε

(1)+ )

χ1(1+ ε(1)∗ )(1+ ε

(1)+ )

T ψ0

ψ1

,

where |ε(0)∗ |, |ε(1)∗ |, |ε

(1)+ | ≤ u.

Homework 9.13 Repeat the above steps for the computation

κ := ((χ0ψ0 +χ1ψ1)+χ2ψ2),

computing in the indicated order.* SEE ANSWER


9.5.3 Preparation

Under the computational model given in Section 9.4, the computed result of (9.2), κ, satisfies

κ =((

(χ0ψ0(1+ ε(0)∗ )+χ1ψ1(1+ ε

(1)∗ ))(1+ ε

(1)+ )+ · · ·

)(1+ ε

(n−2)+ )

+χn−1ψn−1(1+ ε(n−1)∗ )

)(1+ ε

(n−1)+ )

=n−1

∑i=0

(χiψi(1+ ε

(i)∗ )

n−1

∏j=i

(1+ ε( j)+ )

), (9.3)

where ε(0)+ = 0 and |ε(0)∗ |, |ε

( j)∗ |, |ε

( j)+ | ≤ u for j = 1, . . . ,n−1.

Clearly, a notation to keep expressions from becoming unreadable is desirable. For this reason weintroduce the symbol θ j:

Lemma 9.14 Let εi ∈ R, 0≤ i≤ n−1, nu < 1, and |εi| ≤ u. Then ∃ θn ∈ R such thatn−1

∏i=0

(1+ εi)±1 = 1+θn,

with |θn| ≤ nu/(1−nu).

Proof: By Mathematical Induction.

Base case. n = 1. Trivial.

Inductive Step. The Inductive Hypothesis (I.H.) tells us that for all εi ∈ R, 0 ≤ i ≤ n− 1, nu < 1, and|εi| ≤ u, there exists a θn ∈ R such that

n−1

∏i=0

(1+ εi)±1 = 1+θn, with |θn| ≤ nu/(1−nu).

We will show that if εi ∈ R, 0≤ i≤ n, (n+1)u < 1, and |εi| ≤ u, then there exists a θn+1 ∈ R suchthat

n

∏i=0

(1+ εi)±1 = 1+θn+1, with |θn+1| ≤ (n+1)u/(1− (n+1)u).

Case 1: ∏ni=0(1+ εi)

±1 = ∏n−1i=0 (1+ εi)

±1(1+ εn). See Exercise 9.15.

Case 2: ∏ni=0(1 + εi)

±1 = (∏n−1i=0 (1 + εi)

±1)/(1 + εn). By the I.H. there exists a θn such that(1+θn) = ∏

n−1i=0 (1+ εi)

±1 and |θn| ≤ nu/(1−nu). Then

∏n−1i=0 (1+ εi)

±1

1+ εn=

1+θn

1+ εn= 1+

θn− εn

1+ εn︸︷︷︸θn+1

,

which tells us how to pick θn+1. Now

|θn+1| =

∣∣∣∣θn− εn

1+ εn

∣∣∣∣≤ |θn|+u1−u

≤nu

1−nu +u1−u

=nu+(1−nu)u(1−nu)(1−u)

=(n+1)u−nu2

1− (n+1)u+nu2 ≤(n+1)u

1− (n+1)u.


By the Principle of Mathematical Induction, the result holds.

Homework 9.15 Complete the proof of Lemma 9.14.* SEE ANSWER

The quantity θn will be used throughout this note. It is not intended to be a specific number. Instead,it is an order of magnitude identified by the subscript n, which indicates the number of error factors of theform (1+ εi) and/or (1+ εi)

−1 that are grouped together to form (1+θn). Since the bound on |θn| occursoften, we assign it a symbol as follows:

Definition 9.16 For all n≥ 1 and nu < 1, define γn := nu/(1−nu).

With this notation, (9.3) simplifies to

κ = χ0ψ0(1+θn)+χ1ψ1(1+θn)+ · · ·+χn−1ψn−1(1+θ2) (9.4)

=

χ0

χ1

χ2...

χn−1

T

(1+θn) 0 0 · · · 0

0 (1+θn) 0 · · · 0

0 0 (1+θn−1) · · · 0...

...... . . . ...

0 0 0 · · · (1+θ2)

ψ0

ψ1

ψ2...

ψn−1

(9.5)

=

χ0

χ1

χ2...

χn−1

T I +

θn 0 0 · · · 0

0 θn 0 · · · 0

0 0 θn−1 · · · 0...

...... . . . ...

0 0 0 · · · θ2

ψ0

ψ1

ψ2...

ψn−1

,

where |θ j| ≤ γ j, j = 2, . . . ,n.Two instances of the symbol θn, appearing even in the same expression, typically do not represent thesame number. For example, in (9.4) a (1+ θn) multiplies each of the terms χ0ψ0 and χ1ψ1, but thesetwo instances of θn, as a rule, do not denote the same quantity. In particular, One should be careful whenfactoring out such quantities.

As part of the analyses the following bounds will be useful to bound error that accumulates:

Lemma 9.17 If n,b≥ 1 then γn ≤ γn+b and γn + γb + γnγb ≤ γn+b.



9.5.4 Target result

It is of interest to accumulate the roundoff error encountered during computation as a perturbation of inputand/or output parameters:

• κ = (x+δx)T y; (κ is the exact output for a slightly perturbed x)

• κ = xT (y+δy); (κ is the exact output for a slightly perturbed y)

• κ = xT y+δκ. (κ equals the exact result plus an error)

The first two are backward error results (error is accumulated onto input parameters, showing that thealgorithm is numerically stable since it yields the exact output for a slightly perturbed input) while the lastone is a forward error result (error is accumulated onto the answer). We will see that in different situations,a different error result may be needed by analyses of operations that require a dot product.

Let us focus on the second result. Ideally one would show that each of the entries of y is slightlyperturbed relative to that entry:

δy =

σ0ψ0

...

σn−1ψn−1

=

σ0 · · · 0... . . . ...

0 · · · σn−1

ψ0...

ψn−1

= Σy,

where each σi is “small” and Σ = diag(()σ0, . . . ,σn−1). The following special structure of Σ, inspiredby (9.5) will be used in the remainder of this note:

Σ(n) =

0×0 matrix if n = 0

θ1 if n = 1

diag(()θn,θn,θn−1, . . . ,θ2) otherwise.

(9.6)

Recall that θ j is an order of magnitude variable with |θ j| ≤ γ j.

Homework 9.19 Let k ≥ 0 and assume that |ε1|, |ε2| ≤ u, with ε1 = 0 if k = 0. Show that I +Σ(k) 0

0 (1+ ε1)

(1+ ε2) = (I +Σ(k+1)).

Hint: reason the case where k = 0 separately from the case where k > 0.* SEE ANSWER

We state a theorem that captures how error is accumulated by the algorithm.

Theorem 9.20 Let x,y ∈Rn and let κ := xT y be computed by executing the algorithm in Figure 9.2. Then

κ = [xT y] = xT (I +Σ(n))y.


9.5.5 A proof in traditional format

In the below proof, we will pick symbols to denote vectors so that the proof can be easily related to thealternative framework to be presented in Section 9.5.6.Proof: By Mathematical Induction on n, the length of vectors x and y.

Base case. m(x) = m(y) = 0. Trivial.

Inductive Step. I.H.: Assume that if xT ,yT ∈ Rk, k > 0, then

fl(xTTyT) = xT

T(I+ΣT)yT, where ΣT = Σ(k).

We will show that when xT ,yT ∈ Rk+1, the equality fl(xTTyT) = xT

T(I + ΣT)yT holds true again.

Assume that xT ,yT ∈ Rk+1, and partition xT →

x0

χ1

and yT →

y0

ψ1

. Then

fl(

x0

χ1

T y0

ψ1

) = fl(fl(xT0 y0)+fl(χ1ψ1)) (definition)

= fl(xT0 (I+Σ0)y0 +fl(χ1ψ1)) (I.H. with xT = x0,

yT = y0, and Σ0 = Σ(k))

=(xT

0 (I +Σ0)y0 +χ1ψ1(1+ ε∗))(1+ ε+) (SCM, twice)

=

x0

χ1

T (I +Σ0) 0

0 (1+ ε∗)

(1+ ε+)

y0

ψ1

(rearrangement)

= xTT (I +ΣT )yT (renaming),

where |ε∗|, |ε+| ≤ u, ε+ = 0 if k = 0, and (I+ΣT ) =

(I +Σ0) 0

0 (1+ ε∗)

(1+ε+) so that ΣT =

Σ(k+1).

By the Principle of Mathematical Induction, the result holds.

9.5.6 A weapon of math induction for the war on error (optional)

We focus the reader’s attention on Figure 9.3 in which we present a framework, which we will call theerror worksheet, for presenting the inductive proof of Theorem 9.20 side-by-side with the algorithm forDOT. This framework, in a slightly different form, was first introduced in [3]. The expressions enclosedby (in the grey boxes) are predicates describing the state of the variables used in the algorithms andin their analysis. In the worksheet, we use superscripts to indicate the iteration number, thus, the symbolsvi and vi+1 do not denote two different variables, but two different states of variable v.


Error side Step

κ := 0 Σ = 0 1a

Partition x→

xT

xB

,y→ yT

yB

, Σ→

(ΣT 0

0 ΣB

)where xT and yT are empty, and ΣT is 0×0

4

κ = xT

T (I +ΣT )yT ∧ΣT = Σ(k)∧m(xT ) = k

2a

while m(xT )< m(x) do 3κ = xT


2b

Repartition xT

xB

→

x0

χ1

x2

, yT

yB

→

y0

ψ1

y2

,(

ΣT 0

0 ΣB

)→

Σi0 0 0

0 σi1 0

0 0 Σ2

where χ1, ψ1, and σi

1 are scalars

5a

κi = xT

0 (I +Σi0)y0∧Σi

0 = Σ(k)∧m(x0) = k

6

κ := κ+χ1ψ1

κi+1 =(κi +χ1ψ1(1+ ε∗)

)(1+ ε+) SCM, twice

(ε+ = 0 if k = 0)

=(xT

0 (I +Σ(k)0 )y0 +χ1ψ1(1+ ε∗)

)(1+ ε+) Step 6: I.H.

=

x0

χ1

T I +Σ(k)0 0

0 1+ ε∗

(1+ ε+)

y0

ψ1

Rearrange

=

x0

χ1

T (I +Σ(k+1)

) y0

ψ1

Exercise 9.19

8

κi+1 =

x0

χ1

T I +

Σi+10 0

0 σi+11

y0

ψ1

∧

Σi+10 0

0 σi+11

= Σ(k+1)∧m

x0

χ1

= (k+1)

7

Continue with(xT

xB

)←

x0

χ1

x2

,( yT

yB

)←

y0

ψ1

y2

, (ΣT 0

0 ΣB

)←

Σi+10 0 0

0 σi+11 0

0 0 Σ2

5b

κ = xT


2c

endwhile κ = xT

T (I +ΣT )yT ∧ΣT = Σ(k)∧m(xT ) = k∧m(xT ) = m(x)

2dκ = xT(I +Σ(n))y∧m(x) = n

1b

Figure 9.3: Error worksheet completed to establish the backward error result for the given algorithm thatcomputes the DOT operation.


The proof presented in Figure 9.3 goes hand in hand with the algorithm, as it shows that before andafter each iteration of the loop that computes κ := xT y, the variables κ,xT ,yT ,ΣT are such that the predicate

κ = xTT (I +ΣT )yT ∧ k = m(xT )∧ΣT = Σ

(k) (9.7)

holds true. This relation is satisfied at each iteration of the loop, so it is also satisfied when the loopcompletes. Upon completion, the loop guard is m(xT ) = m(x) = n, which implies that κ = xT (I +Σ(n))y,i.e., the thesis of the theorem, is satisfied too.

In details, the inductive proof of Theorem 9.20 is captured by the error worksheet as follows:

Base case. In Step 2a, i.e. before the execution of the loop, predicate (9.7) is satisfied, as k = m(xT ) = 0.

Inductive step. Assume that the predicate (9.7) holds true at Step 2b, i.e., at the top of the loop. ThenSteps 6, 7, and 8 in Figure 9.3 prove that the predicate is satisfied again at Step 2c, i.e., the bottomof the loop. Specifically,

• Step 6 holds by virtue of the equalities x0 = xT ,y0 = yT , and Σi0 = ΣT .

• The update in Step 8-left introduces the error indicated in Step 8-right (SCM, twice), yieldingthe results for Σ

i+10 and σ

i+11 , leaving the variables in the state indicated in Step 7.

• Finally, the redefinition of ΣT in Step 5b transforms the predicate in Step 7 into that of Step 2c,completing the inductive step.

By the Principle of Mathematical Induction, the predicate (9.7) holds for all iterations. In particular,when the loop terminates, the predicate becomes

κ = xT (I +Σ(n))y∧n = m(xT ).

This completes the discussion of the proof as captured by Figure 9.3.In the derivation of algorithms, the concept of loop-invariant plays a central role. Let L be a loop and

P a predicate. If P is true before the execution of L , at the beginning and at the end of each iteration of L ,and after the completion of L , then predicate P is a loop-invariant with respect to L . Similarly, we givethe definition of error-invariant.

Definition 9.21 We call the predicate involving the operands and error operands in Steps 2a–d the error-invariant for the analysis. This predicate is true before and after each iteration of the loop.

For any algorithm, the loop-invariant and the error-invariant are related in that the former describes thestatus of the computation at the beginning and the end of each iteration, while the latter captures an errorresult for the computation indicated by the loop-invariant.

The reader will likely think that the error worksheet is an overkill when proving the error result for thedot product. We agree. However, it links a proof by induction to the execution of a loop, which we believeis useful. Elsewhere, as more complex operations are analyzed, the benefits of the structure that the errorworksheet provides will become more obvious. (We will analyze more complex algorithms as the courseproceeds.)


9.5.7 Results

A number of useful consequences of Theorem 9.20 follow. These will be used later as an inventory(library) of error results from which to draw when analyzing operations and algorithms that utilize DOT.

Corollary 9.22 Under the assumptions of Theorem 9.20 the following relations hold:

R1-B: (Backward analysis) κ = (x+δx)T y, where |δx| ≤ γn|x|, and κ = xT (y+δy), where |δy| ≤ γn|y|;

R1-F: (Forward analysis) κ = xT y+δκ, where |δκ| ≤ γn|x|T |y|.

Proof: We leave the proof of R1-B as an exercise. For R1-F, let δκ = xT Σ(n)y, where Σ(n) is as inTheorem 9.20. Then

|δκ| = |xTΣ(n)y|

≤ |χ0||θn||ψ0|+ |χ1||θn||ψ1|+ · · ·+ |χn−1||θ2||ψn−1|≤ γn|χ0||ψ0|+ γn|χ1||ψ1|+ · · ·+ γ2|χn−1||ψn−1|≤ γn|x|T |y|.

Homework 9.23 Prove R1-B.* SEE ANSWER

9.6 Stability of a Matrix-Vector Multiplication Algorithm

In this section, we discuss the numerical stability of the specific matrix-vector multiplication algorithmthat computes y := Ax via dot products. This allows us to show how results for the dot product can be usedin the setting of a more complicated algorithm.

9.6.1 An algorithm for computing GEMV

We will consider the algorithm given in Figure 9.4 for computing y := Ax, which computes y via dotproducts.

9.6.2 Analysis

Assume A ∈ Rm×n and partition

A =

aT

0

aT1...

aTm−1

and

ψ0

ψ1...

ψm−1

.

9.6. Stability of a Matrix-Vector Multiplication Algorithm 153


A→

AT

AB

,y→

yT

yB

while m(AT )< m(A) do AT

AB

→

A0

aT1

A2

,

yT

yB

→

y0

ψ1

y2

ψ1 := aT

1 x

xT

xB

←

x0

χ1

x2

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 9.4: Algorithm for computing y := Ax.

Then ψ0

ψ1...

ψm−1

:=

aT

0 x

aT1 x...

aTm−1x

.

From Corollary 9.22 R1-B regarding the dot product we know that

y =

ψ0

ψ1...

ψm−1

=

(a0 +δa0)

T x

(a1 +δa1)T x

...

(am−1 +δam−1)T x

=

aT

0

aT1...

aTm−1

+

δaT

0

δaT1

...

δaTm−1

x = (A+∆A)x,

where |δai| ≤ γn|ai|, i = 0, . . . ,m−1, and hence |∆A| ≤ γn|A|. .


Also, from Corollary 9.22 R1-F regarding the dot product we know that

y =

ψ0

ψ1...

ψm−1

=

aT

0 x+δψ0

aT1 x+δψ1

...

aTm−1x+δψm−1

=

aT

0

aT1...

aTm−1

x+

δψ0

δψ1...

δψm−1

= Ax+δy.

where |δψi| ≤ γn|ai|T |x| and hence |δy| ≤ γn|A||x|.The above observations can be summarized in the following theorem:

Theorem 9.24 Error results for matrix-vector multiplication. Let A ∈ Rm×n, x ∈ Rn, y ∈ Rm andconsider the assignment y := Ax implemented via the algorithm in Figure 9.4. Then these equalities hold:

R1-B: y = (A+∆A)x, where |∆A| ≤ γn|A|.

R2-F: y = Ax+δy, where |δy| ≤ γn|A||x|.

Homework 9.25 In the above theorem, could one instead prove the result

y = A(x+δx),

where δx is “small”?* SEE ANSWER

9.7 Stability of a Matrix-Matrix Multiplication Algorithm

In this section, we discuss the numerical stability of the specific matrix-matrix multiplication algorithmthat computes C := AB via the matrix-vector multiplication algorithm from the last section, where C ∈Rm×n, A ∈ Rm×k, and B ∈ Rk×n.

9.7.1 An algorithm for computing GEMM

We will consider the algorithm given in Figure 9.5 for computing C := AC, which computes one columnat a time so that the matrix-vector multiplication algorithm from the last section can be used.

9.7.2 Analysis

PartitionC =

(c0 c1 · · · cn−1

)and B =

(b0 b1 · · · bn−1

).

Then (c0 c1 · · · cn−1

):=(

Ab0 Ab1 · · · Abn−1

).

From Corollary 9.22 R1-B regarding the dot product we know that(c0 c1 · · · cn−1

)=

(Ab0 +δc0 Ab1 +δc1 · · · Abn−1 +δcn−1

)=

(Ab0 Ab1 · · · Abn−1

)+(

δc0 δc1 · · · δcn−1

)= AB+∆C.

9.7. Stability of a Matrix-Matrix Multiplication Algorithm 155


C→(

CL CR

),B→

(BL BR

)while n(CL)< n(C) do(

CL CR

)→(

C0 c1 C2

),(

BL BR

)→(

B0 b1 B2

)c1 := Ab1

(CL CR

)←(

C0 c1 C2

),(

BL BR

)←(

B0 b1 B2

)endwhile

Figure 9.5: Algorithm for computing C := AB one column at a time.

where |δc j| ≤ γk|A||b j|, j = 0, . . . ,n−1, and hence |∆C| ≤ γk|A||B|.The above observations can be summarized in the following theorem:

Theorem 9.26 (Forward) error results for matrix-matrix multiplication. Let C ∈ Rm×n, A ∈ Rm×k,and B ∈ Rk×n and consider the assignment C := AB implemented via the algorithm in Figure 9.5. Thenthe following equality holds:

R1-F: C = AB+∆C, where |∆C| ≤ γk|A||B|.


C = (A+∆A)(B+∆B),

where ∆A and ∆B are “small”?* SEE ANSWER

9.7.3 An application

A collaborator of ours recently implemented a matrix-matrix multiplication algorithm and wanted to checkif it gave the correct answer. To do so, he followed the following steps:

• He created random matrices A ∈ Rm×k, and C ∈ Rm×n, with positive entries in the range (0,1).

• He computed C = AB with an implementation that was known to be “correct” and assumed it yieldsthe exact solution. (Of course, it has error in it as well. We discuss how he compensated for that,below.)

• He computed C = AB with his new implementation.

• He computed ∆C = C−C and checked that each of its entries satisfied δγi, j ≤ 2kuγi, j.


• In the above, he took advantage of the fact that A and B had positive entries so that |A||B|= AB =C.He also approximated γk =

ku1−ku with ku, and introduced the factor 2 to compensate for the fact that

C itself was inexactly computed.

Chapter 10Notes on Performance

How to attain high performance on modern architectures is of importance: linear algebra is fundamentalto scientific computing. Scientific computing often involves very large problems that require the fastestcomputers in the world to be employed. One wants to use such computers efficiently.

For now, we suggest the reader become familiar with the following resources:

• Week 5 of Linear Algebra: Foundations to Frontiers - Notes to LAFF With [29].Focus on Section 5.4 Enrichment.

• Kazushige Goto and Robert van de Geijn.Anatomy of high-performance matrix multiplication [22].ACM Transactions on Mathematical Software, 34 (3), 2008.

• Field G. Van Zee and Robert van de Geijn.BLIS: A Framework for Rapid Instantiation of BLAS Functionality [41].ACM Transactions on Mathematical Software, to appear.

• Robert van de Geijn.How to Optimize Gemm.wiki.cs.utexas.edu/rvdg/HowToOptimizeGemm.(An exercise on how to write a high-performance matrix-matrix multiplication in C.)

A similar exercise:Michael LehnGEMM: From Pure C to SSE Optimized Micro Kernelshttp://apfel.mathematik.uni-ulm.de/ lehn/sghpc/gemm/index.html.

157


http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/index.html

Chapter 10. Notes on Performance 158

Chapter 11Notes on Gaussian Elimination and LUFactorization

The LU factorization is also known as the LU decomposition and the operations it performs are equivalentto those performed by Gaussian elimination. For details, we recommend that the reader consult Weeks 6and 7 of

“Linear Algebra: Foundations to Frontiers - Notes to LAFF With” [29].

Video

Read disclaimer regarding the videos in the preface!Lecture on Gaussian Elimination and LU factorization:


Lecture on deriving dense linear algebra algorithms:

* YouTube* Download from UT Box* View After Local Download* Slides


159


http://youtu.be/fJYrgXRQPOw

https://utexas.box.com/s/e32e8fg5qmivp059cgvc

http://youtu.be/nUvNM5N34qo

https://utexas.box.com/s/7sqlhslds9ow81nto84p

http://www.cs.utexas.edu/users/flame/Notes/DerivationLUVar5.pdf

Chapter 11. Notes on Gaussian Elimination and LU Factorization 160

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

11.1. Definition and Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

11.2. LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

11.2.1. First derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

11.2.2. Gauss transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

11.2.3. Cost of LU factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

11.3. LU Factorization with Partial Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . 165

11.3.1. Permutation matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

11.3.2. The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

11.4. Proof of Theorem 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

11.5. LU with Complete Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

11.6. Solving Ax = y Via the LU Factorization with Pivoting . . . . . . . . . . . . . . . . . 175

11.7. Solving Triangular Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . . 175

11.7.1. Lz = y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

11.7.2. Ux = z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

11.8. Other LU Factorization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

11.8.1. Variant 1: Bordered algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

11.8.2. Variant 2: Left-looking algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 183

11.8.3. Variant 3: Up-looking variant . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

11.8.4. Variant 4: Crout variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

11.8.5. Variant 5: Classical LU factorization . . . . . . . . . . . . . . . . . . . . . . . . 185

11.8.6. All algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

11.8.7. Formal derivation of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 185

11.9. Numerical Stability Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

11.10.Is LU with Partial Pivoting Stable? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

11.11.Blocked Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

11.11.1.Blocked classical LU factorization (Variant 5) . . . . . . . . . . . . . . . . . . . 188

11.11.2.Blocked classical LU factorization with pivoting (Variant 5) . . . . . . . . . . . 191

11.12.Variations on a Triple-Nested Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

11.1. Definition and Existence 161

11.1 Definition and Existence

Definition 11.1 LU factorization (decomposition)Given a matrix A ∈ Cm×n with m ≤ n its LU factor-ization is given by A = LU where L ∈ Cm×n is unit lower trapezoidal and U ∈ Cn×n is upper triangular.

The first question we will ask is when the LU factorization exists. For this, we need a definition.

Definition 11.2 The k× k principle leading submatrix of a matrix A is defined to be the square matrix

AT L ∈ Ck×k such that A =

AT L AT R

ABL ABR

.

This definition allows us to indicate when a matrix has an LU factorization:

Theorem 11.3 Existence Let A ∈ Cm×n and m ≤ n have linearly independent columns. Then A has aunique LU factorization if and only if all its principle leading submatrices are nonsingular.

The proof of this theorem is a bit involved and can be found in Section 11.4.

11.2 LU Factorization

We are going to present two different ways of deriving the most commonly known algorithm. The first isa straight forward derivation. The second presents the operation as the application of a sequence of Gausstransforms.

11.2.1 First derivation

Partition A, L, and U as follows:

A→

α11 aT12

a21 A22

, L→

1 0

l21 L22

, and U →

υ11 uT12

0 U22

.

Then A = LU means that α11 aT12

a21 A22

=

1 0

l21 L22

υ11 uT12

0 U22

=

υ11 uT12

l21υ11 l21uT12 +L22U22

.

This means thatα11 = υ11 aT

12 = uT12

a21 = υ11l21 A22 = l21uT12 +L22U22

or, equivalently,α11 = υ11 aT

12 = uT12

a21 = υ11l21 A22− l21uT12 = L22U22

.

If we let U overwrite the original matrix A this suggests the algorithm


• l21 = a21/α11.

• a21 = 0.

• A22 := A22− l21aT12.

• Continue by overwriting the updated A22 with its LU factorization.

This is captured in the algorithm in Figure 11.1.

11.2.2 Gauss transforms

Definition 11.4 A matrix Lk of the form Lk =

Ik 0 0

0 1 0

0 l21 0

where Ik is k× k is called a Gauss trans-

form.

Example 11.5 Gauss transforms can be used to take multiples of a row and subtract these multiples fromother rows:

1 0 0 0

0 1 0 0

0 −λ21 1 0

0 −λ31 0 1

aT

0

aT1

aT2

aT3

=

aT

0

aT1 aT

2

aT3

− λ21

λ31

aT1

=

aT

0

aT1

aT2 −λ21aT

1

aT3 −λ31aT

1

.

Notice the similarity with what one does in Gaussian Elimination: take a multiples of one row and subtractthese from other rows.

Now assume that the LU factorization in the previous subsection has proceeded to where A containsA00 a01 A02

0 α11 aT12

0 a21 A22

where A00 is upper triangular (recall: it is being overwritten by U!). What we would like to do is eliminatethe elements in a21 by taking multiples of the “current row”

(α11 aT

12

)and subtract these from the rest

of the rows:(

a21 A22

). The vehicle is a Gauss transform: we must determine l21 so that

I 0 0

0 1 0

0 −l21 I

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

.

11.2. LU Factorization 163

Algorithm: Compute LU factorization of A, overwriting L withfactor L and A with factor U

Partition A→

AT L AT R

ABL ABR

, L→

LT L 0

LBL LBR

whereAT L and LT L are 0×0


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L LT R

LBL LBR

→

L00 0 0

lT10 λ11 0

L20 l21 L22

whereα11, λ11 are 1×1

l21 := a21/α11

A22 := A22− l21aT12

(a21 := 0)

or, alternatively,

l21 := a21/α11A00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0 0

0 1 0

0 −l21 0

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L 0

LBL LBR

←

L00 0 0

lT10 λ11 0

L20 l21 L22

endwhile

Figure 11.1: Most commonly known algorithm for overwriting a matrix with its LU factorization.


This means we must pick l21 = a21/α11 sinceI 0 0

0 1 0

0 −l21 I

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 a21−α11l21 A22− l21aT12

.

The resulting algorithm is summarized in Figure 11.1 under “or, alternatively,”. Notice that this algorithmis identical to the algorithm for computing LU factorization discussed before!

How can this be? The following set of exercises explains it.

Homework 11.6 Show that Ik 0 0

0 1 0

0 −l21 I

−1

=

Ik 0 0

0 1 0

0 l21 I

* SEE ANSWER

Now, clearly, what the algorithm does is to compute a sequence of n Gauss transforms L0, . . . , Ln−1such that Ln−1Ln−2 · · · L1L0A = U . Or, equivalently, A = L0L1 · · ·Ln−2Ln−1U , where Lk = L−1

k . Whatwill show next is that L = L0L1 · · ·Ln−2Ln−1 is the unit lower triangular matrix computed by the LUfactorization.

Homework 11.7 Let Lk = L0L1 . . .Lk. Assume that Lk has the form Lk−1 =

L00 0 0

lT10 1 0

L20 0 I

, where L00

is k× k. Show that Lk is given by Lk =

L00 0 0

lT10 1 0

L20 l21 I

.. (Recall: Lk =

I 0 0

0 1 0

0 −l21 I

.)

* SEE ANSWER

What this exercise shows is that L = L0L1 · · ·Ln−2Ln−1 is the triangular matrix that is created by simplyplacing the computed vectors l21 below the diagonal of a unit lower triangular matrix.

11.2.3 Cost of LU factorization

The cost of the LU factorization algorithm given in Figure 11.1 can be analyzed as follows:

• Assume A is n×n.

• During the kth iteration, AT L is initially k× k.

11.3. LU Factorization with Partial Pivoting 165

• Computing l21 := a21/α11 is typically implemented as β := 1/α11 and then the scaling l21 := βa21.The reason is that divisions are expensive relative to multiplications. We will ignore the cost of thedivision (which will be insignificant if n is large). Thus, we count this as n− k−1 multiplies.

• The rank-1 update of A22 requires (n− k−1)2 multiplications and (n− k−1)2 additions.

• Thus, the total cost (in flops) can be approximated by

n−1

∑k=0

[(n− k−1)+2(n− k−1)2] =

n−1

∑j=0

[j+2 j2] (Change of variable: j = n− k−1)

=n−1

∑j=0

j+2n−1

∑j=0

j2

≈ n(n−1)2

+2∫ n

0x2dx

=n(n−1)

2+

23

n3

≈ 23

n3

Notice that this involves roughly half the number of floating point operations as are required for a House-holder transformation based QR factorization.

11.3 LU Factorization with Partial Pivoting

It is well-known that the LU factorization is numerically unstable under general circumstances. In partic-ular, a backward stability analysis, given for example in [3, 8, 6] and summarized in Section 11.9, showsthat the computed matrices L and U statisfy

(A+∆A) = LU where |∆A| ≤ γn|L||U |.

(This is the backward error result for the Crout variant for LU factorization, discussed later in this note.Some of the other variants have an error result of (A+∆A) = LU where |∆A| ≤ γn(|A|+ |L||U |).) Now, ifα is small in magnitude compared to the entries of a21 then not only will l21 have large entries, but theupdate A22− l21aT

12 will potentially introduce large entries in the updated A22 (in other words, the part ofmatrix A from which the future matrix U will be computed), a phenomenon referred to as element growth.To overcome this, we take will swap rows in A as the factorization proceeds, resulting in an algorithmknown as LU factorization with partial pivoting.

11.3.1 Permutation matrices

Definition 11.8 An n× n matrix P is said to be a permutation matrix, or permutation, if, when appliedto a vector x = (χ0,χ1, . . . ,χn−1)

T, it merely rearranges the order of the elements in that vector. Such apermutation can be represented by the vector of integers, (π0,π1, . . . ,πn−1)

T, where π0,π1, . . . ,πn−1 is apermutation of the integers 0,1, . . . ,n−1 and the permuted vector Px is given by (χπ0,χπ1, . . . ,χπn−1)

T.


Algorithm: Compute LU factorization with partial pivoting of A, overwriting L withfactor L and A with factor U . The pivot vector is returned in p.

Partition A→

AT L AT R

ABL ABR

,

L→

LT L 0

LBL LBR

, p→

pT

pB

.

whereAT L and LT L are 0×0 and pT is 0×1


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L LT R

LBL LBR

→

L00 0 0

lT10 λ11 0

L20 l21 L22

,

pT

pB

→

p0

π1

p2

whereα11, λ11, π1 are 1×1

π1 = maxi

α11

a21

α11 aT12

a21 A22

:= P(π1)

α11 aT12

a21 A22

l21 := a21/α11

A22 := A22− l21aT12

(a21 := 0)

Continue withAT L AT R

ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L 0

LBL LBR

←

L00 0 0

lT10 λ11 0

L20 l21 L22

,

pT

pB

←

p0

π1

p2

endwhile

Figure 11.2: LU factorization with partial pivoting.


If P is a permutation matrix then PA rearranges the rows of A exactly as the elements of x are rearrangedby Px.

We will see that when discussing the LU factorization with partial pivoting, a permutation matrix thatswaps the first element of a vector with the π-th element of that vector is a fundamental tool. We willdenote that matrix by

P(π) =

In if π = 00 0 1 0

0 Iπ−1 0 0

1 0 0 0

0 0 0 In−π−1

otherwise,

where n is the dimension of the permutation matrix. In the following we will use the notation Pn to indicatethat the matrix P is of size n. Let p be a vector of integers satisfying the conditions

p = (π0, . . . ,πk−1)T, where 1≤ k ≤ n and 0≤ πi < n− i, (11.1)

then Pn(p) will denote the permutation:

Pn(p) =

Ik−1 0

0 Pn−k+1(πk−1)

Ik−2 0

0 Pn−k+2(πk−2)

· · · 1 0

0 Pn−1(π1)

Pn(π0).

Remark 11.9 In the algorithms, the subscript that indicates the matrix dimensions is omitted.

Example 11.10 Let aT0 ,a

T1 , . . . ,a

Tn−1 be the rows of a matrix A. The application of P(p) to A yields a

matrix that results from swapping row aT0 with aT

π0, then swapping aT

1 with aTπ1+1, aT

2 with aTπ2+2, until

finally aTk−1 is swapped with aT

πk−1+k−1.

Remark 11.11 For those familiar with how pivot information is stored in LINPACK and LAPACK, noticethat those packages store the vector of pivot information (π0 +1,π1 +2, . . . ,πk−1 + k)T.

11.3.2 The algorithm

Having introduced our notation for permutation matrices, we can now define the LU factorization withpartial pivoting: Given an n× n matrix A, we wish to compute a) a vector p of n integers which satisfiesthe conditions (11.1), b) a unit lower trapezoidal matrix L, and c) an upper triangular matrix U so thatP(p)A = LU . An algorithm for computing this operation is typically represented by

[A, p] := LUpivA,

where upon completion A has been overwritten by L\U.Let us start with revisiting the first derivation of the LU factorization. The first step is to find a first

permutation matrix P(π1) such that the element on the diagonal in the first column is maximal in value.For this, we will introduce the function maxi(x) which, given a vector x, returns the index of the elementin x with maximal magnitude (absolute value). The algorithm then proceeds as follows:


• Partition A, L as follows:

A→

α11 aT12

a21 A22

, and L→

1 0

l21 L22

.

• Compute π1 = maxi

α11

a21

.

• Permute the rows:

α11 aT12

a21 A22

:= P(π1)

α11 aT12

a21 A22

.

• Compute l21 := a21/α11.

• Update A22 := A22− l21aT12.

Now, in general, assume that the computation has proceeded to the point where matrix A has beenoverwritten by

A00 a01 A02

0 α11 aT12

0 a21 A22

where A00 is upper triangular. If no pivoting was added one would compute l21 := a21/α11 followed bythe update

A00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0 0

0 1 0

0 −l21 I

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

.

Now, instead one performs the steps


α11

a21

.

• Permute the rows:

A00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0

0 P(π1)

A00 a01 A02

0 α11 aT12

0 a21 A22



• UpdateA00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0 0

0 1 0

0 −l21 I

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

.

This algorithm is summarized in Figure 11.2.Now, what this algorithm computes is a sequence of Gauss transforms L0, . . . , Ln−1 and permulations

P0, . . . ,Pn−1 such thatLn−1Pn−1 · · · L0P0A =U

or, equivalently,A = PT

0 L0 · · · PTn−1Ln−1U,

where Lk = L−1k . What we will finally show is that there are Gauss transforms L0, . . .Ln−1 (here the “bar”

does NOT mean conjugation. It is just a symbol) such that

A = PT0 · · ·PT

n−1 L0 · · ·Ln−1︸︷︷︸L

U

or, equivalently,

P(p)A = Pn−1 · · ·P0A = L0 · · ·Ln−1︸︷︷︸L

U,

which is what we set out to compute.Here is the insight. Assume that after k steps of LU factorization we have computed pT , LT L, LBL, etc.

so that

P(pT )A =

LT L 0

LBL I

AT L AT R

0 ABR

,

where AT L is upper triangular and k× k.

Now compute the next step of LU factorization with partial pivoting with

AT L AT R

0 ABR

:

• Partition AT L AT R

0 ABR

→

A00 a01 A02

0 α11 aT12

0 a01 A02


α11

a21


Algorithm: Compute LU factorization with partial pivoting of A, overwriting L withfactor L and A with factor U . The pivot vector is returned in p.

Partition A→

AT L AT R

ABL ABR

, L→

LT L 0

LBL LBR

, p→

pT

pB

.

whereAT L and LT L are 0×0 and pT is 0×1


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L LT R

LBL LBR

→

L00 0 0

lT10 λ11 0

L20 l21 L22

,

pT

pB

→

p0

π1

p2


π1 = maxi

α11

a21

A00 a01 A02

lT10 α11 aT

12

L20 a21 A22

:=

I 0

0 P(π1)

A00 a01 A02

lT10 α11 aT

12

L20 a21 A22

l21 := a21/α11

A00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0 0

0 1 0

0 −l21 0

A00 a01 A02

0 α11 aT12

0 a21 A22

=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

Continue withAT L AT R

ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L 0

LBL LBR

←

L00 0 0

lT10 λ11 0

L20 l21 L22

,

pT

pB

←

p0

π1

p2

endwhile

Figure 11.3: LU factorization with partial pivoting.


Algorithm: Compute LU factorization with partial pivoting of A,overwriting A with factors L and U . The pivot vectoris returned in p.

Partition A→

AT L AT R

ABL ABR

, p→

pT

pB

.

whereAT L is 0×0 and pT is 0×1


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

pT

pB

→

p0

π1

p2


π1 = maxi

α11

a21

aT10 α11 aT

12

A20 a21 A22

:= P(π1)

aT10 α11 aT

12

A20 a21 A22

a21 := a21/α11

A22 := A22−a21aT12


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

pT

pB

←

p0

π1

p2

endwhile

Figure 11.4: LU factorization with partial pivoting, overwriting A with the factors.

• Permute

A00 a01 A02

0 α11 aT12

0 a21 A22

:=

I 0

0 P(π1)

A00 a01 A02

0 α11 aT12

0 a21 A22



• Update A00 a01 A02

0 α11 aT12

0 a21 A22

:=

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

After this,

P(pT )A =

LT L 0

LBL I

I 0

0 P(π1)

I 0 0

0 1 0

0 l21 I

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

(11.2)

But LT L 0

LBL I

I 0

0 P(π1)

I 0 0

0 1 0

0 l21 I

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

=

I 0

0 P(π1)

LT L 0

P(π1)LBL I

I 0 0

0 1 0

0 l21 I

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

=

I 0

0 P(π1)

LT L 0

LBL I

I 0 0

0 1 0

0 l21 I

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

, ,

NOTE: There is a “bar” above some of L’s and l’s. Very hard to see!!! For this reason, these showup in red as well. Here we use the fact that P(π1) = P(π1)

T because of its very special structure.Bringing the permutation to the left of (11.2) and “repartitioning” we get

I 0

0 P(π1)

P(p0)︸︷︷︸P

p0

π1

A =

L00 0 0

lT10 1 0

L20 0 I

I 0 0

0 1 0

0 l21 I

︸︷︷︸

L00 0 0

lT10 1 0

L20 l21 I

A00 a01 A02

0 α11 aT12

0 0 A22− l21aT12

.

This explains how the algorithm in Figure 11.3 compute p, L, and U (overwriting A with U) so thatP(p)A = LU .

Finally, we recognize that L can overwrite the entries of A below its diagonal, yielding the algorithmin Figure 11.4.

11.4. Proof of Theorem 11.3 173

11.4 Proof of Theorem 11.3

Proof:(⇒) Let nonsingular A have a (unique) LU factorization. We will show that its principle leading subma-trices are nonsingular. Let AT L AT R

ABL ABR

︸︷︷︸

A

=

LT L 0

LBL LBR

︸︷︷︸

L

UT L UT R

0 UBR

︸︷︷︸

U

be the LU factorization of A where AT L, LT L, and UT L are k× k. Notice that U cannot have a zero onthe diagonal since then A would not have linearly independent columns. Now, the k× k principle leadingsubmatrix AT L equals AT L = LT LUT L which is nonsingular since LT L has a unit diagonal and UT L has nozeroes on the diagonal. Since k was chosen arbitrarily, this means that all principle leading submatricesare nonsingular.

(⇐) We will do a proof by induction on n.

Base Case: n = 1. Then A has the form A =

α11

a21

where α11 is a scalar. Since the principle leading

submatrices are nonsingular α11 6= 0. Hence A =

1

a21/α11

︸︷︷︸

L

α11︸︷︷︸U

is the LU factorization of

A. This LU factorization is unique because the first element of L must be 1.

Inductive Step: Assume the result is true for all matrices with n = k. Show it is true for matrices withn = k+1.

Let A of size n = k+1 have nonsingular principle leading submatrices. Now, if an LU factorizationof A exists, A = LU , then it would have to form

A00 a01

aT10 α11

A20 a21

︸︷︷︸

A

=

L00 0

lT10 1

L20 l21

︸︷︷︸

L

U00 u01

0 υ11

︸︷︷︸

U

. (11.3)

If we can show that the different parts of L and U exist and are unique, we are done. Equation (11.3)can be rewritten as

A00

aT10

A20

=

L00

lT10

L20

U00 and

a01

α11

a21

=

L00u01

lT10u01 +υ11

L20u01 + l21υ11

.


Now, by the Induction Hypothesis L11, lT10, and L20 exist and are unique. So the question is whether

u01, υ11, and l21 exist and are unique:

• u01 exists and is unique. Since L00 is nonsingular (it has ones on its diagonal) L00u01 = a01has a solution that is unique.

• υ11 exists, is unique, and is nonzero. Since lT10 and u01 exist and are unique, υ11 =α11− lT

10u01exists and is unique. It is also nonzero since the principle leading submatrix of A given by A00 a01

aT10 α11

=

L00 0

lT10 1

U00 u01

0 υ11

,

is nonsingular by assumption and therefore υ11 must be nonzero.

• l21 exists and is unique. Since υ11 exists and is nonzero, l21 = a21/υ11 exists and is uniquelydetermined.

Thus the m× (k+1) matrix A has a unique LU factorization.

By the Principle of Mathematical Induction the result holds.

Homework 11.12 Implement LU factorization with partial pivoting with the FLAME@lab API, in M-script.

* SEE ANSWER

11.5 LU with Complete Pivoting

LU factorization with partial pivoting builds on the insight that pivoting (rearranging) rows in a linearsystem does not change the solution: if Ax = b then P(p)Ax = P(p)b, where p is a pivot vector. Now, ifr is another pivot vector, then notice that P(r)T P(r) = I (a simple property of pivot matrices) and AP(r)T

permutes the columns of A in exactly the same order as P(r)A permutes the rows of A.What this means is that if Ax = b then P(p)AP(r)T [P(r)x] = P(p)b. This supports the idea that one

might want to not only permute rows of A, as in partial pivoting, but also columns of A. This is done in avariation on LU factorization that is known as LU factorization with complete pivoting.

The idea is as follows: Given matrix A, partition

A =

α11 aT12

a21 A22

.

Now, instead of finding the largest element in magnitude in the first column, find the largest element inmagnitude in the entire matrix. Let’s say it is element (π0,ρ0). Then, one permutes α11 aT

12

a21 A22

:= P(π0)

α11 aT12

a21 A22

P(ρ0)T ,

making α11 the largest element in magnitude. This then reduces the magnitude of multipliers and elementgrowth.

11.6. Solving Ax = y Via the LU Factorization with Pivoting 175

It can be shown that the maximal element growth experienced when employing LU with completepivoting indeed reduces element growth. The problem is that it requires O(n2) comparisons per itera-tion. Worse, it completely destroys the ability to utilize blocked algorithms, which attain much greaterperformance.

In practice LU with complete pivoting is not used.

11.6 Solving Ax = y Via the LU Factorization with Pivoting

Given nonsingular matrix A ∈ Cm×m, the above discussions have yielded an algorithm for computingpermutation matrix P, unit lower triangular matrix L and upper triangular matrix U such that PA = LU .We now discuss how these can be used to solve the system of linear equations Ax = y.

Starting withAx = y

we multiply both sizes of the equation by permutation matrix P

PAx = Py︸︷︷︸y

and substitute LU for PAL Ux︸︷︷︸

z

y.

We now notice that we can solve the lower triangular system

Lz = y

after which x can be computed by solving the upper triangular system

Ux = z.

11.7 Solving Triangular Systems of Equations

11.7.1 Lz = y

First, we discuss solving Lz = y where L is a unit lower triangular matrix.

Variant 1

Consider Lz = y where L is unit lower triangular. Partition

L→

1 0

l21 L22

, z→

ζ1

z2

and y→

ψ1

y2

.


Algorithm: Solve Lz = y, overwriting y (Var. 1)

Partition L→

LT L LT R

LBL LBR

, y→

yT

yB

where LT L is 0×0, yT has 0 rows

while m(LT L)< m(L) do

Repartition

LT L LT R

LBL LBR

→

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

yT

yB

→

y0

ψ1

y2

y2 := y2−ψ1l21

Continue withLT L LT R

LBL LBR

←

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

yT

yB

←

y0

ψ1

y2

endwhile

Algorithm: Solve Lz = y, overwriting y (Var. 2)

Partition L→

LT L LT R

LBL LBR

, y→

yT

yB

where LT L is 0×0, yT has 0 rows

while m(LT L)< m(L) do

Repartition

LT L LT R

LBL LBR

→

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

yT

yB

→

y0

ψ1

y2

ψ1 := ψ1− lT

10y0

Continue withLT L LT R

LBL LBR

←

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 11.5: Algorithms for the solution of a unit lower triangular system Lz = y that overwrite y with z.

11.7. Solving Triangular Systems of Equations 177

Then 1 0

l21 L22

︸︷︷︸

L

ζ1

z2

︸︷︷︸

z

=

ψ1

y2

︸︷︷︸

y

.

Multiplying out the left-hand side yields ζ1

ζ1l21 +L22z2

=

ψ1

y2

and the equalities

ζ1 = ψ1

ζ1l21 +L22z2 = y2,

which can be rearranged as

ζ1 = ψ1

L22z2 = y2−ζ1l21.

These insights justify the algorithm in Figure 11.5 (left), which overwrites y with the solution to Lz = y.

Variant 2

An alternative algorithm can be derived as follows: Partition

L→

L00 0

lT10 1

, z→

z0

ζ1

and y→

y0

ψ1

.

Then L00 0

lT10 1

︸︷︷︸

L

z0

ζ1

︸︷︷︸

z

=

y0

ψ1

︸︷︷︸

y

.

Multiplying out the left-hand side yields L00z0

lT10z0 +ζ1

=

y0

ψ1

and the equalities

L00z0 = y0

lT10z0 +ζ1 = ψ1.

The idea now is as follows: Assume that the elements of z0 were computed in previous iterations inthe algorithm in Figure 11.5 (left), overwriting y0. Then in the current iteration we can compute ζ1 :=ψ0− lT

10z0, overwriting ψ1.


Discussion

Notice that Variant 1 casts the computation in terms of an AXPY operation while Variant 2 casts it in termsof DOT products.

11.7.2 Ux = z

Next, we discuss solving Ux = y where U is an upper triangular matrix (with no assumptions about itsdiagonal entries).

Homework 11.13 Derive an algorithm for solving Ux = y, overwriting y with the solution, that castsmost computation in terms of DOT products. Hint: Partition

U →

υ11 uT12

0 U22

.

Call this Variant 1 and use Figure 11.6 to state the algorithm.* SEE ANSWER

Homework 11.14 Derive an algorithm for solving Ux= y, overwriting y with the solution, that casts mostcomputation in terms of AXPY operations. Call this Variant 2 and use Figure 11.6 to state the algorithm.

* SEE ANSWER

11.8 Other LU Factorization Algorithms

There are actually five different (unblocked) algorithms for computing the LU factorization that werediscovered over the course of the centuries1. The LU factorization in Figure 11.1 is sometimes calledclassical LU factorization or the right-looking algorithm. We now briefly describe how to derive the otheralgorithms.

Finding the algorithms starts with the following observations.

• Our algorithms will overwrite the matrix A, and hence we introduce A to denote the original contentsof A. We will say that the precondition for the algorithm is that

A = A

(A starts by containing the original contents of A.)

• We wish to overwrite A with L and U . Thus, the postcondition for the algorithm (the state in whichwe wish to exit the algorithm) is that

A = L\U ∧LU = A

(A is overwritten by L below the diagonal and U on and above the diagonal, where multiplying Land U yields the original matrix A.)

1For a thorough discussion of the different LU factorization algorithms that also gives a historic perspective, we recommend“Matrix Algorithms Volume 1“ by G.W. Stewart [34]

11.8. Other LU Factorization Algorithms 179

Algorithm: Solve Uz = y, overwriting y (Variant 1)

Partition U →

UT L UT R

UBL UBR

, y→

yT

yB

where UBR is 0×0, yB has 0 rows

while m(UBR)< m(U) do

Repartition

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

→

y0

ψ1

y2

Continue withUT L UT R

UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

←

y0

ψ1

y2

endwhile


Partition U →

UT L UT R

UBL UBR

, y→

yT

yB



Repartition

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

→

y0

ψ1

y2


UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 11.6: Algorithms for the solution of an upper triangular system Ux = y that overwrite y with x.


• All the algorithms will march through the matrices from top-left to bottom-right. Thus, at a repre-sentative point in the algorithm, the matrices are viewed as quadrants:

A→

AT L AT R

ABL ABR

, L→

LT L 0

LBL LBR

, and U →

UT L UT R

0 UBR

.

where AT L, LT L, and UT L are all square and equally sized.

• In terms of these exposed quadrants, in the end we wish for matrix A to contain AT L AT R

ABL ABR

=

L\UT L UT R

LBL L\UBR

where

LT L 0

LBL LBR

UT L UT R

0 UBR

=

AT L AT R

ABL ABR

• Manipulating this yields what we call the Partitioned Matrix Expression (PME), which can be

viewed as a recursive definition of the LU factorization: AT L AT R

ABL ABR

=

L\UT L UT R

LBL L\UBR

∧

LT LUT L = AT L LT LUT R = AT R

LBLUT L = ABL LBLUT R +LBRUBR = ABR

Now, consider the code skeleton for the LU factorization in Figure 11.7. At the top of the loop(right after the while), we want to maintain certain contents in matrix A. Since we are in a loop, wehaven’t yet overwritten A with the final result. Instead, some progress toward this final result havebeen made. The way we can find what the state of A is that we would like to maintain is to take thePME and delete subexpression. For example, consider the following condition on the contents of A: AT L AT R

ABL ABR

=

L\UT L UT R

LBL ABR−LBLUT R

∧


LBLUT L = ABL (((((((((((((

LBLUT R +LBRUBR = ABR

.

What we are saying is that AT L, AT R, and ABL have been completely updated with the correspondingparts of L and U , and ABR has been partially updated. This is exactly the state that the algorithmthat we discussed previously in this document maintains! What is left is to factor ABR, since itcontains ABR−LBLUT R, and ABR−LBLUT R = LBRUBR.

• By carefully analyzing the order in which computation must occur (in compiler lingo: by performinga dependence analysis), we can identify five states that can be maintained at the top of the loop, bydeleting subexpressions from the PME. These are called loop invariants and are listed in Figure 11.8.

• Key to figuring out what updates must occur in the loop for each of the variants is to look at how thematrices are repartitioned at the top and bottom of the loop body.


•

Algorithm: A := LU(A)

Partition A→

AT L AT R

ABL ABR

, L→

LT L LT R

LBL LBR

, U →

UT L UT R

UBL UBR

whereAT L is 0×0, LT L is 0×0, UT L is 0×0


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L LT R

LBL LBR

→

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

whereα11 is 1×1, λ11 is 1×1, υ11 is 1×1


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

LT L LT R

LBL LBR

←

L00 l01 L02

lT10 λ11 lT

12

L20 l21 L22

,

UT L UT R

UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

endwhile

Figure 11.7: Code skeleton for LU factorization.


Variant Algorithm State (loop invariant)

1 Bordered

AT L AT R

ABL ABR

=

L\UT L AT R

ABL ABR

∧

LT LUT L = AT L ((((((((LT LUT R = AT R

((((((((LBLUT L = ABL (((((((((((((


2 Left-looking

AT L AT R

ABL ABR

=

L\UT L AT R

LBL ABR

∧


LBLUT L = ABL (((((((((((((


3 Up-looking

AT L AT R

ABL ABR

=

L\UT L UT R

ABL ABR

∧


((((((((LBLUT L = ABL (((((((((((((


4 Crout variant

AT L AT R

ABL ABR

=

L\UT L UT R

LBL ABR

∧


LBLUT L = ABL (((((((((((((


5 Classical LU

AT L AT R

ABL ABR

=

L\UT L UT R

LBL ABR−LBLUT R

∧


LBLUT L = ABL (((((((((((((


Figure 11.8: Loop invariants for various LU factorization algorithms.


11.8.1 Variant 1: Bordered algorithm

Consider the loop invariant: AT L AT R

ABL ABR

=

L\UT L AT R

ABL ABR

∧


((((((((LBLUT L = ABL (((((((((((((


At the top of the loop, after repartitioning, A contains

L\U00 a01 A02

aT10 α11 aT

12

A20 a21 A22

while at the bottom it must containL\U00 u01 A02

lT10 υ11 aT

12

A20 a21 A22

where the entries in blue are to be computed. Now, considering LU = A we notice that

L00U00 = A00 L00u01 = a01 L00U02 = A02

lT10U00 = aT

10 lT10u01 +υ11 = α11 lT

10U02 +uT12 = aT

12

L20U00 = A20 L20u01 +υ11l21 = a21 L20U02 + l21uT12 +L22U22 = A22

where the entries in red are already known. The equalities in yellow can be used to compute the desiredparts of L and U :

• Solve L00u01 = a01 for u01, overwriting a01 with the result.

• Solve lT10U00 = aT

10 (or, equivalently, UT00(l

T10)

T = (aT10)

T for lT10), overwriting aT

10 with the result.

• Compute υ11 = α11− lT10u01, overwriting α11 with the result.

Homework 11.15 If A is an n×n matrix, show that the cost of Variant 1 is approximately 23n3 flops.

* SEE ANSWER

11.8.2 Variant 2: Left-looking algorithm


ABL ABR

=

L\UT L AT R

LBL ABR

∧


LBLUT L = ABL (((((((((((((




L\U00 a01 A02

lT10 α11 aT

12

L20 a21 A22


lT10 υ11 aT

12

L20 l21 A22


L00U00 = A00 L00u01 = a01 L00U02 = A02

lT10U00 = aT

10 lT10u01 +υ11 = α11 lT

10U02 +uT12 = aT

12


The equalities in yellow can be used to compute the desired parts of L and U :

• Solve L00u01 = a01 for u01, overwriting a01 with the result.


• Compute l21 := (a21−L20u01)/υ11, overwriting a21 with the result.

11.8.3 Variant 3: Up-looking variant

Homework 11.16 Derive the up-looking variant for computing the LU factorization.* SEE ANSWER

11.8.4 Variant 4: Crout variant


ABL ABR

=

L\UT L UT R

LBL ABR

∧


LBLUT L = ABL (((((((((((((



L\U00 u01 U02

lT10 α11 aT

12

L20 a21 A22


while at the bottom it must contain

L\U00 u01 U02

lT10 υ11 uT

12

L20 l21 A22


L00U00 = A00 L00u01 = a01 L00U02 = A02

lT10U00 = aT

10 lT10u01 +υ11 = α11 lT

10U02 +uT12 = aT

12




• Compute l21 := (α21−L20u01)/υ11, overwriting a21 with the result.

• Compute uT12 := aT

12− lT10U02, overwriting aT

12 with the result.

11.8.5 Variant 5: Classical LU factorization

We have already derived this algorithm. You may want to try rederiving it using the techniques discussedin this section.

11.8.6 All algorithms

All five algorithms for LU factorization are summarized in Figure 11.9.

Homework 11.17 Implement all five LU factorization algorithms with the FLAME@lab API, in M-script.

* SEE ANSWER

Homework 11.18 Which of the five variants can be modified to incorporate partial pivoting?* SEE ANSWER

11.8.7 Formal derivation of algorithms

The described approach to deriving algorithms, linking the process to the a priori identification of loopinvariants, was first proposed in [24]. It was refined into what we call the “worksheet” for deriving algo-rithms hand-in-hand with their proofs of correctness, in [4]. A book that describes the process at a levelalso appropriate for the novice is “The Science of Programming Matrix Computations” [38].


Algorithm: A := L\U = LUA

Partition A→

AT L AT R

ABL ABR

whereAT L is 0×0


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

whereα11 is 1×1

A00 contains L00 and U00 in its strictly lower and upper triangular part, re-spectively.

Variant 1 Variant 2 Variant 3 Variant 4 Variant 5

Bordered Left-looking Up-looking Crout variant Classical LU

a01 := L−100 a01 a01 := L−1

00 a01 Exercise in 11.16

aT10 := aT

10U−100

α11 := α11−aT21a01 α11 := α11−aT

21a01 α11 := α11−aT21a01

aT12 := aT

12−aT10A02

a21 := a21−A20a01 a21 := a21−A20a01

a21 := a21/α11 a21 := a21/α11 a21 := a21/α11

A22 := A22−a21aT12


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

endwhile

Figure 11.9: All five LU factorization algorithms.

11.9. Numerical Stability Results 187

11.9 Numerical Stability Results

The numerical stability of various LU factorization algorithms as well as the triangular solve algorithmscan be found in standard graduate level numerical linear algebra texts and references [21, 25, 34]. Ofparticular interest may be the analysis of the Crout variant (Variant 4) in [8], since it uses our notation aswell as the results in “Notes on Numerical Stability”. (We recommend the technical report version [6] ofthe paper, since it has more details as well as exercises to help the reader understand.) In that paper, asystematic approach towards the derivation of backward error results is given that mirrors the systematicapproach to deriving the algorithms given in [?, 4, 38].

Here are pertinent results from that paper, assuming floating point arithmetic obeys the model of com-putation given in “Notes on Numerical Stability” (as well as [8, 6, 25]). It is assumed that the reader isfamiliar with those notes.

Theorem 11.19 Let A∈Rn×n and let the LU factorization of A be computed via the Crout variant (Variant4), yielding approximate factors L and U. Then

(A+∆A) = LU with |∆A| ≤ γn|L||U |.

Theorem 11.20 Let L ∈ Rn×n be lower triangular and y,z ∈ Rn with Lz = y. Let z be the approximatesolution that is computed. Then

(L+∆L)z = y with |∆L| ≤ γn|L|.

Theorem 11.21 Let U ∈ Rn×n be upper triangular and x,z ∈ Rn with Ux = z. Let x be the approximatesolution that is computed. Then

(U +∆U)x = z with |∆U | ≤ γn|U |.

Theorem 11.22 Let A ∈ Rn×nand x,y ∈ Rn with Ax = y. Let x be the approximate solution computed viathe following steps:

• Compute the LU factorization, yielding approximate factors L and U.

• Solve Lz = y, yielding approximate solution z.

• Solve Ux = z, yielding approximate solution x.

Then (A+∆A)x = y with |∆A| ≤ (3γn + γ2n)|L||U |.

The analysis of LU factorization without partial pivoting is related that of LU factorization with partialpivoting. We have shown that LU with partial pivoting is equivalent to the LU factorization without partialpivoting on a pre-permuted matrix: PA = LU , where P is a permutation matrix. The permutation doesntinvolve any floating point operations and therefore does not generate error. It can therefore be argued that,as a result, the error that is accumulated is equivalent with or without partial pivoting


11.10 Is LU with Partial Pivoting Stable?

Homework 11.23 Apply LU with partial pivoting to

A =

1 0 0 · · · 0 1

−1 1 0 · · · 0 1

−1 −1 1 · · · 0 1...

...... . . . ...

...

−1 −1 · · · 1 1

−1 −1 · · · −1 1

.

Pivot only when necessary.* SEE ANSWER

From this exercise we conclude that even LU factorization with partial pivoting can yield large (exponen-tial) element growth in U . You may enjoy the collection of problems for which Gaussian elimination withpartial pivoting in unstable by Stephen Wright [46].

In practice, this does not seem to happen and LU factorization is considered to be stable.

11.11 Blocked Algorithms

It is well-known that matrix-matrix multiplication can achieve high performance on most computer archi-tectures [2, 22, 19]. As a result, many dense matrix algorithms are reformulated to be rich in matrix-matrixmultiplication operations. An interface to a library of such operations is known as the level-3 Basic LinearAlgebra Subprograms (BLAS) [17]. In this section, we show how LU factorization can be rearranged sothat most computation is in matrix-matrix multiplications.

11.11.1 Blocked classical LU factorization (Variant 5)

Partition A, L, and U as follows:

A→

A11 A12

A21 A22

, L→

L11 0

L21 L22

, and U →

U11 U12

0 U22

,

where A11, L11, and U11 are b×b. Then A = LU means that A11 A12

A21 A22

=

L11 0

L21 L22

U11 U12

0 U22

=

L11U11 L11U12

L21U11 L21U12 +L22U22

.

This means thatA11 = L11U11 A12 = L11U12

A21 = L21U11 A22 = L21U12 +L22U22

11.11. Blocked Algorithms 189

Algorithm: A := LU(A)

Partition A→

AT L AT R

ABL ABR

whereAT L is 0×0



AT L AT R

ABL ABR

→

A00 A01 A02

A10 A11 A12

A20 A21 A22

whereA11 is b×b

A00 contains L00 and U00 in its strictly lower and upper triangular part, re-spectively.

Variant 1:A01 := L00

−1A01

A10 := A10U00−1

A11 := A11−A10A01

A11 := LUA11


−1A01

A11 := A11−A10A01

A11 := LUA11

A21 := (A21−A20A01)U11−1

Variant 3:A10 := A10U00

−1

A11 := A11−A10A01

A11 := LUA11

A12 := A12−A10A02

Variant 4:A11 := A11−A10A01

A11 := LUA11

A21 := (A21−A20A01)U−111

A12 := L11−1(A12−A10A02)

Variant 5:A11 := LUA11

A21 := A21U11−1

A12 := L11−1A12

A22 := A22−A21A12


ABL ABR

←

A00 A01 A02

A10 A11 A12

A20 A21 A22

endwhile

Figure 11.10: Blocked algorithms for computing the LU factorization.


Algorithm: [A, p] := LUPIV BLK(A, p)

Partition A→

AT L AT R

ABL ABR

, p→

pT

pB

whereAT L is 0×0, pT has 0 elements.



AT L AT R

ABL ABR

→

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

pT

pB

→

p0

p1

p2

whereA11 is b×b, p1 has b elements


−1A01

A11 := A11−A10A01

A21 := A21−A20A01 A11

A21

, p1

:=

LUpiv

A11

A21

, p1 A10 A12

A20 A22

:=

P(p1)

A10 A12

A20 A22

Variant 4:

A11 := A11−A10A01

A21 := A21−A20A01 A11

A21

, p1

:=

LUpiv

A11

A21

, p1 A10 A12

A20 A22

:=

P(p1)

A10 A12

A20 A22

A12 := A12−A10A02

A12 := L11−1A12

Variant 5:

A11

A21

, p1

:=

LUpiv

A11

A21

, p1 A10 A12

A20 A22

:=

P(p1)

A10 A12

A20 A22

A12 := L11

−1A12

A22 := A22−A21A12

Continue with

AT L AT R

ABL ABR

←

A00 A01 A02

A10 A11 A12

A20 A21 A22

,

pT

pB

←

p0

p1

p2

endwhile

Figure 11.11: Blocked algorithms for computing the LU factorization with partial pivoting..

11.11. Blocked Algorithms 191

or, equivalently,

A11 = L11U11 A12 = L11U12

A21 = L21U11 A22−L21U12 = L22U22.

If we let L and U overwrite the original matrix A this suggests the algorithm

• Compute the LU factorization A11 = L11U11, overwriting A11 with L\U11. Notice that any of the“unblocked” algorithms previously discussed in this note can be used for this factorization.

• Solve L11U12 = A12, overwriting A12 with U12. (This can also be expressed as A12 := L−111 A12.)

• Solve L21U11 = A21, overwiting A21 with U21. (This can also be expressed as A21 := A21U−111 .)

• Update A22 := A22−A21A12.

• Continue by overwriting the updated A22 with its LU factorization.

If b is small relative to n, then most computation is in the last step, which is a matrix-matrix multiplication.Similarly, blocked algorithms for the other variants can be derived. All are given in Figure 11.10.

11.11.2 Blocked classical LU factorization with pivoting (Variant 5)

Pivoting can be added to some of the blocked algorithms. Let us focus once again on Variant 5.Partition A, L, and U as follows:

A→

A00 A01 A02

A10 A11 A12

A20 A21 A22

, L→

L00 0 0

L10 L11 0

L20 L21 L22

, and U →

U00 U01 U02

0 U11 U12

0 0 U22

,

where A00, L00, and U00 are k× k, and A11, L11, and U11 are b×b.Assume that the computation has proceeded to the point where A contains

A00 A01 A02

A10 A11 A12

A20 A21 A22

=

L\U00 U01 U02

L10 A11−L10U01 A12−L10U02

L20 A21−L20U01 A22−L20U02

,

where, as before, A denotes the original contents of A and

P(p0)

A00 A01 A02

A10 A11 A12

A20 A21 A22

=

L00 0 0

L10 I 0

L20 0 I

U00 U01 U02

0 A11 A12

0 A21 A22

.

In the current blocked step, we now perform the following computations


• Compute the LU factorization with pivoting of the “current panel”

A11

A21

:

P(p1)

A11

A21

=

L11

L21

U11,

overwriting A11 with L\U11 and A21 with L21.

• Correspondingly, swap rows in the remainder of the matrix A10 A12

A20 A22

:= P(p1)

A10 A12

A20 A22

.

• Solve L11U12 = A12, overwriting A12 with U12. (This can also be more concisely written as A12 :=L−1

11 A12.)

• Update A22 := A22−A21A12.

Careful consideration shows that this puts the matrix A in the stateA00 A01 A02

A10 A11 A12

A20 A21 A22

=

L\U00 U01 U02

L10 L\U11 U12

L20 L21 A22−L20U02−L21U12

,

where

P(

p0

p1

)

A00 A01 A02

A10 A11 A12

A20 A21 A22

=

L00 0 0

L10 L11 0

L20 L21 I

U00 U01 U02

0 U11 U12

0 0 A22

.

Similarly, blocked algorithms with pivoting for some of the other variants can be derived. All are givenin Figure 11.10.

11.12 Variations on a Triple-Nested Loop

All LU factorization algorithms presented in this note perform exactly the same floating point operations(with some rearrangement of data thrown in for the algorithms that perform pivoting) as does the triple-nested loop that implements Gaussian elimination:

11.12. Variations on a Triple-Nested Loop 193

for j = 0, . . . ,n−1 (zero the elements below ( j, j) element)for i = j+1, . . .n−1

αi, j := αi, j/α j, j (compute multiplier λi, j, overwriting αi, j)for k = j+1, . . . ,n−1 (subtract λi, j times the jth row from ith row)

αi,k := αi,k−αi, jα j,kendfor

endforendfor

Chapter 12Notes on Cholesky Factorization

Video




195

http://youtu.be/0CFeX3kSPSg

https://utexas.box.com/s/3ucw7hld6c7nmh6to094

Chapter 12. Notes on Cholesky Factorization 196

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

12.1. Definition and Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

12.2. Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

12.3. An Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

12.4. Proof of the Cholesky Factorization Theorem . . . . . . . . . . . . . . . . . . . . . . 199

12.5. Blocked Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

12.6. Alternative Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

12.7. Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

12.8. Solving the Linear Least-Squares Problem via the Cholesky Factorization . . . . . . 204

12.9. Other Cholesky Factorization Algorithms . . . . . . . . . . . . . . . . . . . . . . . 204

12.10.Implementing the Cholesky Factorization with the (Traditional) BLAS . . . . . . . 206

12.10.1.What are the BLAS? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

12.10.2.A simple implementation in Fortran . . . . . . . . . . . . . . . . . . . . . . . . 209

12.10.3.Implemention with calls to level-1 BLAS . . . . . . . . . . . . . . . . . . . . . 209

12.10.4.Matrix-vector operations (level-2 BLAS) . . . . . . . . . . . . . . . . . . . . . 209

12.10.5.Matrix-matrix operations (level-3 BLAS) . . . . . . . . . . . . . . . . . . . . . 213

12.10.6.Impact on performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

12.11.Alternatives to the BLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

12.11.1.The FLAME/C API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

12.11.2.BLIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

12.1. Definition and Existence 197

12.1 Definition and Existence

This operation is only defined for Hermitian positive definite matrices:

Definition 12.1 A matrix A ∈ Cm×m is Hermitian positive definite (HPD) if and only if it is Hermitian(AH = A) and for all nonzero vectors x ∈ Cm it is the case that xHAx > 0. If in addition A ∈ Rm×m then Ais said to be symmetric positive definite (SPD).

(If you feel uncomfortable with complex arithmetic, just replace the word “Hermitian” with “Symmetric”in this document and the Hermitian transpose operation, H , with the transpose operation, T .

Example 12.2 Consider the case where m = 1 so that A is a real scalar, α. Notice that then A is SPD ifand only if α > 0. This is because then for all nonzero χ ∈ R it is the case that αχ2 > 0.

First some exercises:

Homework 12.3 Let B ∈ Cm×n have linearly independent columns. Prove that A = BHB is HPD.* SEE ANSWER

Homework 12.4 Let A ∈ Cm×m be HPD. Show that its diagonal elements are real and positive.* SEE ANSWER

We will prove the following theorem in Section 12.4:

Theorem 12.5 (Cholesky Factorization Theorem) Given a HPD matrix A there exists a lower triangularmatrix L such that A = LLH .

Obviously, there similarly exists an upper triangular matrix U such that A = UHU since we can chooseUH = L.

The lower triangular matrix L is known as the Cholesky factor and LLH is known as the Choleskyfactorization of A. It is unique if the diagonal elements of L are restricted to be positive.

The operation that overwrites the lower triangular part of matrix A with its Cholesky factor will bedenoted by A := CholA, which should be read as “A becomes its Cholesky factor.” Typically, only thelower (or upper) triangular part of A is stored, and it is that part that is then overwritten with the result. Inthis discussion, we will assume that the lower triangular part of A is stored and overwritten.

12.2 Application

The Cholesky factorization is used to solve the linear system Ax = y when A is HPD: Substituting thefactors into the equation yields LLHx = y. Letting z = LHx,

Ax = L (LHx)︸︷︷︸z

= Lz = y.

Thus, z can be computed by solving the triangular system of equations Lz = y and subsequently the desiredsolution x can be computed by solving the triangular linear system LHx = z.


12.3 An Algorithm

The most common algorithm for computing A := CholA can be derived as follows: Consider A = LLH .Partition

A =

α11 ?

a21 A22

and L =

λ11 0

l21 L22

. (12.1)

Remark 12.6 We adopt the commonly used notation where Greek lower case letters refer to scalars, lowercase letters refer to (column) vectors, and upper case letters refer to matrices. (This is convention is oftenattributed to Alston Householder.) The ? refers to a part of A that is neither stored nor updated.

By substituting these partitioned matrices into A = LLH we find that α11 ?

a21 A22

=

λ11 0

l21 L22

λ11 0

l21 L22

H

=

λ11 0

l21 L22

λ11 lH21

0 LH22

=

|λ11|2 ?

λ11l21 l21lH21 +L22LH

22

,

from which we conclude that

|λ11|=√

α11 ?

l21 = a21/λ11 L22 = CholA22− l21lH21

.

These equalities motivate the algorithm

1. Partition A→

α11 ?

a21 A22

.

2. Overwrite α11 := λ11 =√

α11. (Picking λ11 =√

α11 makes it positive and real, and ensures unique-ness.)

3. Overwrite a21 := l21 = a21/λ11.

4. Overwrite A22 := A22− l21lH21 (updating only the lower triangular part of A22). This operation is

called a symmetric rank-1 update.

5. Continue with A = A22. (Back to Step 1.)

Remark 12.7 Similar to the tril function in Matlab, we use tril(B) to denote the lower triangular partof matrix B.

12.4. Proof of the Cholesky Factorization Theorem 199

12.4 Proof of the Cholesky Factorization Theorem

In this section, we partition A as in (12.1):

A→

α11 aH21

a21 A22

.

The following lemmas are key to the proof:

Lemma 12.8 Let A ∈ Cnxn be HPD. Then α11 is real and positive.

Proof: This is special case of Exercise 12.4.

Lemma 12.9 Let A ∈ Cm×m be HPD and l21 = a21/√

α11. Then A22− l21lH21 is HPD.

Proof: Since A is Hermitian so are A22 and A22− l21lH21.

Let x2 ∈ C(n−1)×(n−1) be an arbitrary nonzero vector. Define x =

χ1

x2

where χ1 = −aH21x2/α11.

Then, since x 6= 0,

0 < xHAx =

χ1

x2

H α11 aH21

a21 A22

χ1

x2

=

χ1

x2

H α11χ1 +aH21x2

a21χ1 +A22x2

= α11|χ1|2 +χ1aH

21x2 + xH2 a21χ1 + xH

2 A22x2

= α11aH

21x2

α11

xH2 a21

α11−

xH2 a21

α11aH

21x2−xH2 a21

aH21x2

α11+ xH

2 A22x2

= xH2 (A22−

a21aH21

α11)x2 (since xH

2 a21aH21x2 is real and hence equals aH

21x2xH2 a21)

= xH2 (A22− l21lH

21)x2.

We conclude that A22− l21lH21 is HPD.

Proof: of the Cholesky Factorization TheoremProof by induction.

Base case: n = 1. Clearly the result is true for a 1× 1 matrix A = α11: In this case, the fact that A isHPD means that α11 is real and positive and a Cholesky factor is then given by λ11 =

√α11, with

uniqueness if we insist that λ11 is positive.

Inductive step: Assume the result is true for HPD matrix A∈C(n−1)×(n−1). We will show that it holds forA ∈ Cn×n. Let A ∈ Cn×n be HPD. Partition A and L as in (12.1) and let λ11 =

√α11 (which is well-

defined by Lemma 12.8), l21 = a21/λ11, and L22 = CholA22− l21lH21 (which exists as a consequence

of the Inductive Hypothesis and Lemma 12.9). Then L is the desired Cholesky factor of A.

By the principle of mathematical induction, the theorem holds.


12.5 Blocked Algorithm

In order to attain high performance, the computation is cast in terms of matrix-matrix multiplication byso-called blocked algorithms. For the Cholesky factorization a blocked version of the algorithm can bederived by partitioning

A→

A11 ?

A21 A22

and L→

L11 0

L21 L22

,

where A11 and L11 are b×b. By substituting these partitioned matrices into A = LLH we find that A11 ?

A21 A22

=

L11 0

L21 L22

L11 0

L21 L22

H

=

L11LH11 ?

L21LH11 L21LH

21 +L22LH22

.

From this we conclude that

L11 = CholA11 ?

L21 = A21L−H11 L22 = CholA22−L21LH

21

.

An algorithm is then described by the steps

1. Partition A→

A11 ?

A21 A22

, where A11 is b×b.

2. Overwrite A11 := L11 = CholA11.

3. Overwrite A21 := L21 = A21L−H11 .

4. Overwrite A22 := A22−L21LH21 (updating only the lower triangular part).

5. Continue with A = A22. (Back to Step 1.)

Remark 12.10 The Cholesky factorization A11 := L11 = CholA11 can be computed with the unblockedalgorithm or by calling the blocked Cholesky factorization algorithm recursively.

Remark 12.11 Operations like L21 = A21L−H11 are computed by solving the equivalent linear system with

multiple right-hand sides L11LH21 = AH

21.

12.6 Alternative Representation

When explaining the above algorithm in a classroom setting, invariably it is accompanied by a picturesequence like the one in Figure 12.1(left)1and the (verbal) explanation:

1 Picture modified from a similar one in [24].

12.6. Alternative Representation 201

done

done

done

partiallyupdated

Beginning of iteration

AT L

ABL

?

ABR

↓ ↓

Repartition

A00 ? ?

aT10 α11 ?

A20 a21 A22

↓ ↓

UPD.

UPD. UPD.

Update√

α11

a21α11

A22−a21aT

21

↓ ↓

@@R

done

done

done

partiallyupdated

End of iteration

AT L

ABL

?

ABR

Figure 12.1: Left: Progression of pictures that explain Cholesky factorization algorithm. Right: Samepictures, annotated with labels and updates.


Algorithm: A := CHOL UNB(A)

Partition A→

AT L ?

ABL ABR

where AT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

α11 :=

√α11

a21 := a21/α11

A22 := A22− tril(a21aH

21)

Continue withAT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Algorithm: A := CHOL BLK(A)

Partition A→

AT L ?

ABL ABR

where AT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

A10 A11 ?

A20 A21 A22

A11 := CholA11

A21 := A21 tril(A11)−H

A22 := A22− tril(A21AH

21)

Continue withAT L ?

ABL ABR

←

A00 ? ?

A10 A11 ?

A20 A21 A22

endwhile

Figure 12.2: Unblocked and blocked algorithms for computing the Cholesky factorization in FLAMEnotation.

12.7. Cost 203

Beginning of iteration: At some stage of the algorithm (Top of the loop), the computation has movedthrough the matrix to the point indicated by the thick lines. Notice that we have finished with theparts of the matrix that are in the top-left, top-right (which is not to be touched), and bottom-leftquadrants. The bottom-right quadrant has been updated to the point where we only need to performa Cholesky factorization of it.

Repartition: We now repartition the bottom-right submatrix to expose α11, a21, and A22.

Update: α11, a21, and A22 are updated as discussed before.

End of iteration: The thick lines are moved, since we now have completed more of the computation, andonly a factorization of A22 (which becomes the new bottom-right quadrant) remains to be performed.

Continue: The above steps are repeated until the submatrix ABR is empty.

To motivate our notation, we annotate this progression of pictures as in Figure 12.1 (right). In thosepictures, “T”, “B”, “L”, and “R” stand for “Top”, “Bottom”, “Left”, and “Right”, respectively. This thenmotivates the format of the algorithm in Figure 12.2 (left). It uses what we call the FLAME notation forrepresenting algorithms [24, 23, 38]. A similar explanation can be given for the blocked algorithm, whichis given in Figure 12.2 (right). In the algorithms, m(A) indicates the number of rows of matrix A.

Remark 12.12 The indices in our more stylized presentation of the algorithms are subscripts rather thanindices in the conventional sense.

Remark 12.13 The notation in Figs. 12.1 and 12.2 allows the contents of matrix A at the beginning of theiteration to be formally stated:

A =

AT L ?

ABL ABR

=

LT L ?

LBL ABR− tril(LBLLH

BL) ,

where LT L =CholAT L, LBL = ABLL−HT L , and AT L, ABL and ABR denote the original contents of the quadrants

AT L,ABL and ABR, respectively.

Homework 12.14 Implement the Cholesky factorization with M-script.* SEE ANSWER

12.7 Cost

The cost of the Cholesky factorization of A ∈ Cm×m can be analyzed as follows: In Figure 12.2 (left)during the kth iteration (starting k at zero) A00 is k× k. Thus, the operations in that iteration cost

• α11 :=√

α11: negligible when k is large.

• a21 := a21/α11: approximately (m− k−1) flops.

• A22 := A22− tril(a21aH

21): approximately (m− k− 1)2 flops. (A rank-1 update of all of A22 would

have cost 2(m− k−1)2 flops. Approximately half the entries of A22 are updated.)


Thus, the total cost in flops is given by

CChol(m) ≈m−1

∑k=0

(m− k−1)2

︸︷︷︸(Due to update of A22)

+m−1

∑k=0

(m− k−1)︸︷︷︸(Due to update of a21)

=m−1

∑j=0

j2 +m−1

∑j=0

j ≈ 13

m3 +12

m2 ≈ 13

m3

which allows us to state that (obvious) most computation is in the update of A22. It can be shown that theblocked Cholesky factorization algorithm performs exactly the same number of floating point operations.

Comparing the cost of the Cholesky factorization to that of the LU factorization from “Notes on LUFactorization” we see that taking advantage of symmetry cuts the cost approximately in half.

12.8 Solving the Linear Least-Squares Problem via the CholeskyFactorization

Recall that if B∈Cm×n has linearly independent columns, then A = BHB is HPD. Also, recall from “Noteson Linear Least-Squares” that the solution, x ∈ Cn to the linear least-squares (LLS) problem

‖Bx− y‖2 = minz∈Cn‖Bz− y‖2

equals the solution to the normal equations

BHB︸︷︷︸A

x = BHy︸︷︷︸y

.

This makes it obvious how the Cholesky factorization can (and often is) used to solve the LLS problem.

Homework 12.15 Consider B ∈ Cm×n with linearly independent columns. Recall that B has a QR fac-torization, B = QR where Q has orthonormal columns and R is an upper triangular matrix with positivediagonal elements. How are the Cholesky factorization of BHB and the QR factorization of B related?

* SEE ANSWER

12.9 Other Cholesky Factorization Algorithms

There are actually three different unblocked and three different blocked algorithms for computing theCholesky factorization. The algorithms we discussed so far in this note are sometimes called right-lookingalgorithms. Systematic derivation of all these algorithms, as well as their blocked counterparts, are givenin Chapter 6 of [38]. In this section, a sequence of exercises leads to what is often referred to as thebordered Cholesky factorization algorithm.

12.9. Other Cholesky Factorization Algorithms 205

Algorithm: A := CHOL UNB(A) Bordered algorithm)

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

whereα11 is 1×1

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Figure 12.3: Unblocked Cholesky factorization, bordered variant, for Homework 12.17.


Homework 12.16 Let A be SPD and partition

A→

A00 a10

aT10 α11

(Hint: For this exercise, use techniques similar to those in Section 12.4.)

1. Show that A00 is SPD.

2. Assuming that A00 = L00LT00, where L00 is lower triangular and nonsingular, argue that the assign-

ment lT10 := aT

10L−T00 is well-defined.

3. Assuming that A00 is SPD, A00 = L00LT00 where L00 is lower triangular and nonsingular, and lT

10 =

aT10L−T

00 , show that α11− lT10l10 > 0 so that λ11 :=

√α11− lT

10l10 is well-defined.

4. Show that A00 a10

aT10 α11

=

L00 0

lT10 λ11

L00 0

lT10 λ11

T

* SEE ANSWER

Homework 12.17 Use the results in the last exercise to give an alternative proof by induction of theCholesky Factorization Theorem and to give an algorithm for computing it by filling in Figure 12.3. Thisalgorithm is often referred to as the bordered Cholesky factorization algorithm.

* SEE ANSWER

Homework 12.18 Show that the cost of the bordered algorithm is, approximately, 13n3 flops.

* SEE ANSWER

12.10 Implementing the Cholesky Factorization with the (Traditional)BLAS

The Basic Linear Algebra Subprograms (BLAS) are an interface to commonly used fundamental linearalgebra operations. In this section, we illustrate how the unblocked and blocked Cholesky factorizationalgorithms can be implemented in terms of the BLAS. The explanation draws from the entry we wrote forthe BLAS in the Encyclopedia of Parallel Computing [37].

12.10.1 What are the BLAS?

The BLAS interface [28, 18, 17] was proposed to support portable high-performance implementation ofapplications that are matrix and/or vector computation intensive. The idea is that one casts computation interms of the BLAS interface, leaving the architecture-specific optimization of that interface to an expert.

12.10. Implementing the Cholesky Factorization with the (Traditional) BLAS 207

Algorithm Code

Sim

ple

for j = 1 : nα j, j :=√α j, j

for i = j+1 : nαi, j := αi, j/α j, j

endfor

for k = j+1 : nfor i = k : n

αi,k := αi,k−αi, jαk, j

endforendfor

endfor

do j=1, nA(j,j) = sqrt(A(j,j))

do i=j+1,nA(i,j) = A(i,j) / A(j,j)

enddo

do k=j+1,ndo i=k,nA(i,k) = A(i,k) - A(i,j) * A(k,j)

enddoenddo

enddo

Vec

tor-

vect

or

for j = 1 : nα j, j :=√α j, j

α j+1:n, j := α j+1:n, j/α j, j

for k = j+1 : nαk:n,k :=−αk, jαk:n, j +αk:n,k

endforendfor

do j=1, nA( j,j ) = sqrt( A( j,j ) )

call dscal( n-j, 1.0d00 / A(j,j), A(j+1,j), 1 )

do k=j+1,ncall daxpy( n-k+1, -A(k,j), A(k,j), 1,

A(k,k), 1 )enddo

enddo

Figure 12.4: Simple and vector-vector (level-1 BLAS) based representations of the right-looking algo-rithm.


Algorithm Code

Mat

rix-

vect

or

for j = 1 : nα j, j :=√α j, j

α j+1:n, j := α j+1:n, j/α j, j

α j+1:n, j+1:n :=−tril(α j+1:n, j αT

j+1:n, j)+α j+1:n, j+1:n

endfor

do j=1, nA(j,j) = sqrt(A(j,j))

call dscal( n-j, 1.0d00 / A(j,j), A(j+1,j), 1 )

call dsyr( ’lower triangular’,n-j, -1.0, A(j+1,j), 1, A(j+1,j+1), lda )

enddo

FLA

ME

Not

atio

n

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

whereα11 is 1×1

α11 :=√

α11

a21 := a21/α11

A22 := A22− tril(a21aH

21)

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

int Chol_unb_var3( FLA_Obj A )FLA_Obj ATL, ATR, A00, a01, A02,

ABL, ABR, a10t, alpha11, a12t,A20, a21, A22;

FLA_Part_2x2( A, &ATL, &ATR,&ABL, &ABR, 0, 0, FLA_TL );

while ( FLA_Obj_length( ATL ) < FLA_Obj_length( A ) )FLA_Repart_2x2_to_3x3(

ATL,/**/ ATR, &A00, /**/ &a01, &A02,/* ************ */ /* ************************** */

&a10t,/**/ &alpha11, &a12t,ABL,/**/ ABR, &A20, /**/ &a21, &A22,1, 1, FLA_BR );

/*-------------------------------------------------*/FLA_Sqrt( alpha11 );FLA_Invscal( alpha11, a21 );FLA_Syr( FLA_LOWER_TRIANGULAR, FLA_MINUS_ONE,

A21, A22 );/*-------------------------------------------------*/FLA_Cont_with_3x3_to_2x2(

&ATL,/**/ &ATR, A00, a01, /**/ A02,a10t, alpha11,/**/ a12t,

/* ************* */ /* *********************** */&ABL,/**/ &ABR, A20, a21, /**/ A22,FLA_TL );

return FLA_SUCCESS;

Figure 12.5: Matrix-vector (level-2 BLAS) based representations of the right-looking algorithm.


12.10.2 A simple implementation in Fortran

We start with a simple implementation in Fortran. A simple algorithm that does not use BLAS and thecorresponding code in given in the row labeled “Simple” in Figure 12.4. This sets the stage for ourexplaination of how the algorithm and code can be represented with vector-vector, matrix-vector, andmatrix-matrix operations, and the corresponding calls to BLAS routines.

12.10.3 Implemention with calls to level-1 BLAS

The first BLAS interface [28] was proposed in the 1970s when vector supercomputers like the early Crayarchitectures reigned. On such computers, it sufficed to cast computation in terms of vector operations.As long as memory was accessed mostly contiguously, near-peak performance could be achieved. Thisinterface is now referred to as the Level-1 BLAS. It was used for the implementation of the first widelyused dense linear algebra package, LINPACK [16].

Let x and y be vectors of appropriate length and α be scalar. In this and other notes we have vector-vector operations such as scaling of a vector (x := αx), inner (dot) product (α := xT y), and scaled vectoraddition (y := αx+ y). This last operation is known as an axpy, which stands for alpha times x plus y.

Our Cholesky factorization algorithm expressed in terms of such vector-vector operations and the cor-responding code are given in Figure 12.4 in the row labeled “Vector-vector”. If the operations supportedby dscal and daxpy achieve high performance on a target archecture (as it was in the days of vectorsupercomputers) then so will the implementation of the Cholesky factorization, since it casts most compu-tation in terms of those operations. Unfortunately, vector-vector operations perform O(n) computation onO(n) data, meaning that these days the bandwidth to memory typically limits performance, since retriev-ing a data item from memory is often more than an order of magnitude more costly than a floating pointoperation with that data item.

We summarize information about level-1 BLAS in Figure 12.6.

12.10.4 Matrix-vector operations (level-2 BLAS)

The next level of BLAS supports operations with matrices and vectors. The simplest example of suchan operation is the matrix-vector product: y := Ax where x and y are vectors and A is a matrix. Anotherexample is the computation A22 =−a21aT

21+A22 (symmetric rank-1 update) in the Cholesky factorization.Here only the lower (or upper) triangular part of the matrix is updated, taking advantage of symmetry.

The use of symmetric rank-1 update is illustrated in Figure 12.5, in the row labeled “Matrix-vector”.There dsyr is the routine that implements a double precision symmetric rank-1 update. Readability of thecode is improved by casting computation in terms of routines that implement the operations that appear inthe algorithm: dscal for a21 = a21/α11 and dsyr for A22 =−a21aT

21 +A22.If the operation supported by dsyr achieves high performance on a target archecture (as it was in

the days of vector supercomputers) then so will this implementation of the Cholesky factorization, sinceit casts most computation in terms of that operation. Unfortunately, matrix-vector operations performO(n2) computation on O(n2) data, meaning that these days the bandwidth to memory typically limitsperformance.



A proto-typical calling sequence for a level-1 BLAS routine is

axpy( n, alpha, x, incx, y, incy ),

which implements the scaled vector addition operation y = αx+ y. Here

• The “” indicates the data type. The choices for this first letter are

s single precision

d double precision

c single precision complex

z double precision complex

• The operation is identified as axpy: alpha times x plus y.

• n indicates the number of elements in the vectors x and y.

• alpha is the scalar α.

• x and y indicate the memory locations where the first elements of x and y are stored,respectively.

• incx and incy equal the increment by which one has to stride through memory for theelements of vectors x and y, respectively

The following are the most frequently used level-1 BLAS:

routine/ operation

function

swap x↔ y

scal x← αx

copy y← x

axpy y← αx+ y

dot xT y

nrm2 ‖x‖2

asum ‖re(x)‖1 +‖im(x)‖1

imax min(k) : |re(xk)|+ |im(xk)|= max(|re(xi)|+ |im(xi)|)

Figure 12.6: Summary of the most commonly used level-1 BLAS.


The naming convention for level-2 BLAS routines is given by

XXYY,

where

• “” can take on the values s, d, c, z.

• XX indicates the shape of the matrix.

• YY indicates the operation to be performed:

XX matrix shape

ge general (rectangular)

sy symmetric

he Hermitian

tr triangular

YY matrix shape

mv matrix vector multiplication

sv solve vector

r rank-1 update

r2 rank-2 update

• In addition, operations with banded matrices are supported, which we do not discusshere.

A representative call to a level-2 BLAS operation is given by

dsyr( uplo, n, alpha, x, incx, A, lda )

which implements the operation A = αxxT +A, updating the lower or upper triangular part ofA by choosing uplo as ‘Lower triangular’ or ‘Upper triangular’, respectively. Theparameter lda (the leading dimension of matrix A) indicates the increment by which memoryhas to be traversed in order to address successive elements in a row of matrix A.The following table gives the most commonly used level-2 BLAS operations:

routine/ operation

function

gemv general matrix-vector multiplication

symv symmetric matrix-vector multiplication

trmv triangular matrix-vector multiplication

trsv triangular solve vector

ger general rank-1 update

syr symmetric rank-1 update

syr2 symmetric rank-2 update



Algorithm Code

Mat

rix-

mat

rix

for j = 1 : n in steps of nb

b := min(n− j+1,nb)

A j: j+b−1, j: j+b−1 := CholA j: j+b−1, j: j+b−1

A j+b:n, j: j+b−1 :=

A j+b:n, j: j+b−1A−Hj: j+b−1, j: j+b−1

A j+b:n, j+b:n := A j+b:n, j+b:n

− tril(

Aj+b:n,j:j+b−1AHj+b:n,j:j+b−1

)endfor

do j=1, n, nbjb = min( nb, n-j+1 )

call chol( jb, A( j, j ), lda )

call dtrsm( ‘Right’, ‘Lower triangular’,‘Transpose’, ‘Nonunit diag’,J-JB+1, JB, 1.0d00, A( j, j ), lda,

A( j+jb, j ), lda )

call dsyrk( ‘Lower triangular’, ‘No transpose’,J-JB+1, JB, -1.0d00, A( j+jb, j ), lda,

1.0d00, A( j+jb, j+jb ), lda )enddo

FLA

ME

Not

atio

n

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0



AT L ?

ABL ABR

→

A00 ? ?

A10 A11 ?

A20 A21 A22

whereA11 is b×b

A11 := CholA11

A21 := A21 tril(A11)−H

A22 := A22− tril(A21AH

21)

Continue with

AT L ?

ABL ABR

←

A00 ? ?

A10 A11 ?

A20 A21 A22

endwhile

int Chol_unb_var3( FLA_Obj A )FLA_Obj ATL, ATR, A00, a01, A02,

ABL, ABR, a10t, alpha11, a12t,A20, a21, A22;

FLA_Part_2x2( A, &ATL, &ATR,&ABL, &ABR, 0, 0, FLA_TL );

while ( FLA_Obj_length( ATL ) < FLA_Obj_length( A ) )FLA_Repart_2x2_to_3x3(

ATL,/**/ ATR, &A00, /**/ &a01, &A02,/* ************ */ /* ************************** */

&a10t,/**/ &alpha11, &a12t,ABL,/**/ ABR, &A20, /**/ &a21, &A22,1, 1, FLA_BR );

/*-------------------------------------------------*/FLA_Sqrt( alpha11 );FLA_Invscal( alpha11, a21 );FLA_Syr( FLA_LOWER_TRIANGULAR, FLA_MINUS_ONE,

A21, A22 );/*-------------------------------------------------*/FLA_Cont_with_3x3_to_2x2(

&ATL,/**/ &ATR, A00, a01, /**/ A02,a10t, alpha11,/**/ a12t,

/* ************* */ /* *********************** */&ABL,/**/ &ABR, A20, a21, /**/ A22,FLA_TL );

return FLA_SUCCESS;

Figure 12.8: Blocked algorithm and implementation with level-3 BLAS.


The naming convention for level-3 BLAS routines are similar to those for the level-2 BLAS.A representative call to a level-3 BLAS operation is given by

dsyrk( uplo, trans, n, k, alpha, A, lda, beta, C, ldc )

which implements the operation C := αAAT +βC or C := αAT A+βC depending on whethertrans is chosen as ‘No transpose’ or ‘Transpose’, respectively. It updates the lower orupper triangular part of C depending on whether uplo equal ‘Lower triangular’ or ‘Uppertriangular’, respectively. The parameters lda and ldc are the leading dimensions of arraysA and C, respectively.The following table gives the most commonly used Level-3 BLAS operationsx

routine/ operation

function

gemm general matrix-matrix multiplication

symm symmetric matrix-matrix multiplication

trmm triangular matrix-matrix multiplication

trsm triangular solve with multiple right-hand sides

syrk symmetric rank-k update

syr2k symmetric rank-2k update


12.10.5 Matrix-matrix operations (level-3 BLAS)

Finally, we turn to the implementation of the blocked Cholesky factorization algorithm from Section 12.5.The algorithm is expressed with FLAME notation and Matlab-like notation in Figure 12.8.

The routines dtrsm and dsyrk are level-3 BLAS routines:

• The call to dtrsm implements A21 := L21 where L21LT11 = A21.

• The call to dsyrk implements A22 :=−L21LT21 +A22.

The bulk of the computation is now cast in terms of matrix-matrix operations which can achieve highperformance.


12.10.6 Impact on performance

Figure 12.10 illustrates the performance benefits that come from using the different levels of BLAS on atypical architecture.


0 200 400 600 800 1000 1200 1400 1600 1800 20000

2

4

6

8

10

n

GFl

ops

One thread

Hand optimizedBLAS3BLAS2BLAS1triple loops

Figure 12.10: Performance of the different implementations of Cholesky factorization that use differentlevels of BLAS. The target processor has a peak of 11.2 Gflops (billions of floating point operations persecond). BLAS1, BLAS2, and BLAS3 indicate that the bulk of computation was cast in terms of level-1,-2, or -3 BLAS, respectively.

12.11 Alternatives to the BLAS

12.11.1 The FLAME/C API

In a number of places in these notes we presented algorithms in FLAME notation. Clearly, there is adisconnect between this notation and how the algorithms are then encoded with the BLAS interface. InFigures 12.4, 12.5, and 12.8 we also show how the FLAME API for the C programming language [5]allows the algorithms to be more naturally translated into code. While the traditional BLAS interfaceunderlies the implementation of Cholesky factorization and other algorithms in the widely used LAPACKlibrary [1], the FLAME/C API is used in our libflame library [24, 39, 40].

12.11.2 BLIS

The implementations that call BLAS in this paper are coded in Fortran. More recently, the languages ofchoice for scientific computing have become C and C++. While there is a C interface to the traditionalBLAS called the CBLAS [10], we believe a more elegant such interface is the BLAS-like Library Instan-tiation Software (BLIS) interface [41]. BLIS is not only a framework for rapid implementation of thetraditional BLAS, but also presents an alternative interface for C and C++ users.

Chapter 13Notes on Eigenvalues and Eigenvectors

If you have forgotten how to find the eigenvalues and eigenvectors of 2× 2 and 3× 3 matrices, you maywant to review


Video




215


http://youtu.be/SxGybiBcsuo

https://utexas.box.com/s/hl4rgbw8yy2y4il8x4cj

Chapter 13. Notes on Eigenvalues and Eigenvectors 216

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

13.1. Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

13.2. The Schur and Spectral Factorizations . . . . . . . . . . . . . . . . . . . . . . . . . 220

13.3. Relation Between the SVD and the Spectral Decomposition . . . . . . . . . . . . . . 222

13.1. Definition 217

13.1 Definition

Definition 13.1 Let A ∈ Cm×m. Then λ ∈ C and nonzero x ∈ Cm are said to be an eigenvalue and corre-sponding eigenvector if Ax = λx. The tuple (λ,x) is said to be an eigenpair. The set of all eigenvalues of Ais denoted by Λ(A) and is called the spectrum of A. The spectral radius of A, ρ(A) equals the magnitudeof the largest eigenvalue, in magnitude:

ρ(A) = maxλ∈Λ(A)|λ|.

The action of A on an eigenvector x is as if it were multiplied by a scalar. The direction does notchange, only its length is scaled:

Ax = λx.

Theorem 13.2 Scalar λ is an eigenvalue of A if and only if

(λI−A)

is singular

has a nontrivial null-space

has linearly dependendent columns

det(λI−A) = 0

(λI−A)x = 0 has a nontrivial solution

etc.

The following exercises expose some other basic properties of eigenvalues and eigenvectors:

Homework 13.3 Eigenvectors are not unique.* SEE ANSWER

Homework 13.4 Let λ be an eigenvalue of A and let Eλ(A) = x ∈ Cm|Ax = λx denote the set of alleigenvectors of A associated with λ (including the zero vector, which is not really considered an eigenvec-tor). Show that this set is a (nontrivial) subspace of Cm.

* SEE ANSWER

Definition 13.5 Given A ∈ Cm×m, the function pm(λ) = det(λI−A) is a polynomial of degree at most m.This polynomial is called the characteristic polynomial of A.

The definition of pm(λ) and the fact that is a polynomial of degree at most m is a consequence of thedefinition of the determinant of an arbitrary square matrix. This definition is not particularly enlighteningother than that it allows one to succinctly related eigenvalues to the roots of the characteristic polynomial.

Remark 13.6 The relation between eigenvalues and the roots of the characteristic polynomial yield adisconcerting insight: A general formula for the eigenvalues of a m×m matrix with m > 4 does not exist.


The reason is that there is no general formula for the roots of a polynomial of degree m > 4. Given anypolynomial pm(χ) of degree m, an m×m matrix can be constructed such that its characteristic polynomialis pm(λ). If

pm(χ) = α0 +α1χ+ · · ·+αm−1χm−1 +χ

m

and

A =

−αn−1 −αn−2 −αn−3 · · · −α1 −α0

1 0 0 · · · 0 0

0 1 0 · · · 0 0

0 0 1 · · · 0 0...

...... . . . ...

...

0 0 0 · · · 1 0

then

pm(λ) = det(λI−A)

Hence, we conclude that no general formula can be found for the eigenvalues for m×m matrices whenm > 4. What we will see in future “Notes on ...” is that we will instead create algorithms that converge tothe eigenvalues and/or eigenvalues of matrices.

Theorem 13.7 Let A ∈ Cm×m and pm(λ) be its characteristic polynomial. Then λ ∈ Λ(A) if and only ifpm(λ) = 0.

Proof: This is an immediate consequence of Theorem 13.2.In other words, λ is an eigenvalue of A if and only if it is a root of pm(λ). This has the immediate

consequence that A has at most m eigenvalues and, if one counts multiple roots by their multiplicity, it hasexactly m eigenvalues. (One says “Matrix A ∈ Cm×m has m eigenvalues, multiplicity counted.)

Homework 13.8 The eigenvalues of a diagonal matrix equal the values on its diagonal. The eigenvaluesof a triangular matrix equal the values on its diagonal.

* SEE ANSWER

Corollary 13.9 If A ∈ Rm×m is real valued then some or all of its eigenvalues may be complex valued. Inthis case, if λ ∈ Λ(A) then so is its conjugate, λ.

Proof: It can be shown that if A is real valued, then the coefficients of its characteristic polynomial areall real valued. Complex roots of a polynomial with real coefficients come in conjugate pairs.

It is not hard to see that an eigenvalue that is a root of multiplicity k has at most k eigenvectors. Itis, however, not necessarily the case that an eigenvalue that is a root of multiplicity k also has k linearlyindependent eigenvectors. In other words, the null space of λI−A may have dimension less then thealgebraic multiplicity of λ. The prototypical counter example is the k× k matrix

J(µ) =

µ 1 0 · · · 0 0

0 µ 1 . . . 0 0... . . . . . . . . . ...

...

0 0 0 . . . µ 1

0 0 0 · · · 0 µ

13.1. Definition 219

where k > 1. Observe that λI− J(µ) is singular if and only if λ = µ. Since µI− J(µ) has k− 1 linearlyindependent columns its null-space has dimension one: all eigenvectors are scalar multiples of each other.This matrix is known as a Jordan block.

Definition 13.10 A matrix A ∈ Cm×m that has fewer than m linearly independent eigenvectors is said tobe defective. A matrix that does have m linearly independent eigenvectors is said to be nondefective.

Theorem 13.11 Let A ∈ Cm×m. There exist nonsingular matrix X and diagonal matrix Λ such that A =XΛX−1 if and only if A is nondefective.

Proof:(⇒). Assume there exist nonsingular matrix X and diagonal matrix Λ so that A = XΛX−1. Then, equiva-lently, AX = XΛ. Partition X by columns so that

A(

x0 x1 · · · xm−1

)=

(x0 x1 · · · xm−1

)

λ0 0 · · · 0

0 λ1 · · · 0... . . . . . . ...

0 0 · · · λm−1

=

(λ0x0 λ1x1 · · · λm−1xm−1

).

Then, clearly, Ax j = λ jx j so that A has m linearly independent eigenvectors and is thus nondefective.(⇐). Assume that A is nondefective. Let x0, · · · ,xm−1 equal m linearly independent eigenvectors cor-responding to eigenvalues λ0, · · · ,λm−1. If X =

(x0 x1 · · · xm−1

)then AX = XΛ where Λ =

diag(λ0, . . . ,λm−1). Hence A = XΛX−1.

Definition 13.12 Let µ ∈ Λ(A) and pm(λ) be the characteristic polynomial of A. Then the algebraicmultiplicity of µ is defined as the multiplicity of µ as a root of pm(λ).

Definition 13.13 Let µ ∈ Λ(A). Then the geometric multiplicity of µ is defined to be the dimension ofEµ(A). In other words, the geometric multiplicity of µ equals the number of linearly independent eigen-vectors that are associated with µ.

Theorem 13.14 Let A∈Cm×m. Let the eigenvalues of A be given by λ0,λ1, · · · ,λk−1, where an eigenvalueis listed exactly n times if it has geometric multiplicity n. There exists a nonsingular matrix X such that

A = X

J(λ0) 0 · · · 0

0 J(λ1) · · · 0...

... . . . ...

0 0 · · · J(λk−1)

.

For our discussion, the sizes of the Jordan blocks J(λi) are not particularly important. Indeed, this decom-position, known as the Jordan Canonical Form of matrix A, is not particularly interesting in practice. Forthis reason, we don’t discuss it further and do not we give its proof.


13.2 The Schur and Spectral Factorizations

Theorem 13.15 Let A,Y,B ∈ Cm×m, assume Y is nonsingular, and let B = Y−1AY . Then Λ(A) = Λ(B).

Proof: Let λ∈Λ(A) and x be an associated eigenvector. Then Ax= λx if and only if Y−1AYY−1x=Y−1λxif and only if B(Y−1x) = λ(Y−1x).

Definition 13.16 Matrices A and B are said to be similar if there exists a nonsingular matrix Y such thatB = Y−1AY .

Given a nonsingular matrix Y the transformation Y−1AY is called a similarity transformation of A.It is not hard to expand the last proof to show that if A is similar to B and λ ∈ Λ(A) has alge-

braic/geometric multiplicity k then λ ∈ Λ(B) has algebraic/geometric multiplicity k.The following is the fundamental theorem for the algebraic eigenvalue problem:

Theorem 13.17 Schur Decomposition Theorem Let A ∈Cm×m. Then there exist a unitary matrix Q andupper triangular matrix U such that A = QUQH . This decomposition is called the Schur decompositionof matrix A.

In the above theorem, Λ(A) = Λ(U) and hence the eigenvalues of A can be found on the diagonal of U .Proof: We will outline how to construct Q so that QHAQ =U , an upper triangular matrix.

Since a polynomial of degree m has at least one root, matrix A has at least one eigenvalue, λ1, andcorresponding eigenvector q1, where we normalize this eigenvector to have length one. Thus Aq1 = λ1q1.Choose Q2 so that Q =

(q1 Q1

)is unitary. Then

QHAQ =(

q1 Q2

)HA(

q1 Q2

)=

qH1 Aq1 qH

1 AQ2

QH2 Aq1 QH

2 AQ2

=

λ1 qH1 AQ2

λQH2 q1 QH

2 AQ2

=

λ1 wT

0 B

,

where wT = qH1 AQ2 and B = QH

2 AQ2. This insight can be used to construct an inductive proof.One should not mistake the above theorem and its proof as a constructive way to compute the Schurdecomposition: finding an eigenvalue and/or the eigenvalue associated with it is difficult.

Lemma 13.18 Let A ∈Cm×m be of form A =

AT L AT R

0 ABR

. Assume that QT L and QBR are unitary “of

appropriate size”. Then

A =

QT L 0

0 QBR

H QT LAT LQHT L QT LAT RQH

BR

0 QBRABRQHBR

QT L 0

0 QBR

.

13.2. The Schur and Spectral Factorizations 221

Homework 13.19 Prove Lemma 13.18. Then generalize it to a result for block upper triangular matrices:

A =

A0,0 A0,1 · · · A0,N−1

0 A1,1 · · · A1,N−1

0 0 . . . ...

0 0 · · · AN−1,N−1

.

* SEE ANSWER

Corollary 13.20 Let A ∈ Cm×m be of for A =

AT L AT R

0 ABR

. Then Λ(A) = Λ(AT L)∪Λ(ABR).

Homework 13.21 Prove Corollary 13.20. Then generalize it to a result for block upper triangular matri-ces.

* SEE ANSWER

A theorem that will later allow the eigenvalues and vectors of a real matrix to be computed (mostly)without requiring complex arithmetic is given by

Theorem 13.22 Let A ∈ Rm×m. Then there exist a unitary matrix Q ∈ Rm×m and quasi upper triangularmatrix U ∈ Rm×m such that A = QUQT .

A quasi upper triangular matrix is a block upper triangular matrix where the blocks on the diagonal are1×1 or 2×2. Complex eigenvalues of A are found as the complex eigenvalues of those 2×2 blocks onthe diagonal.

Theorem 13.23 Spectral Decomposition Theorem Let A ∈ Cm×m be Hermitian. Then there exist aunitary matrix Q and diagonal matrix Λ ∈ Rm×m such that A = QΛQH . This decomposition is called theSpectral decomposition of matrix A.

Proof: From the Schur Decomposition Theorem we know that there exist a matrix Q and upper triangularmatrix U such that A = QUQH . Since A = AH we know that QUQH = QUHQH and hence U =UH . Buta Hermitian triangular matrix is diagonal with real valued diagonal entries.

What we conclude is that a Hermitian matrix is nondefective and its eigenvectors can be chosen toform an orthogonal basis.

Homework 13.24 Let A be Hermitian and λ and µ be distinct eigenvalues with eigenvectors xλ and xµ,respectively. Then xH

λxµ = 0. (In other words, the eigenvectors of a Hermitian matrix corresponding to

distinct eigenvalues are orthogonal.)* SEE ANSWER


13.3 Relation Between the SVD and the Spectral Decomposition

Homework 13.25 Let A ∈ Cm×m be a Hermitian matrix, A = QΛQH its Spectral Decomposition, andA =UΣV H its SVD. Relate Q, U, V , Λ, and Σ.

* SEE ANSWER

Homework 13.26 Let A∈Cm×m and A =UΣV H its SVD. Relate the Spectral decompositions of AHA andAAH to U, V , and Σ.

* SEE ANSWER

Chapter 14Notes on the Power Method and RelatedMethods

You may want to review Chapter 12 of

Linear Algebra: Foundations to Frontiers - Notes to LAFF With [29]

in which the Power Method and Inverse Power Methods are discussed at a more rudimentary level.

Video


Tragically, I forgot to turn on the camera... This was a great lecture!

223


Chapter 14. Notes on the Power Method and Related Methods 224

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

14.1. The Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

14.1.1. First attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

14.1.2. Second attempt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

14.1.3. Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

14.1.4. Practical Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

14.1.5. The Rayleigh quotient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

14.1.6. What if |λ0| ≥ |λ1|? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

14.2. The Inverse Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

14.3. Rayleigh-quotient Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

14.1. The Power Method 225

14.1 The Power Method

The Power Method is a simple method that under mild conditions yields a vector corresponding to theeigenvalue that is largest in magnitude.

Throughout this section we will assume that a given matrix A ∈ Cm×m is nondeficient: there exists anonsingular matrix X and diagonal matrix Λ such that A = XΛX−1. (Sometimes this is called a diagonal-izable matrix since there exists a matrix X so that

X−1AX = Λ or, equivalently, A = XΛX−1.

From “Notes on Eigenvalues and Eigenvectors” we know then that the columns of X equal eigenvectorsof A and the elements on the diagonal of Λ equal the eigenvalues::

X =(

x0 x0 · · · xm−1

)and Λ =

λ0 0 · · · 0

0 λ1 · · · 0...

... . . . ...

0 0 · · · λm−1

so that

Axi = λixi for i = 0, . . . ,m−1.

For most of this section we will assume that

|λ0|> |λ1| ≥ · · · ≥ |λm−1|.

In particular, λ0 is the eigenvalue with maximal absolute value.

14.1.1 First attempt

Now, let v(0) ∈ Cm×m be an “initial guess”. Our (first attempt at the) Power Method iterates as follows:

for k = 0, . . .v(k+1) = Av(k)

endfor

Clearly v(k) = Akv(0). Let

v(0) = Xy = ψ0x0 +ψ1x1 + · · ·+ψm−1xm−1.

What does this mean? We view the columns of X as forming a basis for Cm and then the elements invector y = X−1v(0) equal the coefficients for describing v(0) in that basis. Then

v(1) = Av(0) = A(ψ0x0 +ψ1x1 + · · ·+ψm−1xm−1)

= ψ0λ0x0 +ψ1λ0x1 + · · ·+ψm−1λm−1xm−1,

v(2) = Av(1) = ψ0λ20x0 +ψ1λ

21x1 + · · ·+ψm−1λ

2m−1xm−1,

...v(k) = Av(k−1) = ψ0λ

k0x0 +ψ1λ

k1x1 + · · ·+ψm−1λ

km−1xm−1.


Now, as long as ψ0 6= 0 clearly ψ0λk0x0 will eventually dominate which means that v(k) will start pointing

in the direction of x0. In other words, it will start pointing in the direction of an eigenvector correspondingto λ0. The problem is that it will become infinitely long if |λ0|> 1 or infinitessimily short if |λ0|< 1. Allis good if |λ0|= 1.

14.1.2 Second attempt

Again, let v(0) ∈ Cm×m be an “initial guess”. The second attempt at the Power Method iterates as follows:

for k = 0, . . .v(k+1) = Av(k)/λ0

endfor

It is not hard to see that then

v(k) = Av(k−1)/λ0 = Akv(0)/λk0

= ψ0

(λ0

λ0

)k

x0 +ψ1

(λ1

λ0

)k

x1 + · · ·+ψm−1

(λm−1

λ0

)k

xm−1

= ψ0x0 +ψ1

(λ1

λ0

)k

x1 + · · ·+ψm−1

(λm−1

λ0

)k

xm−1.

Clearly limk→∞ v(k) = ψ0x0, as long as ψ0 6= 0, since∣∣∣λk

λ0

∣∣∣< 1 for k > 0.

Another way of stating this is to notice that

Ak = (AA · · ·A)︸︷︷︸k times

= (XΛX−1)(XΛX−1) · · ·(XΛX−1)︸︷︷︸Λk

= XΛkX−1.

so that

v(k) = Akv(0)/λk0

= AkXy/λk0

= XΛkX−1Xy/λ

k0

= XΛky/λ

k0

= X(

Λk/λ

k0

)y = X

1 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

y.


Now, since∣∣∣λk

λ0

∣∣∣< 1 for k > 1 we can argue that

limk→∞

v(k) = limk→∞

X

1 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

y = X

1 0 · · · 0

0 0 · · · 0... . . . . . . ...

0 0 · · · 0

y

= Xψ0e0 = ψ0Xe0 = ψ0x0.

Thus, as long as ψ0 6= 0 (which means v must have a component in the direction of x0) this method willeventually yield a vector in the direction of x0. However, this time the problem is that we don’t know λ0when we start.

14.1.3 Convergence

Before we make the algorithm practical, let us examine how fast the iteration converges. This requires afew definitions regarding rates of convergence.

Definition 14.1 Let α0,α1,α2, . . . ∈ C be an infinite sequence of scalars. Then αk is said to converge toα if

limk→∞|αk−α|= 0.

Let x0,x1,x2, . . . ∈ Cm be an infinite sequence of vectors. Then xk is said to converge to x in the ‖ · ‖norm if

limk→∞‖xk− x‖= 0.

Notice that because of the equivalence of norms, if the sequence converges in one norm, it converges inall norms.

Definition 14.2 Let α0,α1,α2, . . . ∈ C be an infinite sequence of scalars that converges to α. Then

• αk is said to converge linearly to α if for large enough k

|αk+1−α| ≤C|αk−α|

for some constant C < 1.

• αk is said to converge super-linearly to α if

|αk+1−α| ≤Ck|αk−α|

with Ck→ 0.

• αk is said to converge quadratically to α if for large enough k

|αk+1−α| ≤C|αk−α|2

for some constant C.


• αk is said to converge super-quadratically to α if

|αk+1−α| ≤Ck|αk−α|2

with Ck→ 0.

• αk is said to converge cubically to α if for large enough k

|αk+1−α| ≤C|αk−α|3

for some constant C.

Linear convergence can be slow. Let’s say that for k ≥ K we observe that

|αk+1−α| ≤C|αk−α|.

Then, clearly, |αk+n−α| ≤Cn|αk−α|. If C = 0.99, progress may be very, very slow. If |αk−α|= 1, then

|αk+1−α| ≤ 0.99000|αk+2−α| ≤ 0.98010|αk+3−α| ≤ 0.97030|αk+4−α| ≤ 0.96060|αk+5−α| ≤ 0.95099|αk+6−α| ≤ 0.94148|αk+7−α| ≤ 0.93206|αk+8−α| ≤ 0.92274|αk+9−α| ≤ 0.91351

Quadratic convergence is fast. Now

|αk+1−α| ≤ C|αk−α|2

|αk+2−α| ≤ C|αk+1−α|2 ≤C(C|αk−α|2)2 =C3|αk−α|4

|αk+3−α| ≤ C|αk+2−α|2 ≤C(C3|αk−α|4)2 =C7|αk−α|8...

|αk+n−α| ≤ C2n−1|αk−α|2n

Even C = 0.99 and |αk−α|= 1, then

|αk+1−α| ≤ 0.99000|αk+2−α| ≤ 0.970299|αk+3−α| ≤ 0.932065|αk+4−α| ≤ 0.860058|αk+5−α| ≤ 0.732303|αk+6−α| ≤ 0.530905|αk+7−α| ≤ 0.279042|αk+8−α| ≤ 0.077085|αk+9−α| ≤ 0.005882|αk+10−α| ≤ 0.000034


If we consider α the correct result then, eventually, the number of correct digits roughly doubles in eachiteration. This can be explained as follows: If |αk−α| < 1, then the number of correct decimal digits isgiven by

− log10 |αk−α|.

Since log10 is a monotonically increasing function,

log10 |αk+1−α| ≤ log10C|αk−α|2 = log10(C)+2log10 |αk−α| ≤ 2log10(|αk−α|

and hence− log10 |αk+1−α|︸︷︷︸number of correct

digits in αk+1

≥ 2( − log10(|αk−α|︸︷︷︸number of correct

digits in αk

).

Cubic convergence is dizzyingly fast: Eventually the number of correct digits triples from one iterationto the next.

We now define a convenient norm.

Lemma 14.3 Let X ∈Cm×m be nonsingular. Define ‖·‖X : Cm→R by ‖y‖X = ‖Xy‖ for some given norm‖ · ‖ : Cm→ R. Then ‖ · ‖X is a norm.


With this new norm, we can do our convergence analysis:

v(k)−ψ0x0 = Akv(0)/λk0−ψ0x0 = X

1 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

X−1v(0)−ψ0x0

= X

1 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

y−ψ0x0 = X

0 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

y

Hence

X−1(v(k)−ψ0x0) =

0 0 · · · 0

0(

λ1λ0

)k· · · 0

... . . . . . . ...

0 0 · · ·(

λm−1λ0

)k

y


and

X−1(v(k+1)−ψ0x0) =

0 0 · · · 0

0 λ1λ0· · · 0

... . . . . . . ...

0 0 · · · λm−1λ0

X−1(v(k)−ψ0x0).

Now, let ‖ · ‖ be a p-norm1and its induced matrix norm and ‖ · ‖X−1 as defined in Lemma 14.3. Then

‖v(k+1)−ψ0x0‖X−1 = ‖X−1(v(k+1)−ψ0x0)‖

=

∥∥∥∥∥∥∥∥∥∥∥

0 0 · · · 0

0 λ1λ0· · · 0

... . . . . . . ...

0 0 · · · λm−1λ0

X−1(v(k)−ψ0x0)

∥∥∥∥∥∥∥∥∥∥∥≤

∣∣∣∣λ1

λ0

∣∣∣∣‖X−1(v(k)−ψ0x0)‖=∣∣∣∣λ1

λ0

∣∣∣∣‖v(k)−ψ0x0‖X−1 .

This shows that, in this norm, the convergence of v(k) to ψ0x0 is linear: The difference between currentapproximation, v(k), and the solution, ψx0, is reduced by at least a constant factor in each iteration.

14.1.4 Practical Power Method

The following algorithm, known as the Power Method, avoids the problem of v(k) growing or shrinking inlength, without requiring λ0 to be known, by scaling it to be of unit length at each step:

for k = 0, . . .v(k+1) = Av(k)

v(k+1) = v(k+1)/‖v(k+1)‖endfor

14.1.5 The Rayleigh quotient

A question is how to extract an approximation of λ0 given an approximation of x0. The following theoremprovides the answer:

Theorem 14.5 If x is an eigenvector of A then λ = xHAx/(xHx) is the associated eigenvalue of A. Thisratio is known as the Rayleigh quotient.

Proof: Let x be an eigenvector of A and λ the associated eigenvalue. Then Ax = λx. Multiplying on theleft by xH yields xHAx = λxHx which, since x 6= 0 means that λ = xHAx/(xHx).

Clearly this ratio as a function of x is continuous and hence an approximation to x0 when plugged into thisformula would yield an approximation to λ0.

1We choose a p-norm to make sure that the norm of a diagonal matrix equals the absolute value of the largest element (inmagnitude) on its diagonal.

14.2. The Inverse Power Method 231

14.1.6 What if |λ0| ≥ |λ1|?Now, what if

|λ0|= · · ·= |λk−1|> |λk| ≥ . . .≥ |λm−1|?

By extending the above analysis one can easily show that v(k) will converge to a vector in the subspacespanned by the eigenvectors associated with λ0, . . . ,λk−1.

An important special case is when k = 2: if A is real valued then λ0 still may be complex valued inwhich case λ0 is also an eigenvalue and it has the same magnitude as λ0. We deduce that v(k) will alwaysbe in the space spanned by the eigenvectors corresponding to λ0 and λ0.

14.2 The Inverse Power Method

The Power Method homes in on an eigenvector associated with the largest (in magnitude) eigenvalue. TheInverse Power Method homes in on an eigenvector associated with the smallest eigenvalue (in magnitude).

Throughout this section we will assume that a given matrix A ∈Cm×m is nondeficient and nonsingularso that there exist matrix X and diagonal matrix Λ such that A = XΛX−1. We further assume that Λ =diag(λ0, · · · ,λm−1) and

|λ0| ≥ |λ1| ≥ · · · ≥ |λm−2|> |λm−1|.

Theorem 14.6 Let A ∈Cm×m be nonsingular. Then λ and x are an eigenvalue and associated eigenvectorof A if and only if 1/λ and x are an eigenvalue and associated eigenvector of A−1.

Homework 14.7 Assume that

|λ0| ≥ |λ1| ≥ · · · ≥ |λm−2|> |λm−1|> 0.

Show that ∣∣∣∣ 1λm−1

∣∣∣∣> ∣∣∣∣ 1λm−2

∣∣∣∣≥ ∣∣∣∣ 1λm−3

∣∣∣∣≥ ·· · ≥ ∣∣∣∣ 1λ0

∣∣∣∣ .* SEE ANSWER

Thus, an eigenvector associated with the smallest (in magnitude) eigenvalue of A is an eigenvector associ-ated with the largest (in magnitude) eigenvalue of A−1. This suggest the following naive iteration:

for k = 0, . . .v(k+1) = A−1v(k)

v(k+1) = λm−1v(k+1)

endfor

Of course, we would want to factor A = LU once and solve L(Uv(k+1)) = v(k) rather than multiplyingwith A−1. From the analysis of the convergence of the “second attempt” for a Power Method algorithmwe conclude that now

‖v(k+1)−ψm−1xm−1‖X−1 ≤∣∣∣∣λm−1

λm−2

∣∣∣∣‖v(k)−ψm−1xm−1‖X−1.

A practical Inverse Power Method algorithm is given by


for k = 0, . . .v(k+1) = A−1v(k)


Often, we would expect the Invert Power Method to converge faster than the Power Method. Forexample, take the case where |λk| are equally spaced between 0 and m: |λk|= (k+1). Then∣∣∣∣λ1

λ0

∣∣∣∣= m−1m

and∣∣∣∣λm−1

λm−2

∣∣∣∣= 12.

which means that the Power Method converges much more slowly than the Inverse Power Method.

14.3 Rayleigh-quotient Iteration

The next observation is captured in the following lemma:

Lemma 14.8 Let A ∈ Cm×m and µ ∈ C. Then (λ,x) is an eigenpair of A if and only if (λ− µ,x) is aneigenpair of (A−µI).


The matrix A−µI is referred to as the matrix A that has been “shifted” by µ. What the lemma says is thatshifting A by µ shifts the spectrum of A by µ:

Lemma 14.10 Let A ∈ Cm×m, A = XΛX−1 and µ ∈ C. Then A−µI = X(Λ−µI)X−1.


This suggests the following (naive) iteration: Pick a value µ close to λm−1. Iterate

for k = 0, . . .v(k+1) = (A−µI)−1v(k)

v(k+1) = (λm−1−µ)v(k+1)

endfor

Of course one would solve (A−µI)v(k+1) = v(k) rather than computing and applying the inverse of A.If we index the eigenvalues so that |λ0−µ| ≤ · · · ≤ |λm−2−µ|< |λm−1−µ| then

‖v(k+1)−ψm−1xm−1‖X−1 ≤∣∣∣∣λm−1−µλm−2−µ

∣∣∣∣‖v(k)−ψm−1xm−1‖X−1.

The closer to λm−1 the “shift” (so named because it shifts the spectrum of A) is chosen, the more favorablethe ratio that dictates convergence.

A more practical algorithm is given by

14.3. Rayleigh-quotient Iteration 233

for k = 0, . . .v(k+1) = (A−µI)−1v(k)


The question now becomes how to chose µ so that it is a good guess for λm−1. Often an applicationinherently supplies a reasonable approximation for the smallest eigenvalue or an eigenvalue of particularinterest. However, we know that eventually v(k) becomes a good approximation for xm−1 and thereforethe Rayleigh quotient gives us a way to find a good approximation for λm−1. This suggests the (naive)Rayleigh-quotient iteration:

for k = 0, . . .µk = v(k)HAv(k)/(v(k)Hv(k))v(k+1) = (A−µkI)−1v(k)

v(k+1) = (λm−1−µk)v(k+1)

endfor

Now2

‖v(k+1)−ψm−1xm−1‖X−1 ≤∣∣∣∣λm−1−µk

λm−2−µk

∣∣∣∣‖v(k)−ψm−1xm−1‖X−1

withlimk→∞

(λm−1−µk) = 0

which means super linear convergence is observed. In fact, it can be shown that once k is large enough

‖v(k+1)−ψm−1xm−1‖X−1 ≤C‖v(k)−ψm−1xm−1‖2X−1,

which is known as quadratic convergence. Roughly speaking this means that every iteration doublesthe number of correct digits in the current approximation. To prove this, one shows that |λm−1− µk| ≤C‖v(k)−ψm−1xm−1‖X−1 .

Better yet, it can be shown that if A is Hermitian, then, once k is large enough,

‖v(k+1)−ψm−1xm−1‖X−1 ≤C‖v(k)−ψm−1xm−1‖3X−1,

which is known as cubic convergence. Roughly speaking this means that every iteration triples the numberof correct digits in the current approximation. This is mind-boggling fast convergence!

A practical Rayleigh quotient iteration is given by

v(0) = v(0)/‖v(0)‖2for k = 0, . . .

µk = v(k)HAv(k) (Now ‖v(k)‖2 = 1)v(k+1) = (A−µkI)−1v(k)


2 I think... I have not checked this thoroughly. But the general idea holds. λm−1 has to be defined as the eigenvalue to whichthe method eventually converges.

Chapter 15Notes on the QR Algorithm and other DenseEigensolvers

Video

Read disclaimer regarding the videos in the preface!Tragically, the camera ran out of memory for the first lecture... Here is the second lecture, which discussesthe implicit QR algorithm


(For help on viewing, see Appendix A.)In most of this ote, we focus on the case where A is symmetric and real valued. The reason for this is

that many of the techniques can be more easily understood in that setting.

235

http://youtu.be/HJN2Dp1BgUY

https://utexas.box.com/s/bxsqo37rdcsj4m6lrae6

Chapter 15. Notes on the QR Algorithm and other Dense Eigensolvers 236

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

15.1. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

15.2. Subspace Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

15.3. The QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

15.3.1. A basic (unshifted) QR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 242

15.3.2. A basic shifted QR algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

15.4. Reduction to Tridiagonal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

15.4.1. Householder transformations (reflectors) . . . . . . . . . . . . . . . . . . . . . 244

15.4.2. Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

15.5. The QR algorithm with a Tridiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . 247

15.5.1. Givens’ rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

15.6. QR Factorization of a Tridiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . 248

15.7. The Implicitly Shifted QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 250

15.7.1. Upper Hessenberg and tridiagonal matrices . . . . . . . . . . . . . . . . . . . . 250

15.7.2. The Implicit Q Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

15.7.3. The Francis QR Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

15.7.4. A complete algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

15.8. Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

15.8.1. More on reduction to tridiagonal form . . . . . . . . . . . . . . . . . . . . . . . 258

15.8.2. Optimizing the tridiagonal QR algorithm . . . . . . . . . . . . . . . . . . . . . 258

15.9. Other Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

15.9.1. Jacobi’s method for the symmetric eigenvalue problem . . . . . . . . . . . . . . 258

15.9.2. Cuppen’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

15.9.3. The Method of Multiple Relatively Robust Representations (MRRR) . . . . . . 261

15.10.The Nonsymmetric QR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

15.10.1.A variant of the Schur decomposition . . . . . . . . . . . . . . . . . . . . . . . 261

15.10.2.Reduction to upperHessenberg form . . . . . . . . . . . . . . . . . . . . . . . . 262

15.10.3.The implicitly double-shifted QR algorithm . . . . . . . . . . . . . . . . . . . . 265

15.1. Preliminaries 237

15.1 Preliminaries

The QR algorithm is a standard method for computing all eigenvalues and eigenvectors of a matrix. In thisnote, we focus on the real valued symmetric eigenvalue problem (the case where A ∈ Rn×n. For this case,recall the Spectral Decomposition Theorem:

Theorem 15.1 If A∈Rn×n then there exists unitary matrix Q and diagonal matrix Λ such that A=QΛQT .

We will partition Q =(

q0 · · · qn−1

)and assume that Λ = diag(()λ0, · · · ,λn−1) so that throughout

this note, qi and λi refer to the ith column of Q and the ith diagonal element of Λ, which means that eachtuple (λ,qi) is an eigenpair.

15.2 Subspace Iteration

We start with a matrix V ∈ Rn×r with normalized columns and iterate something like

V (0) =V

for k = 0, . . . convergence

V (k+1) = AV (k)

Normalize the columns to be of unit length.

end for

The problem with this approach is that all columns will (likely) converge to an eigenvector associated withthe dominant eigenvalue, since the Power Method is being applied to all columns simultaneously. We willnow lead the reader through a succession of insights towards a practical algorithm.

Let us examine what V =AV looks like, for the simple case where V =(

v0 v1 v2

)(three columns).

We know thatv j = Q QT v j︸︷︷︸

y j

.

Hence

v0 =n−1

∑j=0

ψ0, jq j,

v1 =n−1

∑j=0

ψ1, jq j,and

v2 =n−1

∑j=0

ψ2, jq j,


where ψi, j equals the ith element of y j. Then

AV = A(

v0 v1 v2

)= A

(∑

n−1j=0 ψ0, jq j ∑

n−1j=0 ψ1, jq j ∑

n−1j=0 ψ2, jq j

)=

(∑

n−1j=0 ψ0, jAq j ∑

n−1j=0 ψ1, jAq j ∑

n−1j=0 ψ2, jAq j

)=

(∑

n−1j=0 ψ0, jλ jq j ∑

n−1j=0 ψ1, jλ jq j ∑

n−1j=0 ψ2, jλ jq j

)• If we happened to know λ0, λ1, and λ2 then we could divide the columns by these, respectively, and

get new vectors(v0 v1 v2

)=(

∑n−1j=0 ψ0, j

(λ jλ0

)q j ∑

n−1j=0 ψ1, j

(λ jλ1

)q j ∑

n−1j=0 ψ2, j

(λ jλ2

)q j

)

=

ψ0,0q0+

∑n−1j=1 ψ0, j

(λ jλ0

)q j

ψ1,0

(λ0λ1

)q0+

ψ1,1q1+

∑n−1j=2 ψ1, j

(λ jλ1

)q j

ψ2,0

(λ0λ2

)q0 +ψ2,1

(λ1λ2

)q1+

ψ2,2q2+

∑n−1j=3 ψ2, j

(λ jλ2

)q j

(15.1)

• Assume that |λ0|> |λ1|> |λ2|> |λ3| ≥ · · · ≥ |λn−1|. Then, similar as for the power method,

– The first column will see components in the direction of q1, . . . ,qn−1 shrink relative to thecomponent in the direction of q0.

– The second column will see components in the direction of q2, . . . ,qn−1 shrink relative to thecomponent in the direction of q1, but the component in the direction of q0 increases, relatively,since |λ0/λ1|> 1.

– The third column will see components in the direction of q3, . . . ,qn−1 shrink relative to thecomponent in the direction of q2, but the components in the directions of q0 and q1 increase,relatively, since |λ0/λ2|> 1 and |λ1/λ2|> 1.

How can we make it so that v j converges to a vector in the direction of q j?

• If we happen to know q0, then we can subtract out the component of

v1 = ψ1,0λ0

λ1q0 +ψ1,1q1 +

n−1

∑j=2

ψ1, jλ j

λ1q j

in the direction of q0:

v1−qT0 v1q0 = ψ1,1q1 +

n−1

∑j=2

ψ1, jλ j

λ1q j

so that we are left with the component in the direction of q1 and components in directions ofq2, . . . ,qn−1 that are suppressed every time through the loop.

• Similarly, if we also know q1, the components of v2 in the direction of q0 and q1 can be subtractedfrom that vector.

15.2. Subspace Iteration 239

• We do not know λ0, λ1, and λ2 but from the discussion about the Power Method we remember thatwe can just normalize the so updated v0, v1, and v2 to have unit length.

How can we make these insights practical?

• We do not know q0, q1, and q2, but we can informally argue that if we keep iterating,

– The vector v0, normalized in each step, will eventually point in the direction of q0.

– S(v0, v1) will eventually equal S(q0,q1).

– In each iteration, we can subtract the component of v1 in the direction of v0 from v1, and thennormalize v1 so that eventually result in a the vector that points in the direction of q1.

– S(v0, v1, v2) will eventually equal S(q0,q1,q2).

– In each iteration, we can subtract the component of v2 in the directions of v0 and v1 from v2,and then normalize the result, to make v2 eventually point in the direction of q2.

What we recognize is that normalizing v0, subtracting out the component of v1 in the direction of v0, andthen normalizing v1, etc., is exactly what the Gram-Schmidt process does. And thus, we can use anyconvenient (and stable) QR factorization method. This also shows how the method can be generalized towork with more than three columns and even all columns simultaneously.

The algorithm now becomes:

V (0) = In×p (In×p represents the first p columns of I)

for k = 0, . . . convergence

AV (k)→V (k+1)R(k+1) (QR factorization with R(k+1) ∈ Rp×p)

end for

Now consider again (15.1), focusing on the third column:

ψ2,0

(λ0λ2

)q0 +ψ2,1

(λ1λ2

)q1+

ψ2,2q2+

∑n−1j=3 ψ j

(λ jλ2

)q j

=

ψ2,0

(λ0λ2

)q0 +ψ2,1

(λ1λ2

)q1+

ψ2,2q2+

ψ j

(λ3λ2

)q3 +∑

n−1j=4 ψ2, j

(λ jλ2

)q j

.

This shows that, if the components in the direction of q0 and q1 are subtracted out, it is the componentin the direction of q3 that is deminished in length the most slowly, dictated by the ratio

∣∣∣λ3λ2

∣∣∣. This, of

course, generalizes: the jth column of V (k), v(k)i will have a component in the direction of q j+1, of length|qT

j+1v(k)j |, that can be expected to shrink most slowly.We demonstrate this in Figure 15.1, which shows the execution of the algorithm with p = n for a 5×5

matrix, and shows how |qTj+1v(k)j | converge to zero as as function of k.


Figure 15.1: Convergence of the subspace iteration for a 5×5 matrix. This graph is mislabeled: x shouldbe labeled with v. The (linear) convergence of v j to a vector in the direction of q j is dictated by nowquickly the component in the direction q j+1 converges to zero. The line labeled |qT

j+1x j| plots the lengthof the component in the direction q j+1 as a function of the iteration number.

Next, we observe that if V ∈ Rn×n in the above iteration (which means we are iterating with n vectorsat a time), then AV yields a next-to last column of the form

∑

n−3j=0 γn−2, j

(λ j

λn−2

)q j+

ψn−2,n−2qn−2+

ψn−2,n−1

(λn−1λn−2

)qn−1

,

where ψi, j = qTj vi. Thus, given that the components in the direction of q j, j = 0, . . . ,n−2 can be expected

in later iterations to be greatly reduced by the QR factorization that subsequently happens with AV , wenotice that it is

∣∣∣λn−1λn−2

∣∣∣ that dictates how fast the component in the direction of qn−1 disappears from v(k)n−2.This is a ratio we also saw in the Inverse Power Method and that we noticed we could accelerate in theRayleigh Quotient Iteration: At each iteration we should shift the matrix to (A− µkI) where µk ≈ λn−1.Since the last column of V (k) is supposed to be converging to qn−1, it seems reasonable to use µk =

v(k)Tn−1Av(k)n−1 (recall that v(k)n−1 has unit length, so this is the Rayleigh quotient.)

The above discussion motivates the iteration

15.2. Subspace Iteration 241

0 5 10 15 20 25 30 35 40

10−15

10−10

10−5

100

k

leng

th o

f com

pone

nt in

dire

ctio

n ...

| q0

T x4(k) |

| q1T x4

(k) |

| q2T x4

(k) |

| q3T x4

(k) |

Figure 15.2: Convergence of the shifted subspace iteration for a 5×5 matrix. This graph is mislabeled: xshould be labeled with v. What this graph shows is that the components of v4 in the directions q0 throughtq3 disappear very quickly. The vector v4 quickly points in the direction of the eigenvector associatedwith the smallest (in magnitude) eigenvalue. Just like the Rayleigh-quotient iteration is not guaranteed toconverge to the eigenvector associated with the smallest (in magnitude) eigenvalue, the shifted subspaceiteration may home in on a different eigenvector than the one associated with the smallest (in magnitude)eigenvalue. Something is wrong in this graph: All curves should quickly drop to (near) zero!

V (0) := I (V (0) ∈ Rn×n!)

for k := 0, . . . convergence

µk := v(k)Tn−1Av(k)n−1 (Rayleigh quotient)

(A−µkI)V (k)→V (k+1)R(k+1) (QR factorization)

end for

Notice that this does not require one to solve with (A− µkI), unlike in the Rayleigh Quotient Iteration.However, it does require a QR factorization, which requires more computation than the LU factorization(approximately 4

3n3 flops).

We demonstrate the convergence in Figure 15.2, which shows the execution of the algorithm with a5×5 matrix and illustrates how |qT

j v(k)n−1| converge to zero as as function of k.


Subspace iteration QR algorithm

A(0) := A

V (0) := I

for k := 0, . . . until convergence

AV (k)→ V (k+1)R(k+1) (QR factorization)

A(k+1) := V (k+1)T AV (k+1)

end for

A(0) := A

V (0) := I


A(k)→ Q(k+1)R(k+1) (QR factorization)

A(k+1) := R(k+1)Q(k+1)

V (k+1) :=V (k)Q(k+1)

end for

Figure 15.3: Basic subspace iteration and basic QR algorithm.

15.3 The QR Algorithm

The QR algorithm is a classic algorithm for computing all eigenvalues and eigenvectors of a matrix.While we explain it for the symmetric eigenvalue problem, it generalizes to the nonsymmetric eigenvalueproblem as well.

15.3.1 A basic (unshifted) QR algorithm

We have informally argued that the columns of the orthogonal matrices V (k) ∈ Rn×n generated by the(unshifted) subspace iteration converge to eigenvectors of matrix A. (The exact conditions under whichthis happens have not been fully discussed.) In Figure 15.3 (left), we restate the subspace iteration. Init, we denote matrices V (k) and R(k) from the subspace iteration by V (k) and R to distinguish them fromthe ones computed by the algorithm on the right. The algorithm on the left also computes the matrixA(k) = V (k)T AV (k), a matrix that hopefully converges to Λ, the diagonal matrix with the eigenvalues of Aon its diagonal. To the right is the QR algorithm. The claim is that the two algorithms compute the samequantities.

Homework 15.2 Prove that in Figure 15.3, V (k) =V (k), and A(k) = A(k), k = 0, . . ..* SEE ANSWER

We conclude that if V (k) converges to the matrix of orthonormal eigenvectors when the subspace iterationis applied to V (0) = I, then A(k) converges to the diagonal matrix with eigenvalues along the diagonal.

15.3.2 A basic shifted QR algorithm

In Figure 15.4 (left), we restate the subspace iteration with shifting. In it, we denote matrices V (k) and R(k)

from the subspace iteration by V (k) and R to distinguish them from the ones computed by the algorithmon the right. The algorithm on the left also computes the matrix A(k) =V (k)T AV (k), a matrix that hopefullyconverges to Λ, the diagonal matrix with the eigenvalues of A on its diagonal. To the right is the shiftedQR algorithm. The claim is that the two algorithms compute the same quantities.

15.3. The QR Algorithm 243

Subspace iteration QR algorithm

A(0) := A

V (0) := I


µk := v(k)Tn−1 Av(k)n−1

(A− µI)V (k)→ V (k+1)R(k+1) (QR factorization)

A(k+1) := V (k+1)T AV (k+1)

end for

A(0) := A

V (0) := I


µk = α(k)n−1,n−1

A(k)−µkI→ Q(k+1)R(k+1) (QR factorization)

A(k+1) := R(k+1)Q(k+1)+µkI

V (k+1) :=V (k)Q(k+1)

end for

Figure 15.4: Basic shifted subspace iteration and basic shifted QR algorithm.

Homework 15.3 Prove that in Figure 15.4, V (k) =V (k), and A(k) = A(k), k = 1, . . ..* SEE ANSWER

We conclude that if V (k) converges to the matrix of orthonormal eigenvectors when the shifted subspaceiteration is applied to V (0) = I, then A(k) converges to the diagonal matrix with eigenvalues along thediagonal.

The convergence of the basic shifted QR algorithm is illustrated below. Pay particular attention to theconvergence of the last row and column.

A(0) =

2.01131953448 0.05992695085 0.14820940917

0.05992695085 2.30708673171 0.93623515213

0.14820940917 0.93623515213 1.68159373379

A(1) =

2.21466116574 0.34213192482 0.31816754245

0.34213192482 2.54202325042 0.57052186467

0.31816754245 0.57052186467 1.24331558383

A(2) =

2.63492207667 0.47798481637 0.07654607908

0.47798481637 2.35970859985 0.06905042811

0.07654607908 0.06905042811 1.00536932347

A(3) =

2.87588550968 0.32971207176 0.00024210487

0.32971207176 2.12411444949 0.00014361630

0.00024210487 0.00014361630 1.00000004082

A(4) =

2.96578660126 0.18177690194 0.00000000000

0.18177690194 2.03421339873 0.00000000000

0.00000000000 0.00000000000 1.00000000000

A(5) =

2.9912213907 0.093282073553 0.00000000000

0.0932820735 2.008778609226 0.00000000000

0.0000000000 0.000000000000 1.00000000000

Once the off-diagonal elements of the last row and column have converged (are sufficiently small), theproblem can be deflated by applying the following theorem:

Theorem 15.4 Let

A =

A0,0 A01 · · · A0N−1

0 A1,1 · · · A1,N−1...

... . . . ...

0 0 · · · AN−1,N−1

,

where Ak,k are all square. Then Λ(A) = ∪N−1k=0 Λ(Ak,k).


Homework 15.5 Prove the above theorem.* SEE ANSWER

In other words, once the last row and column have converged, the algorithm can continue with the subma-trix that consists of the first n−1 rows and columns.

The problem with the QR algorithm, as stated, is that each iteration requires O(n3) operations, whichis too expensive given that many iterations are required to find all eigenvalues and eigenvectors.

15.4 Reduction to Tridiagonal Form

In the next section, we will see that if A(0) is a tridiagonal matrix, then so are all A(k). This reduces the costof each iteration from O(n3) to O(n). We first show how unitary similarity transformations can be used toreduce a matrix to tridiagonal form.

15.4.1 Householder transformations (reflectors)

We briefly review the main tool employed to reduce a matrix to tridiagonal form: the Householder trans-form, also known as a reflector. Full details were given in Chapter 6.

Definition 15.6 Let u ∈Rn, τ ∈R. Then H = H(u) = I−uuT/τ, where τ = 12uT u, is said to be a reflector

or Householder transformation.

We observe:

• Let z be any vector that is perpendicular to u. Applying a Householder transform H(u) to z leavesthe vector unchanged: H(u)z = z.

• Let any vector x be written as x = z+uT xu, where z is perpendicular to u and uT xu is the componentof x in the direction of u. Then H(u)x = z−uT xu.

This can be interpreted as follows: The space perpendicular to u acts as a “mirror”: any vector in thatspace (along the mirror) is not reflected, while any other vector has the component that is orthogonalto the space (the component outside and orthogonal to the mirror) reversed in direction. Notice that areflection preserves the length of the vector. Also, it is easy to verify that:

1. HH = I (reflecting the reflection of a vector results in the original vector);

2. H =HT , and so HT H =HHT = I (a reflection is an orthogonal matrix and thus preserves the norm);and

3. if H0, · · · ,Hk−1 are Householder transformations and Q = H0H1 · · ·Hk−1, then QT Q = QQT = I (anaccumulation of reflectors is an orthogonal matrix).

As part of the reduction to condensed form operations, given a vector x we will wish to find aHouseholder transformation, H(u), such that H(u)x equals a vector with zeroes below the first element:H(u)x = ∓‖x‖2e0 where e0 equals the first column of the identity matrix. It can be easily checked thatchoosing u = x±‖x‖2e0 yields the desired H(u). Notice that any nonzero scaling of u has the same prop-erty, and the convention is to scale u so that the first element equals one. Let us define [u,τ,h] =HouseV(x)to be the function that returns u with first element equal to one, τ = 1

2uT u, and h = H(u)x.

15.4. Reduction to Tridiagonal Form 245

15.4.2 Algorithm

The first step towards computing the eigenvalue decomposition of a symmetric matrix is to reduce thematrix to tridiagonal form.

The basic algorithm for reducing a symmetric matrix to tridiagonal form, overwriting the originalmatrix with the result, can be explained as follows. We assume that symmetric A is stored only in thelower triangular part of the matrix and that only the diagonal and subdiagonal of the symmetric tridiagonalmatrix is computed, overwriting those parts of A. Finally, the Householder vectors used to zero out partsof A overwrite the entries that they annihilate (set to zero).

• Partition A→

α11 aT21

a21 A22

.

• Let [u21,τ,a21] := HouseV(a21).1

• Update α11 aT21

a21 A22

:=

1 0

0 H

α11 aT21

a21 A22

1 0

0 H

=

α11 aT21H

Ha21 HA22H

where H = H(u21). Note that a21 := Ha21 need not be executed since this update was performed bythe instance of HouseV above.2 Also, aT

12 is not stored nor updated due to symmetry. Finally, onlythe lower triangular part of HA22H is computed, overwriting A22. The update of A22 warrants closerscrutiny:

A22 := (I− 1τ

u21uT21)A22(I−

1τ

u21uT21)

= (A22−1τ

u21 uT21A22︸︷︷︸yT

21

)(I− 1τ

u21uT21)

= A22−1τ

u21yT21−

1τ

Au21︸︷︷︸y21

uT21 +

1τ2 u21 yT

21u21︸︷︷︸2β

uT21

= A22−(

1τ

u21yT21−

β

τ2 u21uT21

)−(

1τ

y21uT21−

β

τ2 u21uT21

)= A22−u21

1τ

(yT

21−β

τuT

21

)︸︷︷︸

wT21

− 1τ

(y21−

β

τu21

)︸︷︷︸

w21

uT21

= A22−u21wT21−w21uT

21.︸︷︷︸symmetric

rank-2 update

1 Note that the semantics here indicate that a21 is overwritten by Ha21.2 In practice, the zeros below the first element of Ha21 are not actually written. Instead, the implementation overwrites these

elements with the corresponding elements of the vector u21.


Algorithm: [A, t] := TRIRED UNB(A)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB

whereAT L is 0×0 and tb has 0 elements


Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

→

t0

τ1

t2


[u21,τ1,a21] := HouseV(a21)

y21 := A22u21

β := uT21y21/2

w21 := (y21−βu21/τ1)/τ1

A22 := A22− tril(u21wT

21 +w21uT21)

(symmetric rank-2 update)


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

←

t0

τ1

t2

endwhile

Figure 15.5: Basic algorithm for reduction of a symmetric matrix to tridiagonal form.

15.5. The QR algorithm with a Tridiagonal Matrix 247

× × × × ×× × × × ×× × × × ×× × × × ×× × × × ×

−→

× × 0 0 0

× × × × ×0 × × × ×0 × × × ×0 × × × ×

−→

× × 0 0 0

× × × 0 0

0 × × × ×0 0 × × ×0 0 × × ×

−→

Original matrix First iteration Second iteration

× × 0 0 0

× × × 0 0

0 × × × 0

0 0 × × ×0 0 0 × ×

−→

× × 0 0 0

× × × 0 0

0 × × × 0

0 0 × × ×

0 0 0 × ×

−→

Third iteration

Figure 15.6: Illustration of reduction of a symmetric matrix to tridiagonal form. The ×s denote nonzeroelements in the matrix. The gray entries above the diagonal are not actually updated.

• Continue this process with the updated A22.

This is captured in the algorithm in Figure 15.5. It is also illustrated in Figure 15.6.The total cost for reducing A ∈ Rn×n is approximately

n−1

∑k=0

(4(n− k−1)2) flops≈ 4

3n3 flops.

This equals, approximately, the cost of one QR factorization of matrix A.

15.5 The QR algorithm with a Tridiagonal Matrix

We are now ready to describe an algorithm for the QR algorithm with a tridiagonal matrix.

15.5.1 Givens’ rotations

First, we introduce another important class of unitary matrices known as Givens’ rotations. Given a vector

x =

χ1

χ2

∈ R2, there exists an orthogonal matrix G such that GT x =

±‖x‖2

0

. The Householder

transformation is one example of such a matrix G. An alternative is the Givens’ rotation: G=

γ −σ

σ γ


where γ2 +σ2 = 1. (Notice that γ and σ can be thought of as the cosine and sine of an angle.) Then

GT G =

γ −σ

σ γ

T γ −σ

σ γ

=

γ σ

−σ γ

γ −σ

σ γ

=

γ2 +σ2 −γσ+ γσ

γσ− γσ γ2 +σ2

=

1 0

0 1

,

which means that a Givens’ rotation is a unitary matrix.Now, if γ = χ1/‖x‖2 and σ = χ2/‖x‖2, then γ2 +σ2 = (χ2

1 +χ22)/‖x‖2

2 = 1 and γ −σ

σ γ

T χ1

χ2

=

γ σ

−σ γ

χ1

χ2

=

(χ21 +χ2

2)/‖x‖2

(χ1χ2−χ1χ2)/‖x‖2

=

‖x‖2

0

.

15.6 QR Factorization of a Tridiagonal Matrix

Now, consider the 4×4 tridiagonal matrixα0,0 α0,1 0 0

α1,0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

From

α0,0

α1,0

one can compute γ1,0 and σ1,0 so that

γ1,0 −σ1,0

σ1,0 γ1,0

T α0,0

α1,0

=

α0,0

0

.

Then α0,0 α0,1 α0,2 0

0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

=

γ1,0 σ1,0 0 0

−σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

α0,0 α0,1 0 0

α1,0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

Next, from

α1,1

α2,1


γ2,1 −σ2,1

σ2,1 γ2,1

T α1,1

α2,1

=

α1,1

0

.

15.6. QR Factorization of a Tridiagonal Matrix 249

Then

α0,0 α0,1 α0,2 0

0 α1,1α1,2

α1,3

0 0 α2,2 α2,3

0 0 α3,2 α3,3

=

1 0 0 0

0 γ2,1 σ2,1 0

0 −σ2,1 γ2,1 0

0 0 0 1

α0,0 α0,1 α0,2 0

0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

Finally, from

α2,2

α3,2


γ3,2 −σ3,2

σ3,2 γ3,2

T α2,2

α3,2

=

α2,2

0

.

Then

α0,0 α0,1 α0,2 0

0 α1,1α1,2

α1,3

0 0 α2,2α2,3

0 0 0 α3,3

=

1 0 0 0

0 1 0 0

1 0 γ3,2 σ3,2

0 1 −σ3,2 γ3,2

α0,0 α0,1 α0,2 0

0 α1,1α1,2

α1,3

0 0 α2,2 α2,3

0 0 α3,2 α3,3

The matrix Q is the orthogonal matrix that results from multiplying the different Givens’ rotations together:

Q =

γ1,0 −σ1,0 0 0

σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

1 0 0 0

0 γ2,1 −σ2,1 0

0 σ2,1 γ2,1 0

0 0 0 1

1 0 0 0

0 1 0 0

0 0 γ3,2 −σ3,2

0 0 σ3,2 γ3,2

. (15.2)

However, it is typically not explicitly formed.


The next question is how to compute RQ given the QR factorization of the tridiagonal matrix:α0,0 α0,1 α0,2 0

0 α1,1α1,2

α1,3

0 0 α2,2α2,3

0 0 0 α3,3

γ1,0 −σ1,0 0 0

σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

︸︷︷︸

1 0 0 0

0 γ2,1 −σ2,1 0

0 σ2,1 γ2,1 0

0 0 0 1

1 0 0 0

0 1 0 0

0 0 γ3,2 −σ3,2

0 0 σ3,2 γ3,2

α0,0 α0,1 α0,2 0

α1,0˜α1,1

α1,2α1,3

0 0 α2,2α2,3

0 0 0 α3,3

︸︷︷︸

α0,0 ˜α0,1 α0,2 0

α1,0 ˜α1,1 α1,2α1,3

0 α2,1 α2,2α2,3

0 0 0 α3,3

︸︷︷︸

α0,0 ˜α0,1 α0,2 0

α1,0 ˜α1,1 ˜α1,2 α1,3

0 α2,1 ˜α2,2 α2,3

0 0 α3,2 α3,3

.

A symmetry argument can be used to motivate that α0,2 = α1,3 = 0.

15.7 The Implicitly Shifted QR Algorithm

15.7.1 Upper Hessenberg and tridiagonal matrices

Definition 15.7 A matrix is said to be upper Hessenberg if all entries below its first subdiagonal equalzero.

In other words, if matrix A ∈ Rn×n is upper Hessenberg, it looks like

A =

α0,0 α0,1 α0,2 · · · α0,n−1 α0,n−1

α1,0 α1,1 α1,2 · · · α1,n−1 α1,n−1

0 α2,1 α2,2 · · · α2,n−1 α2,n−1... . . . . . . . . . ...

...

0 0 0 . . . αn−2,n−2 αn−2,n−2

0 0 0 · · · αn−1,n−2 αn−1,n−2

.

Obviously, a tridiagonal matrix is a special case of an upper Hessenberg matrix.

15.7. The Implicitly Shifted QR Algorithm 251

15.7.2 The Implicit Q Theorem

The following theorem sets up one of the most remarkable algorithms in numerical linear algebra, whichallows us to greatly simplify the implementation of the shifted QR algorithm when A is tridiagonal.

Theorem 15.8 (Implicit Q Theorem) Let A,B∈Rn×n where B is upper Hessenberg and has only positiveelements on its first subdiagonal and assume there exists a unitary matrix Q such that QT AQ = B. Then Qand B are uniquely determined by A and the first column of Q.

Proof: Partition

Q =(

q0 q1 q2 · · · qn−2 qn−1

)and B =

β0,0 β0,1 β0,2 · · · β0,n−2 β0,n−1

β1,0 β1,1 β1,2 · · · β1,n−2 β1,n−1

0 β2,1 β2,2 · · · β2,n−2 β2,n−1

0 0 β3,2 · · · β3,n−2 β3,n−1...

...... . . . ...

...

0 0 0 · · · βn−1,n−2 βn−1,n−1

.

Notice that AQ = QB and hence

A(

q0 q1 q2 · · · qn−2 qn−1

)

=(

q0 q1 q2 · · · qn−2 qn−1

)

β0,0 β0,1 β0,2 · · · β0,n−2 β0,n−1

β1,0 β1,1 β1,2 · · · β1,n−2 β1,n−1

0 β2,1 β2,2 · · · β2,n−1

0 0 β3,2 · · · β3,n−2 β3,n−1...

...... . . . ...

...

0 0 0 · · · βn−1,n−2 βn−1,n−1

.

Equating the first column on the left and right, we notice that

Aq0 = β0,0q0 +β1,0q1.

Now, q0 is given and ‖q0‖2 since Q is unitary. Hence

qT0 Aq0 = β0,0qT

0 q0 +β1,0qT0 q1 = β0,0.

Next,β1,0q1 = Aq0−β0,0q0 = q1.

Since ‖q1‖2 = 1 (it is a column of a unitary matrix) and β1,0 is assumed to be positive, then we know that

β1,0 = ‖q1‖2.

Finally,q1 = q1/β1,0.

The point is that the first column of B and second column of Q are prescribed by the first column of Q andthe fact that B has positive elements on the first subdiagonal.

In this way, each column of Q and each column of B can be determined, one by one.


Homework 15.9 Give all the details of the above proof.* SEE ANSWER

Notice the similarity between the above proof and the proof of the existence and uniqueness of the QRfactorization!

To take advantage of the special structure of A being symmetric, the theorem can be expanded to

Theorem 15.10 (Implicit Q Theorem) Let A,B ∈ Rn×n where B is upper Hessenberg and has only pos-itive elements on its first subdiagonal and assume there exists a unitary matrix Q such that QT AQ = B.Then Q and B are uniquely determined by A and the first column of Q. If A is symmetric, then B is alsosymmetric and hence tridiagonal.

15.7.3 The Francis QR Step

The Francis QR Step combines the steps (A(k−1)− µkI)→ Q(k)R(k) and A(k+1) := R(k)Q(k)+ µkI into asingle step.

Now, consider the 4×4 tridiagonal matrixα0,0 α0,1 0 0

α1,0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

−µI

The first Givens’ rotation is computed from

α0,0−µ

α1,0

, yielding γ1,0 and σ1,0 so that

γ1,0 −σ1,0

σ1,0 γ1,0

T α0,0−µI

α1,0

has a zero second entry. Now, to preserve eigenvalues, any orthogonal matrix that is applied from the leftmust also have its transpose applied from the right. Let us compute

α0,0 α1,0 α2,0 0

α1,0 α1,1 α1,2 0

α2,0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

=

γ1,0 σ1,0 0 0

−σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

α0,0 α0,1 0 0

α1,0 α1,1 α1,2 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

γ1,0 −σ1,0 0 0

σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

.

Next, from

α1,0

α2,0


γ2,0 −σ2,0

σ2,0 γ2,0

T α1,0

α2,0

=

α1,0

0

.

Thenα0,0 α1,0 0 0

α1,0 α1,1α2,1 α3,1

0 α2,1 α2,2 α2,3

0 α3,1 α3,2 α3,3

=

1 0 0 0

0 γ2,0 σ2,0 0

0 −σ2,0 γ2,0 0

0 0 0 1

α0,0 α1,0 α2,0 0

α1,0 α1,1 α1,2 0

α2,0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

1 0 0 0

0 γ2,0 −σ2,0 0

0 σ2,0 γ2,0 0

0 0 0 1


From: Gene H Golub <[email protected]>Date: Sun, 19 Aug 2007 13:54:47 -0700 (PDT)Subject: John Francis, Co-Inventor of QR

Dear Colleagues,

For many years, I have been interested in meeting J G F Francis, one ofthe co-inventors of the QR algorithm for computing eigenvalues of generalmatrices. Through a lead provided by the late Erin Brent and with the aidof Google, I finally made contact with him.

John Francis was born in 1934 in London and currently lives in Hove, nearBrighton. His residence is about a quarter mile from the sea; he is awidower. In 1954, he worked at the National Research Development Corp(NRDC) and attended some lectures given by Christopher Strachey.In 1955,’56 he was a student at Cambridge but did not complete a degree.He then went back to NRDC as an assistant to Strachey where he gotinvolved in flutter computations and this led to his work on QR.

After leaving NRDC in 1961, he worked at the Ferranti Corp and then at theUniversity of Sussex. Subsequently, he had positions with variousindustrial organizations and consultancies. He is now retired. Hisinterests were quite general and included Artificial Intelligence,computer languages, systems engineering. He has not returned to numericalcomputation.

He was surprised to learn there are many references to his work andthat the QR method is considered one of the ten most importantalgorithms of the 20th century. He was unaware of such developments asTeX and Math Lab. Currently he is working on a degree at the OpenUniversity.

John Francis did remarkable work and we are all in his debt. Along withthe conjugate gradient method, it provided us with one of the basic toolsof numerical analysis.

Gene Golub

Figure 15.7: Posting by the late Gene Golub in NA Digest Sunday, August 19, 2007 Volume 07 : Issue34. An article on the ten most important algorithms of the 20th century, published in SIAM News, can befound at http://www.uta.edu/faculty/rcli/TopTen/topten.pdf.


again preserves eigenvalues. Finally, from

α2,1

α3,1


γ3,1 −σ3,1

σ3,1 γ3,1

T α2,1

α3,1

=

α2,1

0

.

Thenα0,0 α1,0 0 0

α1,0 α1,1 α2,1 0

0 α2,1 α2,2 α2,3

0 0 α3,2 α3,3

=

1 0 0 0

0 1 0 0

1 0 γ3,2 σ3,2

0 1 −σ3,2 γ3,2

α0,0 α1,0 0 0

α1,0 α1,1α2,1 α3,1

0 α2,1 α2,2 α2,3

0 α3,1 α3,2 α3,3

1 0 0 0

0 1 0 0

1 0 γ3,1 −σ3,1

0 1 σ3,1 γ3,1

The matrix Q is the orthogonal matrix that results from multiplying the different Givens’ rotations to-

gether:

Q =

γ1,0 −σ1,0 0 0

σ1,0 γ1,0 0 0

0 0 1 0

0 0 0 1

1 0 0 0

0 γ2,0 −σ2,0 0

0 σ2,0 γ2,0 0

0 0 0 1

1 0 0 0

0 1 0 0

0 0 γ3,1 −σ3,1

0 0 σ3,1 γ3,1

.

It is important to note that the first columns of Q is given by

γ1,0

σ1,0

0

0

, which is exactly the same first

column had Q been computed as in Section 15.6 (Equation 15.2). Thus, by the Implicit Q Theorem, thetridiagonal matrix that results from this approach is equal to the tridiagonal matrix that would be computedby applying the QR factorization from Section 15.6 with A−µI, A−µI→ QR followed by the formationof RQ+µI using the algorithm for computing RQ in Section 15.6.

The successive elimination of elements αi+1,i is often referred to as chasing the bulge while the entireprocess that introduces the bulge and then chases it is known as a Francis Implicit QR Step. Obviously,the method generalizes to matrices of arbitrary size, as illustrated in Figure 15.8. An algorithm for thechasing of the bulge is given in Figure 17.4. (Note that in those figures T is used for A, something thatneeds to be made consistent in these notes, eventually.) In practice, the tridiagonal matrix is not stored asa matrix. Instead, its diagonal and subdiagonal are stored as vectors.

15.7.4 A complete algorithm

This last section shows how one iteration of the QR algorithm can be performed on a tridiagonal matrixby implicitly shifting and then “chasing the bulge”. All that is left to complete the algorithm is to note that

• The shift µk can be chosen to equal αn−1,n−1 (the last element on the diagonal, which tends toconverge to the eigenvalue smallest in magnitude). In practice, choosing the shift to be an eigenvalueof the bottom-right 2×2 matrix works better. This is known as the Wilkinson Shift.


××××××××××××××××××××××

+

+

Beginning of iteration

TT L T TML

TML TMM T TBM

TBM TBR

↓ ↓××××××××××××××××××××××

+

+

Repartition

T00 t10

T T10 τ11 tT

21

t21 T22 t32

tT32 τ33 tT

43

t43 T44

↓ ↓××××××××××××××××××××××

+

+ Update

T00 t10

tT10 τ11 tT

21

t21 T22 t32

tT32 τ33 tT

43

t43 T44

↓ ↓××××××××××××××××××××××

+

+ End of iteration

TT L T TML

TML TMM T TBM

TBM TBR

Figure 15.8: One step of “chasing the bulge” in the implicitly shifted symmetric QR algorithm.

• If an element of the subdiagonal (and corresponding element on the superdiagonal) becomes smallenough, it can be considered to be zero and the problem deflates (decouples) into two smaller tridi-agonal matrices. Small is often taken to means that |αi+1,i| ≤ ε(|αi,i|+ |αi+1,i+1|) where ε is somequantity close to the machine epsilon (unit roundoff).

• If A = QT QT reduced A to the tridiagonal matrix T before the QR algorithm commensed, then theGivens’ rotations encountered as part of the implicitly shifted QR algorithm can be applied from the


Algorithm: T := CHASEBULGE(T )

Partition T →

TT L ? ?

TML TMM ?

0 TBM TBR

whereTT L is 0×0 and TMM is 3×3

while m(TBR)> 0 do

Repartition

TT L ? 0

TML TMM ?

0 TBM TBR

→

T00 ? 0 0 0

tT10 τ11 ? 0 0

0 t21 T22 ? 0

0 0 tT32 τ33 ?

0 0 0 t43 T44

whereτ11 and τ33 are scalars(during final step, τ33 is 0×0)

Compute (γ,σ) s.t. GTγ,σt21 =

τ21

0

, and assign t21 =

τ21

0

T22 = GT

γ,σT22Gγ,σ

tT32 = tT

32Gγ,σ (not performed during final step)

Continue with

TT L ? 0

TML TMM ?

0 TBM TBR

←

T00 ? 0 0 0

tT10 τ11 ? 0 0

0 t21 T22 ? 0

0 0 tT32 τ33 ?

0 0 0 t43 T44

endwhile

Figure 15.9: Chasing the bulge.

right to the appropriate columns of Q so that upon completion Q is overwritten with the eigenvectorsof A. Notice that applying a Givens’ rotation to a pair of columns of Q requires O(n) computationper Givens rotation. For each Francis implicit QR step O(n) Givens’ rotations are computed, mak-ing the application of Givens’ rotations to Q of cost O(n2) per iteration of the implicitly shifted QRalgorithm. Typically a few (2-3) iterations are needed per eigenvalue that is uncovered (by defla-tion) meaning that O(n) iterations are needed. Thus, the QR algorithm is roughly of cost O(n3)if the eigenvalues are accumulated (in addition to the cost of forming the Q from the reduction totridiagonal form, which takes another O(n3) operations.)


• If an element on the subdiagonal becomes zero (or very small), and hence the corresponding elementof the superdiagonal, then the problem can be deflated: If

T =

T00 0

0 T11

× ×× × ×× × ×× × ×× × 0

× ×0× × ×× ×

then

– The computation can continue separately with T00 and T11.

– One can pick the shift from the bottom-right of T00 as one continues finding the eigenvalues ofT00, thus accelerating the computation.

– One can pick the shift from the bottom-right of T11 as one continues finding the eigenvalues ofT11, thus accelerating the computation.

– One must continue to accumulate the eigenvectors by applying the rotations to the appropriatecolumns of Q.

Because of the connection between the QR algorithm and the Inverse Power Method, subdiagonalentries near the bottom-right of T are more likely to converge to a zero, so most deflation will happenthere.

• A question becomes when an element on the subdiagonal, τi+1,i can be considered to be zero. Theanswer is when |τi+1,i| is small relative to |τi| and |τi+1,i+1. A typical condition that is used is

|τi+1,i| ≤ u√|τi,iτi+1, i+1|.

• If A ∈ Cn×n is Hermitian, then its Spectral Decomposition is also computed via the following steps,which mirror those for the real symmetric case:

– Reduce to tridiagonal form. Householder transformation based similarity transformations canagain be used for this. This leaves one with a tridiagonal matrix, T , with real values along thediagonal (because the matrix is Hermitian) and values on the subdiagonal and superdiagonalthat may be complex valued.

– The matrix QT such A = QT T QHT can then be formed from the Householder transformations.

– A simple step can be used to then change this tridiagonal form to have real values even on thesubdiagonal and superdiagonal. The matrix QT can be updated accordingly.

– The tridiagonal QR algorithm that we described can then be used to diagonalize the matrix,accumulating the eigenvectors by applying the encountered Givens’ rotations to QT . This iswhere the real expense is: Apply the Givens’ rotations to matrix T requires O(n) per sweep.Applying the Givens’ rotation to QT requires O(n2) per sweep.

For details, see some of our papers mentioned in the next section.


15.8 Further Reading

15.8.1 More on reduction to tridiagonal form

The reduction to tridiagonal form can only be partially cast in terms of matrix-matrix multiplication [20].This is a severe hindrance to high performance for that first step towards computing all eigenvalues andeigenvector of a symmetric matrix. Worse, a considerable fraction of the total cost of the computation isin that first step.

For a detailed discussion on the blocked algorithm that uses FLAME notation, we recommend [43]

Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortı, G. Joseph Elizondo.Families of Algorithms for Reducing a Matrix to Condensed Form.ACM Transactions on Mathematical Software (TOMS) , Vol. 39, No. 1, 2012

(Reduction to tridiagonal form is one case of what is more generally referred to as “condensed form”.)

15.8.2 Optimizing the tridiagonal QR algorithm

As the Givens’ rotations are applied to the tridiagonal matrix, they are also applied to a matrix in whicheigenvectors are accumulated. While one Implicit Francis Step requires O(n) computation, this accumu-lation of the eigenvectors requires O(n2) computation with O(n2) data. We have learned before that thismeans the cost of accessing data dominates on current architectures.

In a recent paper, we showed how accumulating the Givens’ rotations for several Francis Steps allowsone to attain performance similar to that attained by a matrix-matrix multiplication. Details can be foundin [42]:

Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortı.Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance.ACM Transactions on Mathematical Software (TOMS), Vol. 40, No. 3, 2014.

15.9 Other Algorithms

15.9.1 Jacobi’s method for the symmetric eigenvalue problem

(Not to be mistaken for the Jacobi iteration for solving linear systems.)The oldest algorithm for computing the eigenvalues and eigenvectors of a matrix is due to Jacobi and

dates back to 1846 [26]. This is a method that keeps resurfacing, since it parallelizes easily. The operationcount tends to be higher (by a constant factor) than that of reduction to tridiagonal form followed by thetridiagonal QR algorithm.

The idea is as follows: Given a symmetric 2×2 matrix

A31 =

α11 α13

α31 α33

http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html


15.9. Other Algorithms 259

Sweep 1

× 0 × ×0 × × ×× × × ×× × × ×

→

× × 0 ×× × × ×0 × × ×× × × ×

→

× × × 0

× × × ×× × × ×0 × × ×

→

zero (1,0) zero (2,0) zero (3,0)

× × × 0

× × 0 ×× 0 × ×0 × × ×

→

× × × ×× × × 0

× × × ×× 0 × ×

→

× × × ×× × × ×× × × 0

× × 0 ×

→

zero (2,1) zero (3,1) zero (3,2)

Sweep 2

× 0 × ×0 × × ×× × × 0

× × 0 ×

→

× × 0 ×× × × ×0 × × ×× × × ×

→

× × × 0

× × × ×× × × ×0 × × ×

→

zero (1,0) zero (2,0) zero (3,0)

× × × 0

× × 0 ×× 0 × ×0 × × ×

→

× × × ×× × × 0

× × × ×× 0 × ×

→

× × × ×× × × ×× × × 0

× × 0 ×

→

zero (2,1) zero (3,1) zero (3,2)

Figure 15.10: Column-cyclic Jacobi algorithm.


There exists a rotation (which is of course unitary)

J31 =

γ11 −σ13

σ31 γ33

such that

J31A31JT31 =

γ11 −σ31

σ31 γ33

α11 α31

α31 α33

γ11 −σ31

σ31 γ33

T

=

α11 0

0 α33

.

We know this exists since the Spectral Decomposition of the 2×2 matrix exists. Such a rotation is calleda Jacobi rotation. (Notice that it is different from a Givens’ rotation because it diagonalizes a 2×2 matrixwhen used as a unitary similarity transformation. By contrast, a Givens’ rotation zeroes an element whenapplied from one side of a matrix.)

Homework 15.11 In the above discussion, show that α211 +2α2

31 +α233 = α2

11 + α233.

* SEE ANSWER

Jacobi rotation rotations can be used to selectively zero off-diagonal elements by observing the fol-lowing:

JAJT =

I 0 0 0 0

0 γ11 0 −σ31 0

0 0 I 0 0

0 σ31 0 γ33 0

0 0 0 0 I

A00 a10 AT20 a30 AT

40

aT10 α11 aT

21 α31 aT41

A20 a21 A22 a32 AT42

aT30 α31 aT

32 α33 aT43

A40 a41 A42 a43 A44

I 0 0 0 0

0 γ11 0 −σ13 0

0 0 I 0 0

0 σ31 0 γ33 0

0 0 0 0 I

T

=

A00 a10 AT20 a30 AT

40

aT10 α11 aT

21 0 aT41

A20 a21 A22 a32 AT42

aT30 0 aT

32 α33 aT43

A40 a41 A42 a43 A44

= A,

where γ11 −σ31

σ31 γ33

aT10 aT

21 aT41

aT30 aT

32 aT43

=

aT10 aT

21 aT41

aT30 aT

32 aT43

.

Importantly,

aT10a10 +aT

30a30 = aT10a10 + aT

30a30

aT21a21 +aT

32a32 = aT21a21 + aT

32a32

aT41a41 +aT

43a43 = aT41a41 + aT

43a43.

15.10. The Nonsymmetric QR Algorithm 261

What this means is that if one defines off(A) as the square of the Frobenius norm of the off-diagonalelements of A,

off(A) = ‖A‖2F −‖diag(A)‖2

F ,

then off(A) = off(A)−2α231.

• The good news: every time a Jacobi rotation is used to zero an off-diagonal element, off(A) de-creases by twice the square of that element.

• The bad news: a previously introduced zero may become nonzero in the process.

The original algorithm developed by Jacobi searched for the largest (in absolute value) off-diagonalelement and zeroed it, repeating this processess until all off-diagonal elements were small. The algorithmwas applied by hand by one of his students, Seidel (of Gauss-Seidel fame). The problem with this is thatsearching for the largest off-diagonal element requires O(n2) comparisons. Computing and applying oneJacobi rotation as a similarity transformation requires O(n) flops. Thus, for large n this is not practical.Instead, it can be shown that zeroing the off-diagonal elements by columns (or rows) also converges to adiagonal matrix. This is known as the column-cyclic Jacobi algorithm. We illustrate this in Figure 15.10.

15.9.2 Cuppen’s Algorithm

To be added at a future time.

15.9.3 The Method of Multiple Relatively Robust Representations (MRRR)

Even once the problem has been reduced to tridiagonal form, the computation of the eigenvalues andeigenvectors via the QR algorithm requires O(n3) computations. A method that reduces this to O(n2)time (which can be argued to achieve the lower bound for computation, within a constant, because the nvectors must be at least written) is achieved by the Method of Multiple Relatively Robust Representations(MRRR) by Dhillon and Partlett [13, 12, 14, 15]. The details of that method go beyond the scope of thisnote.

15.10 The Nonsymmetric QR Algorithm

The QR algorithm that we have described can be modified to compute the Schur decomposition of anonsymmetric matrix. We briefly describe the high-level ideas.

15.10.1 A variant of the Schur decomposition

Let A ∈ Rn×n be nonsymmetric. Recall:

• There exists a unitary matrix Q ∈ Cn×n and upper triangular matrix R ∈ Cn×n such that A = QRQH .Importantly: even if A is real valued, the eigenvalues and eigenvectors may be complex valued.

• The eigenvalues will come in conjugate pairs.

A variation of the Schur Decomposition theorem is


Theorem 15.12 Let A ∈ Rn×n. Then there exist unitary Q ∈ Rn×n and quasi upper triangular matrixR ∈ Rn×n such that A = QRQT .

Here quasi upper triangular matrix means that the matrix is block upper triangular with blocks of thediagonal that are 1×1 or 2×2.

Remark 15.13 The important thing is that this alternative to the Schur decomposition can be computedusing only real arithmetic.

15.10.2 Reduction to upperHessenberg form

The basic algorithm for reducing a real-valued nonsymmetric matrix to upperHessenberg form, overwrit-ing the original matrix with the result, can be explained similar to the explanation of the reduction of asymmetric matrix to tridiagonal form. We assume that the upperHessenberg matrix overwrites the upper-Hessenbert part of A and the Householder vectors used to zero out parts of A overwrite the entries that theyannihilate (set to zero).

• Assume that the process has proceeded to where A =

A00 a01 A02

aT10 α11 aT

12

0 a21 A22

with A00 being k×k and

upperHessenberg, aT10 being a row vector with only a last nonzero entry, the rest of the submatrices

updated according to the application of previous k Householder transformations.

• Let [u21,τ,a21] := HouseV(a21).3

• Update A00 a01 A02

aT10 α11 aT

12

0 a21 A22

:=

I 0 0

0 1 0

0 0 H

A00 a01 A02

aT10 α11 aT

12

0 a21 A22

I 0 0

0 1 0

0 0 H

=

A00 a01 A02H

aT10 α11 aT

12H

0 Ha21 HA22H

where H = H(u21).

– Note that a21 := Ha21 need not be executed since this update was performed by the instanceof HouseV above.4

3 Note that the semantics here indicate that a21 is overwritten by Ha21.4 In practice, the zeros below the first element of Ha21 are not actually written. Instead, the implementation overwrites these

elements with the corresponding elements of the vector u21.


Algorithm: [A, t] := HESSRED UNB(A)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB



Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

→

t0

τ1

t2

whereα11 is a scalar

[u21,τ1,a21] := HouseV(a21)

y01 := A02u21

A02 := A02− 1τy01uT

21

ψ11 := aT12u21

aT12 := aT

12 +ψ11

τuT

21

y21 := A22u21

β := uT21y21/2

z21 := (y21−βu21/τ1)/τ1

w21 := (AT22u21−βu21/τ)/τ

A22 := A22− (u21wT21 + z21uT

21) (rank-2 update)


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

←

t0

τ1

t2

endwhile

Figure 15.11: Basic algorithm for reduction of a nonsymmetric matrix to upperHessenberg form.


× × × × ×× × × × ×× × × × ×× × × × ×× × × × ×

−→

× × × × ×

× × × × ×0 × × × ×0 × × × ×0 × × × ×

−→

× × × × ×

× × × × ×

0 × × × ×0 0 × × ×0 0 × × ×

−→


× × × × ×× × × × ×

0 × × × ×

0 0 × × ×0 0 0 × ×

−→

× × × × ×× × × × ×0 × × × ×

0 0 × × ×

0 0 0 × ×

−→

Third iteration

Figure 15.12: Illustration of reduction of a nonsymmetric matrix to upperHessenbert form. The×s denotenonzero elements in the matrix.

– The update A02H requires

A02 := A02(I−1τ

u21uT21)

= A02−1τ

A02u21︸︷︷︸y01

uT21

= A02−1τ

y01uT21 (ger)

– The update aT12H requires

aT12 := aT

12(I−1τ

u21uT21)

= aT12−

1τ

aT12u21︸︷︷︸ψ11

uT21

= aT12−

(ψ11

τ

)uT

21 (axpy)

– The update of A22 requires

A22 := (I− 1τ

u21uT21)A22(I−

1τ

u21uT21)


= (A22−1τ

u21uT21A22)(I−

1τ

u21uT21)

= A22−1τ

u21uT21A22−

1τ

A22u21uT21 +

1τ2 u21 uT

21Au21︸︷︷︸2β

uT21

= A22−(

1τ

u21uT21A22−

β

τ2 u21uT21

)−(

1τ

Au21uT21−

β

τ2 u21uT21

)= A22−

1τ

u21

(uT

21A22−β

τuT

21

)︸︷︷︸

wT21

−(

Au21−β

τu21

)︸︷︷︸

z21

1τ

uT21

= A22−1τ

u21wT21−

1τ

z21uT21︸︷︷︸

rank-2 update

• Continue this process with the updated A.

It doesn’t suffice to only This is captured in the algorithm in Figure 15.11. It is also illustrated in Fig-ure 15.12.

Homework 15.14 Give the approximate total cost for reducing a nonsymmetric A ∈ Rn×n to upper-Hessenberg form.

* SEE ANSWER


Gregorio Quintana-Ortı and Robert A. van de Geijn.Improving the performance of reduction to Hessenberg form.ACM Transactions on Mathematical Software (TOMS) , Vol. 32, No. 2, 2006

and [43]


In those papers, citations to earlier work can be found.

15.10.3 The implicitly double-shifted QR algorithm

To be added at a future date!



Chapter 16Notes on the Method of Relatively RobustRepresentations (MRRR)

The purpose of this note is to give some high level idea of how the Method of Relatively Robust Repre-sentations (MRRR) works.

267

Chapter 16. Notes on the Method of Relatively Robust Representations (MRRR) 268

Outline

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

16.1. MRRR, from 35,000 Feet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

16.2. Cholesky Factorization, Again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

16.3. The LDLT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

16.4. The UDUT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

16.5. The UDUT Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

16.6. The Twisted Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

16.7. Computing an Eigenvector from the Twisted Factorization . . . . . . . . . . . . . . 279

16.1. MRRR, from 35,000 Feet 269

16.1 MRRR, from 35,000 Feet

The Method of Relatively Robust Representations (MRRR) is an algorithm that, given a tridiagonal matrix,computes eigenvectors associated with that matrix in O(n) time per eigenvector. This means it computesall eigenvectors in O(n2) time, which is much faster than the tridiagonal QR algorithm (which requiresO(n3) computation). Notice that highly accurate eigenvalues of a tridiagonal matrix can themselves becomputed in O(n2) time. So, it is legitimate to start by assuming that we have these highly accurateeigenvalues. For our discussion, we only need one.

The MRRR algorithm has at least two benefits of the symmetric QR algorithm for tridiagonal matrices:

• It can compute all eigenvalues and eigenvectors of tridiagonal matrix in O(n2) time (versus O(n3)time for the symmetric QR algorithm).

• It can efficiently compute eigenvectors corresponding to a subset of eigenvalues.

The benefit when computing all eigenvalues and eigenvectors of a dense matrix is considerably less be-cause transforming the eigenvectors of the tridiagonal matrix back into the eigenvectors of the dense matrixrequires an extra O(n3) computation, as discussed in [42]. In addition, making the method totally robusthas been tricky, since the method does not rely exclusively on unitary similarity transformations.

The fundamental idea is that of a twisted factorization of the tridiagonal matrix. This note builds upto what that factorization is, how to compute it, and how to then compute an eigenvector with it. Westart by reminding the reader of what the Cholesky factorization of a symmetric positive definite (SPD)matrix is. Then we discuss the Cholesky factorization of a tridiagonal SPD matrix. This then leads tothe LDLT factorization of an indefinite and indefinite tridiagonal matrix. Next follows a discussion of theUDUT factorization of an indefinite matrix, which then finally yields the twisted factorization. When thematrix is nearly singular, an approximation of the twisted factorization can then be used to compute anapproximate eigenvalue.

Notice that the devil is in the details of the MRRR algorithm. We will not tackle those details.

16.2 Cholesky Factorization, Again

We have discussed in class the Cholesky factorization, A = LLT , which requires a matrix to be symmetricpositive definite. (We will restrict our discussion to real matrices.) The following computes the Choleskyfactorization:

• Partition A→

α11 aT21

a21 A22

.

• Update α11 :=√

α11.

• Update a21 := a21/α11.

• Update A22 := A22−a21aT21 (updating only the lower triangular part).

• Continue to compute the Cholesky factorization of the updated A22.


Algorithm: A := CHOL(A)

Partition A→

AT L ?

ABL ABR

where AT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

α11 :=

√α11

a21 := a21/α11

A22 := A22−a21aT21

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Figure 16.1: Unblocked algorithm for computing the Cholesky factorization. Updates to A22 affect onlythe lower triangular part.

16.2. Cholesky Factorization, Again 271

Algorithm: A := CHOL TRI(A)

Partition A→

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

where AFF is 0×0

while m(AFF)< m(A) do

Repartition

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

→

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

α11 :=

√α11

α21 := α21/α11

α22 := α22−α221

Continue with

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

←

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

endwhile

Figure 16.2: Algorithm for computing the the Cholesky factorization of a tridiagonal matrix.


The resulting algorithm is given in Figure 16.1.In the special case where A is tridiagonal and SPD, the algorithm needs to be modified so that it can

take advantage of zero elements:

• Partition A→

α11 ? ?

α21 α22 ?

0 α32eF A33

, where ? indicates the symmetric part that is not stored. Here

eF indicates the unit basis vector with a “1” as first element.

• Update α11 :=√

α11.

• Update α21 := α21/α11.

• Update α22 := α22−α221.

• Continue to compute the Cholesky factorization of

α22 ?

α32eF A33

The resulting algorithm is given in Figure 16.2. In that figure, it helps to interpret F , M, and L as First,Middle, and Last. In that figure, eF and eL are the unit basis vectors with a ”1” as first and last element,respectively Notice that the Cholesky factor of a tridiagonal matrix is a lower bidiagonal matrix thatoverwrites the lower triangular part of the tridiagonal matrix A.

Naturally, the whole matrix needs not be stored. But that detail is not important for our discussion.

16.3 The LDLT Factorization

Now, one can alternatively compute A= LDLT , where L is unit lower triangular and D is diagonal. We willlook at how this is done first and will then note that this factorization can be computed for any indefinite(nonsingular) symmetric matrix. (Notice: the so computed L is not the same as the L computed by theCholesky factorization.) Partition

A→

α11 aT21

a21 A22

,L→

1 0

l21 L22

, and D→

δ1 0

0 D22

.

Then α11 aT21

a21 A22

=

1 0

l21 L22

δ1 0

0 D22

1 0

l21 L22

T

=

δ11 0

δ11l21 L22D22

1 lT21

0 LT22

=

δ11 ?

δ11l21 δ11l21lT21 +L22D22LT

22

16.3. The LDLT Factorization 273

Algorithm: A := LDLT(A)

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

whereα11 is 1×1

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Figure 16.3: Unblocked algorithm for computing A→ LDLT , overwriting A.

This suggests the following algorithm for overwriting the strictly lower triangular part of A with the strictlylower triangular part of L and the diagonal of A with D:

• Partition A→

α11 aT21

a21 A22

.

• α11 := δ11 = α11 (no-op).


• Update A22 := A22− l21aT21 (updating only the lower triangular part).

• a21 := l21.

• Continue with computing A22→ L22D22LT22.


This algorithm will complete as long as δ11 6= 0, which happens when A is nonsingular. This is equivalentto A does not having zero as an eigenvalue. Such a matrix is also called indefinite.

Homework 16.1 Modify the algorithm in Figure 16.1 so that it computes the LDLT factorization. (Fill inFigure 16.3.)

* SEE ANSWER

Homework 16.2 Modify the algorithm in Figure 16.2 so that it computes the LDLT factorization of atridiagonal matrix. (Fill in Figure 16.4.) What is the approximate cost, in floating point operations, ofcomputing the LDLT factorization of a tridiagonal matrix? Count a divide, multiply, and add/subtract asa floating point operation each. Show how you came up with the algorithm, similar to how we derived thealgorithm for the tridiagonal Cholesky factorization.

* SEE ANSWER

Notice that computing the LDLT factorization of an indefinite matrix is not a good idea when A isnearly singular. This would lead to some δ11 being nearly zero, meaning that dividing by it leads to verylarge entries in l21 and corresponding element growth in A22. (In other words, strange things will happen“down stream” from the small δ11.) Not a good thing. Unfortunately, we are going to need something likethis factorization specifically for the case where A is nearly singular.

16.4 The UDUT Factorization

The fact that the LU, MRRResky, and LDLT factorizations start in the top-left corner of the matrix andwork towards the bottom-right is an accident of history. Gaussian elimination works in that direction,hence so does LU factorization, and the rest kind of follow.

One can imagine, given a SPD matrix A, instead computing A = UUT , where U is upper triangular.Such a computation starts in the lower-right corner of the matrix and works towards the top-left. Similarly,an algorithm can be created for computing A =UDUT where U is unit upper triangular and D is diagonal.

Homework 16.3 Derive an algorithm that, given an indefinite matrix A, computes A=UDUT . Overwriteonly the upper triangular part of A. (Fill in Figure 16.5.) Show how you came up with the algorithm,similar to how we derived the algorithm for LDLT .

* SEE ANSWER

16.5 The UDUT Factorization

Homework 16.4 Derive an algorithm that, given an indefinite tridiagonal matrix A, computes A=UDUT .Overwrite only the upper triangular part of A. (Fill in Figure 16.6.) Show how you came up with the al-gorithm, similar to how we derived the algorithm for LDLT .

* SEE ANSWER

Notice that the UDUT factorization has the exact same shortcomings as does LDLT when appliedto a singular matrix. If D has a small element on the diagonal, it is again “down stream” that this cancause large elements in the factored matrix. But now “down stream” means towards the top-left since thealgorithm moves in the opposite direction.

16.5. The UDUT Factorization 275

Algorithm: A := LDLT TRI(A)

Partition A→

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

whereALL is 0×0


Repartition

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

→

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

whereα22 is a scalars

Continue with

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

←

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

endwhile

Figure 16.4: Algorithm for computing the the LDLT factorization of a tridiagonal matrix.


Algorithm: A := UDUT(A)

Partition A→

AT L AT R

? ABR

whereABR is 0×0

while m(ABR)< m(A) do

Repartition

AT L AT R

? ABR

→

A00 a01 A02

? α11 aT12

? ? A22

whereα11 is 1×1

Continue with

AT L AT R

? ABR

←

A00 a01 A02

? α11 aT12

? ? A22

endwhile

Figure 16.5: Unblocked algorithm for computing A = UDUT . Updates to A00 affect only the uppertriangular part.

16.5. The UDUT Factorization 277

Algorithm: A := UDUT TRI(A)

Partition A→

AFF αFMeT

L 0

? αMM αMLeTF

? ? ALL

whereAFF is 0×0

while m(ALL)< m(A) do

Repartition

AFF αFMeL 0

? αMM αMLeTF

? ? ALL

→

A00 α01eL 0 0

? α11 α12 0

? ? α22 α23eTF

? ? ? A33

where

Continue with

AFF αFMeL 0

? αMM αMLeTF

? ? ALL

←

A00 α01eL 0 0

? α11 α12 0

? ? α22 α23eTF

? ? ? A33

endwhile

Figure 16.6: Algorithm for computing the the UDUT factorization of a tridiagonal matrix.


16.6 The Twisted Factorization

Let us assume that A is a tridiagonal matrix and that we are given one of its eigenvalues (or rather, a verygood approximation), λ, and let us assume that this eigenvalue has multiplicity one. Indeed, we are goingto assume that it is well-separated meaning there is no other eigevalue close to it.

Here is a way to compute an associated eigenvector: Find a nonzero vector in the null space of B =A−λI. Let us think back how one was taught how to do this:

• Reduce B to row-echelon form. This may will require pivoting.

• Find a column that has no pivot in it. This identifies an independent (free) variable.

• Set the element in vector x corresponding to the independent variable to one.

• Solve for the dependent variables.

There are a number of problems with this approach:

• It does not take advantage of symmetry.

• It is inherently unstable since you are working with a matrix, B = A−λI that inherently has a badcondition number. (Indeed, it is infinity if λ is an exact eigenvalue.)

• λ is not an exact eigenvalue when it is computed by a computer and hence B is not exactly singular.So, no independent variables will even be found.

These are only some of the problems. To overcome this, one computes something called a twisted factor-ization.

Again, let B be a tridiagonal matrix. Assume that λ is an approximate eigenvalue of B and let A =B−λI. We are going to compute an approximate eigenvector of B associated with λ by computing a vectorthat is in the null space of a singular matrix that is close to A. We will assume that λ is an eigenvalue ofB that has multiplicity one, and is well-separated from other eigenvalues. Thus, the singular matrix closeto A has a null space with dimension one. Thus, we would expect A = LDLT to have one element on thediagonal of D that is essentially zero. Ditto for A =UEUT , where E is diagonal and we use this letter tobe able to distinguish the two diagonal matrices.

Thus, we have

• λ is an approximate (but not exact) eigenvalue of B.

• A = B−λI is indefinite.

• A = LDLT . L is bidiagonal, unit lower triangular, and D is diagonal with one small element on thediagonal.

• A =UEUT . U is bidiagonal, unit upper triangular, and E is diagonal with one small element on thediagonal.

16.7. Computing an Eigenvector from the Twisted Factorization 279

Let

A =

A00 α10eL 0

α10eTL α11 α21eT

F

0 α21eF A22

,L =

L00 0 0

λ10eTL 1 0

0 λ21eF L22

,U =

U00 υ01eL 0

0 1 υ21eTF

0 0 U22

,

D =

D00 0 0

0 δ1 0

0 0 D22

, and E =

E00 0 0

0 ε1 0

0 0 E22

,

where all the partitioning is “conformal”.

Homework 16.5 Show that, provided φ1 is chosen appropriately,L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 φ1 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

=

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

.

(Hint: multiply out A = LDLT and A = UEUT with the partitioned matrices first. Then multiply out theabove. Compare and match...) How should φ1 be chosen? What is the cost of computing the twistedfactorization given that you have already computed the LDLT and UDUT factorizations? A ”Big O”estimate is sufficient. Be sure to take into account what eT

L D00eL and eTFE22eF equal in your cost estimate.

* SEE ANSWER

16.7 Computing an Eigenvector from the Twisted Factorization

The way the method now works for computing the desired approximate eigenvector is to find the φ1 that issmallest in value of all possible such values. In other words, you can partition A, L, U , D, and E in manyways, singling out any of the diagonal values of these matrices. The partitioning chosen is the one thatmakes φ1 the smallest of all possibilities. It is then set to zero so that

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 0 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

≈

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

.

Because λ was assumed to have multiplicity one and well-separated, the resulting twisted factorizationhas “nice” properties. We won’t get into what that exactly means.


Homework 16.6 Compute x0, χ1, and x2 so thatL00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 0 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

x0

χ1

x2

︸︷︷︸

Hint:

0

1

0

=

0

0

0

,

where x =

x0

χ1

x2

is not a zero vector. What is the cost of this computation, given that L00 and U22 have

special structure?* SEE ANSWER

The vector x that is so computed is the desired approximate eigenvector of B.Now, if the eigenvalues of B are well separated, and one follows essentially this procedure to find

eigenvectors associated with each eigenvalue, the resulting eigenvectors are quite orthogonal to each other.We know that the eigenvectors of a symmetric matrix with distinct eigenvalues should be orthogonal, sothis is a desirable property.

The approach becomes very tricky when eigenvalues are “clustered”, meaning that some of them arevery close to each other. A careful scheme that shifts the matrix is then used to make eigenvalues in acluster relatively well separated. But that goes beyond this note.

Chapter 17Notes on Computing the SVD of a Matrix

For simplicity we will focus on the case where A ∈ Rn×n. We will assume that the reader has read Chap-ter 15.

281

Chapter 17. Notes on Computing the SVD of a Matrix 282

Outline

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

17.1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

17.2. Reduction to Bidiagonal Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

17.3. The QR Algorithm with a Bidiagonal Matrix . . . . . . . . . . . . . . . . . . . . . . 287

17.4. Putting it all together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

17.1. Background 283

17.1 Background

Recall:

Theorem 17.1 If A ∈ Rm×m then there exists unitary matrices U ∈ Rm×m and V ∈ Rn×n, and diagonalmatrix Σ ∈ Rm×n such that A =UΣV T . This is known as the Singular Value Decomposition.

There is a relation between the SVD of a matrix A and the Spectral Decomposition of AT A:

Homework 17.2 If A=UΣV T is the SVD of A then AT A=V Σ2V T is the Spectral Decomposition of AT A.* SEE ANSWER

The above exercise suggests steps for computing the Reduced SVD:

• Form C = AT A.

• Compute unitary V and diagonal Λ such that C = QΛQT , ordering the eigenvalues from largest tosmallest on the diagonal of Λ.

• Form W = AV . Then W = UΣ so that U and Σ can be computed from W by choosing the diago-nal elements of Σ to equal the lengths of the corresponding columns of W and normalizing thosecolumns by those lengths.

The problem with this approach is that forming AT A squares the condition number of the problem. Wewill show how to avoid this.

17.2 Reduction to Bidiagonal Form

In the last chapter, we saw that it is beneficial to start by reducing a matrix to tridiagonal or upperHessen-berg form when computing the Spectral or Schur decompositions. The corresponding step when comput-ing the SVD of a given matrix is to reduce the matrix to bidiagonal form.

We assume that the bidiagnoal matrix overwrites matrix A. Householder vectors will again play a role,and will again overwrite elements that are annihilated (set to zero).

• Partition A→

α11 aT12

a21 A22

. In the first iteration, this can be visualized as

× × × ×

× × × ×× × × ×× × × ×× × × ×

• We introduce zeroes below the “diagonal” in the first column by computing a Householder transfor-mation and applying this from the left:


Algorithm: [A, t,r] := BIDRED UNB(A)

Partition A→

AT L AT R

ABL ABR

, t→

tT

tB

, r→

rT

rB



Repartition

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

→

t0

τ1

t2

,

rT

rB

→

r0

ρ1

r2

whereα11, τ1, and ρ1 are scalars

[

1

u21

,τ1,

α11

a21

] := HouseV(

α11

a21

)

aT12 := aT

12−wT12

z21 := (y21−βu21/τ1)/τ1

A22 := A22−u21wT21 (rank-1 update)

[v12,ρ1,a12] := HouseV(a12)

w21 := A22v12/ρ1

A22 := A22−w21vT12 (rank-1 update)


ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

tT

tB

←

t0

τ1

t2

,

rT

rB

←

r0

ρ1

r2

endwhile

Figure 17.1: Basic algorithm for reduction of a nonsymmetric matrix to bidiagonal form.

17.2. Reduction to Bidiagonal Form 285

× × × ×× × × ×× × × ×× × × ×× × × ×

−→

× × 0 0

0 × × ×0 × × ×0 × × ×0 × × ×

−→

× × 0 0

0 × × 0

0 0 × ×0 0 × ×0 0 × ×

−→


× × 0 0

0 × × 0

0 0 × ×

0 0 0 ×0 0 0 ×

−→

× × 0 0

0 × × 0

0 0 × ×

0 0 0 ×

0 0 0 0

Third iteration Fourth iteration

Figure 17.2: Illustration of reduction to bidiagonal form form.

– Let [

1

u21

,τ1,

α11

a21

] := HouseV(

α11

a21

). This not only computes the House-

holder vector, but also zeroes the entries in a21 and updates α11.

– Update α11 aT12

a21 A22

:= (I− 1τ1

1

u21

1

u21

T

)

α11 aT12

a21 A22

=

α11 aT12

0 A22

where

* α11 is updated as part of the computation of the Householder vector;

* wT12 := (aT

12 +uT21A22)/τ1;

* aT12 := aT

12−wT12; and

* A22 := A22−u21wT12.

This introduces zeroes below the ”diagonal” in the first column:

× × × ×

× × × ×× × × ×× × × ×× × × ×

−→

× × × ×

0 × × ×0 × × ×0 × × ×0 × × ×


The zeroes below α11 are not actually written. Instead, the implementation overwrites theseelements with the corresponding elements of the vector uT

21.

• Next, we introduce zeroes in the first row to the right of the element on the superdiagonal by com-puting a Householder transformation from aT

12:

– Let [v12,ρ1,a12] := HouseV(a12). This not only computes the Householder vector, but alsozeroes all but the first entry in aT

12, updating that first entry.

– Update aT12

A22

:=

aT12

A22

(I− 1ρ1

v12vT12) =

α12eT0

A22

where

* α12 equals the first element of the updated aT12;

* w21 := A22v12/ρ1;

* A22 := A22−w21vT12.

This introduces zeroes to the right of the ”superdiagonal” in the first row:

× × × ×

0 × × ×0 × × ×0 × × ×0 × × ×

−→

× × 0 0

0 × × ×0 × × ×0 × × ×0 × × ×

– The zeros to the right of the first element of aT12 are not actually written. Instead, the imple-

mentation overwrites these elements with the corresponding elements of the vector vT12.

• Continue this process with the updated A22.

The above observations are captured in the algorithm in Figure 17.1. It is also illustrated in Figure 17.2.

Homework 17.3 Homework 17.4 Give the approximate total cost for reducing A ∈ Rn×n to bidiagonalform.

* SEE ANSWER




17.3. The QR Algorithm with a Bidiagonal Matrix 287

17.3 The QR Algorithm with a Bidiagonal Matrix

Let B =UTB AVB be bidiagonal where UB and VB have orthonormal columns.

Lemma 17.5 Let B ∈ Rm×n be upper bidiagonal and T ∈ Rn×n with T = BT B. Then T is tridiagonal.

Proof: Partition

B =

B00 β10el

0 β11

,

where el denotes the unit basis vector with a “1” in the last entry. Then

BT B =

B00 β10el

0 β11

T B00 β10el

0 β11

=

BT00 0

β10eTl β11

B00 β10el

0 β11

=

BT00B00 β10BT

00el

β10eTl B β2

10 +β211

,

where β10BT00el is (clearly) a vector with only a nonzero in the last entry.

The above proof also shows that if B has no zeroes on its superdiagonal, then neither does BT B.This means that one can apply the QR algorithm to matrix BT B. The problem, again, is that this squares

the condition number of the problem.The following extension of the Implicit Q Theorem comes to the rescue:

Theorem 17.6 Let C,B ∈ Rm×n. If there exist unitary matrices U ∈ Rm×m and V ∈ Rn×n so that B =UTCV is upper bidiagonal and has positive values on its superdiagonal then B and V are uniquely deter-mined by C and the first column of V .

The above theorem supports an algorithm that, starting with an upper bidiagonal matrix B, implicitlyperforms a shifted QR algorithm with T = BT B.

Consider the 5×5 bidiagonal matrix

B =

β0,0 β0,1 0 0 0

0 β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

.

Then T = BT B equals

T =

β20,0 ? 0 0 0

β0,1β0,0 ? ? 0 0

0 0 ? ? 0

0 0 0 ? ?

0 0 0 0 β23,4 +β2

4,4

,


where the ?s indicate “don’t care” entries. Now, if an iteration of the implicitly shifted QR algorithm wereexecuted with this matrix, then the shift would be µk = β2

3,4 +β24,4 (actually, it is usually computed from

the bottom-right 2×2 matrix, a minor detail) and the first Givens’ rotation would be computed so that γ0,1 σ0,1

−σ0,1 γ0,1

β20,0−µI

β0,1β1,1

has a zero second entry. Let us call the n×n matrix with this as its top-left submatrix G0. Then applyingthis from the left and right of T means forming G0T G0 = G0BT BGT

0 = (BGT0 )

T (BGT0 ). What if we only

apply it to B? Then

β0,0 β0,1 0 0 0

β1,0 β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

=

β0,0 β0,1 0 0 0

0 β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

γ0,1 −σ0,1 0 0 0

σ0,1 γ0,1 0 0 0

0 0 1 0 0

0 0 0 1 0

0 0 0 0 1

.

The idea now is to apply Givens’ rotation from the left and right to “chase the bulge” except that now the

rotations applied from both sides need not be the same. Thus, next, from

β0,0

β1,0

one can compute γ1,0

and σ1,0 so that

γ1,0 σ1,0

−σ1,0 γ1,0

β0,0

β1,0

=

˜β0,0

0

. Then

˜β0,0

˜β0,1 β02 0 0

0 ˜β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

=

γ1,0 σ1,0 0 0 0

−σ1,0 γ1,0 0 0 0

0 0 1 0 0

0 0 0 1 0

0 0 0 0 1

β0,0 β0,1 0 0 0

β1,0 β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

.

Continuing, from

˜β0,1

β0,2


γ0,2 σ0,2

−σ0,2 γ0,2

˜β0,1

β0,2

= ˜β0,2

0

. Then

˜β0,0

˜β0,1 0 0 0

0˜˜

β1,1 β1,2 0 0

0 β2,1 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

=

˜β0,0

˜β0,1 β02 0 0

0 ˜β1,1 β1,2 0 0

0 0 β2,2 β2,3 0

0 0 0 β3,3 β3,4

0 0 0 0 β4,4

1 0 0 0 0

0 γ1,0 −σ1,0 0 0

0 σ1,0 γ1,0 0 0

0 0 1 0 0

0 0 0 0 1

And so forth, as illustrated in Figure 17.3.

17.3. The QR Algorithm with a Bidiagonal Matrix 289

× × 0 0 0

× × × 0 0

0 0 × × 0

0 0 0 × ×0 0 0 0 ×

−→

× × × 0 0

0 × × 0 0

0 0 × × 0

0 0 0 × ×0 0 0 0 ×

−→

× × 0 0 0

0 × × 0 0

0 × × × 0

0 0 0 × ×0 0 0 0 ×

−→

× × 0 0 0

0 × × × 0

0 0 × × 0

0 0 0 × ×

0 0 0 0 ×

−→

× × 0 0 0

0 × × 0 0

0 0 × × 0

0 0 × × ×

0 0 0 0 ×

−→

× × 0 0 0

0 × × 0 0

0 0 × × ×

0 0 0 × ×

0 0 0 0 ×

−→

× × 0 0 0

0 × × 0 0

0 0 × × 0

0 0 0 × ×

0 0 0 × ×

−→

× × 0 0 0

0 × × 0 0

0 0 × × 0

0 0 × ×

0 0 0 0 ×

Figure 17.3: Illustration of one sweep of an implicit QR algorithm with a bidiagonal matrix.


17.4 Putting it all together

17.4. Putting it all together 291

Algorithm: B := CHASEBULGEBID(B)

Partition B→

BT L BT M ?

0 BMM BMR

0 0 BBR

whereBT L is 0×0 and BMM is 2×2

while m(BBR)≥ 0 do

Repartition

BT L BT M ?

0 BMM BMR

0 0 BBR

→

B00 β01el 0 0 0

0 β11 β12 0 0

0 β21 β22 β23 0

0 0 0 β33 β34eT0

0 0 0 0 B44

whereτ11 and τ33 are scalars(during final step, τ33 is 0×0)

Compute (γL1 ,σ

L1) s.t.

γL1 σL

1

−σL1 γL

1

β1,1

β2,1

=

β1,1

0

overwriting β1,1 with β1,1 β1,2 β1,3

β2,2 β2,3

:=

γL1 σL

1

−σL1 γL

1

β1,2 0

β2,2 β2,3

if m(BBR) 6= 0

Compute (γR1 ,σ

R1 ) s.t.

γR1 σR

1

−σR1 γR

1

β1,2

β1,3

=

β1,2

0

overwriting β1,2 with β1,2β1,2 0

β2,2 β2,3

β3,2 β3,3

:=

β1,2 β1,3

β2,2 β2,3

0 β3,3

γR

1 −σR1

σR1 γR

1

Continue with

BT L BT M ?

0 BMM BMR

0 0 BBR

←

B00 β01el 0 0 0

0 β11 β12 0 0

0 β21 β22 β23 0

0 0 0 β33 β34eT0

0 0 0 0 B44

endwhile

Figure 17.4: Chasing the bulge.

Chapter 18Notes on Splitting Methods

Video




293

https://utexas.box.com/s/qhtd4mi0muacrazszcx7

Chapter 18. Notes on Splitting Methods 294

Outline

Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

18.1. A Simple Example: One-Dimensional Boundary Value Problem . . . . . . . . . . . 295

18.2. A Two-dimensional Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

18.2.1. Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296

18.3. Direct solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

18.4. Iterative solution: Jacobi iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

18.4.1. Motivating example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

18.4.2. More generally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

18.5. The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

18.6. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

18.6.1. Some useful facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

18.1. A Simple Example: One-Dimensional Boundary Value Problem 295

18.1 A Simple Example: One-Dimensional Boundary Value Problem

A computational science application typically starts with a physical problem. This problem is describedby a law that governs the physics. Such a law can be expressed as a (continuous) mathematical equationthat must be satisfied. Often it is impossible to find an explicit solution for this equation and hence it isapproximated by discretizing the problem. At the bottom of the food chain, a linear algebra problem oftenappears.

Let us illustrate this for a very simple, one-dimensional problem. A example of a one-dimensionalPoisson equation with Derichlet boundary condition on the domain [0,1] can be described as

u′′(x) = f (x) where u(0) = u(1) = 0.

The Derichlet boundary condition is just a fancy way of saying that on the boundary (at x = 0 and x = 1)function u is restricted to take on the value 0. Think of this as a string where there is some driving force(e.g., a sound wave hitting the wave) that is given by a continuous function f (x), meaning that u is twicecontinuously differentiable on the interior (for 0 < x < 1). The ends of the string are fixed, giving riseto the given boundary condition. The function u(x) indicates how far from rest (which would happen iff (x) = 0) the string is displaced at point x. There are a few details missing here, such as the fact that u andf are probably functions of time as well, but let’s ignore that.

Now, we can transform this continuous problem into a discretized problem by partitioning the interval[0,1] into N +1 equal intervals of length h and looking at the points x0,x1, . . . ,xN+1 where xi = ih, whichare the nodes where the intervals meet. We now instead compute υi ≈ u(xi). In our discussion, we will letφi = f (xi) (which is exactly available, since f (x) is given).

Now, we recall that

u′(x)≈ u(xi +h)−u(xi)

h=

υi+1−υi

h.

Also,

u′′(x)≈ u′(xi +h)−u′(xi)

h≈

u(xi+h)−u(xi)h − u(xi)−u(x−1)

hh

=υi−1−2υi +υi+1

h2 .

So, one can approximate our problem with υi−1−2υi +υi+1 = h2φi 0 < i≤ N

υ0 = υN+1 = 0

or, equivalently, −2υ1 +υ2 = h2φ1

υi−1−2υi +υi+1 = h2φi 1 < i < N

υN−1−2υN = h2φN

The above compactly expresses the system of linear equations

−2υ1+ υ2 = h2φ1

υ1−2υ2+ υ3 = h2φ2

υ2−2υ3+ υ4 = h2φ3. . . . . . . . . =

...

υN−1−2υN = h2φN


or, in matrix notation,

−2 1

1 −2 1

1 −2 1. . . . . . . . .

1 −2

υ1

υ2

υ3...

υN

=

h2φ1

h2φ2

h2φ3...

h2φN

.

Thus, one can express this discretized problem as

Au = h2 f ,

where A is the indicated tridiagonal matrix.

18.2 A Two-dimensional Example

A more typical computational engineering or sciences application starts with a Partial Differential Equa-tion (PDE) that governs the physics of the problem. This is the higher dimensional equivalent of the lastexample. In this note, we will use one of the simplest, Laplace’s equation. Consider

−∆u = f

which in two dimensions is,

−∂2u∂x2 −

∂2u∂y2 = f (x,y)

with boundary condition ∂Ω = 0 (meaning that u(x,y) = 0 on the boundary of domain Ω). For example,the domain may be 0 ≤ x,y ≤ 1, ∂Ω its boundary, and the question may be a membrane with, again, fbeing some force from a sound wave.

18.2.1 Discretization

To solve the problem computationally with a computer, the problem is again discretized. Relating backto the problem of the membrane on the unit square in the previous section, this means that the contin-uous domain is viewed as a mesh instead, as illustrated in Figure 18.1. In that figure, υi will equal thedisplacement from rest.

Now, just like before, we will let φi be the value of f (x,y) at the mesh point i. One can approximate

∂2u(x,y)∂x2 ≈ u(x−h,y)−2u(x,y)+u(x+h,y)

h2

and∂2u(x,y)

∂y2 ≈ u(x,y−h)−2u(x,y)+u(x,y+h)h2

so that

−∂2u∂x2 −

∂2u∂y2 = f (x,y)

18.2. A Two-dimensional Example 297

υ1 υ2 υ3 υ4

υ5 υ6 υ7 υ8

υ9 υ10 υ11 υ12

υ13 υ14 υ15 υ16

Figure 18.1: Discretized domain, yielding a mesh. The specific ordering of the elements of u, by rows,is known as the “natural ordering”. If one ordered differently, for example randomly, this would inducea permutation on the rows and columns of matrix A, since it induces a permutation on the order of theindices.

becomes

−u(x−h,y)+2u(x,y)−u(x+h,y)h2 − u(x,y−h)+2u(x,y)−u(x,y+h)

h2 = f (x,y)

or, equivalently,

−u(x−h,y)−u(x,y−h)+4u(x,y)−u(x+h,y)−u(x,y+h)h2 = f (x,y).

If (x,y) corresponds to the point i in a mesh where the interior points form a N×N grid, this translates tothe system of linear equations

−υi−N−υi−1 +4υi−υi+1−υi+N = h2φi.

This can be rewritten asυi = h2

φi−υi−N +υi−1 +υi+1 +υi+N

4or

4υ1 − υ2 − υ5 = h2φ0

−υ1 +4υ2− υ3 −υ6 = h2φ6

− υ2+4υ3− υ4 −υ7 = h2φ2

− υ3+4υ4 −υ8 = h2φ3

−υ1 +4υ5 −υ6 −υ9 = h2φ5. . . . . . . . . . . . . . . =

...

(18.1)


In matrix notation this becomes

4 −1 −1

−1 4 −1 −1

−1 4 −1 −1

−1 4 −1

−1 4 −1 −1

−1 −1 4 −1 . . .

−1 −1 4 −1

−1 −1 4

−1 4 . . .. . . . . . . . .

υ1

υ2

υ3

υ4

υ5

υ6

υ7

υ8

υ9...

=

h2φ1

h2φ2

h2φ3

h2φ4

h2φ5

h2φ6

h2φ7

h2φ8

h2φ9...

. (18.2)

This now shows how solving the discretized Laplace’s equation boils down to the solution of a linearsystem Au = h2 f (= b), where A has a distinct sparsity pattern (pattern of nonzeroes).

Remark 18.1 The point is that many problems that arise in computational science require the solution toa system of linear equations Au = b where A is a (very) sparse matrix.

Definition 18.2 Wilkinson defined a sparse matrix as any matrix with enough zeros that it pays to takeadvantage of them.

18.3 Direct solution

It is tempting to simply use a dense linear solver to compute the solution to Au = b. This would requireO(n3) operations, where n equals the size of matrix A. One can take advantage of the fact that beyond theouter band of “−1”s the matrix is entire zeroes to reduce the cost to O(nB2), where B equals the width ofthe banded matrix. And there are sparse direct methods that try to further reduce fill-in of zeroes to bringthe number of operations required to factor to matrix. Details of these go beyond the scope of this note.

18.4 Iterative solution: Jacobi iteration

It is more common to solve sparse problems like our model problem via iterative methods which generatea sequence of vectors u(k), k = 0, . . .. The first vector, u(0), represents an initial guess of what the solutionis. The hope is that u(k) converges to the solution u, in other words, that eventually u(k) becomes arbitrarilyclose to the solution. More precisely, we would like for it to be the case that ‖u(k)−u‖→ 0 for some norm‖ · ‖, where u is the true solution.

18.4. Iterative solution: Jacobi iteration 299

18.4.1 Motivating example

Consider Au = b. If we guessed perfectly, then u(0) would solve Ax = b and it would be the case that

υ(0)1 = (b1 +υ

(0)2 +υ

(0)5 )/4

υ(0)2 = (b2 +υ

(0)1 +υ

(0)3 +υ

(0)6 )/4

υ(0)3 = (b3 +υ

(0)2 +υ

(0)4 +υ

(0)7 )/4

υ(0)4 = (b4 +υ

(0)3 +υ

(0)8 )/4

......

Naturally, that is a bit optimistic. Therefore, a new approximation to the solution is created by computing

υ(1)1 = (b1 +υ

(0)2 +υ

(0)5 )/4

υ(1)2 = (b2 +υ

(0)1 +υ

(0)3 +υ

(0)6 )/4

υ(1)3 = (b3 +υ

(0)2 +υ

(0)4 +υ

(0)7 )/4

υ(1)4 = (b4 +υ

(0)3 +υ

(0)8 )/4

......

In other words, a new guess for the displacement ui at point i is generated by the values at the pointsaround it, plus some contribution from βi (which itself incorporates contributions from φi in our example).

This process can be used to compute a sequence of vectors u(0),u(1), . . . by similarly computing u(k+1)

from u(k):

υ(k+1)1 = (b1 +υ

(k)2 +υ

(k)5 )/4

υ(k+1)2 = (b2 +υ

(k)1 +υ

(k)3 +υ

(k)6 )/4

υ(k+1)3 = (b3 +υ

(k)2 +υ

(k)4 +υ

(k)7 )/4

υ(k+1)4 = (b4 +υ

(k)3 +υ

(k)8 )/4

......

This can be rewritten as

4υ(k+1)1 = υ

(k)2 +υ

(k)5 +β1

4υ(k+1)2 =υ

(k)1 +υ

(k)3 +υ

(k)6 +β2

4υ(k+1)3 = υ

(k)2 +υ

(k)4 +υ

(k)7 +β3

4υ(k+1)4 = +υ

(k)3 +υ

(k)8 +β4

4υ(k+1)5 =υ

(k)1 +υ

(k)6 +υ

(k)9 +β5

=. . . . . . . . . . . . ...



4

υ(k+1)1

υ(k+1)2

υ(k+1)3

υ(k+1)4

υ(k+1)5

υ(k+1)6

υ(k+1)7

υ(k+1)8

υ(k+1)9

...

=

0 1 1

1 0 1 1

1 0 1 1

1 0 1

1 0 1 1

1 1 0 1 . . .

1 1 0 1

1 1 0

1 0 . . .. . . . . . . . .

υ(k)1

υ(k)2

υ(k)3

υ(k)4

υ(k)5

υ(k)6

υ(k)7

υ(k)8

υ(k)9...

+

β1

β2

β3

β4

β5

β6

β7

β8

β9...

.

Typically, the physics of the problem dictates that this iteration will eventually yield a vector u(k) that isarbitrarily close to the solution of Au = b. More on the mathematical reasons for this later. In other words,the vectors u(k) will converge to u: limk→∞ u(k) = u or limk→∞ ‖u(k)−u‖= 0.

18.4.2 More generally

The above discussion can be more concisely described as follows. Split matrix A = L+D+U , where Land U are the strictly lower and upper triangular parts of A, respectively, and D its diagonal. Now,

Au = b

implies that(L+D+U)u = b

oru =−D−1[(L+U)u+b].

This is an example of a fixed-point equation: Plug u into −D−1[(L+U)u+ b] and the result is again u.The iteration is then created by viewing the vector on the left as the next guess given the guess for u onthe right:

u(k+1) =−D−1[(L+U)u(k)+b].

Now let us see how to extract a more detailed algorithm from this. Partition, conformally,

A =

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,D =

D00 0 0

0 δ11 0

0 0 D22

,L =

L00 0 0

lT10 0 0

L20 l21 L22

, (18.3)

U =

U00 υ01 U02

0 0 υT12

0 0 U22

,u(k) =

υ(k)0

υ(k)1

υ(k)2

, and b(k) =

b0

β1

b2

. (18.4)

18.5. The general case 301

ConsiderD00 0 0

0 δ11 0

0 0 D22

x(k+1)0

υ(k+1)1

x(k+1)2

=−

L00 0 0

lT10 0 0

L20 l21 L22

+

U00 υ01 U02

0 0 υ12

0 0 U22

υ(k)0

υ(k)1

υ(k)2

+

b0

β1

b2

.

Thenδ11υ

(k+1)1 = β1− lT

10υ(k)0 −υ

T12υ

(k)2 .

Now, for the Jacobi iteration δ11 = α11, lT10 = aT

10, and υ12 = aT12, so that

υ(k+1)1 = (β1−aT

10υ(k)0 −aT

12υ(k)2 )/α11.

This suggests the algorithm in Figure 18.2. For those who prefer indices (and for these kinds of iterations,indices often clarify rather than obscure), this can be described as

υ(k+1)i =

1αii

βi− ∑j 6= i

αi j 6= 0

αi jυ(k)j

.

Remark 18.3 It is not hard to see that, for the example of the discretized square, this yields exactly thecomputations that compute u(k+1) from u(k) since there

−(L+U) =

0 1 1

1 0 1 1

1 0 1 1

1 0 1

1 0 1 1

1 1 0 1 . . .

1 1 0 1

1 1 0

1 0 . . .. . . . . . . . .

and D = 4I.

18.5 The general case

The Jacobi iteration is a special case of a family of method known as Splitting Methods. It starts witha splitting of a nonsingular matrix A into A = M−N. One next observes that Au = b is equivalent to(M−N)u = b, which is equivalent to Mu = Nu+b or u = M−1(Nu+b). Notice that M−1 is not explicitly


Algorithm:[x(k+1)

]:= SPLITTINGMETHODITERATION(A,x(k),b)

Partition x(k+1)→

x(k+1)T

x(k+1)B

, A→

AT L AT R

ABL ABR

, x(k)→

x(k)T

x(k)B

, b→

bT

bB

whereAT L is 0×0, x(k+1)

T , x(k)T , bT have 0 elements

while m(x(k+1)T )< m(x(k+1)) do

Repartition

x(k+1)T

x(k+1)B

→

x(k+1)0

χ(k+1)1

x(k+1)2

,

AT L AT R

ABL ABR

→

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

x(k)T

xy(k)B

→

x(k)0

χ(k)1

x(k)2

,

bT

bB

→

b0

β1

b2

whereχ

(k+1)1 , α11, χ

(k)1 , and β1 are scalars

Jacobi iteration Gauss-Seidel iteration

χ(k+1)1 = (βi−aT

10x(k)0 −aT12x(k)2 )/α11 χ

(k+1)1 = (βi−aT

10x(k+1)0 −aT

12x(k)2 )/α11

Continue with x(k+1)T

x(k+1)B

←

x(k+1)0

χ(k+1)1

x(k+1)2

,

AT L AT R

ABL ABR

←

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

,

x(k)T

x(k)B

←

x(k)0

χ(k)1

x(k)2

,

bT

bB

←

b0

β1

b2

endwhile

Figure 18.2: Various splitting methods (one iteration x(k+1) = M−1(Nx(k)+b)).

formed: one solves with M instead. Now, a vector u is the solution of Au = b if and only if it satisfiesu = M−1(Nu+b). The idea is to start with an initial guess (approximation) of u, u(0), and to then iterateto compute u(k+1) = M−1(Nu(k)+b), which requires a multiplication with N and a solve with M in eachiteration.


Example 18.4 Let A∈Cn×n and split this matrix A = L+D+U, where L and U are the strictly lower andupper triangular parts of A, respectively, and D its diagonal. Then choosing M = D and N = −(L+U)yields the Jacobi iteration.

The convergence of these methods is summarized by the following theorem:

Theorem 18.5 Let A ∈ Cn×n be nonsingular and x,b ∈ Cn so that Au = b. Let A = M−N be a splittingof A, u(0) be given (an initial guess), and u(k+1) = M−1(Nu(k)+b). If ‖M−1N‖< 1 for some matrix norminduced by the ‖ · ‖ vector norm, then u(k) will converge to the solution u.

Proof: Notice Au = b implies that (M−N)u = b implies that Mu = Nu+b and hence u = M−1(Nu+b).Hence

u(k+1)−u = M−1(Nu(k)+b)−M−1(Nu+b) = M−1N(u(k)−u).

Taking norms of both sides, one gets that

‖u(k+1)−u‖= ‖M−1N(u(k)−u)‖ ≤ ‖M−1N‖‖u(k)−u‖.

Since this holds for all k, once can deduce that

‖u(k)−u‖ ≤ ‖M−1N‖k‖u(0)−u‖.

If ‖M−1N‖< 1 then the right-hand side of this will converge to zero, meaning that u(k) converges to u.Now, often one checks convergence by computing the residual r(k) = b−Au(k). After all, if r(k) is

approximately the zero vector, then u(k) approximately solves Au = b. The following corollary links theconvergence of r(k) to the convergence of u(k).

Corollary 18.6 Let A ∈ Cn×n be nonsingular and u,b ∈ Cn so that Au = b. Let u(k) ∈ Cn and let r(k) =b−Au(k). Then u(k) converges to u if and only if r(k) converges to the zero vector.

Proof: Assume that u(k) converges to u. Note that r(k) = b−Au(k) = Au−Au(k) = A(u−u(k)). Since A isnonsingular, clearly r(k) converges to the zero vector if and only if u(k) converges to u.

Corollary 18.7 Let A ∈ Cn×n be nonsingular and u,b ∈ Cn so that Au = b. Let A = M−N be a splittingof A, u(0) be given (an initial guess), u(k+1) = M−1(Nu(k)+ b), and r(k) = b−Au(k). If ‖M−1N‖ < 1 forsome matrix norm induced by the ‖ · ‖ vector norm, then r(k) will converge to the zero vector.

Now let us see how to extract a more detailed algorithm from this. Partition, conformally,

A =

A00 a01 A02

aT10 α11 aT

12

A20 a21 A22

, M =

M00 m01 M02

mT10 µ11 mT

12

M20 m21 M22

, N =

N00 n01 N02

nT10 ν11 nT

12

N20 n21 N22

,

u(k) =

x(k)0

υ(k)1

x(k)2

, and b =

b0

β1

b2

.


Consider M00 m01 M02

mT10 µ11 mT

12

M20 m21 M22

x(k+1)0

υ(k+1)1

x(k+1)2

=

N00 n01 N02

nT10 ν11 nT

12

N20 n21 N22

x(k)0

υ(k)1

x(k)2

+

b0

β1

b2

.

Then

mT10x(k+1)

0 +µ11υ(k+1)1 +mT

12x(k+1)2 = nT

10x(k)0 +ν11υ(k)1 +nT

12x(k)2 +β1.

Jacobi iteration

Example 18.8 For the Jacobi iteration mT10 = 0, µ11 =α11, mT

12 = 0, nT10 =−aT

10, ν11 = 0, and nT12 =−aT

10,so that

υ(k+1)1 = (β1−aT

10x(k)0 −aT12x(k)2 )/α11.

This suggests the algorithm in Figure 18.2.

Gauss-Seidel iteration A modification of the Jacobi iteration is inspired by the observation that inEqns. (18.1)–(18.2) when computing υi one could use the new (updated) values for υ j, j < i:

υ(k+1)1 = (β1 +υ

(k)2 +υ

(k)5 )/4

υ(k+1)2 = (β2 +υ

(k+1)1 +υ

(k)3 +υ

(k)6 )/4

υ(k+1)3 = (β3 +υ

(k+1)2 +υ

(k)4 +υ

(k)7 )/4

υ(k+1)4 = (β4 +υ

(k+1)3 +υ

(k)8 )/4

υ(k+1)5 = (β5 +υ

(k+1)1 +υ

(k)6 +υ

(k)9 )/4

......

This can be rewritten as

4υ(k+1)1 = υ

(k)2 +υ

(k)5 +β1

4υ(k+1)2 =υ

(k+1)1 + υ

(k)3 +υ

(k)6 +β2

4υ(k+1)3 = +υ

(k+1)2 +υ

(k)4 +υ

(k)7 +β3

4υ(k+1)4 = +υ

(k+1)3 +υ

(k)8 +β4

4υ(k+1)5 =υ

(k+1)1 +υ

(k)6 +υ

(k)9 +β5

=. . . . . . . . . . . . ...



4

−1 4

−1 4

−1 4

−1 4

−1 −1 4

−1 −1 4

−1 −1 4

−1 4. . . . . . . . .

υ(k+1)1

υ(k+1)2

υ(k+1)3

υ(k+1)4

υ(k+1)5

υ(k+1)6

υ(k+1)7

υ(k+1)8

υ(k+1)9

...

=

0 1 1

0 0 1 1

0 0 1 1

0 0 1

0 0 1 1

0 0 0 1 . . .

0 0 0 1

0 0 0

0 0 . . .. . . . . . . . .

υ(k)1

υ(k)2

υ(k)3

υ(k)4

υ(k)5

υ(k)6

υ(k)7

υ(k)8

υ(k)9...

+b.

Again, split A = L+D+U and partition these matrices as in (18.3) and (18.4). Consider

D00 +L00 0 0

lT10 δ11 0

L20 l21 D22 +L22

x(k+1)0

υ(k+1)1

x(k+1)2

=−

U00 υ01 U02

0 0 υ12

0 0 U22

x(k)0

υ(k)1

x(k)2

+

b0

β1

b2

.

Then

lT10x(k+1)

0 +δ11υ(k+1)1 = β1−υ

T12x(k)2 .

Now, for the Gauss-Seidel iteration δ11 = α11, lT10 = aT

10, and υ12 = aT12, so that

υ(k+1)1 = (β1−aT

10x(k+1)0 −aT

12x(k)2 )/α11.


This suggests the algorithm in Figure 18.2. For those who prefer indices, this can be described as

υ(k+1)i =

1αii

βi−i−1

∑j = 0

αi j 6= 0

αi jυ(k+1)j −

n−1

∑j = i+1αi j 6= 0

αi jυ(k)j

.

Notice that this fits the general framework since now M = L+D and N =−U .

Successive Over Relaxation (SOR) Consider the update as performed by the Gauss-Seidel iteration

υGS(k+1)1 = (β1−aT

10x(k+1)0 −aT

12x(k)2 )/α11.

Next, one can think ofυ

GS(k+1)1 −υ

(k)1

as a direction in which the solution is changing. Now, one can ask the question “What if we go a littlefurther in that direction?” In other words, One can think of this as

υ(k+1)1 = υ

(k)1 +α[υ

GS(k+1)1 −υ

(k)1 ] = ωυ

GS(k+1)1 +(1−ω)υ

(k)1 ,

for some relaxation parameter ω = 1+α. Going further would mean picking ω > 1. Then

υ(k+1)1 = ω[β1−aT

10x(k+1)0 −aT

12x(k)2 )/α11]+ (1−ω)υ(k)1

= (ωβ1−ωaT10x(k+1)

0 −ωaT12x(k)2 )/α11 +(1−ω)υ

(k)1 ,

which can be rewritten as

ωaT10x(k+1)

0 +α11υ(k+1)1 = (1−ω)α11υ

(k)1 −ωaT

12x(k)2 +ωβ1.

or

aT10x(k+1)

0 +α11

ωυ(k+1)1 =

1−ω

ωα11υ

(k)1 −aT

12x(k)2 +β1.

Now, if one once again partitions A = L+D+U and one takes M =(L+ 1

ωD)

and N =(1−ω

ωD−U

)then

it can be easily checked that Mu(k+1) = Nu(k)+b. Thus,

‖u(k)−u‖ ≤ ‖(L+1ω

D)−1(1−ω

ωD−U)‖k‖u(0)−u‖= ‖(ωL+D)−1((1−ω)D−ωU)‖k‖u(0)−u‖.

The idea now is that by choosing ω carefully,

‖(ωL+D)−1((1−ω)D−ωU)‖

can be made smaller, meaning that the convergence is faster.

18.6. Theory 307

Symmetric Successive Over Relaxation Gauss-Seidel and SOR update the unknowns in what is calledthe natural ordering meaning that one updates υ0,υ1, · · · in that prescribed order, using the most recentlyupdated values and, in the case of SOR, extrapolating the update. The iteration can be made symmetric byupdating first in the natural ordering and then updating similarly, but in reverse natural ordering.

Then this is equivalent to choosing MF =(L+ 1

ωD)

and NF =(1−ω

ωD−U

)(the Forward splitting)

and MB =(U + 1

ωD)

and NB =(1−ω

ωD−L

)(the Backward splitting). Then

MFu(k+1/2) = NFu(k)+bMBu(k+1) = NBu(k+1/2)+b,

or,

u(k+1) = M−1B [NBu(k+1/2)+b]

= M−1B [NBM−1

F [NFu(k)+b]+b]

=

(U +

1ω

D)−1

︸︷︷︸M−1

B

(

1−ω

ωD−L

)︸︷︷︸

NB

(L+

1ω

D)−1

︸︷︷︸M−1

F

(

1−ω

ωD−U

)︸︷︷︸

NF

u(k)+b

+b

=

(U +

1ω

D)−1

[(2−ω

ωD− (L+

1ω

D)

)(L+

1ω

D)−1[(1−ω

ωD−U

)u(k)+b

]+b

]

=

(U +

1ω

D)−1

[(2−ω

ωD(

L+1ω

D)−1

− I

)[(1−ω

ωD−U

)u(k)+b

]+b

]

= (ωU +D)−1[((1−ω)D−ωL)(ωL+D)−1

[((1−ω)D−ωU)u(k)+b

]+b]

= (ωU +D)−1[((1−ω)D−ωL)(ωL+D)−1

[((1−ω)D−ωU)u(k)+ωb

]+ωb

]

(One of these days I’ll work out all the details!)

18.6 Theory

18.6.1 Some useful facts

Recall that for A∈Cn×n its spectral radius is denoted by ρ(A). It equals to the magnitude of the eigenvalueof A that is largest in magnitude.

Theorem 18.9 Let ‖·‖ be matrix norm induced by a vector norm ‖·‖. Then for any A∈Cm×n ρ(A)≤‖A‖.

Homework 18.10 Prove the above theorem.* SEE ANSWER


Theorem 18.11 Given a matrix A ∈ Cn×n and ε > 0, there exists a consistent matrix norm ‖ · ‖ such that‖A‖ ≤ ρ(A)+ ε.

Proof: Let

A =UTUH =(

u0 U1

) ρ(A) tT01

0 T11

( u0 U1

)H

be a Schur decomposition of A. Define ‖ · ‖ by

‖X‖=

∥∥∥∥∥∥UT XU

1 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

.

(You can show that this is a matrix norm.) Then

‖A‖ =

∥∥∥∥∥∥UT AU

1 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ ρ(A) tT

01

0 T11

1 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ ρ(A) 0

0 0

+

0 tT01

0 T11

1 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

=

∥∥∥∥∥∥ ρ(A) 0

0 0

1 0

0 ε/‖A‖2I

+

0 tT01

0 T11

1 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

≤

∥∥∥∥∥∥ ρ(A) 0

0 0

∥∥∥∥∥∥2

+

∥∥∥∥∥∥ 0 tT

01

0 T11

0 0

0 ε/‖A‖2I

∥∥∥∥∥∥2

≤

∥∥∥∥∥∥ ρ(A) 0

0 0

∥∥∥∥∥∥2

+ ε

∥∥∥∥∥∥ 0 tT

01

0 T11

∥∥∥∥∥∥2

∥∥∥∥∥∥ 0 0

0 1/‖A‖2I

∥∥∥∥∥∥2

≤

∥∥∥∥∥∥ ρ(A) 0

0 0

∥∥∥∥∥∥2

+ ε

∥∥∥∥∥∥ ρ(A) tT

01

0 T11

∥∥∥∥∥∥2

∥∥∥∥∥∥ 0 0

0 1/‖A‖2I

∥∥∥∥∥∥2

≤ ρ(A)+ ε‖A‖2/‖A‖2 = ρ(A)+ ε.

Theorem 18.12 The iterationx(k+1) = M−1(Nk(k)+b)

converges for any x(0) if and only ifρ(M−1N)< 1.

Proof: Recall that for any consistent matrix norm ‖ · ‖

‖x(k)− x‖ ≤ ‖M−1N‖k‖x(0)− x‖

.

18.6. Theory 309

(⇐) Assume ρ(M−1N) < 1. Pick ε such that ρ(M−1N)+ ε < 1. Finally, pick ‖ · ‖ such that ‖M−1N‖ ≤ρ(M−1N)+ ε. Then ‖M−1N‖< 1 and hence the sequence converges.

(⇒) We will show that if ρ(M−1N)≥ 1 then there exists a x(0) such that the iteration does not converge.

Let y 6= 0 have the property that M−1Ny = ρ(M−1N)y. Choose x(0) = y+ x where Ax = b. Then

‖x(1)− x‖ = ‖M−1N(x(0)− x)‖= ‖M−1Ny‖= ‖ρ(M−1N)y‖= ρ(M−1N)‖y‖= ρ(M−1N)‖x(0)− x‖ ≥ ‖x(0)− x‖.

Extending this, one finds that for all k, ‖x(k)−x‖= ‖M−1N(x(0)−x)‖. Thus, the sequence does notconverge.

Definition 18.13 A matrix A ∈ Cn×n is said to be diagonally dominant if for all i, 0≤ i < n

|αi,i| ≥∑j 6=i|αi, j|.

It is said to be strictly diagonally dominant if for all i, 0≤ i < n

|αi,i|> ∑j 6=i|αi, j|.

Theorem 18.14 Gershgorin Disc Theorem Let A ∈ Cn×n. Define

Di =

β : |β−αii| ≤∑

j 6=i|αi j|

Then λ ∈ Λ(A) implies there exists a k such that λ ∈Dk.

Proof: Let λ ∈ Λ(A) and u 6= 0 be such that Au = λu. Then (λI−A)u = 0. Let k be such that |υk| =maxn−1

j=0 |υ j|. In other words, υk is the element in u of largest magnitude. Then, looking at the kth row of(λI−A)u we find that

(λ−αkk)υk−∑j 6=k

αk, jυ j = 0.

Hence

|λ−αkk|=

∣∣∣∣∣∑j 6=kαk, j

υ j

υk

∣∣∣∣∣≤ ∑j 6=k|αk, j|

∣∣∣∣υ j

υk

∣∣∣∣≤ ∑j 6=k|αk, j|

which implies that λ ∈Dk.

Example 18.15 In the case of the example in the previous section, where A is given by (18.2) and thesplitting yields the Jacobi iteration, the Gershgorin Disk Theorem implies that ‖M−1N‖2 ≤ 1. We wouldlike to prove that ‖M−1N‖2 < 1, which is true, but a little trickier to prove...

Definition 18.16 A matrix A ∈ Cn×n is said to be reducible if there exists a permutation matrix P suchthat

PAPT =

TT L TT R

0 TBR

,

where TT L (and hence TBR) is square of size b× b where 0 < b < n. A matrix that is not reducible isirreducible.


In other words, a matrix is reducible if and only if there exists a symmetric permutation of the rows andcolumns that leaves the matrix block upper triangular.

Homework 18.17 A symmetric matrix A ∈ Cn×n is reducible if and only if there exists a permutationmatrix P such that

PAPT =

TT L 0

0 TBR

,

where TT L (and hence TBR) is square of size b×b where 0 < b < n.

In other words, a symmetric matrix is reducible if and only if there exists a symmetric permutation of therows and columns that leaves the matrix block diagonal.

Theorem 18.18 If A is strictly diagonally dominant, then the Jacobi iteration converges.

Proof: Let

A =

α0,0 0 · · · 0

0 α1,1 · · · 0...

... . . . ...

0 0 · · · αn−1,n−1

︸︷︷︸

M

−

0 −α0,0 · · · −α0,n−1

−α1,0 0 · · · −α1,n−1...

... . . . ...

−αn−1,0 −αn−1,1 · · · 0

︸︷︷︸

N

.

Then

M−1N =

α0,0 0 · · · 0

0 α1,1 · · · 0...

... . . . ...

0 0 · · · αn−1,n−1

−10 −α0,1 · · · −α0,n−1

−α1,0 0 · · · −α1,n−1...

... . . . ...

−αn−1,0 −αn−1,1 · · · 0

=

0 −α0,1

α0,0· · · −α0,n−1

α0,0

−α1,0α1,1

0 · · · −α1,n−1α1,1

...... . . . ...

− αn−1,0αn−1,n−1

− αn−1,1αn−1,n−1

· · · 0

Since A is strictly diagonally dominant, we know that

∑j 6=i

|αi, j||αi, j|

< 1.

By the Gershgorin Disk Theorem we therefore know that ρ(M−1N) < 1. Hence we know there exists anorm ‖ · ‖ such that ‖M−1N‖< 1, and hence the Jacobi iteration converges.

18.6. Theory 311

Theorem 18.19 If A is strictly diagonally dominant, then the Gram-Schmidt iteration converges.

Proof:

Theorem 18.20 If A is weakly diagonally dominant, irreducible, and has at least one row for which

|αk,k|> ∑j 6=k|αk, j|,

then the Jacobi and Gram-Schmidt iterations converge.

Theorem 18.21 If A is SPD then SOR converges iff 0 < ω < 2.

Proofs of these last few observations will be added to this notes in the future.

Chapter 19Notes on Descent Methods and the ConjugateGradient Method

These notes are quite incomplete. More to come in the future!

In this note, we assume that the n× n matrix A is Symmetric Positive Definite. We are interested initerative solving the linear system Ax = b. We will do so by creating a sequence of approximate solutions,x(0),x(1), . . . that, hopefully, converge to the true solution x.

313

Chapter 19. Notes on Descent Methods and the Conjugate Gradient Method 314

Outline

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

19.1. Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

19.2. Descent Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

19.3. Relation to Splitting Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

19.4. Method of Steepest Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

19.5. Preconditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318

19.6. Methods of A-conjugate Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318

19.7. Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

19.1. Basics 315

19.1 Basics

The basic idea is to look at the problem of solving Ax = b instead as the minimization problem that findsthe vector x that minimizes the function f (x) = 1

2xT Ax− xT b.

Theorem 19.1 Let A be SPD and assume that Ax = b. Then the vector x minimizes the function f (y) =12yT Ay− yT b.

Proof: Let Ax = b. Then

f (y) =12

yT Ay− yT b =12

yT Ay− yT Ax

=12

yT Ay− yT Ax+12

xT Ax− 12

xT Ax

=12(y− x)T A(y− x)− 1

2xT Ax.

Since xT Ax is independent of y, this is clearly minimized when y = x.An alternative way to look at this is to note that the function is minimized when the gradient is zero

and that ∇ f (x) = Ax−b.

19.2 Descent Methods

The basic idea behind a descent method is that at the kth iteration one has an approximation to x, x(k). Onewould like to create a better approximation, x(k+1). To do so, one picks a search direction, p(k), and letsx(k+1) = x(k)+αk p(k). In other words, one searches for a minimum along a line defined by the currentiterate, x(k), and the search direction, p(k). One then picks αk so that, preferrably, f (x(k+1)) ≤ f (x(k)).Typically, one picks αk to exactly minimize the function along the line (leading to exact descent methods).Now

f (x(k+1)) = f (x(k)+αk p(k)) =12(x(k)+αk p(k))T A(x(k)+αk p(k))− (x(k)+αk p(k))T b

=12

x(k)T Ax(k)+αk p(k)T Ax(k)+12

α2k p(k)T Ap(k)− x(k)T b−αk p(k)T b

=12

x(k)T Ax(k)− x(k)T b+αk p(k)T Ax(k)+12

α2k p(k)T Ap(k)−αk p(k)T b

= f (x(k))+12

α2k p(k)T Ap(k)+αk p(k)T (Ax(k)−b)

= f (x(k))+12

p(k)T Ap(k)α2k− p(k)T r(k)αk,

where r(k) = b−Ax(k), the residual. This is a quadratic equation in αk (since this is the only free variable).Thus, minimizing this expression exactly requires the deriviative with respect to αk to be zero:

0 =d f (x(k)+αk p(k))

dαk= p(k)T Ap(k)αk− p(k)T r(k).


Given: A, b, x(0), p(0) Given: A, b, x(0), p(0) Given: A, b, x, p

r(0) = b−Ax(0) r(0) = b−Ax(0) r = b−Ax

for k = 0,1, · · · for k = 0,1, · · · for k = 0,1, · · ·q = Ap

αk =p(k)T r(k)

p(k)T Ap(k)αk =

p(k)T r(k)

p(k)T Ap(k)α = pT r

pT q

x(k+1) = x(k)+αk p(k) x(k+1) = x(k)+αk p(k) x = x+αp

r(k+1) = b−Ax(k+1) r(k+1) = r(k)−αkAp(k) r = r−αq

p(k+1) = next direction p(k+1) = next direction p = next direction

endfor endfor endfor

Figure 19.1: Three basic descent method. The formulation on the middles follows from the one on the leftby virtue of b−Ax(k+1) = b−A(x(k)+αk p(k)) = r(k)−αkAp(k). It is preferred because it requires onlyone matrix-vector multiplication per iteration (to form Ap(k). The one of the right overwrites the variousvectors to reduce storage.

and hence, for exact descent methods,

αk =p(k)T r(k)

p(k)T Ap(k)and x(k+1) = x(k)+αk p(k).

A question now becomes how to pick the search directions p(0), p(1), . . ..Basic decent methods based on these ideas are given in Figure 19.1. What is missing in those algo-

rithms is a stopping criteria. Notice that αk = 0 if p(k)T r(k) = 0. But this does not necessarily mean that wehave converged, since it merely means that p(k) is orthogonal to r(k). What we will see is that commonlyused methods pick p(k) not to be orthogonal to r(k). Then p(k)T r(k) = 0 implies r(k) = 0 and this condition,cheap to compute, can be used as a stopping criteria.

19.3 Relation to Splitting Methods

Let us do something really simple:

• Pick p(k) = ek modn.

• p(0) = e0.

• p(0)T Ap(0) = eT0 Ae0 = α0,0 (the (0,0) element in A).

• r(0) = Ax(0)−b.

• p(0)T r(0) = eT0 (Ax(0)−b) = eT

0 Ax(0)− eT0 b = aT

0 x(0)−β0, where ai denotes the ith row of A.

19.4. Method of Steepest Descent 317

• x(1) = x(0)+α0 p(0) = x(0)+ p(0)T r(0)

p(0)T Ap(0)e0 = x(0)+ β0−aT

0 x(0)

α0,0e0.. This means that only the first element

of x(0) changes, and it changes to

χ(1)0 = χ

(0)0 +

1α0,0

(β0−

n−1

∑j=0

α0, jχ(0)j

)=

1α0,0

(β0−

n−1

∑j=1

α0, jχ(0)j

).

Careful contemplation then reveals that this is exactly how the first element in vector x is changedin the Gauss-Seidel method!

We leave it to the reader to similarly deduce that x(n) is exactly the vector one gets from applying one stepof Gauss-Seidel (updating all elements of x once):

x(n) = D−1((L+LT )x(0)+b)

where D and−(L+LT ) equal the diagonal and off-diagonal parts of A so that A = M−N = D− (L+LT ).Given that choosing the search directions cyclically as e0,e1, . . . yields the Gauss-Seidel iteration,

one can image picking x(k+1) = x(k)+ωαk p(k), with some relaxation factor ω. This would yield SOR ifwe continue to use, cyclically, the search directions e0,e1, . . ..

When the next iterate is chosen to equal the exact minimum along the line defined by the searchdirection, the method is called an exact descent direction and it is, clearly, guaranteed that f (x(k+1)) ≤f (x(k)). When a relaxation parameter ω 6= 1 is used, this is no longer the case, and the method is calledinexact.

19.4 Method of Steepest Descent

For a function f : Rn→R that we are trying to minimize, for a given x, the direction in which the functionmost rapidly increases in value at x is given by the gradient,

∇ f (x).

Thus, the direction in which is decreases most rapidly is

−∇ f (x).

For our functionf (x) =

12

xT Ax− xT b

this direction of steepest descent is given by

−∇ f (x) =−(Ax−b) = b−Ax.

Thus, recalling that r(k) = b−Ax(k), the direction of steepest descent at x(k) is given by p(k) = r(k) =b−Ax(k), the residual, so that the iteration becomes

x(k+1) = x(k)+p(k)T r(k)

p(k)T Ap(k)︸︷︷︸αk

r(k) = x(k)+p(k)T r(k)

p(k)T q(k)︸︷︷︸αk

r(k),

where q(k) = Ap(k). We again notice that

r(k+1) = b−Ax(k+1) = b−A(x(k)+αk p(k)) = b−Ax(k)−αkApk = r(k)−αkq(k).

These insights explain the algorithm in Figure 19.2 (left).


19.5 Preconditioning

For general nonlinear f (x), using the direction of steepest descent as the search direction is often effective.Not necessarily so for our problem, if A is relatively ill-conditioned.

A picture clarifies this (on the blackboard). The key insight is that if κ(A) = λ0/λn−1 (the ratio betweenthe largest and smallest eigenvalues or, equivalently, the ratio between the largest and smallest singularvalue), then convergence can take many iterations.

What would happen if λ0 = · · · = λn−1? Then A = QΛQT is the spectral decomposition of A andA = Q(λ0I)QT = λ0I and

x(1) = x(0)− r(0)T r(0)

r(0)T λ0Ir(0)r(0) = x(0)− 1

λ0r(0) = x(0)− 1

λ0(λ0x(0)−b) = x(0)− x(0)+

1λ0

b =1λ0

b,

which is the solution to λ0Ix = b. Thus, the iteration converges in one step.The point we are trying to make is that if A is well-conditioned, then the method of steepest descent

converges faster. Now, Ax = b is equivalent to M−1Ax = M−1b. Hence, one can define a new problemwith the same solution and, hopefully, a better condition number by letting A = M−1A and b = M−1b. Theproblem is that our methods are defined for SPD matrices. Generally speaking, M−1A will not be SPD.

If one chooses M to be SPD with Cholesky factorization M = LLT , then one can also transform theproblem into L−1AL−T LT x = L−1b. One then solves Ax = b where A = L−1AL−T , x = LT x, and b = L−1band eventually transforms the solution of this back to solution of the original problem by solving LT x = x.If M is chosen carefully, κ(L−1AL−T ) can be greatly improved. The best choice would be M = A, ofcourse, but that is not realistic. The matrix M is called the preconditioner. Some careful rearrangementtakes the method of steepest decent on the transformed problem to the much simpler preconditioned algo-rithm in Figure 19.2.

19.6 Methods of A-conjugate Directions

We now borrow heavily from the explanation in Golub and Van Loan [21].If we start with x(0) = 0, then

x(k+1) = α0 p(0)+ · · ·+αk p(k).

Thus, x(k+1) ∈ S(p(0), . . . , p(k)). It would be nice if each x(k+1) satisfied

f (x(k+1)) = minx∈S(p(0),...,p(k))

f (x) (19.1)

and the search directions were linearly independent. The primary reason is that then the descent directionmethod is guaranteed to complete in at most n iterations, since then

S(p(0), . . . , p(n−1)) = Rn

so thatf (x(n)) = min

x∈S(p(0),...,p(n−1))f (x) = min

x∈Rnf (x)

so that Ax(n) = b.

19.6. Methods of A-conjugate Directions 319

Given: A, b, x(0) Given: A, b, x(0) Given: A, b, x(0)

A = L−1AL−T , b = L−1b, x(0) = LT x(0)

r(0) = b − Ax(0), p(0) =r(0)

r(0) = b− Ax(0), p(0) = r(0) r(0) = b−Ax(0), p(0) = M−1r(0)

for k = 0,1, · · · for k = 0,1, · · · for k = 0,1, · · ·q(k) = Ap(k) q(k) = Ap(k) q(k) = Ap(k)

αk =p(k)T r(k)

p(k)T q(k)αk =

p(k)T r(k)

p(k)T q(0)αk =

p(0)T r(0)

p(0)T q

x(k+1) = x(k)+αk p(k) x(k+1) = x(k)+ αk p(k) x(k+1) = x(k)+αk p(k)

r(k+1) = r(k+1)−αkq(k) r(k+1) = r(k)− αkq(k) r(k+1) = r(k)−αkq(k)

p(k+1) = r(k+1) p(k+1) = r(k+1) p(k+1) = M−1r(k+1)

x(k+1) = L−T x(k+1)

endfor endfor endfor

Figure 19.2: Left: method of steepest decent. Middle: method of steepest decent with transformed prob-lem. Right: preconditioned method of steepest decent. It can be checked that the x(k) computed bythe middle algorithm is exactly the x(k) computed by the one on the right. Of course, the computationx(k+1) = L−T x(k+1) needs only be done once, after convergence, in the algorithm in the middle.

Unfortunately, the method of steepest descent does not have this property. The iterate x(k+1) minimizesf (x) where x is constraint to be on the line x(k)+αp(k). Because in each step f (x(k+1))≤ f (x(k)), a slightlystronger result holds: It also minimizes f (x) where x is constraint to be on the union of lines x( j)+αp( j),j = 0, . . . ,k. However, that is not the same as it minimizing over all vectors in S(p(0), . . . , p(k)).

We can write write (19.1) more concisely: Let P(k) =(

p(0) p(1) · · · p(k))

. Then

minx∈S(p(0),...,p(k))

f (x) = miny

f (P(k)y) = miny

f ((

P(k−1) p(k)) y0

ψ1

)

= miny

12

( P(k−1) p(k)) y0

ψ1

T

A(

P(k−1) p(k)) y0

ψ1

−

( P(k−1) p(k)) y0

ψ1

T

b

.= min

y

[12

[yT

0 P(k−1)T +ψ1 p(k)T]

A[P(k−1)y0 +ψ1 p(k)

]−[yT

0 P(k−1)T +ψ1 p(k)T]

b]

= miny

[12

yT0 P(k−1)T AP(k−1)y0 +ψ1yT

0 P(k−1)T Ap(k)+12

ψ21 p(k)T Ap(k)− yT

0 P(k−1)T b−ψ1 p(k)T b]


= miny

[12

yT0 P(k−1)T AP(k−1)y0− yT

0 P(k−1)T b+ψ1yT0 P(k−1)T Ap(k)+

12

ψ21 p(k)T Ap(k)−ψ1 p(k)T b

].

where y =

y0

ψ1

. Now, if

P(k−1)T Ap(k) = 0

then

minx∈S(p(0),...,p(k))

f (x) = miny

f (P(k)y)

= miny

[12


0 P(k−1)T b+12


].

= miny0

[12


0 P(k−1)T b]+min

ψ1

[12


]= min

y0f (P(k−1)y0)+0.

where the minimizing ψ1 is given by

ψ1 =p(k)T b

p(k)T Ap(k)

so thatx(k+1) = P(k−1)y0 +ψ1 p(k) = x(k)+ψ1 p(k).

SinceP(k−1)T Ap(k)

is equivalent to p( j)T Ap(k) = 0, j = 0, . . . ,k− 1, we are looking for a sequence of directions that areA-conjugate:

Definition 19.2 Let p(0), . . . , p(k−1) ∈ Rn and A be SPD. Then this sequence of vectors is said to be A-conjugate if p( j)T Ap(k) = 0 if and only if j 6= k.

19.7 Conjugate Gradient Method

The Conjugate Gradient method (CG) is a descent method that picks the search directions very carefully.Specifically, it picks them to be ”A-conjugate”, meaning that pT

i Ap j = 0 if i 6= j. It also starts with x(0) = 0so that p(0) = r(0) = b.

Notice that, if x(0) = 0 then, by nature, descent methods yield

x(k+1) = α0 p(0)+ · · ·+αk p(k).

Also, the residual

r(k+1) = b−Ax(k+1) = b−A(α0 p(0)+ · · ·+αk−1 p(k)) = b−α0Ap(0)−·· ·−αk−1kAp(k).

19.7. Conjugate Gradient Method 321

Given: A, b, x(0) = 0 Given: A, b, x(0) = 0 Given: A, b, x(0) = 0

r(0) = b, p(0) = r(0)

for k = 0,1, · · ·q(k) = Ap(k)

αk =p(k)T r(k)

p(k)T q(k)

x(k+1) = x(k)+αk p(k)

r(k+1) = r(k+1)−αkq(k)

γk =− r(k+1)T Ap(k)

p(k)T Ap(k)

p(k+1) = r(k+1)+ γk p(k)

endfor

Figure 19.3: Left: Basic Conjugate Gradient Method.

Now, if r(k+1) = 0, then we know that x(k+1) solves Ax = b, and we are done. So, we can assume thatr(k+1) 6= 0. How can we construct a new p(k+1) such that p(k+1)T Ap(k) = 0. For the Conjugate Gradientmethod we pick

p(k+1) = r(k+1)+ γk p(k)

such thatp(k+1)T

Ap(k) = 0.

Now,0 = p(k+1)T Ap(k) = (r(k+1)+ γk p(k))T Ap(k) = r(k+1)T Ap(k)+ γk p(k)T Ap(k)

so that

γk =−r(k+1)T Ap(k)

p(k)T Ap(k).

Now, it can be shown that not only p(k+1)T Ap(k), but also p(k+1)T Ap( j), for j = 0,1, . . . ,k. Thus, theConjugate Gradient Method is an A-conjugate method.

Working these insights into the basic descent method algorithm yields the most basic form of theConjugate Gradient method, summarized in Figure 19.3 (left).

Chapter 20Notes on Lanczos Methods

These notes are quite incomplete. More to come in the future!

In this note, we give a brief overview of the ideas behind Lanczos Methods. These methods are usefulwhen computing (some) eigenvalues and eigenvectors of a sparse matrix.

323

Chapter 20. Notes on Lanczos Methods 324

Outline

Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

20.1. Krylov Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

20.1. Krylov Subspaces 325

20.1 Krylov Subspaces

Consider matrix A, some initial unit vector q0, and the sequence x0,x1, . . . defined by

xk+1 := Axk, k = 0,1, . . . ,

where x0 = q0. We already saw this sequence when we discussed the power method.

Definition 20.1 Given square matrix A and initial vector x0, the Krylov subspace K (A,x0,k) is definedby

K (A,x0,k) = S(x0,Ax0, . . . ,Ak−1x0) = S(x0,x1, . . . ,xk−1),

where x j+1 = Ax j, for j = 0,1, . . . ,. The Krylov matrix is defined by

K(A,x0,k) =(

x0 Ax0 . . . Ak−1x0

)=(

x0 x1 . . . xk−1

)From our discussion of the Power Method, we notice that it is convenient to instead work with mutually

orthonormal vectors. Consider the QR factorization of K(A,q0,n):

K(A,q0,n) =(

q0 x1 . . . xn−1

)=(

q0 q1 . . . qn−1

)

1 ρ0,1 . . . ρ0,n−1

0 ρ1,1 . . . ρ1,n−1...

... . . . ...

0 0 · · · ρn−1,n−1

= QR

Notice thatAK(A,q0,n) = A

(q0 x1 . . . xn−1

)=(

x1 . . . xn−1 Axn−1

).

Partition

K(A,q0,n) =(

q0 X1

),Q =

(q0 Q1

), and R =

1 rT01

0 R11

.

ThenK(A,q0,n) = A

(q0 X1

)=(

X1 Axn−1

)and hence

K(A,q0,n) = A(

q0 Q1

) 1 rT01

0 R11

=(

Q1R11−q0rT01 Axn−1

)Now, notice that

QT AQR =(

q0 Q1

)TA(

q0 Q1

)R

=(

q0 Q1

)T (Q1R11−q0rT

01 Axn−1

)=

( (q0 Q1

)T(Q1R11−q0rT

01)(

q0 Q1

)TAxn−1

)=

rT01 qT

0 Axn−1

R11 QT1 Axn−1

.


Finally, we see that (if R is invertible)

QT AQ =

rT01 qT

0 Axn−1

R11 QT1 Axn−1

︸︷︷︸

upperHessenberg!

R−1

︸︷︷︸upperHessenberg!

= T.

We emphasize that

• T = QT AQ is upperHessenberg (and if A is symmetric, it is tridiagonal).

• Q is generated by the initial vector x0 = q0.

From the Implicit Q Theorem, we thus know that T and Q are uniquely determined! Finally, QT AQ is aunitary similarity transformations so that the eigenvalues can be computed from T and then the eigenvec-tors of T can be transformed back to eigenvectors of Q.

One way for computing the tridiagonal matrix is, of course, to use Householder transformations. How-ever, that would quickly fill zeroes that exist in matrix A. Instead, we compute it more direction. We willdo so under the assumption that A is symmetric and that therefore T is tridiagonal.

Consider AQ = QT and partition

Q =(

QL qM QR

)and T =

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

The idea is that QL, qM have already been computed, as have TT L and τML. Also, we assume that atemporary vector u holds AqM−τMLQLeL. Initially, QL has zero rows and qM equals the q1 with which westarted our discussion. Also, initially, TT L is 0×0 so that τML doesn’t really exist.

Now, we repartition (QL qM QR

)→(

Q0 q1 q2 Q3

)and

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

→

T00 ? 0 0

τ10eTL τ11 ? 0

0 τ21 τ22 ?

0 0 τ32e0 T33

,

which one sould visualize as

× × 0 0 0 0

× × τML 0 0 0

0 τML τMM τBM 0 0

0 0 τBM × × 0

0 0 0 × × ×0 0 0 0 × ×

→

× × 0 0 0 0

× × τ10 0 0 0

0 τ10 τ11 τ21 0 0

0 0 τ21 τ22 τ32 0

0 0 0 τ32 × ×0 0 0 0 × ×

,


so that

A(

Q0 q1 q2 Q3

)=(

Q0 q1 q2 Q3

)

T00 ? 0 0

τ10eTL τ11 ? 0

0 τ21 τ22 ?

0 0 τ32e0 T33

Notice that then

Aq1 = τ10Q0eL + τ11q1 + τ21q2.

We assumed that u already contained u := AqM− τMLQLeL which means that u = Aq1− τ10Q0eL. (Noticethat Q0eL is merely the last column of Q0.) Next,

qT1 u = qT

1 (Aq1− τ10Q0el) = τ11qT1 q1 + τ21qT

1 q2 = τ11,

which means that τ11 := qT1 u. This allows us to update u := u− τ11q1 = Aq1− τ10Q0e1− τ11q1. We

notice that now τ21q2 = u and hence we can choose τ21 := ‖q2‖2, since q2 must have length one, and thenq2 := u/τ21. We then create a new vector u := Aq2− τ21q1. This allows us to move where we are in thematrices forward: (

QL qM QR

)←(

Q0 q1 q2 Q3

)and

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

←

T00 ? 0 0

τ10eTL τ11 ? 0

0 τ21 τ22 ?

0 0 τ32e0 T33

,

which one sould visualize as

× × 0 0 0 0

× × τML 0 0 0

0 τML τMM τBM 0 0

0 0 τBM × × 0

0 0 0 × × ×0 0 0 0 × ×

←

× × 0 0 0 0

× × τ10 0 0 0

0 τ10 τ11 τ21 0 0

0 0 τ21 τ22 τ32 0

0 0 0 τ32 × ×0 0 0 0 × ×

.

These observations are captured in the algorithm in Figure 20.1.Consider

A(

Q0 q1 q2 Q3

)=(

Q0 q1 q2 Q3

)

T00 ? 0 0

τ10eTL τ11 ? 0

0 τ21 τ22 ?

0 0 τ32e0 T33


Algorithm: [Q,T ] := LANCZOS UNB VAR1(A,q0,Q,T )

Partition Q→(

QL qM QR

), T →

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

whereQL has 0 columns, TT L is 0×0

qM := q0; u = Aq0

while n(QL)< n(Q) do

Repartition(QL qM QR

)→(

Q0 q1 q2 Q3

),

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

→

T00 ? 0 0

τ10eTL τ11 ? ?

0 τ21 τ22 ?

0 0 τ32e0 T33

whereq2 has 1 column, τ22 is 1×1

τ11 := qT1 u

u := u− τ11q1

τ21 := ‖q2‖2

q2 := u/τ21

u := Aq2− τ21q1

Continue with(QL qM QR

)→(

Q0 q1 q2 Q3

),

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

←

T00 ? 0 0

τ10eTL τ11 ? ?

0 τ21 τ22 ?

0 0 τ32e0 T33

endwhile

Figure 20.1: Lanscos algorithm for symmetric Q. It computes Q and T such that AQ = QT . It lacks astopping criteria.


• If q2 represents the kth column of Q, then it is in the direction of (Ak−1q0)⊥, the component of

Ak−1q0 that is orthogonal to the columns of(

Q0 q1

).

• As k gets large, Ak−1q0 is essentially in the space spanned by(

Q0 q1

).

• q2 is computed fromτ21q2 = Aq1− τ10Q0eL− τ11q1.

• The length of Aq1−τ10Q0eL−τ11q1 must be very small as k gets large since Ak−1q0 eventually liesapproximately in the direction of Ak−2q0.

• This means that τ21 = ‖Aq1− τ10Q0eL− τ11q1‖2 will be small.

• Hence

A(

Q0 q1 q2 Q3

)≈(

Q0 q1 q2 Q3

)

T00 ? 0 0

τ10eTL τ11 0 0

0 0 ? ?

0 0 ? ?

or, equivalently,

A(

Q0 q1

)≈(

Q0 q1

) T00 ?

τ10eTL τ11

= S,

where S is a (k−1)× (k−1) tridiagonal matrix.

Here the space spanned by(

Q0 q1

), C (

(Q0 q1

)), is said to be an invariant subspace of

matrix A.

• If we compute the Spectral Decomposition of S:

S = QSΛSQTS

thenA[(

Q0 q1

)QS

]≈[(

Q0 q1

)QS

]ΛS.

• We conclude that, approximately,

– the diagonal elements of ΛS equal eigenvalues of A and

– the columns of[(

Q0 q1

)QS

]equal the corresponding eigenvectors.

The algorithm now, with stopping criteria, is given in Figure 20.2.


Algorithm: [QL,TT L] := LANCZOS UNB VAR1(A,q0,Q,T )

Partition Q→(

QL qM QR

), T →

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

whereQL has 0 columns, TT L is 0×0

qM := q0; u = Aq0

while |τML|> tolerance do

Repartition(QL qM QR

)→(

Q0 q1 q2 Q3

),

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

→

T00 ? 0 0

τ10eTL τ11 ? ?

0 τ21 τ22 ?

0 0 τ32e0 T33

whereq2 has 1 column, τ22 is 1×1

τ11 := qT1 u

u := u− τ11q1

τ21 := ‖q2‖2

if |τ21|> tolerance q2 := u/τ21

u := Aq2− τ21q1

Continue with(QL qM QR

)→(

Q0 q1 q2 Q3

),

TT L ? 0

τMLeTL τMM ?

0 τBMe0 TBR

←

T00 ? 0 0

τ10eTL τ11 ? ?

0 τ21 τ22 ?

0 0 τ32e0 T33

endwhile

Figure 20.2: Lanscos algorithm for symmetric Q. It computes QL and TT L such that AQL ≈ QLTT L.

More Chapters to be Added in the Future!

331

Answers

Chapter 1. Notes on Simple Vector and Matrix Operations (Answers)


A =(

a0 a1 · · · an−1

)=

aT

0

aT1...

aTm−1

.


•(

a0 a1 · · · an−1

)T=

aT

0

aT1...

aTm−1

.

•

aT

0

aT1...

aTm−1

T

=(

a0 a1 · · · an−1

).

•(

a0 a1 · · · an−1

)H=

aH

0

aH1...

aHm−1

.

333

•

aT

0

aT1...

aTm−1

H

=(

a0 a1 · · · an−1

).

* BACK TO TEXT

Homework 1.2 Partition x into subvectors:

x =

x0

x1...

xN−1

.


• x =

x0

x1...

xN−1

.

• xT =(

xT0 xT

1 · · · xTN−1

).

• xH =(

xH0 xH

1 · · · xHN−1

).

* BACK TO TEXT


A =

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

,

where Ai, j ∈ Cmi×ni . Here ∑M−1i=0 mi = m and ∑

N−1j=0 ni = n.


334

•

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

T

=

AT

0,0 AT1,0 · · · AT

M−1

AT0,1 AT

1,1 · · · ATM−1,1

...... · · · ...

AT0,N−1 AT

1,N−1 · · · ATM−1,N−1

.

•

A0,0 A0,1 · · · A0,N−1

A1,0 A1,1 · · · A1,N−1...

... · · · ...

AM−1,0 AM−1,1 · · · AM−1,N−1

H

=

AH

0,0 AH1,0 · · · AH

M−1

AH0,1 AH

1,1 · · · AHM−1,1

...... · · · ...

AH0,N−1 AH

1,N−1 · · · AHM−1,N−1

.

* BACK TO TEXT


• αxT =(

αχ0 αχ1 · · · αχn−1

).

• (αx)T = αxT .

• (αx)H = αxH .

• α

x0

x1...

xN−1

=

αx0

αx1...

αxN−1

* BACK TO TEXT


• α

x0

x1...

xN−1

+

y0

y1...

yN−1

=

αx0 + y0

αx1 + y1...

αxN−1 + yN−1

. (Provided xi,yi ∈ Cni and ∑N−1i=0 ni = n.)

* BACK TO TEXT


335

•

x0

x1...

xN−1

Hy0

y1...

yN−1

= ∑N−1i=0 xH

i yi. (Provided xi,yi ∈ Cni and ∑N−1i=0 ni = n.)

* BACK TO TEXT

Homework 1.7 Homework 1.8 Prove that xHy = yHx.* BACK TO TEXT

336

Chapter 2. Notes on Vector and Matrix Norms (Answers)

Homework 2.2 Prove that if ν :Cn→R is a norm, then ν(0) = 0 (where the first 0 denotes the zero vectorin Cn).

Answer: Let x ∈ Cn and~0 the zero vector of size n and 0 the scalar zero. Then

ν(~0) = ν(0 · x) 0 · x =~0

= |0|ν(x) ν(·) is homogeneous

= 0 algebra

* BACK TO TEXT

Homework 2.7 The vector 1-norm is a norm.

Answer: We show that the three conditions are met:Let x,y ∈ Cn and α ∈ C be arbitrarily chosen. Then

• x 6= 0⇒‖x‖1 > 0 (‖ · ‖1 is positive definite):


‖x‖1 = |χ0|+ · · ·+ |χn−1| ≥ |χ j|> 0.

• ‖αx‖1 = |α|‖x‖1 (‖ · ‖1 is homogeneous):

‖αx‖1 = |αχ0|+ · · ·+ |αχn−1| = |α||χ0|+ · · ·+ |α||χn−1| = |α|(|χ0|+ · · ·+ |χn−1|) =|α|(|χ0|+ · · ·+ |χn−1|) = |α|‖x‖1.

• ‖x+ y‖1 ≤ ‖x‖1 +‖y‖1 (‖ · ‖1 obeys the triangle inequality).

‖x+ y‖1 = |χ0 +ψ0|+ |χ1 +ψ1|+ · · ·+ |χn−1 +ψn−1|≤ |χ0|+ |ψ0|+ |χ1|+ |ψ1|+ · · ·+ |χn−1|+ |ψn−1|= |χ0|+ |χ1|+ · · ·+ |χn−1|+ |ψ0|+ |ψ1|+ · · ·+ |ψn−1|= ‖x‖1 +‖y‖1.

* BACK TO TEXT

Homework 2.9 The vector ∞-norm is a norm.

Answer: We show that the three conditions are met:Let x,y ∈ Cn and α ∈ C be arbitrarily chosen. Then

337

• x 6= 0⇒‖x‖∞ > 0 (‖ · ‖∞ is positive definite):


‖x‖∞ = maxi|χi| ≥ |χ j|> 0.

• ‖αx‖∞ = |α|‖x‖∞ (‖ · ‖∞ is homogeneous):

‖αx‖∞ = maxi |αχi|= maxi |α||χi|= |α|maxi |χi|= |α|‖x‖∞.

• ‖x+ y‖∞ ≤ ‖x‖∞ +‖y‖∞ (‖ · ‖∞ obeys the triangle inequality).

‖x+ y‖∞ = maxi|χi +ψi|

≤ maxi(|χi|+ |ψi|)

≤ maxi(|χi|+max

j|ψ j|)

= maxi|χi|+max

j|ψ j|= ‖x‖∞ +‖y‖∞.

* BACK TO TEXT

Homework 2.13 Show that the Frobenius norm is a norm.

Answer: The answer is to realize that if A =(

a0 a1 · · · an−1

)then

‖A‖F =

√√√√m−1

∑i=0

n−1

∑j=0|αi, j|2 =

√√√√n−1

∑j=0

m−1

∑i=0|αi, j|2 =

√√√√n−1

∑j=0‖a j‖2

2 =

√√√√√√√√√√√

∥∥∥∥∥∥∥∥∥∥∥

a0

a1...

an−1

∥∥∥∥∥∥∥∥∥∥∥

2

2

.

In other words, it equals the vector 2-norm of the vector that is created by stacking the columns of A ontop of each other. The fact that the Frobenius norm is a norm then comes from realizing this connectionand exploiting it.

Alternatively, just grind through the three conditions!* BACK TO TEXT

Homework 2.20 Let A ∈ Cm×n and partition A =

aT

0

aT1...

aTm−1

. Show that

‖A‖∞ = max0≤i<m

‖ai‖1 = max0≤i<m

(|αi,0|+ |αi,1|+ · · ·+ |αi,n−1|)

338

Answer: Partition A =

aT

0

aT1...

aTm−1

. Then

‖A‖∞ = max‖x‖∞=1

‖Ax‖∞ = max‖x‖∞=1

∥∥∥∥∥∥∥∥∥∥∥

aT

0

aT1...

aTm−1

x

∥∥∥∥∥∥∥∥∥∥∥∞

= max‖x‖∞=1

∥∥∥∥∥∥∥∥∥∥∥

aT

0 x

aT1 x...

aTm−1x

∥∥∥∥∥∥∥∥∥∥∥∞

= max‖x‖∞=1

(max

i|aT

i x|)= max‖x‖∞=1

maxi|

n−1

∑p=0

αi,pχp| ≤ max‖x‖∞=1

maxi

n−1

∑p=0|αi,pχp|

= max‖x‖∞=1

maxi

n−1

∑p=0

(|αi,p||χp|)≤ max‖x‖∞=1

maxi

n−1

∑p=0

(|αi,p|(maxk|χk|))≤ max

‖x‖∞=1max

i

n−1

∑p=0

(|αi,p|‖x‖∞)

= maxi

n−1

∑p=0

(|αi,p|= ‖ai‖1

so that ‖A‖∞ ≤maxi ‖ai‖1.We also want to show that ‖A‖∞ ≥ maxi ‖ai‖1. Let k be such that maxi ‖ai‖1 = ‖ak‖1 and pick y =ψ0

ψ1...

ψn−1

so that aTk y = |αk,0|+ |αk,1|+ · · ·+ |αk,n−1| = ‖ak‖1. (This is a matter of picking ψi so that

|ψi|= 1 and ψiαk,i = |αk,i|.) Then

‖A‖∞ = max‖x‖1=1

‖Ax‖∞ = max‖x‖1=1

∥∥∥∥∥∥∥∥∥∥∥

aT

0

aT1...

aTm−1

x

∥∥∥∥∥∥∥∥∥∥∥∞

≥

∥∥∥∥∥∥∥∥∥∥∥

aT

0

aT1...

aTm−1

y

∥∥∥∥∥∥∥∥∥∥∥∞

=

∥∥∥∥∥∥∥∥∥∥∥

aT

0 y

aT1 y...

aTm−1y

∥∥∥∥∥∥∥∥∥∥∥∞

≥ |aTk y|= aT

k y = ‖ak‖1.= maxi‖ai‖1

* BACK TO TEXT

339

Homework 2.21 Let y ∈ Cm and x ∈ Cn. Show that ‖yxH‖2 = ‖y‖2‖x‖2.

Answer: Answer needs to be filled in.* BACK TO TEXT

Homework 2.25 Show that ‖Ax‖µ ≤ ‖A‖µ,ν‖x‖ν.

Answer: W.l.o.g. let x 6= 0.

‖A‖µ,ν = maxy6=0

‖Ay‖µ

‖y‖ν

≥‖Ax‖µ

‖x‖ν

.

Rearranging this establishes the result.* BACK TO TEXT

Homework 2.26 Show that ‖AB‖µ ≤ ‖A‖µ,ν‖B‖ν.

Answer:

‖AB‖µ,ν = max‖x‖ν=1

‖ABx‖µ ≤ max‖x‖ν=1

‖A‖µ,ν‖Bx‖ν = ‖A‖µ,ν max‖x‖ν=1

‖Bx‖ν = ‖A‖µ,ν‖B‖ν

* BACK TO TEXT

Homework 2.27 Show that the Frobenius norm, ‖ · ‖F , is submultiplicative.

Answer:

‖AB‖2F =

∥∥∥∥∥∥∥∥∥∥∥

aH

0

aH1...

aHm−1

(

b0 b1 · · · bn−1

)∥∥∥∥∥∥∥∥∥∥∥

2

F

=

∥∥∥∥∥∥∥∥∥∥∥

aH

0 b0 aH0 b1 · · · aH

0 bn−1

aH0 b0 aH

0 b1 · · · aH0 bn−1

......

...

aHm−1b0 aH

m−1b1 · · · aHm−1bn−1

∥∥∥∥∥∥∥∥∥∥∥

2

F

= ∑i

∑j|aH

i b j|2

≤ ∑i

∑j‖aH

i ‖22‖b j‖2 (Cauchy-Schwartz)

=

(∑

i‖ai‖2

2

)(∑

j‖b j‖2

)=

(∑

iaH

i ai

)(∑

jbH

j b j

)

≤

(∑

i∑

j|aH

i a j|

)(∑

i∑

j|bH

i b j|

)= ‖A‖2

F‖B‖2F .

so that ‖AB‖2F ≤ ‖A‖2

2‖B‖22. Taking the square-root of both sides established the desired result.

* BACK TO TEXT

340

Homework 2.28 Let ‖ · ‖ be a matrix norm induced by the ‖· · ·‖ vector norm. Show that κ(A) =‖A‖‖A−1‖ ≥ 1.

Answer:‖I‖= ‖AA−1‖ ≤ ‖A‖‖A−1‖.

But‖I‖= max

‖x‖=1‖Ix‖= max

‖x‖‖x‖= 1.

Hence 1≤ ‖A‖‖A−1‖.* BACK TO TEXT

341

Chapter 3. Notes on Orthogonality and the SVD (Answers)

Homework 3.4 Let Q ∈ Cm×n (with n ≤ m). Partition Q =(

q0 q1 · · · qn−1

). Show that Q is an

orthonormal matrix if and only if q0,q1, . . . ,qn−1 are mutually orthonormal.

Answer: Answer needs to be filled in.* BACK TO TEXT

Homework 3.6 Let Q ∈ Cm×m. Show that if Q is unitary then Q−1 = QH and QQH = I.

Answer: If Q is unitary, then QHQ = I. If A,B ∈ Cm×m, the matrix B such that BA = I is the inverse ofA. Hence Q−1 = QH . Also, if BA = I then AB = I and hence QQH = I.

* BACK TO TEXT

Homework 3.7 Let Q0,Q1 ∈ Cm×m both be unitary. Show that their product, Q0Q1, is unitary.

Answer: Obviously, Q0Q1 is a square matrix.

(Q0Q1)H(Q0Q1) = QH

1 QH0 Q0︸︷︷︸I

Q1 = QH1 Q1︸︷︷︸I

= I.

Hence Q0Q1 is unitary.* BACK TO TEXT

Homework 3.8 Let Q0,Q1, . . . ,Qk−1 ∈ Cm×m all be unitary. Show that their product, Q0Q1 · · ·Qk−1, isunitary.

Answer: Strictly speaking, we should do a proof by induction. But instead we will make the moreinformal argument that

(Q0Q1 · · ·Qk−1)HQ0Q1 · · ·Qk−1 = QH

k−1 · · ·QH1 QH

0 Q0Q1 · · ·Qk−1

= QHk−1 · · · QH

1 QH0 Q0︸︷︷︸I

Q1

︸︷︷︸I

· · ·

︸︷︷︸I

Qk−1

︸︷︷︸I

= I.

* BACK TO TEXT

342

Homework 3.9 Let U ∈ Cm×m be unitary and x ∈ Cm, then ‖Ux‖2 = ‖x‖2.

Answer: ‖Ux‖22 = (Ux)H(Ux) = xH UHU︸︷︷︸

I

x = xHx = ‖x‖22. Hence ‖Ux‖2 = ‖x‖2.

* BACK TO TEXT

Homework 3.10 Let U ∈ Cm×m and V ∈ Cn×n be unitary matrices and A ∈ Cm×n. Then

‖UA‖2 = ‖AV‖2 = ‖A‖2.

Answer:

• ‖UA‖2 = max‖x‖2=1 ‖UAx‖2 = max‖x‖2=1 ‖Ax‖2 = ‖A‖2.

• ‖AV‖2 = max‖x‖2=1 ‖AV x‖2 = max‖V x‖2=1 ‖A(V x)‖2 = max‖y‖2=1 ‖Ay‖2 = ‖A‖2.

* BACK TO TEXT

Homework 3.11 Let U ∈ Cm×m and V ∈ Cn×n be unitary matrices and A ∈ Cm×n. Then ‖UA‖F =‖AV‖F = ‖A‖F .

Answer:

• Partition A =(

a0 a1 · · · an−1

). Then it is easy to show that ‖A‖2

F = ∑n−1j=0 ‖a j‖2

2. Thus

‖UA‖2F = ‖U

(a0 a1 · · · an−1

)‖2

F = ‖(

Ua0 Ua1 · · · Uan−1

)‖2

F

=n−1

∑j=0‖Ua j‖2

2 =n−1

∑j=0‖a j‖2

2 = ‖A‖2F .

Hence ‖UA‖F = ‖A‖F .

• Partition A =

aT

0

aT1...

aTm−1

. Then it is easy to show that ‖A‖2F = ∑

m−1i=0 ‖(aT

i )H‖2

2. Thus

‖AV‖2F = ‖

aT

0

aT1...

aTm−1

V‖2F = ‖

aT

0 V

aT1 V...

aTm−1V

‖2F =

m−1

∑i=0‖(aT

i V )H‖22

=m−1

∑i=0‖V H(aT

i )H‖2

2 =m−1

∑i=0‖(aT

i )H‖2

2 = ‖A‖2F .

Hence ‖AV‖F = ‖A‖F .

343

* BACK TO TEXT

Homework 3.13 Let D = diag(δ0, . . . ,δn−1). Show that ‖D‖2 = maxn−1i=0 |δi|.

Answer:

‖D‖22 = max

‖x‖2=1‖Dx‖2

2 = max‖x‖2=1

∥∥∥∥∥∥∥∥∥∥∥

δ0 0 · · · 0

0 δ1 · · · 0...

... . . . ...

0 0 · · · δn−1

χ0

χ1...

χn−1

∥∥∥∥∥∥∥∥∥∥∥

2

2

= max‖x‖2=1

∥∥∥∥∥∥∥∥∥∥∥

δ0χ0

δ1χ1...

δn−1χn−1

∥∥∥∥∥∥∥∥∥∥∥

2

2

= max‖x‖2=1

[n−1

∑j=0|δ jχ j|2

]= max‖x‖2=1

[n−1

∑j=0

[|δ j|2|χ j|2

]]

≤ max‖x‖2=1

[n−1

∑j=0

[n−1maxi=0|δi|2|χ j|2

]]= max‖x‖2=1

[n−1maxi=0|δi|2

n−1

∑j=0|χ j|2

]=

(n−1maxi=0|δi|)2

max‖x‖2=1

‖x‖22

=

(n−1maxi=0|δi|)2

.

so that ‖D‖2 ≤maxn−1i=0 |δi|.

Also, choose j so that |δ j|= maxn−1i=0 |δi|. Then

‖D‖2 = max‖x‖2=1

‖Dx‖2 ≥ ‖De j‖2 = ‖δ je j‖2 = |δ j|‖e j‖2 = |δ j|=n−1maxi=0|δi|.

so that maxn−1i=0 |δi| ≤ ‖D‖2 ≤maxn−1

i=0 |δi|, which implies that ‖D‖2 = maxn−1i=0 |δi|.

* BACK TO TEXT

Homework 3.14 Let A =

AT

0

. Use the SVD of A to show that ‖A‖2 = ‖AT‖2.

Answer: Let AT =UT ΣTV HT be the SVD of A. Then

A =

AT

0

=

UT ΣTV HT

0

=

UT

0

ΣTV HT ,

which is the SVD of A. As a result, cleary the largest singular value of AT equals the largest singular valueof A and hence ‖A‖2 = ‖AT‖2.

* BACK TO TEXT

344

Homework 3.15 Assume that U ∈ Cm×m and V ∈ Cn×n be unitary matrices. Let A,B ∈ Cm×n with B =UAV H . Show that the singular values of A equal the singular values of B.

Answer: Let A = UAΣAV HA be the SVD of A. Then B = UUAΣAV H

A V H = (UUA)ΣA(VVA)H where both

UUA and VVA are unitary. This gives us the SVD for B and it shows that the singular values of B equal thesingular values of A.

* BACK TO TEXT

Homework 3.16 Let A ∈ Cm×n with A =

σ0 0

0 B

and assume that ‖A‖2 = σ0. Show that ‖B‖2 ≤

‖A‖2. (Hint: Use the SVD of B.)

Answer: Let B =UBΣBV HB be the SVD of B. Then

A =

σ0 0

0 B

=

σ0 0

0 UBΣBV HB

=

1 0

0 UB

σ0 0

0 ΣB

1 0

0 VB

H

which shows the relationship between the SVD of A and B. Since ‖A‖2 = σ0, it must be the case that thediagonal entries of ΣB are less than or equal to σ0 in magnitude, which means that ‖B‖2 ≤ ‖A‖2.

* BACK TO TEXT

Homework 3.17 Prove Lemma 3.12 for m≤ n.

Answer: You can use the following as an outline for your proof:Proof: First, let us observe that if A = 0 (the zero matrix) then the theorem trivially holds: A = UDV H

where U = Im×m, V = In×n, and D =

0

, so that DT L is 0×0. Thus, w.l.o.g. assume that A 6= 0.

We will employ a proof by induction on m.

• Base case: m = 1. In this case A =(

aT0

)where aT

0 ∈ R1×n is its only row. By assumption,

aT0 6= 0. Then

A =(

aT0

)=(

1)(‖aT

0 ‖2)(

v0

)H

where v0 = (aT0 )

H/‖aT0 ‖2. Choose V1 ∈ Cn×(n−1) so that V =

(v0 V1

)is unitary. Then

A =(

aT0

)=(

1)(‖aT

0 ‖2) 0)(

v0 V1

)H=UDV H

where DT L =(

δ0

)=(‖aT

0 ‖2

)and U =

(1)

.

• Inductive step: Similarly modify the inductive step of the proof of the theorem.

345


* BACK TO TEXT

Homework 3.35 Show that if A ∈ Cm×m is nonsingular, then

• ‖A‖2 = σ0, the largest singular value;

• ‖A−1‖2 = 1/σm−1, the inverse of the smallest singular value; and

• κ2(A) = σ0/σm−1.

Answer: Answer needs to be filled in* BACK TO TEXT

346

Chapter 4. Notes on Gram-Schmidt QR Factorization (Answers)

Homework 4.1 • What happens in the Gram-Schmidt algorithm if the columns of A are NOT linearlyindependent?

Answer: If a j is the first column such that a0, . . . ,a j are linearly dependent, then a⊥j will equalthe zero vector and the process breaks down.

• How might one fix this?

Answer: When a vector with a⊥j is encountered, the columns can be rearranged so that that column(or those columns) come last.

• How can the Gram-Schmidt algorithm be used to identify which columns of A are linearly indepen-dent?

Answer: Again, if a⊥j = 0 for some j, then the columns are linearly dependent.* BACK TO TEXT

Homework 4.2 Homework 4.3 Convince yourself that the relation between the vectors a j and q jin the algorithms in Figure 4.2 is given by

(a0 a1 · · · an−1

)=(

q0 q1 · · · qn−1

)

ρ0,0 ρ0,1 · · · ρ0,n−1

0 ρ1,1 · · · ρ1,n−1...

... . . . ...

0 0 · · · ρn−1,n−1

,

where

qHi q j =

1 for i = j

0 otherwiseand ρi, j =

qH

i a j for i < j

‖a j−∑j−1i=0 ρi, jqi‖2 for i = j

0 otherwise.

Answer: Just watch the video for this lecture!* BACK TO TEXT

Homework 4.5 Homework 4.6 Let A have linearly independent columns and let A = QR be a QR fac-torization of A. Partition

A→(

AL AR

), Q→

(QL QR

), and R→

RT L RT R

0 RBR

,

where AL and QL have k columns and RT L is k× k. Show that

347

1. AL = QLRT L: QLRT L equals the QR factorization of AL,

2. C (AL) = C (QL): the first k columns of Q form an orthonormal basis for the space spanned by thefirst k columns of A.

3. RT R = QHL AR,

4. (AR−QLRT R)HQL = 0,

5. AR−QLRT R = QRRBR, and

6. C (AR−QLRT R) = C (QR).

Answer: Consider the fact that A = QR. Then

(AL AR

)=(

QL QR

) RT L RT R

0 RBR

=(

QLRT L QLRT R +QRRBR

).

HenceAL = QLRT L and AR = QLRT R +QRRBR.

• The left equation answers 1.

• Rearranging the right equation yields AR−QLRT R = QRRBR., which answers 5.

• C (AL) = C (QL) can be shown by showing that C (AL)⊂ C (QL) and C (QL)⊂ C (AL):

C (AL)⊂ C (QL): Let y ∈ C (AL). Then there exists x such that ALx = y. But then QLRT Lx = y andhence QL(RT Lx) = y which means that y ∈ C (QL).

C (QL)⊂ C (AL): Let y ∈ C (QL). Then there exists x such that QLx = y. But then ALR−1T Lx = y and

hence AL(R−1T Lx) = y which means that y ∈ C (AL). (Notice that RT L is nonsingular because it

is a triangular matrix that has only nonzeroes on its diagonal.)

This answer 2.

• Take AR−QLRT R = QRRBR. and multiply both side by QHL :

QHL (AR−QLRT R) = QH

L QRRBR

is equivalent toQH

L AR− QHL QL︸︷︷︸I

RT R = QHL QR︸︷︷︸0

RBR = 0.

Rearranging yields 3.

• Since AR−QLRT R = QRRBR we find that (AR−QLRT R)HQL = (QRRBR)

HQL and

(AR−QLRT R)HQL = RH

BRQHR QL = 0.

• The proof of 6. follows similar to the proof of 2.

* BACK TO TEXT

348

Chapter 6. Notes on Householder QR Factorization (Answers)

Homework 6.2 Show that if H is a reflector, then

• HH = I (reflecting the reflection of a vector results in the original vector),

• H = HH , and

• HHH = I (a reflection is a unitary matrix and thus preserves the norm).

Answer:* BACK TO TEXT

Homework 6.4 Show that if x ∈ Rn, v = x∓‖x‖2e0, and τ = vT v/2 then (I− 1τvvT )x =±‖x‖2e0.

* BACK TO TEXT

Homework 6.5 Verify thatI− 1τ

1

u2

1

u2

H χ1

x2

=

ρ

0

where τ = uHu/2 = (1+uH

2 u2)/2 and ρ =©±‖x‖2.

Hint: ρρ = |ρ|2 = ‖x‖22 since H preserves the norm. Also, ‖x‖2

2 = |χ1|2 +‖x2‖22 and

√zz =

z|z| .


Homework 6.6 Show thatI 0

0

I− 1τ1

1

u2

1

u2

H=

I− 1τ1

0

1

u2

0

1

u2

H .


349

Homework 6.11 If m = n then Q could be accumulated by the sequence

Q = (· · ·((IH0)H1) · · ·Hn−1).

Give a high-level reason why this would be (much) more expensive than the algorithm in Figure 6.6.


Homework 6.12 Assuming all inverses exist, show that T00 t01

0 τ1

−1

=

T−100 −T−1

00 t01/τ1

0 1/τ1

.

Answer: T00 t01

0 τ1

T−100 −T−1

00 t01/τ1

0 1/τ1

=

T00T−100 −T00T−1

00 t01/τ1 + τ1/τ1

0 τ1/τ1

=

I 0

0 1

.

* BACK TO TEXT

Homework 6.13 Homework 6.14 Consider u1 ∈Cm with u1 6= 0 (the zero vector), U0 ∈Cm×k, and non-singular T00 ∈ Ck×k. Define τ1 = (uH

1 u1)/2, so that

H1 = I− 1τ1

u1uH1


Q0 = I−U0T−100 UH

0 .

Show that

Q0H1 = (I−U0T−100 UH

0 )(I− 1τ1

u1uH1 ) = I−

(U0 u1

) T00 t01

0 τ1

−1(U0 u1

)H,

where t01 = QH0 u1.

Answer:

Q0H1 = (I−U0T−100 UH

0 )(I− 1τ1

u1uH1 )

= I−U0T−100 UH

0 −1τ1

u1uH1 +U0T−1

00 UH0

1τ1

u1uH1

= I−U0T−100 UH

0 −1τ1

u1uH1 +

1τ1

U0T−100 t01uH

1

350

Also

I−(

U0 u1

) T00 t01

0 τ1

−1(U0 u1

)H

= I−(

U0 u1

) T−100 −T−1

00 t01/τ1

0 1/τ1

( U0 u1

)H

= I−(

U0 u1

) T−100 UH

0 −T−100 t01uH

1 /τ1

1/τ1uH1

= I−U0(T−1

00 UH0 −T−1

00 t01uH1 /τ1)−1/τ1u1uH

1

= I−U0T−100 UH

0 −1τ1

u1uH1 +

1τ1

U0T−100 t01uH

1

* BACK TO TEXT

Homework 6.14 Homework 6.15 Consider ui ∈Cm with ui 6= 0 (the zero vector). Define τi = (uHi ui)/2,

so thatHi = I− 1

τiuiuH

i


U =(

u0 u1 · · · uk−1

).

Show thatH0H1 · · ·Hk−1 = I−UT−1UH ,

where T is an upper triangular matrix.

Answer: The results follows from Homework 6.14 via a proof by induction.* BACK TO TEXT

351

Chapter 7. Notes on Solving Linear Least-squares Problems (An-swers)

Homework 7.2 Let A ∈ Cm×n with m < n have linearly independent rows. Show that there exist a lowertriangular matrix LL ∈Cm×m and a matrix QT ∈Cm×n with orthonormal rows such that A = LLQT , notingthat LL does not have any zeroes on the diagonal. Letting L =

(LL 0

)be Cm×n and unitary Q = QT

QB

, reason that A = LQ.

Don’t overthink the problem: use results you have seen before.

Answer: We know that AH ∈ Cn×m with linearly independent columns (and n ≤ m). Hence there exists

a unitary matrix Q = FlaOneByTwoQLQR and R =

RT

0

with upper triangular matrix RT that has no

zeroes on its diagonal such that AH = QLRT . It is easy to see that the desired QT equals QT = QHL and the

desired LL equals RHT .

* BACK TO TEXT

Homework 7.3 Let A ∈ Cm×n with m < n have linearly independent rows. Consider


Use the fact that A = LLQT , where LL ∈ Cm×m is lower triangular and QT has orthonormal rows, to arguethat any vector of the form QH

T L−1L y+QH

B wB (where wB is any vector in Cn−m) is a solution to the LLS

problem. Here Q =

QT

QB

.

Answer:

minz‖Az− y‖2 = min

z‖LQz− y‖2

= minw

z = QH w

‖Lw− y‖2

= minw

z = QH w

∥∥∥∥∥∥(

LL 0) wT

wB

− y

∥∥∥∥∥∥2

= minw

z = QH w

‖LLwT − y‖2 .

Hence wT = L−1L y minimizes. But then

z = QHw =(

QHT QH

B

) wT

wB

= QHT wT +QBwB.

352

describes all solutions to the LLS problem.* BACK TO TEXT

Homework 7.4 Continuing Exercise 7.2, use Figure 7.1 to give a Classical Gram-Schmidt inspired al-gorithm for computing LL and QT . (The best way to check you got the algorithm right is to implementit!)


Homework 7.5 Continuing Exercise 7.2, use Figure 7.2 to give a Householder QR factorization inspiredalgorithm for computing L and Q, leaving L in the lower triangular part of A and Q stored as Householdervectors above the diagonal of A. (The best way to check you got the algorithm right is to implement it!)


353

Chapter 8. Notes on the Condition of a Problem (Answers)

Homework 8.1 Show that, if A is a nonsingular matrix, for a consistent matrix norm, κ(A)≥ 1.

Answer: Hmmm, I need to go and check what the exact definition of a consistent matrix norm ishere... What I mean is a matrix norm induced by a vector norm. The reason is that then ‖I‖= 1.

Let ‖ · ‖ be the norm that is used to define κ(A) = ‖AA−1‖. Then

1 = ‖I‖= ‖AA−1‖ ≤ ‖A‖‖A−1‖= κ(A).

* BACK TO TEXT

Homework 8.2 If A has linearly independent columns, show that ‖(AHA)−1AH‖2 = 1/σn−1, where σn−1equals the smallest singular value of A. Hint: Use the SVD of A.

Answer: Let A =UΣV H be the reduced SVD of A. Then

‖(AHA)−1AH‖2 = ‖((UΣV H)HUΣV H)−1(UΣV H)H‖2

= ‖(V ΣUHUΣV H)−1V ΣUH‖2

= ‖(V Σ−1

Σ−1V H)V ΣUH‖2

= ‖V Σ−1UH‖2

= ‖Σ−1‖2

= 1/σn−1

(since the two norm of a diagonal matrix equals its largest diagonal element in absolute value).* BACK TO TEXT

Homework 8.3 Let A have linearly independent columns. Show that κ2(AHA) = κ2(A)2.

Answer: Let A =UΣV H be the reduced SVD of A. Then

κ2(AHA) = ‖AHA‖2‖(AHA)−1‖2

= ‖(UΣV H)HUΣV H‖2‖((UΣV H)HUΣV H)−1‖2

= ‖V Σ2V H‖2‖V (Σ−1)2V H‖2

= ‖Σ2‖2‖(Σ−1)2‖2

=σ2

0

σ2n−1

=

(σ0

σn−1

)2

= κ2(A)2.

* BACK TO TEXT

Homework 8.4 Let A ∈ Cn×n have linearly independent columns.

354

• Show that Ax = y if and only if AHAx = AHy.

Answer: Since A has linearly independent columns and is square, we know that A−1 and A−H exist.If Ax = y, then multiplying both sides by AH yields AHAx = AHy. If AHAx = AHy then multiplyingboth sides by A−H yields Ax = y.

• Reason that using the method of normal equations to solve Ax = y has a condition number of κ2(A)2.

* BACK TO TEXT

Homework 8.5 Let U ∈ Cn×n be unitary. Show that κ2(U) = 1.

Answer: The SVD of U is given by U = UΣV H where Σ = I and V = I. Hence σ0 = σn−1 = 1 andκ2(U) = σ0/σn−1 = 1.

* BACK TO TEXT

Homework 8.6 Characterize the set of all square matrices A with κ2(A) = 1.

Answer: If κ2(A) = 1 then σ0/σn−1 = 1 which means that σ0 = σn−1 and hence σ0 = σ1 = · · ·= σn−1.Hence the singular value decomposition of A must equal A =UΣV H = σ0UV H . But UV H is unitary sinceU and V H are, and hence we conclude that A must be a nonzero multiple of a unitary matrix.

* BACK TO TEXT

355

Chapter 9. Notes on the Stability of an Algorithm (Answers)

Homework 9.3 Assume a floating point number system with β = 2 and a mantissa with t digits so that atypical positive number is written as .d0d1 . . .dt−1×2e, with di ∈ 0,1.

• Write the number 1 as a floating point number.

Answer: . 10 · · ·0︸︷︷︸t

digits

×21.

• What is the largest positive real number u (represented as a binary fraction) such that the float-ing point representation of 1+ u equals the floating point representation of 1? (Assume roundedarithmetic.)

Answer: . 10 · · ·0︸︷︷︸t

digits

×21 + . 00 · · ·0︸︷︷︸t

digits

1×21 = . 10 · · ·0︸︷︷︸t

digits

1×21 which rounds to . 10 · · ·0︸︷︷︸t

digits

×21 = 1.

It is not hard to see that any larger number rounds to . 10 · · ·01︸︷︷︸t

digits

×21 > 1.

• Show that u = 1221−t .

Answer: . 0 · · ·00︸︷︷︸t

digits

1×21 = .1×21−t = 12 ×21−t .

* BACK TO TEXT

Homework 9.10 Prove Lemma 9.9.

Answer: Let C = AB. Then the (i, j) entry in |C| is given by

|γi, j|=

∣∣∣∣∣ k

∑p=0

αi,pβp, j

∣∣∣∣∣≤ k

∑p=0|αi,pβp, j|=

k

∑p=0|αi,p||βp, j|

which equals the (i, j) entry of |A||B|. Thus |AB| ≤ |A||B|.* BACK TO TEXT

356

Homework 9.12 Prove Theorem 9.11.

Answer:Show that if |A| ≤ |B| then ‖A‖1 ≤ ‖B‖1:

LetA =

(a0 · · · an−1

)and B =

(b0 · · · bn−1

).

Then

‖A‖1 = max0≤ j<n

‖a j‖1 = max0≤ j<n

(m−1

∑i=0|αi, j|

)

=

(m−1

∑i=0|αi,k|

)where k is the index that maximizes

≤

(m−1

∑i=0|βi,k|

)since |A| ≤ |B|

≤ max0≤ j<n

(m−1

∑i=0|βi, j|

)= max

0≤ j<n‖b j‖1 = ‖B‖1.

Show that if |A| ≤ |B| then ‖A‖∞ ≤ ‖B‖∞:Note: ‖A‖∞ = ‖AT‖1 and ‖B‖∞ = ‖BT‖1. Also, if |A| ≤ |B| then, clearly, |AT | ≤ |BT |. Hence ‖A‖∞ =

‖AT‖1 ≤ ‖BT‖1 = ‖B‖∞.Show that if |A| ≤ |B| then ‖A‖F ≤ ‖B‖F :

‖A‖2F =

m−1

∑i=0

n−1

∑j=0

α2i, j ≤

m−1

∑i=0

n−1

∑j=0

β2i, j = ‖B‖2

F .

Hence ‖A‖F ≤ ‖B‖F .* BACK TO TEXT

Homework 9.13 Repeat the above steps for the computation

κ := ((χ0ψ0 +χ1ψ1)+χ2ψ2),

computing in the indicated order.


Homework 9.15 Complete the proof of Lemma 9.14.

Answer: We merely need to fill in the details for Case 1 in the proof:

357

Case 1: ∏ni=0(1+ εi)

±1 = (∏n−1i=0 (1+ εi)

±1)(1+ εn). By the I.H. there exists a θn such that (1+ θn) =

∏n−1i=0 (1+ εi)

±1 and |θn| ≤ nu/(1−nu). Then(n−1

∏i=0

(1+ εi)±1

)(1+ εn) = (1+θn)(1+ εn) = 1+ θn + εn +θnεn︸︷︷︸

θn+1

,

which tells us how to pick θn+1. Now

|θn+1| = |θn + εn +θnεn| ≤ |θn|+ |εn|+ |θn||εn| ≤nu

1−nu+u+

nu1−nu

u

=nu+u(1−nu)+nu2

1−nu=

(n+1)+u1−nu

≤ (n+1)+u1− (n+1)u

.

* BACK TO TEXT


Answer:γn =

nu1−nu

≤ (n+b)u1−nu

≤ (n+b)u1− (n+b)u

= γn+b.

γn + γb + γnγb =nu

1−nu+

bu1−bu

+nu

(1−nu)bu

(1−bu)

=nu(1−bu)+(1−nu)bu+bnu2

(1−nu)(1−bu)

=nu−bnu2 +bu−bnu2 +bnu2

1− (n+b)u+bnu2 =(n+b)u−bnu2

1− (n+b)u+bnu2

≤ (n+b)u1− (n+b)u+bnu2 ≤

(n+b)u1− (n+b)u

= γn+b.

* BACK TO TEXT

Homework 9.19 Let k ≥ 0 and assume that |ε1|, |ε2| ≤ u, with ε1 = 0 if k = 0. Show that I +Σ(k) 0

0 (1+ ε1)

(1+ ε2) = (I +Σ(k+1)).

Hint: reason the case where k = 0 separately from the case where k > 0.

Answer:Case: k = 0. Then I +Σ(k) 0

0 (1+ ε1)

(1+ ε2) = (1+0)(1+ ε2) = (1+ ε2) = (1+θ1) = (I +Σ(1)).

358

Case: k = 1. Then I +Σ(k) 0

0 (1+ ε1)

(1+ ε2) =

1+θ1 0

0 (1+ ε1)

(1+ ε2)

=

(1+θ1)(1+ ε2) 0

0 (1+ ε1)(1+ ε2)

=

(1+θ2) 0

0 (1+θ2)

= (I +Σ(2)).

Case: k > 1. Notice that

(I +Σ(k))(1+ ε2) = diag(()(1+θk),(1+θk),(1+θk−1), · · · ,(1+θ2))(1+ ε2)

= diag(()(1+θk+1),(1+θk+1),(1+θk), · · · ,(1+θ3))

Then I +Σ(k) 0

0 (1+ ε1)

(1+ ε2) =

(I +Σ(k))(1+ ε2) 0

0 (1+ ε1)(1+ ε2)

=

(I +Σ(k))(1+ ε2) 0

0 (1+θ2)

= (I +Σ(k+1)).

* BACK TO TEXT

Homework 9.23 Prove R1-B.

Answer: From Theorem 9.20 we know that

κ = xT (I +Σ(n))y = (x+ Σ(n)x

δx

)T y.

Then

|δx| = |Σ(n)x|=

∣∣∣∣∣∣∣∣∣∣∣∣∣∣

θnχ0

θnχ1

θn−1χ2...

θ2χn−1

∣∣∣∣∣∣∣∣∣∣∣∣∣∣=

|θnχ0||θnχ1||θn−1χ2|

...

|θ2χn−1|

=

|θn||χ0||θn||χ1||θn−1||χ2|

...

|θ2||χn−1|

≤

|θn||χ0||θn||χ1||θn||χ2|

...

|θn||χn−1|

= |θn|

|χ0||χ1||χ2|

...

|χn−1|

≤ γn|x|.

359

(Note: strictly speaking, one should probably treat the case n = 1 separately.)!.* BACK TO TEXT


y = A(x+δx),

where δx is “small”?

Answer: The answer is “no”. The reason is that for each individual element of y

ψi = aTi (x+δx)

which would appear to support thatψ0

ψ1...

ψm−1

=

aT

0 (x+δx)

aT1 (x+δx)

...

aTm−1(x+δx)

.

However, the δx for each entry ψi is different, meaning that we cannot factor out x + δx to find thaty = A(x+δx).

* BACK TO TEXT


C = (A+∆A)(B+∆B),

where ∆A and ∆B are “small”?

Answer: The answer is “no” for reasons similar to why the answer is “no” for Exercise 9.25.* BACK TO TEXT

360

Chapter 10. Notes on Performance (Answers)

No exercises to answer yet.

361

Chapter 11. Notes on Gaussian Elimination and LU Factorization (An-swers)

Homework 11.6 Show that Ik 0 0

0 1 0

0 −l21 I

−1

=

Ik 0 0

0 1 0

0 l21 I

Answer:

Ik 0 0

0 1 0

0 −l21 I

Ik 0 0

0 1 0

0 l21 I

=

Ik 0 0

0 1 0

0 −l21 + l21 I

=

Ik 0 0

0 1 0

0 0 I

* BACK TO TEXT

Homework 11.7 Let Lk = L0L1 . . .Lk. Assume that Lk has the form Lk−1 =

L00 0 0

lT10 1 0

L20 0 I

, where L00

is k× k. Show that Lk is given by Lk =

L00 0 0

lT10 1 0

L20 l21 I

.. (Recall: Lk =

I 0 0

0 1 0

0 −l21 I

.)

Answer:

Lk = Lk−1Lk = Lk−1L−1k

=

L00 0 0

lT10 1 0

L20 0 I

.

I 0 0

0 1 0

0 −l21 I

−1

=

L00 0 0

lT10 1 0

L20 0 I

.

I 0 0

0 1 0

0 l21 I

=

L00 0 0

lT10 1 0

L20 l21 I

.

* BACK TO TEXT

362

function [ A_out ] = LU_unb_var5( A )

[ ATL, ATR, ...ABL, ABR ] = FLA_Part_2x2( A, ...

0, 0, ’FLA_TL’ );

while ( size( ATL, 1 ) < size( A, 1 ) )

[ A00, a01, A02, ...a10t, alpha11, a12t, ...A20, a21, A22 ] = FLA_Repart_2x2_to_3x3( ATL, ATR, ...

ABL, ABR, ...1, 1, ’FLA_BR’ );

%------------------------------------------------------------%

a21 = a21/ alpha11;A22 = A22 - a21 * a12t;

%------------------------------------------------------------%

[ ATL, ATR, ...ABL, ABR ] = FLA_Cont_with_3x3_to_2x2( A00, a01, A02, ...

a10t, alpha11, a12t, ...A20, a21, A22, ...’FLA_TL’ );

end

A_out = [ ATL, ATRABL, ABR ];

return

Figure 11.3: Answer to Exercise 11.12.

Homework 11.12 Implement LU factorization with partial pivoting with the FLAME@lab API, in M-script.

Answer: See Figure 11.3.* BACK TO TEXT

Homework 11.13 Derive an algorithm for solving Ux = y, overwriting y with the solution, that casts mostcomputation in terms of DOT products. Hint: Partition

U →

υ11 uT12

0 U22

.

Call this Variant 1 and use Figure 11.6 to state the algorithm.

363


Partition U →

UT L UT R

UBL UBR

, y→

yT

yB



Repartition

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

→

y0

ψ1

y2

ψ1 := ψ1−uT

12x2

ψ1 := ψ1/υ11


UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

←

y0

ψ1

y2

endwhile


Partition U →

UT L UT R

UBL UBR

, y→

yT

yB



Repartition

UT L UT R

UBL UBR

→

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

→

y0

ψ1

y2

ψ1 := χ1/υ11

y0 := y0−χ1u01


UBL UBR

←

U00 u01 U02

uT10 υ11 uT

12

U20 u21 U22

,

yT

yB

←

y0

ψ1

y2

endwhile

Figure 6 (Answer): Algorithms for the solution of upper triangular system Ux = y that overwrite y with x.

364

Answer:Partition υ11 uT

12

0 U22

χ1

x2

=

ψ1

y2

.

Multiplying this out yields υ11χ1 +uT12x2

U22x2

=

ψ1

y2

.

So, if we assume that x2 has already been computed and has overwritten y2, then χ1 can be computed as

χ1 = (ψ1−uT12x2)/υ11

which can then overwrite ψ1. The resulting algorithm is given in Figure 11.6 (Answer) (left).* BACK TO TEXT

Homework 11.14 Derive an algorithm for solving Ux = y, overwriting y with the solution, that casts mostcomputation in terms of AXPY operations. Call this Variant 2 and use Figure 11.6 to state the algorithm.

Answer: Partition U00 u01

0 υ11

x0

χ1

=

y0

ψ1

.

Multiplying this out yields U00x0 +u01χ1

υ11χ1

=

y0

ψ1

.

So, χ1 = ψ1/υ11 after which x0 can be computed by solving U00x0 = y0−χ1u01. The resulting algorithmis given in Figure 11.6 (Answer) (right).

* BACK TO TEXT

Homework 11.15 If A is an n×n matrix, show that the cost of Variant 1 is approximately 23n3 flops.

Answer: During the kth iteration, L00 is k× k, for k = 0, . . . ,n− 1. Then the (approximate) cost of thesteps are given by

• Solve L00u01 = a01 for u01, overwriting a01 with the result. Cost: k2 flops.


10 (or, equivalently, UT00(l

T10)

T = (aT10)

T for lT10), overwriting aT

10 with the result.Cost: k2 flops.

• Compute υ11 = α11− lT10u01, overwriting α11 with the result. Cost: 2k flops.

Thus, the total cost is given by

n−1

∑k=0

(k2 + k2 +2k

)≈ 2

n−1

∑k=0

k2 ≈ 213

n3 =23

n3.

* BACK TO TEXT

365

Homework 11.16 Derive the up-looking variant for computing the LU factorization.

Answer: Consider the loop invariant: AT L AT R

ABL ABR

=

L\UT L UT R

ABL ABR

∧


((((((((LBLUT L = ABL (((((((((((((



L\U00 u01 U02

aT10 α11 aT

12

A20 a21 A22


lT10 υ11 uT

12

A20 a21 A22


L00U00 = A00 L00u01 = a01 L00U02 = A02

lT10U00 = aT

10 lT10u01 +υ11 = α11 lT

10U02 +uT12 = aT

12




10 for lT10, overwriting aT

10 with the result.


• Compute uT12 := aT

12− lT10U02, overwriting aT

12 with the result.

* BACK TO TEXT

Homework 11.17 Implement all five LU factorization algorithms with the FLAME@lab API, in M-script.

* BACK TO TEXT

366

Homework 11.18 Which of the five variants can be modified to incorporate partial pivoting?

Answer: Variants 2, 4, and 5.

* BACK TO TEXT

Homework 11.23 Apply LU with partial pivoting to

A =

1 0 0 · · · 0 1

−1 1 0 · · · 0 1

−1 −1 1 · · · 0 1...

...... . . . ...

...

−1 −1 · · · 1 1

−1 −1 · · · −1 1

.

Pivot only when necessary.

Answer: Notice that no pivoting is necessary: Eliminating the entries below the diagonal in the firstcolumn yields:

1 0 0 · · · 0 1

0 1 0 · · · 0 2

0 −1 1 · · · 0 2...

...... . . . ...

...

0 −1 · · · 1 2

0 −1 · · · −1 2

.

Eliminating the entries below the diagonal in the second column yields:

1 0 0 · · · 0 1

0 1 0 · · · 0 2

0 0 1 · · · 0 4...

...... . . . ...

...

0 0 · · · 1 4

0 0 · · · −1 4

.

367

Eliminating the entries below the diagonal in the (n−1)st column yields:

1 0 0 · · · 0 1

0 1 0 · · · 0 2

0 0 1 · · · 0 4...

...... . . . ...

...

0 0 · · · 1 2n−2

0 0 · · · 0 2n−1

.

* BACK TO TEXT

368

Chapter 12. Notes on Cholesky Factorization (Answers)

Homework 12.3 Let B ∈ Cm×n have linearly independent columns. Prove that A = BHB is HPD.

Answer: Let x ∈ Cm be a nonzero vector. Then xHBHBx = (Bx)H(Bx). Since B has linearly independentcolumns we know that Bx 6= 0. Hence (Bx)HBx > 0.

* BACK TO TEXT

Homework 12.4 Let A ∈ Cm×m be HPD. Show that its diagonal elements are real and positive.

Answer: Let e j be the jth unit basis vectors. Then 0 < eHj Ae j = α j, j.

* BACK TO TEXT

Homework 12.14 Implement the Cholesky factorization with M-script.* BACK TO TEXT

Homework 12.15 Consider B ∈ Cm×n with linearly independent columns. Recall that B has a QR fac-torization, B = QR where Q has orthonormal columns and R is an upper triangular matrix with positivediagonal elements. How are the Cholesky factorization of BHB and the QR factorization of B related?

Answer:BHB = (QR)HQR = RH QHQ︸︷︷︸

I

R = RH︸︷︷︸L

R︸︷︷︸LH

.

* BACK TO TEXT

Homework 12.16 Let A be SPD and partition

A→

A00 a10

aT10 α11

(Hint: For this exercise, use techniques similar to those in Section 12.4.)

1. Show that A00 is SPD.

Answer: Assume that A is n×n so that A00 is (n−1)× (n−1). Let x0 ∈Rn−1 be a nonzero vector.Then

xT0 A00x0 =

x0

0

T A00 a10

aT10 α11

x0

0

> 0

since

x0

0

T

is a nonzero vector and A is SPD.

369

2. Assuming that A00 = L00LT00, where L00 is lower triangular and nonsingular, argue that the assign-

ment lT10 := aT

10L−T00 is well-defined.

Answer: The compution is well-defined because L−100 exists and hence lT

10 := aT10L−T

00 uniquelycomputes lT

10.

3. Assuming that A00 is SPD, A00 = L00LT00 where L00 is lower triangular and nonsingular, and lT

10 =

aT10L−T

00 , show that α11− lT10l10 > 0 so that λ11 :=

√α11− lT

10l10 is well-defined.

Answer: We want to show that α11− lT10l10 > 0. To do so, we are going to construct a nonzero

vector x so that xT Ax = α11− lT10l10, at which point we can invoke the fact that A is SPD.

Rather than just giving the answer, we go through the steps that will give insight into the thoughtprocess leading up to the answer as well. Consider x0

χ1

T A00 a10

aT10 α11

x0

χ1

=

x0

χ1

T A00x0 +χ1a10

aT10x0 +α11χ1

= xT

0 A00x0 +χ1a10 +χ1aT10x0 +α11χ1

= xT0 A00x0 +χ1xT

0 a10 +χ1aT10x0 +α11χ

21.

Since we are trying to get to α11− lT10l10, perhaps we should pick χ1 = 1. Then x0

1

T A00 a10

aT10 α11

x0

1

= xT0 A00x0 + xT

0 a10 +aT10x0 +α11.

The question now becomes how to pick x0 so that

xT0 A00x0 + xT

0 a10 +aT10x0 =−lT

10l10.

Let’s try x0 =−L−100 l10 and recall that lT

10 = aT10L−T

00 so that L00l10 = a10. Then

xT0 A00x0 + xT

0 a10 +aT10x0 = xT

0 L00LT00x0 + xT

0 a10 +aT10x0

= (−L−100 l10)

T L00LT00(−L−1

00 l10)+(−L−100 l10)

T (L00l10)+(L00l10)T (−L−1

00 l10)

= lT10L−T

00 L00LT00L−1

00 l10− lT10L−T

00 L00l10)− lT10LT

00L−100 l10

= −lT10l10.

We now put all these insights together:

0 <

−L−100 l10

1

T A00 a10

aT10 α11

−L−100 l10

1

= (−L−1

00 l10)T L00LT

00(−L−100 l10)+(−L−1

00 l10)T (L00l10)+(L00l10)

T (−L−100 l10)+α11

= α11− lT10l10.

370

Algorithm: A := CHOL UNB(A) Bordered algorithm)

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

whereα11 is 1×1

(A00 = L00LT00 has already been computed and

L00 has overwritten the lower triangular part of A00)

aT10 := aT

10L−T00 (Solve L00l10 = a10, overwriting aT

10 with l10.)

α11 := α11−aT10a10

α11 :=√

α11

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Figure 12.4: Answer for Homework 12.17.

4. Show that A00 a10

aT10 α11

=

L00 0

lT10 λ11

L00 0

lT10 λ11

T

Answer: This is just a matter of multiplying it all out.

* BACK TO TEXT

Homework 12.17 Use the results in the last exercise to give an alternative proof by induction of theCholesky Factorization Theorem and to give an algorithm for computing it by filling in Figure 12.3. Thisalgorithm is often referred to as the bordered Cholesky factorization algorithm.

Answer:Proof by induction.

371

Base case: n = 1. Clearly the result is true for a 1× 1 matrix A = α11: In this case, the fact that A isSPD means that α11 is real and positive and a Cholesky factor is then given by λ11 =

√α11, with

uniqueness if we insist that λ11 is positive.

Inductive step: Assume the result is true for SPD matrix A ∈ C(n−1)×(n−1). We will show that it holdsfor A ∈ Cn×n. Let A ∈ Cn×n be SPD. Partition A and L like

A =

A00 ?

aT10 α11

and L =

L00 0

lT10 λ11

.

Assume that A00 = L00LT00 is the Cholesky factorization of A00. We know this exists since A00 is (n−

1)× (n−1). We then know that L00 is nonsingular, lT10 = aT

01L−T00 exists, and λ11 =

√α11− lT

10l10 is

well-defined. We also know that A = LLT and hence L is the desired Cholesky factor of A.

By the principle of mathematical induction, the theorem holds.

The algorithm is given in 12.4.* BACK TO TEXT

Homework 12.18 Show that the cost of the bordered algorithm is, approximately, 13n3 flops.

Answer: During the kth iteration, k = 0, . . . ,n−1 assume that A00 is k× k. Then

• aT10 = aT

10L−T00 (implemented as the lower triangular solve L00l10 = a10) requires, approximately, k2

flops.

• The cost of update α11 can be ignored.

Thus, the total cost equals (approximately)

n−1

∑k=0

k2 ≈∫ n

0x2dx =

13

n3.

* BACK TO TEXT

372

Chapter 13. Notes on Eigenvalues and Eigenvectors (Answers)

Homework 13.3 Eigenvectors are not unique.

Answer: If Ax = λx for x 6= 0, then A(αx) = αAx = αλx = λ(αx). Hence any (nonzero) scalar multipleof x is also an eigenvector. This demonstrates we care about the direction of an eigenvector rather than itslength.

* BACK TO TEXT

Homework 13.4 Let λ be an eigenvalue of A and let Eλ(A) = x ∈ Cm|Ax = λx denote the set of alleigenvectors of A associated with λ (including the zero vector, which is not really considered an eigenvec-tor). Show that this set is a (nontrivial) subspace of Cm.

Answer:

• 0 ∈ Eλ(A): (since we explicitly include it).

• x ∈ Eλ(A) implies αx ∈ Eλ(A): (by the last exercise).

• x,y ∈ Eλ(A) implies x+ y ∈ Eλ(A):

A(x+ y) = Ax+Ay = λA+λy = λ(x+ y).

* BACK TO TEXT

Homework 13.8 The eigenvalues of a diagonal matrix equal the values on its diagonal. The eigenvaluesof a triangular matrix equal the values on its diagonal.

Answer: Since a diagonal matrix is a special case of a triangular matrix, it suffices to prove that theeigenvalues of a triangular matrix are the values on its diagonal.

If A is triangular, so is A−λI. By definition, λ is an eigenvalue of A if and only if A−λI is singular.But a triangular matrix is singular if and only if it has a zero on its diagonal. The triangular matrix A−λIhas a zero on its diagonal if and only if λ equals one of the diagonal elements of A.

* BACK TO TEXT

Homework 13.19 Prove Lemma 13.18. Then generalize it to a result for block upper triangular matrices:

A =

A0,0 A0,1 · · · A0,N−1

0 A1,1 · · · A1,N−1

0 0 . . . ...

0 0 · · · AN−1,N−1

.

Answer:

373

Lemma 13.18 Let A ∈Cm×m be of form A =

AT L AT R

0 ABR

. Assume that QT L and QBR are

unitary “of appropriate size”. Then

A =

QT L 0

0 QBR


BR

0 QBRABRQHBR

QT L 0

0 QBR

.

Proof: QT L 0

0 QBR


BR

0 QBRABRQHBR

QT L 0

0 QBR

=

QHT LQT LAT LQH

T LQT L QHT LQT LAT RQH

BRQBR

0 QHBRQBRABRQH

BRQBR

=

AT L AT R

0 ABR

= A.

By simple extension A = QHBQ where

Q =

Q0,0 0 · · · 0

0 Q1,1 · · · 0

0 0 . . . ...

0 0 · · · QN−1,N−1

and

B =

Q0,0A0,0QH

0,0 Q0,0A0,1QH1,1 · · · Q0,0A0,N−1QH

N−1,N−1

0 Q1,1A1,1QH1,1 · · · Q1,1A1,N−1QH

N−1,N−1

0 0 . . . ...

0 0 · · · QN−1,N−1AN−1,N−1QHN−1,N−11

.

* BACK TO TEXT

Homework 13.21 Prove Corollary 13.20. Then generalize it to a result for block upper triangular matri-ces.

Answer:

Colollary 13.20 Let A ∈ Cm×m be of for A =

AT L AT R

0 ABR

. Then Λ(A) = Λ(AT L)∪

Λ(ABR).

374

Proof: This follows immediately from Lemma 13.18. Let AT L = QT LTT LiQHT L and ABR = QBRTBRiQH

BRbe the Shur decompositions of the diagonal blocks. Then Λ(AT L) equals the diagonal elements of TT Land Λ(ABR) equals the diagonal elements of TBR. Now, by Lemma 13.18 the Schur decomposition of A isgiven by

A =

QT L 0

0 QBR


BR

0 QBRABRQHBR

QT L 0

0 QBR

=

QT L 0

0 QBR

H TT L QT LAT RQHBR

0 TBR

︸︷︷︸

TA

QT L 0

0 QBR

.

Hence the diagonal elements of TA (which is upper triangular) equal the elements of Λ(A). The set ofthose elements is clearly Λ(AT L)∪Λ(ABR).

The generalization is that if

A =

A0,0 A0,1 · · · A0,N−1

0 A1,1 · · · A1,N−1

0 0 . . . ...

0 0 · · · AN−1,N−1

where the blocks on the diagonal are square, then

Λ(A) = Λ(A0,0)∪Λ(A1,1)∪·· ·∪Λ(AN−1,N−1).

* BACK TO TEXT

Homework 13.24 Let A be Hermitian and λ and µ be distinct eigenvalues with eigenvectors xλ and xµ,respectively. Then xH

λxµ = 0. (In other words, the eigenvectors of a Hermitian matrix corresponding to

distinct eigenvalues are orthogonal.)

Answer: Assume that Axλ = λxλ and Axµ = µxµ, for nonzero vectors xλ and xµ and λ 6= µ. Then

xHµ Axλ = λxH

µ xλ

andxH

λAxµ = µxH

λxµ.

Because A is Hermitian and λ is real

µxHλ

xµ = xHλ

Axµ = (xHµ Axλ)

H = (λxHµ xλ)

H = λxHλ

xµ.

HenceµxH

λxµ = λxH

λxµ.

If xHλ

xµ 6= 0 then µ = λ, which is a contradiction.* BACK TO TEXT

375

Homework 13.25 Let A ∈ Cm×m be a Hermitian matrix, A = QΛQH its Spectral Decomposition, andA =UΣV H its SVD. Relate Q, U , V , Λ, and Σ.

Answer: I am going to answer this by showing how to take the Spectral decomposition of A, and turnthis into the SVD of A. Observations:

• We will assume all eigenvalues are nonzero. It should be pretty obvious how to fix the below if someof them are zero.

• Q is unitary and Λ is diagonal. Thus, A = QΛQH is the SVD except that the diagonal elements of Λ

are not necessarily nonnegative and they are not ordered from largest to smallest in magnitude.

• We can fix the fact that they are not nonnegative with the following observation:

A = QΛQH = Q

λ0 0 · · · 0

0 λ1 · · · 0...

.... . .

...

0 0 · · · λn−1

QH

= Q

λ0 0 · · · 0

0 λ1 · · · 0...

.... . .

...

0 0 · · · λn−1

sign(λ0) 0 · · · 0

0 sign(λ1) · · · 0...

.... . .

...

0 0 · · · sign(λn−1)

sign(λ0) 0 · · · 0

0 sign(λ1) · · · 0...

.... . .

...

0 0 · · · sign(λn−1)

︸︷︷︸

I

QH

= Q

λ0 0 · · · 0

0 λ1 · · · 0...

.... . .

...

0 0 · · · λn−1

sign(λ0) 0 · · · 0

0 sign(λ1) · · · 0...

.... . .

...

0 0 · · · sign(λn−1)

︸︷︷︸

|λ0| 0 · · · 0

0 |λ1| · · · 0...

.... . .

...

0 0 · · · |λn−1|

︸︷︷︸

Λ

sign(λ0) 0 · · · 0

0 sign(λ1) · · · 0...

.... . .

...

0 0 · · · sign(λn−1)

QH

︸︷︷︸QH

= QΛQH .

Now Λ has nonnegative diagonal entries and Q is unitary since it is the product of two unitarymatrices.

• Next, we fix the fact that the entries of Λ are not ordered from largest to smallest. We do so bynoting that there is a permutation matrix P such that Σ = PΛPH equals the diagonal matrix with thevalues ordered from the largest to smallest. Then

A = QΛQH

376

= Q PHP︸︷︷︸I

Λ PHP︸︷︷︸I

QH

= QPH︸︷︷︸U

PΛPH︸︷︷︸Σ

PQH︸︷︷︸V H

= UΣV H .

* BACK TO TEXT

Homework 13.26 Let A ∈ Cm×m and A = UΣV H its SVD. Relate the Spectral decompositions of AHAand AAH to U , V , and Σ.

Answer:AHA = (UΣV H)HUΣV H =V ΣUHUΣV H =V Σ

2V H .

Thus, the eigenvalues of AHA equal the square of the singular values of A.

AAH =UΣV H(UΣV H)H =UΣV HV ΣUH =UΣ2UH .

Thus, the eigenvalues of AAH equal the square of the singular values of A.* BACK TO TEXT

377

Chapter 14. Notes on the Power Method and Related Methods (An-swers)


Answer: We need to show that

• Let y 6= 0. Show that ‖y‖X > 0: Let z = Xy. Then z 6= 0 since X is nonsingular. Hence

‖y‖X = ‖Xy‖= ‖z‖> 0.

• Show that if α ∈ C and y ∈ Cm then ‖αy‖X = |α|‖y‖X :

‖αy‖X = ‖X(αy)‖= ‖αXy‖= |α|‖Xy‖= |α|‖y‖X .

• Show that if x,y ∈ Cm then ‖x+ y‖X ≤ ‖x‖X +‖y‖X :

‖x+ y‖X = ‖X(x+ y)‖= ‖Xx+Xy‖ ≤ ‖Xx‖+‖Xy‖= ‖x‖X +‖y‖X .

* BACK TO TEXT

Homework 14.7 Assume that

|λ0| ≥ |λ1| ≥ · · · ≥ |λm−2|> |λm−1|> 0.

Show that ∣∣∣∣ 1λm−1

∣∣∣∣> ∣∣∣∣ 1λm−2

∣∣∣∣≥ ∣∣∣∣ 1λm−3

∣∣∣∣≥ ·· · ≥ ∣∣∣∣ 1λ0

∣∣∣∣ .Answer: This follows immediately from the fact that if α > 0 and β > 0 then

• α > β implies that 1/β > 1/α and

• α≥ β implies that 1/β≥ 1/α.

* BACK TO TEXT


Answer: If Ax = λx then (A−µI)x = Ax−µx = λx−µx = (λ−µ)x.* BACK TO TEXT


Answer:A−µI = XΛX−1−µXX−1 = X(Λ−µI)X−1.

* BACK TO TEXT

378

Chapter 15. Notes on the QR Algorithm and other Dense Eigen-solvers(Answers)

Homework 15.2 Prove that in Figure 15.3, V (k) =V (k), and A(k) = A(k), k = 0, . . ..

Answer: This requires a proof by induction.

• Base case: k = 0. We need to show that V (0) =V (0) and A(0) = A(0). This is trivially true.

• Inductive step: Inductive Hypothesis (IH): V (k) =V (k) and A(k) = A(k).

We need to show that V (k+1) =V (k+1) and A(k+1) = A(k+1). Notice that

AV (k) = V (k+1)R(k+1) (QR factorization)

so thatA(k) = V (k)T AV (k) = V (k)TV (k+1)R(k+1)

Since V (k)TV (k+1) this means that A(k) = V (k)TV (k+1)R(k+1) is a QR factorization of A(k).

Now, by the I.H.,A(k) = A(k)V (k)TV (k+1)R(k+1)

and in the algorithm on the rightA(k) = Q(k+1)R(k+1).

By the uniqueness of the QR factorization, this means that

Q(k+1) = V (k)TV (k+1).

But thenV (k+1) =V (k)Q(k+1) = V (k)Q(k+1) = V (k)V (k)T︸︷︷︸

I

V (k+1) = V (k+1).

Finally,

A(k+1) = V (k+1)T AV (k+1) =V (k+1)T AV (k+1) = (V (k)Q(k+1))T A(V (k)Q(k+1))

= Q(k+1)TV (k)AV (k)Q(k+1) = Q(k+1)T A(k)︸︷︷︸R(k+1)

Q(k+1) = R(k+1)Q(k+1) = A(k+1)

• By the Principle of Mathematical Induction the result holds for all k.

* BACK TO TEXT

Homework 15.3 Prove that in Figure 15.4, V (k) =V (k), and A(k) = A(k), k = 1, . . ..

Answer: This requires a proof by induction.

379

• Base case: k = 0. We need to show that V (0) =V (0) and A(0) = A(0). This is trivially true.

• Inductive step: Inductive Hypothesis (IH): V (k) =V (k) and A(k) = A(k).

We need to show that V (k+1) =V (k+1) and A(k+1) = A(k+1).

The proof of the inductive step is a minor modification of the last proof except that we need to showthat µk = µk:

µk = v(k)Tn−1 Av(k)n−1(V

(k)en−1)T A(V (k)en−1) = en−1V (k)T AV (k)en−1

= en−1A(k)en−1 = en−1A(k)en−1︸︷︷︸by I.H.

= α(k)n−1,n−1 = µk

• By the Principle of Mathematical Induction the result holds for all k.

* BACK TO TEXT

Homework 15.5 Prove the above theorem.

Answer: I believe we already proved this in “Notes on Eigenvalues and Eigenvectors”.* BACK TO TEXT

Homework 15.9 Give all the details of the above proof.

Answer: Assume that q1, . . . ,qk and the column indexed with k−1 of B have been shown to be uniquelydetermined under the stated assumptions. We now show that then qk+1 and the column indexed by k of Bare uniquely determined. (This is the inductive step in the proof.) Then

Aqk = β0,kq0 +β1,kq1 + · · ·+βk,kqk +βk+1,kqk+1.

We can determine β0,k throught βk,k by observing that

qTj Aqk = β j,k

for j = 0, . . . ,k. Then

βk+1,kqk+1 = Aqk− (β0,kq0 +β1,kq1 + · · ·+βk,kqk) = qk+1.

Since it is assumed that βk+1,k > 0, it can be determined as

βk+1,k = ‖qk+1‖2

and thenqk+1 = qk+1/βk+1,k.

This way, the columns of Q and B can be determined, one-by-one.* BACK TO TEXT

380

Homework 15.11 In the above discussion, show that α211 +2α2

31 +α233 = α2

11 + α233.

Answer: If A31 = J31A31JT31 then ‖A31‖2

F = ‖A31‖2F because multiplying on the left and/or the right by a

unitary matrix preserves the Frobenius norm of a matrix. Hence∥∥∥∥∥∥ α11 α31

α31 α33

∥∥∥∥∥∥2

F

=

∥∥∥∥∥∥ α11 0

0 α33

∥∥∥∥∥∥2

F

.

But ∥∥∥∥∥∥ α11 α31

α31 α33

∥∥∥∥∥∥2

F

= α211 +2α

231 +α

233

and ∥∥∥∥∥∥ α11 0

0 α33

∥∥∥∥∥∥2

F

= α211 ++α

233,

which proves the result.* BACK TO TEXT

Homework 15.14 Give the approximate total cost for reducing a nonsymmetric A ∈ Rn×n to upper-Hessenberg form.

Answer: To prove this, assume that in the current iteration of the algorithm A00 is k× k. The update inthe current iteration is then

[u21,τ1,a21] := HouseV(a21) lower order cost, ignore

y01 := A02u21 2k(n− k−1) flops since A02 is k× (n− k−1).

A02 := A02− 1τy01uT

21 2k(n− k−1) flops since A02 is k× (n− k−1).

ψ11 := aT12u21 lower order cost, ignore

aT12 := aT

12 +ψ11

τuT

21 lower order cost, ignore

y21 := A22u21 2(n− k−1)2 flops since A22 is (n− k−1)× (n− k−1).

β := uT21y21/2 lower order cost, ignore

z21 := (y21−βu21/τ1)/τ1 lower order cost, ignore

w21 := (AT22u21−βu21/τ)/τ 2(n− k−1)2 flops since A22 is (n− k−1)× (n− k−1) and

the significant cost is in the matrix-vector multiply.

A22 := A22− (u21wT21 + z21uT

21) 2×2k(n− k−1) flops since A22 is (n− k−1)× (n− k−1),which is updated by two rank-1 updates.

381

Thus, the total cost in flops is, approximately,

n−1

∑k=0

[2k(n− k−1)+2k(n− k−1)+2(n− k−1)2 +2(n− k−1)2 +2×2(n− k−1)2]

=n−1

∑k=0

[4k(n− k−1)+4(n− k−1)2]+ n−1

∑k=0

4(n− k−1)2

=n−1

∑k=0

[4k(n− k−1)+4(n− k−1)2]︸︷︷︸= 4(k+n− k−1)(n− k−1)

= 4(n−1)(n− k−1)

≈ 4n(n− k−1)

+4(n− k−1)2

≈ 4nn−1

∑k=0

(n− k−1)+4n−1

∑k=0

(n− k−1)2 = 4nn−1

∑j=0

j+4n−1

∑j=0

j2

≈ 4nn(n−1)

2+4

∫ n

j=0j2d j ≈ 2n3 +

43

n3 =103

n3.

Thus, the cost is (approximately)103

n3 flops.

* BACK TO TEXT

382

Algorithm: A := LDLT(A)

Partition A→

AT L ?

ABL ABR

whereAT L is 0×0


Repartition

AT L ?

ABL ABR

→

A00 ? ?

aT10 α11 ?

A20 a21 A22

whereα11 is 1×1

Option 1: Option 2: Option 3:

l21 := a21/α11 a21 := a21/α11

A22 := A22− l21aT21 A22 := A22−α11a21aT

21 A22 := A22− 1α11

a21aT21

(updating lower triangle) (updating lower triangle) (updating lower triangle)

a21 := l21 a21 := a21/α11

Continue with

AT L ?

ABL ABR

←

A00 ? ?

aT10 α11 ?

A20 a21 A22

endwhile

Figure 16.5: Unblocked algorithm for computing A→ LDLT , overwriting A.

Chapter 16. Notes on the Method of Relatively Robust Representa-tions (Answers)

Homework 16.1 Modify the algorithm in Figure 16.1 so that it computes the LDLT factorization. (Fill inFigure 16.3.)

Answer:Three possible answers are given in Figure 16.5.

* BACK TO TEXT

Homework 16.2 Modify the algorithm in Figure 16.2 so that it computes the LDLT factorization of atridiagonal matrix. (Fill in Figure 16.4.) What is the approximate cost, in floating point operations, of

383

Algorithm: A := LDLT TRI(A)

Partition A→

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

whereALL is 0×0


Repartition

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

→

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

whereα22 is a scalars


λ21 := α21/α11 α21 := α21/α11

α22 := α22−λ21α21 α22 := α22−α11α221 α22 := α22− 1

α11α2

21

(updating lower triangle) (updating lower triangle) (updating lower triangle)

α21 := λ21 α21 := α21/α11

Continue with

AFF ? ?

αMF eTL αMM ?

0 αLMeF ALL

←

A00 ? ?

α10eTL α11 ? ?

0 α21 α22 ?

0 0 α32eF A33

endwhile

Figure 16.6: Algorithm for computing the the LDLT factorization of a tridiagonal matrix.

computing the LDLT factorization of a tridiagonal matrix? Count a divide, multiply, and add/subtract as afloating point operation each. Show how you came up with the algorithm, similar to how we derived thealgorithm for the tridiagonal Cholesky factorization.

Answer:

Three possible algorithms are given in Figure 16.6.

The key insight is to recognize that, relative to the algorithm in Figure 16.5, a21 =

α21

0

so that,

384

for example, a21 := a21/α11 becomes α21

0

:=

α21

0

/α11 =

α21/α11

0

.

Then, an update like A22 := A22−α11a21aT21 becomes α22 ?

α32eF A33

:=

α22 ?

α32eF A33

−α11

α21

0

α21

0

T

=

α22−α11α221 ?

α32eF A33

.

* BACK TO TEXT

Homework 16.3 Derive an algorithm that, given an indefinite matrix A, computes A =UDUT . Overwriteonly the upper triangular part of A. (Fill in Figure 16.5.) Show how you came up with the algorithm,similar to how we derived the algorithm for LDLT .

Answer: Three possible algorithms are given in Figure 16.7.Partition

A→

A00 a01

aT01 α11

U →

U00 u01

0 1

, and D→

D00 0

0 δ1

Then A00 a01

aT01 α11

=

U00 u01

0 1

D00 0

0 δ1

U00 u01

0 1

T

=

U00D00 u01δ1

0 δ1

UT00 0

uT01 1

=

U00D00UT00 +δ1u01uT

01 u01δ1

? δ1

This suggests the following algorithm for overwriting the strictly upper triangular part of A with the strictlyupper triangular part of U and the diagonal of A with D:

• Partition A→

A00 a01

aT01 α11

.

• α11 := δ11 = α11 (no-op).

385

Algorithm: A := UDUT(A)

Partition A→

AT L AT R

? ABR

whereABR is 0×0

while m(ABR)< m(A) do

Repartition

AT L AT R

? ABR

→

A00 a01 A02

? α11 aT12

? ? A22

whereα11 is 1×1


u01 := a01/α11 u01 := a01/α11

A00 := A00−u01aT01 A00 := A00−α11a01aT

01 A00 := A00− 1α11

a01aT01

(updating upper triangle) (updating upper triangle) (updating upper triangle)

a01 := u01 a01 := a01/α11

Continue with

AT L AT R

? ABR

←

A00 a01 A02

? α11 aT12

? ? A22

endwhile


386

• Compute u01 := u01/α11.

• Update A22 := A22−u01aT01 (updating only the upper triangular part).

• a01 := u01.

• Continue with computing A00→U00D00LT00.

This algorithm will complete as long as δ11 6= 0,* BACK TO TEXT

Homework 16.4 Derive an algorithm that, given an indefinite tridiagonal matrix A, computes A=UDUT .Overwrite only the upper triangular part of A. (Fill in Figure 16.6.) Show how you came up with thealgorithm, similar to how we derived the algorithm for LDLT .

Answer: Three possible algorithms are given in Figure 16.8.The key insight is to recognize that, relative to the algorithm in Figure 16.7, α11 = α22 and a10 = 0

α12

so that, for example, a10 := a10/α11 becomes

0

α12

:=

0

α12

/α22 =

0

α12/α22

.

Then, an update like A00 := A00−α11a01aT01 becomes A00 α01eL

? α11

:=

A00 α01eL

? α11

−α22

0

α12

0

α12

T

=

A00 α01eL

? α11−α212

.

* BACK TO TEXT

Homework 16.5 Show that, provided φ1 is chosen appropriately,L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 φ1 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

=

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

.

387

Algorithm: A := UDUT TRI(A)

Partition A→

AFF αFMeT

L 0

? αMM αMLeTF

? ? ALL

whereAFF is 0×0

while m(ALL)< m(A) do

Repartition

AFF αFMeL 0

? αMM αMLeTF

? ? ALL

→

A00 α01eL 0 0

? α11 α12 0

? ? α22 α23eTF

? ? ? A33

where


υ12 := α12/α22 α12 := α12/α22

α11 := α11−υ12αT12 α11 := α11−α22α2

12 α11 := α11− 1α22

α212

(updating upper triangle) (updating upper triangle) (updating upper triangle)

α12 := υ12 α12 := α12/α22

Continue with

AFF αFMeL 0

? αMM αMLeTF

? ? ALL

←

A00 α01eL 0 0

? α11 α12 0

? ? α22 α23eTF

? ? ? A33

endwhile


388

(Hint: multiply out A = LDLT and A = UEUT with the partitioned matrices first. Then multiply out theabove. Compare and match...) How should φ1 be chosen? What is the cost of computing the twistedfactorization given that you have already computed the LDLT and UDUT factorizations? A ”Big O”estimate is sufficient. Be sure to take into account what eT

L D00eL and eTFE22eF equal in your cost estimate.

Answer:L00 0 0

λ10eTL 1 0

0 λ21eF L22

D00 0 0

0 δ1 0

0 0 D22

L00 0 0

λ10eTL 1 0

0 λ21eF L22

T

=

L00 0 0

λ10eTL 1 0

0 λ21eF L22

D00 0 0

0 δ1 0

0 0 D22

LT00 0 λ10eL

0 1 λ21eTF

0 LT22

=

L00 0 0

λ10eTL 1 0

0 λ21eF L22

D00LT00 λ10D00eL 0

0 δ1 λ21δ1eTF

0 0 D22LT22

=

L00D00LT

00 λ10L00D00eL 0

λ10eTL D00LT

00 λ210eT

L D00eL +δ1 λ21δ1eTF

0 λ21δ1eF λ221δ1eF eT

F +L22D22LT22

=

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

. (16.1)

and U00 υ01eL 0

0 1 υ21eTF

0 0 U22

E00 0 0

0 ε1 0

0 0 E22

U00 υ01eL 0

0 1 υ21eTF

0 0 U22

T

=

U00 υ01eL 0

0 1 υ21eTF

0 0 U22

E00 0 0

0 ε1 0

0 0 E22

UT00 0 0

υ01eTL 1 0

0 υ21eF UT22

=

U00 υ01eL 0

0 1 υ21eTF

0 0 U22

E00UT00 0 0

ε1υ01eTL ε1 0

0 υ21E22eF E22UT22

=

U00E00UT

00 +υ201ε1eLeT

L υ10ε1eL 0

υ01ε1eTL ε1 +υ2

01eTF E22eF ε1υ21eT

F E22UT22

0 υ21UT22E22eF U22E22UT

22

=

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

. (16.2)

Finally,L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 φ1 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

389

=

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 φ1 0

0 0 E22

LT00 λ10eL 0

0 1 0

0 υ21eF UT22

=

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00LT00 λ10D00eL 0

0 φ1 0

0 υ21E22eF E22UT22

=

L00D00LT

00 λ10L00D00eL 0

λ10eTL D00LT

00 λ210eT

L D00eL +φ1 +υ221eT

F E22eF υ21eTF E22UT

22

0 υ21U22E22eF U22E22UT22

=

A00 α01eL 0

α01eTL α11 α21eT

F

0 α21eF A22

. (16.3)

The equivalence of the submatrices highlighted in yellow in (16.1) and (??) justify the submatrices high-lighted in yellow in (16.3).

The equivalence of the submatrices highlighted in grey in (16.1), (??, and (16.3) tell us that

α11 = λ210eT

L D00eL +δ1

α11 = υ201eT

FE22eF + ε1

α11 = λ210eT

L D00eL +φ1 +υ221eT

FE22eF .

or

α11−δ1 = λ210eT

L D00eL

α11− ε1 = υ201eT

FE22eF

so thatα11 = (α11−δ1)+φ1 +(α11− ε1).

Solving this for φ1 yieldsφ1 = δ1 + ε1−α11.

Notice that, given the factorizations A = LDLT and A =UEUT the cost of computing the twisted factor-ization is O(1).

* BACK TO TEXT

Homework 16.6 Compute x0, χ1, and x2 so thatL00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 0 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

x0

χ1

x2

︸︷︷︸

Hint:

0

1

0

=

0

0

0

,

390

where x =

x0

χ1

x2

is not a zero vector. What is the cost of this computation, given that L00 and U22 have

special structure?

Answer: Choose x0, χ1, and x2 so that

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

x0

χ1

x2

=

0

1

0

or, equivalently,

LT00 λ10eL 0

0 1 0

0 υ21eF UT22

x0

χ1

x2

=

0

1

0

We conclude that χ1 = 1 and

LT00x0 = −λ10eL

UT22x2 = −υ21eF

so that x0 and x2 can be computed via solves with a bidiagonal upper triangular matrix (LT00) and bidiagonal

lower triangular matrix (UT00), respectively.

Then

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 0 0

0 0 E22

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

T

x0

χ1

x2

=

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

D00 0 0

0 0 0

0 0 E22

0

1

0

=

L00 0 0

λ10eTL 1 υ21eT

F

0 0 U22

0

0

0

=

0

0

0

,

391

Now, consider solving LT00x0 =−λ10eL:

× × 0 0 0 0

0 × × 0 0 0

0 0 × × 0 0

0 0 0 × × 0

0 0 0 0 × ×0 0 0 0 0 ×

=

0

0

0

0

0

−λ10

A moment of reflection shows that if L00 is k× k, then solving LT

00x0 = −λ10eL requires O(k) flops.Similarly, since then U22 is then (n− k− 1)× (n− k− 1), then solving UT

22x2 = −υ21eF requires O(n−k−1) flops. Thus, computing x requires O(n) flops.

Thus,

• Given tridiagonal A computing all eigenvalues requires O(n2) computation. (We have not discussedthis in detail...)

• Then, for each eigenvalue

– Computing A := A−λI requires O(n) flops.

– Factoring A = LDLT requires O(n) flops.

– Factoring A =UEUT requires O(n) flops.

– Computing all φ1 so that the smallest can be chose requires O(n) flops.

– Computing the eigenvector of A associated with λ from the twisted factorization requires O(n)computation.

Thus, computing all eigenvalues and eigenvectors of a tridiagonal matrix via this method requiresO(n2) computation. This is much better than computing these via the tridiagonal QR algorithm.

* BACK TO TEXT

392

Chapter 16 Notes on Computing the SVD (Answers)

Homework 16.2 If A = UΣV T is the SVD of A then AT A = V Σ2V T is the Spectral Decomposition ofAT A.

* BACK TO TEXT

Homework 16.3 Homework 16.4 Give the approximate total cost for reducing A ∈ Rn×n to bidiagonalform.

* BACK TO TEXT

393

Chapter 17. Notes on Splitting Methods (Answers)

Homework 17.10 Prove the above theorem.

Answer: Let λ equal the eigenvector such that |λ|= ρ(A). Then

‖A‖= max‖y‖=1

‖Ay‖ ≥ max‖x‖= 1Ax = λx

‖Ax‖= max‖x‖= 1Ax = λx

‖λx‖= max‖x‖= 1Ax = λx

|λ|‖x‖= |λ|= ρ(A).

* BACK TO TEXT

394

Appendix AHow to Download

Videos associated with these notes can be viewed in one of three ways:

• * YouTube links to the video uploaded to YouTube.

• * Download from UT Box links to the video uploaded to UT-Austin’s “UT Box’ File SharingService”.

• * View After Local Download links to the video downloaded to your own computer in directoryVideo within the same directory in which you stored this document. You can download videos byfirst linking on * Download from UT Box. Alternatively, visit the UT Box directory in which thevideos are stored and download some or all.

395

https://utexas.box.com/s/x50rjx8hdl5koer004ga

Bibliography

[1] E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, Jack J. Dongarra, J. Du Croz, S. Ham-marling, A. Greenbaum, A. McKenney, and D. Sorensen. LAPACK Users’ guide (third ed.). SIAM,Philadelphia, PA, USA, 1999.

[2] E. Anderson, Z. Bai, J. Demmel, J. E. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. E.McKenney, S. Ostrouchov, and D. Sorensen. LAPACK Users’ Guide. SIAM, Philadelphia, 1992.

[3] Paolo Bientinesi. Mechanical Derivation and Systematic Analysis of Correct Linear Algebra Algo-rithms. PhD thesis, Department of Computer Sciences, The University of Texas, 2006. TechnicalReport TR-06-46. September 2006.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[4] Paolo Bientinesi, John A. Gunnels, Margaret E. Myers, Enrique S. Quintana-Ortı, and Robert A.van de Geijn. The science of deriving dense linear algebra algorithms. ACM Trans. Math. Soft.,31(1):1–26, March 2005.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[5] Paolo Bientinesi, Enrique S. Quintana-Ortı, and Robert A. van de Geijn. Representing linear al-gebra algorithms in code: The FLAME APIs. ACM Trans. Math. Soft., 31(1):27–59, March 2005.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[6] Paolo Bientinesi and Robert A. van de Geijn. The science of deriving stability analyses. FLAMEWorking Note #33. Technical Report AICES-2008-2, Aachen Institute for Computational Engineer-ing Sciences, RWTH Aachen, November 2008.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[7] Paolo Bientinesi and Robert A. van de Geijn. Goal-oriented and modular stability analysis. SIAM J.Matrix Anal. & Appl., 32(1):286–308, 2011.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[8] Paolo Bientinesi and Robert A. van de Geijn. Goal-oriented and modular stability analysis. SIAM J.Matrix Anal. Appl., 32(1):286–308, March 2011.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.We suggest you read FLAME Working Note #33 for more details.

[9] Christian Bischof and Charles Van Loan. The WY representation for products of Householder ma-trices. SIAM J. Sci. Stat. Comput., 8(1):s2–s13, Jan. 1987.

[10] Basic linear algebra subprograms technical forum standard. International Journal of High Perfor-mance Applications and Supercomputing, 16(1), 2002.

397







[11] James W. Demmel. Applied Numerical Linear Algebra. SIAM, 1997.

[12] I. S. Dhillon. A New O(n2) Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector Prob-lem. PhD thesis, Computer Science Division, University of California, Berkeley, California, May1997. Available as UC Berkeley Technical Report No. UCB//CSD-97-971.

[13] I. S. Dhillon. Reliable computation of the condition number of a tridiagonal matrix in O(n) time.SIAM J. Matrix Anal. Appl., 19(3):776–796, July 1998.

[14] I. S. Dhillon and B. N. Parlett. Multiple representations to compute orthogonal eigenvectors ofsymmetric tridiagonal matrices. Lin. Alg. Appl., 387:1–28, August 2004.

[15] Inderjit S. Dhillon, Beresford N. Parlett, and Christof Vomel. The design and implementation of theMRRR algorithm. ACM Trans. Math. Soft., 32(4):533–560, December 2006.

[16] J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart. LINPACK Users’ Guide. SIAM,Philadelphia, 1979.

[17] Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain Duff. A set of level 3 basic linearalgebra subprograms. ACM Trans. Math. Soft., 16(1):1–17, March 1990.

[18] Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Richard J. Hanson. An extended set ofFORTRAN basic linear algebra subprograms. ACM Trans. Math. Soft., 14(1):1–17, March 1988.

[19] Jack J. Dongarra, Iain S. Duff, Danny C. Sorensen, and Henk A. van der Vorst. Solving LinearSystems on Vector and Shared Memory Computers. SIAM, Philadelphia, PA, 1991.

[20] Jack J. Dongarra, Sven J. Hammarling, and Danny C. Sorensen. Block reduction of matrices tocondensed forms for eigenvalue computations. Journal of Computational and Applied Mathematics,27, 1989.

[21] Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins UniversityPress, Baltimore, 3nd edition, 1996.

[22] Kazushige Goto and Robert van de Geijn. Anatomy of high-performance matrix multiplication. ACMTrans. Math. Soft., 34(3):12:1–12:25, May 2008.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[23] John A. Gunnels. A Systematic Approach to the Design and Analysis of Parallel Dense Linear Alge-bra Algorithms. PhD thesis, Department of Computer Sciences, The University of Texas, December2001.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[24] John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn. FLAME: FormalLinear Algebra Methods Environment. ACM Trans. Math. Soft., 27(4):422–455, December 2001.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[25] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial andApplied Mathematics, Philadelphia, PA, USA, second edition, 2002.

398




[26] C. G. J. Jacobi. Uber ein leichtes Verfahren, die in der Theorie der Sakular-storungen vorkommendenGleichungen numerisch aufzulosen. Crelle’s Journal, 30:51–94, 1846.

[27] Thierry Joffrain, Tze Meng Low, Enrique S. Quintana-Ortı, Robert van de Geijn, and Field G. VanZee. Accumulating Householder transformations, revisited. ACM Trans. Math. Softw., 32(2):169–179, June 2006.

[28] C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh. Basic linear algebra subprograms forFortran usage. ACM Trans. Math. Soft., 5(3):308–323, Sept. 1979.

[29] Margaret E. Myers, Pierce M. van de Geijn, and Robert A. van de Geijn. Linear Algebra: Founda-tions to Frontiers - Notes to LAFF With. Self published, 2014.Download from http://www.ulaff.net.

[30] Jack Poulson, Bryan Marker, Robert A. van de Geijn, Jeff R. Hammond, and Nichols A. Romero.Elemental: A new framework for distributed memory dense matrix computations. ACM Trans. Math.Softw., 39(2):13:1–13:24, February 2013.

[31] C. Puglisi. Modification of the Householder method based on the compact WY representation. SIAMJ. Sci. Stat. Comput., 13:723–726, 1992.

[32] Gregorio Quintana-Ortı and Robert van de Geijn. Improving the performance of reduction to Hes-senberg form. ACM Trans. Math. Softw., 32(2):180–194, June 2006.

[33] Robert Schreiber and Charles Van Loan. A storage-efficient WY representation for products ofHouseholder transformations. SIAM J. Sci. Stat. Comput., 10(1):53–57, Jan. 1989.

[34] G. W. Stewart. Matrix Algorithms Volume 1: Basic Decompositions. SIAM, Philadelphia, PA, USA,1998.

[35] G. W. Stewart. Matrix Algorithms Volume II: Eigensystems. SIAM, Philadelphia, PA, USA, 2001.

[36] Lloyd N. Trefethen and David Bau III. Numerical Linear Algebra. SIAM, 1997.

[37] Robert van de Geijn and Kazushige Goto. Encyclopedia of Parallel Computing, chapter BLAS (BasicLinear Algebra Subprograms), pages Part 2, 157–164. Springer, 2011.

[38] Robert A. van de Geijn and Enrique S. Quintana-Ortı. The Science of Programming Matrix Compu-tations. www.lulu.com/contents/contents/1911788/, 2008.

[39] Field G. Van Zee. libflame: The Complete Reference. www.lulu.com, 2012.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[40] Field G. Van Zee, Ernie Chan, Robert van de Geijn, Enrique S. Quintana-Ortı, and GregorioQuintana-Ortı. The libflame library for dense matrix computations. IEEE Computation in Science &Engineering, 11(6):56–62, 2009.

[41] Field G. Van Zee and Robert A. van de Geijn. BLIS: A framework for rapid instantiation of BLASfunctionality. ACM Trans. Math. Soft., 2015. To appear.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

399





[42] Field G. Van Zee, Robert A. van de Geijn, and Gregorio Quintana-Ortı. Restructuring the tridiagonaland bidiagonal QR algorithms for performance. ACM Trans. Math. Soft., 40(3):18:1–18:34, April2014.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[43] Field G. Van Zee, Robert A. van de Geijn, Gregorio Quintana-Ortı, and G. Joseph Elizondo. Familiesof algorithms for reducing a matrix to condensed form. ACM Trans. Math. Soft., 39(1), 2012.Download from http://www.cs.utexas.edu/users/flame/web/FLAMEPublications.html.

[44] H. F. Walker. Implementation of the GMRES method using Householder transformations. SIAM J.Sci. Stat. Comput., 9(1):152–163, 1988.

[45] David S. Watkins. Fundamentals of Matrix Computations, 3rd Edition. Wiley, third edition, 2010.

[46] Stephen J. Wright. A collection of problems for which Gaussian elimination with partial pivoting isunstable. SIAM J. Sci. Comput., 14(1):231–238, 1993.

400



Index

absolute value, 4axpy, 8

cost, 9, 11

blocked matrix-matrix multiplication, 20bordered Cholesky factorization algorithm, 204

CGS, 59Cholesky factorization

bordered algorithm, 204other algorithm, 204

Classical Gram-Schmidt, 59code skeleton, 77complete pivoting

LU factorization, 174complex conjugate, 4complex scalar

absolute value, 4condition number, 54conjugate, 4

complex, 4matrix, 4scalar, 4vector, 4

dot, 9dot product, 9”dot” product, 10

eigenpairdefinition, 217

eigenvalue(, 215), 221definition, 217

eigenvector

(, 215), 221definition, 217

FLAMEAPI, 73–81notation, 75

FLAME API, 73–81FLAME notation, 12floating point operations, 8flop, 8

Gauss transform, 162–164Gaussian elimination, 159–193gemm

cost, 19gemm, 16gemv

cost, 12gemv, 11ger

cost, 16ger, 14Gershgorin Disc Theorem, 309Gram-Schmidt

Classical, 59cost, 70Madified, 64

Gram-Schmidt orthogonalizationimplementation, 75

Gram-Schmidt QR factorization, 57–72

Hermitian transposevector, 5

Householder transformation, 85

401

inner product, 9invariant subspace, 329inverse

Moore-Penrose generalized, 50pseudo, 50

LAFF Notes, ivlinear system solve, 175

triangular, 175–178low-rank approximation, 50lower triangular solve, 175–178LU decomposition, 161LU factorization, 159–193

complete pivoting, 174cost, 164–165existence, 161existence proof, 173–174partial pivoting, 165–172

algorithm, 167LU factorization:derivation, 161–162

MAC, 19Matlab, 75matrix

condition number, 54conjugate, 4low-rank approximation, 50orthogonal, 35–55orthogormal, 37permutation, 165spectrum, 217transpose, 4–7unitary, 37

matrix-matrix multiplication, 16blocked, 20cost, 19element-by-element, 17via matrix-vector multiplications, 18via rank-1 updates, 18via row-vector times matrix multiplications, 18

matrix-matrix operationgemm, 16matrix-matrix multiplication, 16

matrix-matrix product, 16matrix-vector multiplication, 11

algorithm, 13by columns, 13

by rows, 13cost, 12via axpy, 13via dot, 13

matrix-vector operation, 11gemv, 11ger, 14matrix-vector multiplication, 11rank-1 update, 13

matrix-vector product, 11MGS, 64Modified Gram-Schmidt, 64Moore-Penrose generalized inverse, 50multiplication

matrix-matrix, 16blocked, 20cost, 19element-by-element, 17via matrix-vector multiplications, 18via rank-1 updates, 18via row-vector times matrix multiplications,

18matrix-vector, 11

cost, 12multiply-accumulate, 19

notationFLAME, 12

Octave, 75orthonormal basis, 57outer product, 13

partial pivotingLU factorization, 165–172

permutationmatrix, 165

preface, iiproduct

dot, 9inner, 9matrix-matrix, 16matrix-vector, 11outer, 13

projectiononto column space, 49

pseudo inverse, 50

402

QR factorization, 61Gram-Schmidt, 57–72

rank-1 update, 13algorithm, 15by columns, 15by rows, 15cost, 16via axpy, 15via dot, 15

Reflector, 85reflector, 85residual, 303

scal, 7sca

cost, 8scaled vector addition, 8

cost, 9, 11scaling

vectorcost, 8

singular value, 41Singular Value Decomposition, 35–55

geometric interpretation, 41reduced, 45theorem, 41

solvetriangular, 175–178

Spark webpage, 75spectral radius

definition, 217spectrum

definition, 217

transpose, 4–7matrix, 5vector, 4

triangular solve, 175lower, 175–178upper, 178

triangular system solve, 178

upper triangular solve, 178

vectorcomplex conjugate, 4conjugate, 4

Hermitian transpose, 5orthogonal, 37orthonormal, 37perpendicular, 37scaling

cost, 8transpose, 4

vector additionscaled, 8

vector-vector operation, 7axpy, 8dot, 9scal, 7

403

linear algebra: foundations to frontiers a collection of ... · pdf filelinear algebra:...

Documents