functions of a matrix: theory and computationhigham/talks/talk06_funm.pdf · classic matlab help...

51
Functions of a Matrix: Theory and Computation Nick Higham School of Mathematics The University of Manchester [email protected] http://www.ma.man.ac.uk/~higham/ Landscapes in Mathematical Science, University of Bath, November 24 2006

Upload: others

Post on 17-Jan-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Functions of a Matrix:

Theory and Computation

Nick HighamSchool of Mathematics

The University of Manchester

[email protected]

http://www.ma.man.ac.uk/~higham/

Landscapes in Mathematical Science,University of Bath, November 24 2006

Page 2: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Outline

1 Definition of f (A)

2 Motivation and MATLAB

3 eA and its Frechét derivative

4 A1/2: Modified Newton Methods

MIMS Nick Higham Functions of a Matrix 2 / 45

Page 3: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Function of a Matrix

f : Cn×n 7→ C

n×n for an underlying scalar function f .

These are not matrix functions:

trace(A), det(A).

The adjugate (or adjoint) matrix.

Transfer function f (t) = B(tI − A)−1C.

sin A = (sin aij).

These are matrix functions:

eA = I + A +A2

2!+ · · · .

log(I + A) = A− A2

2+

A3

3+ · · · , ρ(A) < 1.

A−1, A1/2.

MIMS Nick Higham Functions of a Matrix 3 / 45

Page 4: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Multiplicity of Definitions

There have been proposed in the literature since 1880

eight distinct definitions of a matric function,

by Weyr, Sylvester and Buchheim,

Giorgi, Cartan, Fantappiè, Cipolla,

Schwerdtfeger and Richter.

— R. F. Rinehart,The Equivalence of Definitions of a Matric Function,

Amer. Math. Monthly (1955)

MIMS Nick Higham Functions of a Matrix 4 / 45

Page 5: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Jordan Canonical Form

Z−1AZ = J = diag(J1, . . . , Jp), Jk︸︷︷︸mk×mk

=

λk 1

λk. . .. . . 1

λk

Definition

f (A) = Zf (J)Z−1 = Zdiag(f (Jk))Z−1,

f (Jk) =

f (λk) f ′(λk) . . .f (mk−1))(λk)

(mk − 1)!

f (λk). . .

.... . . f ′(λk)

f (λk)

.

MIMS Nick Higham Functions of a Matrix 5 / 45

Page 6: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

The Formula for f (Jk)

Write Jk = λk I + Ek ∈ Cmk×mk . For mk = 3 we have

Ek =

0 1 0

0 0 1

0 0 0

, E2

k =

0 0 1

0 0 0

0 0 0

, E3

k = 0.

Assume f has Taylor expansion

f (t) = f (λk) + f ′(λk)(t − λk) + · · ·+ f (j)(λk)(t − λk)j

j!+ · · · .

Then

f (Jk) = f (λk)I + f ′(λk)Ek + · · ·+ f (mk−1)(λk)Emk−1k

(mk − 1)!.

MIMS Nick Higham Functions of a Matrix 6 / 45

Page 7: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Interpolation

Definition (Sylvester, 1883; Buchheim, 1886)

Distinct e’vals λ1, . . . , λs, ni = max size of Jordan blocks for

λi . Then f (A) = r(A), where r is unique Hermite

interpolating poly of degree <∑s

i=1 ni satisfying

r (j)(λi) = f (j)(λi), j = 0 : ni − 1, i = 1 : s.

MIMS Nick Higham Functions of a Matrix 7 / 45

Page 8: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Interpolation

Definition (Sylvester, 1883; Buchheim, 1886)

Distinct e’vals λ1, . . . , λs, ni = max size of Jordan blocks for

λi . Then f (A) = r(A), where r is unique Hermite

interpolating poly of degree <∑s

i=1 ni satisfying

r (j)(λi) = f (j)(λi), j = 0 : ni − 1, i = 1 : s.

Example. Let f (t) = t1/2, A =

[2 2

1 3

], λ(A) = {1, 4}.

Taking +ve square roots,

r(t) = f (1)t − 4

1− 4+ f (4)

t − 1

4− 1=

1

3(t + 2).

⇒ A1/2 = r(A) =1

3(A + 2I) =

1

3

[4 2

1 5

].

MIMS Nick Higham Functions of a Matrix 7 / 45

Page 9: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Cauchy Integral Theorem

Definition

f (A) =1

2πi

Γ

f (z)(zI − A)−1 dz,

where f is analytic on and inside a closed contour Γ that

encloses λ(A).

MIMS Nick Higham Functions of a Matrix 8 / 45

Page 10: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Equivalence of Definitions

Theorem

The three definitions are equivalent, modulo analyticity

assumption for Cauchy.

Interpolation: for basic properties.

JCF: for solving matrix equations (e.g., X 2 = A,

eX = A). For evaluation (normal A).

Cauchy: various uses.

MIMS Nick Higham Functions of a Matrix 9 / 45

Page 11: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Equivalence of Definitions

Theorem

The three definitions are equivalent, modulo analyticity

assumption for Cauchy.

Interpolation: for basic properties.

JCF: for solving matrix equations (e.g., X 2 = A,

eX = A). For evaluation (normal A).

Cauchy: various uses.

For computation:

Use the definitions (with care).

Schur decomposition for general f .

Methods specific to particular f .

MIMS Nick Higham Functions of a Matrix 9 / 45

Page 12: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Outline

1 Definition of f (A)

2 Motivation and MATLAB

3 eA and its Frechét derivative

4 A1/2: Modified Newton Methods

MIMS Nick Higham Functions of a Matrix 10 / 45

Page 13: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Toolbox of Matrix Functions

Want to have techniques for evaluating interesting f at

matrix args as well as scalar args.

Example:

d2y

dt2+ Ay = 0, y(0) = y0, y ′(0) = y ′

0

has solution

y(t) = cos(√

At)y0 +(√

A)−1

sin(√

At)y ′0,

where√

A is any square root of A.

MATLAB has expm, logm, sqrtm, funm.

MIMS Nick Higham Functions of a Matrix 11 / 45

Page 14: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Classic MATLAB

< M A T L A B >

Version of 01/10/84

HELP is available

<>

help

Type HELP followed by

INTRO (To get started)

NEWS (recent revisions)

ABS ANS ATAN BASE CHAR CHOL CHOP CLEA COND CONJ COS

DET DIAG DIAR DISP EDIT EIG ELSE END EPS EXEC EXIT

EXP EYE FILE FLOP FLPS FOR FUN HESS HILB IF IMAG

INV KRON LINE LOAD LOG LONG LU MACR MAGI NORM ONES

ORTH PINV PLOT POLY PRIN PROD QR RAND RANK RCON RAT

REAL RETU RREF ROOT ROUN SAVE SCHU SHOR SEMI SIN SIZE

SQRT STOP SUM SVD TRIL TRIU USER WHAT WHIL WHO WHY

< > ( ) = . , ; \ / ’ + - * :

MIMS Nick Higham Functions of a Matrix 12 / 45

Page 15: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Classic MATLAB

<>

help fun

FUN For matrix arguments X , the functions SIN, COS, ATAN,

SQRT, LOG, EXP and X**p are computed using eigenvalues D

and eigenvectors V . If <V,D> = EIG(X) then f(X) =

V*f(D)/V . This method may give inaccurate results if V

is badly conditioned. Some idea of the accuracy can be

obtained by comparing X**1 with X .

For vector arguments, the function is applied to each

component.

The availability of [FUN] in early versions of MATLAB

quite possibly contributed to

the system’s technical and commercial success.

— Cleve Moler (2003)

MIMS Nick Higham Functions of a Matrix 13 / 45

Page 16: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Outline

1 Definition of f (A)

2 Motivation and MATLAB

3 eA and its Frechét derivative

4 A1/2: Modified Newton Methods

MIMS Nick Higham Functions of a Matrix 14 / 45

Page 17: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Matrix Exponential

Large literature.

Early exposition in Frazer, Duncan & Collar.

Elementary Matrices and Some Applications toDynamics and Differential Equations. CUP, 1938.

Moler & Van Loan.

Nineteen dubious ways to compute the exponentialof a matrix, twenty-five years later, SIAM Rev., 45

(2003).

Over 500 citations on ISI Citation Index.

MIMS Nick Higham Functions of a Matrix 15 / 45

Page 18: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Scaling and Squaring Method

◮ B ← A/2s so ‖B‖∞ ≈ 1

◮ rm(B) = [m/m] Padé approximant to eB

◮ X = rm(B)2s ≈ eA

Used by expm in MATLAB.

Originates with Lawson (1967).

Ward (1977): algorithm, with rounding error analysis

and a posteriori error bound.

Moler & Van Loan (1978): give backward error

analysis covering truncation error in Padé

approximants, allowing choice of s and m.

MIMS Nick Higham Functions of a Matrix 16 / 45

Page 19: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Padé Approximations rm to ex

rm(x) = pm(x)/qm(x) known explicitly:

pm(x) =m∑

j=0

(2m − j)!m!

(2m)! (m − j)!

x j

j!

and qm(x) = pm(−x). Error satisfies

ex−rm(x) = (−1)m (m!)2

(2m)!(2m + 1)!x2m+1+O(x2m+2).

MIMS Nick Higham Functions of a Matrix 17 / 45

Page 20: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Analysis

Let

e−Arm(A) = I + G = eH

and assume ‖G‖ < 1. Then

‖H‖ = ‖ log(I + G)‖ ≤∞∑

j=1

‖G‖j/j = − log(1− ‖G‖).

Hence

rm(A) = eAeH = eA+H .

Rewrite as

rm(A/2s)2s

= eA+E ,

where E = 2sH satisfies

‖E‖ ≤ −2s log(1− ‖G‖).

MIMS Nick Higham Functions of a Matrix 18 / 45

Page 21: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Result

Theorem

Let

e−2−sA rm(2−sA) = I + G,

where ‖G‖ < 1. Then the Padé approximant rm satisfies

rm(2−sA)2s

= eA+E ,

where‖E‖‖A‖ ≤

− log(1− ‖G‖)‖2−sA‖ .

◮ Need to bound ‖G‖ given m and ‖2−sA‖.◮ Then ‖E‖/‖A‖ bounded in terms of m, s and ‖A‖.◮ Now select “best” (s, m) pair for given ‖A‖.

MIMS Nick Higham Functions of a Matrix 19 / 45

Page 22: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Bounding ‖G‖

ρ(x) := e−x rm(x)− 1 =∞∑

i=2m+1

cixi

converges absolutely for |x | < min{ |t | : qm(t) = 0 } =: νm.

Hence, with θ := ‖2−sA‖ < νm,

‖G‖ = ‖ρ(2−sA)‖ ≤∞∑

i=2m+1

|ci |θi =: f (θ). (∗)

Thus ‖E‖/‖A‖ ≤ − log(1− f (θ))/θ) .

◮ If only ‖A‖ known, (∗) is optimal bound on ‖G‖.◮ Moler & Van Loan (1978) bound less sharp;

Dieci & Papini (2000) bound a different error.

MIMS Nick Higham Functions of a Matrix 20 / 45

Page 23: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Summary of Computations

Solve “bound = unit roundoff” by summing 150 terms of

series in 250 digit arithmetic.

Work out cost of evaluating rm(B) and of the squaring

phase.

Minimize cost subject to retaining numerical stability in

evaluation of rm(B).

Precision m θm

IEEE single 7 3.9

IEEE double 13 5.4

IEEE quad 17 3.3

MIMS Nick Higham Functions of a Matrix 21 / 45

Page 24: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Scaling and Squaring Algorithm for eA

Algorithm 1 (H, 2005; MATLAB 7.2, Mathematica 5.1)

1 for m = [3 5 7 9]2 if ‖A‖1 ≤ θm

3 X = rm(A).4 quit

5 end

6 end

7 A← A/2s with s ≥ 0 minimal s.t. ‖A/2s‖1 ≤ θ13 = 5.48 A2 = A2, A4 = A2

2, A6 = A2A4

9 U = A[A6(b13A6 + b11A4 + b9A2) + b7A6 + b5A4 + b3A2 + b1I

]

10 V = A6(b12A6 + b10A4 + b8A2) + b6A6 + b4A4 + b2A2 + b0I

11 Solve (−U + V )r13 = U + V for r13.

12 X = r132s

by repeated squaring.

MIMS Nick Higham Functions of a Matrix 22 / 45

Page 25: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Comparison with Existing Algorithms

Method m max ‖2−sA‖Alg 1 13 5.4

Ward (1977) 8 1.0 [θ8 = 1.5]

Old MATLAB expm 6 0.5 [θ6 = 0.54]

Sidje (1998) 6 0.5

◮ ‖A‖1 > 1: Alg 1 requires 1–2 fewer mat mults than

Ward, 2–3 fewer than expm.

◮ ‖A‖1 ∈ (2, 2.1):Alg 1 Ward expm Sidje

mults 5 7 8 10

MIMS Nick Higham Functions of a Matrix 23 / 45

Page 26: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Squaring Phase

◮ The bound

‖A2 − fl(A2)‖ ≤ γn‖A‖2, γn =nu

1− nu.

shows the dangers in matrix squaring.

◮ Open question: are errors in squaring phase

consistent with conditioning of the problem?

◮ Our choice of parameters uses 1–5 fewer matrix

squarings than previous algorithms. Hence has

potential accuracy advantages.

MIMS Nick Higham Functions of a Matrix 24 / 45

Page 27: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Numerical Experiment

◮ 70 test matrices, dimension 2–10.

◮ Evaluated 1-norm relative error ‖X − X‖1/‖X‖1.

◮ Notation:

◮ expm: Alg 1 (MATLAB 7.2).

◮ old_expm: MATLAB 7.1.

◮ funm: MATLAB 7.

◮ padm: Sidje.

◮ cond(A) = limǫ→0

max‖E‖2≤ǫ‖A‖2

‖eA+E − eA‖2

ǫ‖eA‖2

.

MIMS Nick Higham Functions of a Matrix 25 / 45

Page 28: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Different S&S Codes and funm

0 10 20 30 40 50 60 70

10−18

10−16

10−14

10−12

10−10

10−8

10−6

old_expm

padm

funm

expm (Alg 1)

cond*u

MIMS Nick Higham Functions of a Matrix 26 / 45

Page 29: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Performance Profiles

Dolan & Moré (2002) propose a new type of performance

profile.

For the given set of solvers and test problems, plot

x-axis: α

y -axis: probability that solver has error within factor αof smallest error over all solvers on the test set.

MIMS Nick Higham Functions of a Matrix 27 / 45

Page 30: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Performance Profile

1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

expm (Alg 1)

padm

funm

old_expm

α

p

MIMS Nick Higham Functions of a Matrix 28 / 45

Page 31: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Frechét Derivative

Fréchet derivative of f : Cn×n → C

n×n at X ∈ Cn×n

A linear mapping L : Cn×n → C

n×n s.t. for all E ∈ Cn×n

f (X + E)− f (X )− L(X , E) = o(‖E‖).

Example For f (X ) = X 2 we have

f (X + E)− f (X ) = XE + EX + E2,

so L(X , E) = XE + EX .

MIMS Nick Higham Functions of a Matrix 29 / 45

Page 32: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Frechét Derivative of eA

L(A, E) =

∫ 1

0

eA(1−s)EeAs ds.

‖L(A)‖ := max ‖L(A, E)‖/‖E‖.

Condition no. of the exponential: κexp(A) =‖L(A)‖‖A‖‖eA‖ .

‖L(A)‖ ≥ ‖L(A, I)‖ = ‖eA‖ ⇒ κexp(A) ≥ ‖A‖ .

Theorem

If A ∈ Cn×n is normal then in the 2-norm,

κexp(A) = ‖A‖2.

If A ∈ Rn×n is a nonnegative scalar multiple of a

stochastic matrix then in the∞-norm, κexp(A) = ‖A‖∞.

MIMS Nick Higham Functions of a Matrix 30 / 45

Page 33: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Quadrature

Repeated trap rule:∫ 1

0f (t) dt ≈ 1

m(1

2f0 + f1 + f2 + · · ·+ fm−1 + 1

2fm), fi := f (i/m)

gives L(A, E) ≈ 1m

(12eAE +

∑m−1i=1 eA(1−i/m)EeAi/m + 1

2EeA

).

Requires eA/m, e2A/m, . . . , e(m−1)A/m, eA.

Lemma

Consider R1(A, E) =∑p

i=1 wi eA(1−ti )EeAti and denote its

m-times repeated form by Rm(A, E). If

Qs = R1(2−sA, 2−sE),

Qi−1 = e2−i AQi + Qi e2−i A, i = s : −1 : 1,

then Q0 = R2s(A, E) .

MIMS Nick Higham Functions of a Matrix 31 / 45

Page 34: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Scaling and Squaring

Recall Lx2(A, E) = AE + EA.

Applying chain rule to eA = (eA/2)2 gives

Lexp(A, E) = Lx2

(eA/2, Lexp(A/2, E/2)

)

= eA/2Lexp(A/2, E/2) + Lexp(A/2, E/2)eA/2.

Recurrence for L0 = Lexp(A, E):

Ls = Lexp(2−sA, 2−sE),

Li−1 = e2−i ALi + Li e2−i A, i = s : −1 : 1.

MIMS Nick Higham Functions of a Matrix 32 / 45

Page 35: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Quadrature Algorithm

Algorithm (Kenney & Laub, 1998; Mathias, 1993)

Approximates L = Lexp(A, E) via repeated Simpson rule;

order of magnitude estimate.

1 B = A/2s with s ≥ 0 minimal s.t. ‖A/2s‖1 ≤ 1/2

2 X = eB

3 X = eB/2

4 Qs = 2−s(XE + 4XEX + EX )/6

5 for i = s:−1: 1

6 if i < s, X = e2−i A, end

7 Qi−1 = XQi + QiX

8 end

9 L = Q0

MIMS Nick Higham Functions of a Matrix 33 / 45

Page 36: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Kronecker Formula

L(A, E) =

∫ 1

0

eA(1−s)EeAs ds.

MIMS Nick Higham Functions of a Matrix 34 / 45

Page 37: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Kronecker Formula

L(A, E) =

∫ 1

0

eA(1−s)EeAs ds.

A⊗ B = (aijB) ∈ Cn2×n2

.

A⊕ B = A⊗ In + Im ⊗ B.

eA⊕B = eA ⊗ eB.

vec(AXB) = (BT ⊗ A)vec(X ).

MIMS Nick Higham Functions of a Matrix 34 / 45

Page 38: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Kronecker Formula

L(A, E) =

∫ 1

0

eA(1−s)EeAs ds.

A⊗ B = (aijB) ∈ Cn2×n2

.

A⊕ B = A⊗ In + Im ⊗ B.

eA⊕B = eA ⊗ eB.

vec(AXB) = (BT ⊗ A)vec(X ).

Theorem

vec(L(A, E)) = K (A)vec(E), where

K (A) = 12(eAT ⊕ eA)τ

(12[AT ⊕ (−A)]

)∈ C

n2×n2,

τ(x) = tanh(x)/x and 12‖AT ⊕ (−A)‖ < π/2 assumed.

MIMS Nick Higham Functions of a Matrix 34 / 45

Page 39: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Approximating the Frechét Derivative

[AT ⊕ (−A)− θI

]vec(E) =

[AT ⊗ I − I ⊗ A− θI

]vec(E)

= vec(EA− AE − θI)

= vec(E(A− θ/2 · I)− (A + θ/2 · I)E).

Obtain Padé approximant

τ(x) = tanh(x)/x ≈ rm(x) =m∏

i=1

(x/βi − 1)−1(x/αi − 1)

by truncating

τ(x) = 1 +1

1 +x2/(1 · 3)

1 +x2/(3 · 5)

1 + · · · x2/((2k − 1) · (2k + 1))

1 + · · ·

.

MIMS Nick Higham Functions of a Matrix 35 / 45

Page 40: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Algorithm (Fréchet derivative; Kenney & Laub, 1998)

Evaluates L = Lexp(A, E) using scaling and squaring and

[8/8] Padé approximant to τ .

1 B = A/2s with s ≥ 0 minimal s.t. ‖A/2s‖1 ≤ 1

2 G0 = 2−sE

3 for i = 1: 8

4 Solve (I + B/βi) Gi + Gi (I − B/βi) =(I + B/αi)Gi−1 + Gi−1(I − B/αi).

5 end

6 X = eB

7 Ls = (GmX + XGm)/2

8 for i = s:−1: 1

9 if i < s, X = e2−i A, end

10 Li−1 = XLi + LiX

11 end

12 L = L0

Page 41: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Outline

1 Definition of f (A)

2 Motivation and MATLAB

3 eA and its Frechét derivative

4 A1/2: Modified Newton Methods

MIMS Nick Higham Functions of a Matrix 37 / 45

Page 42: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Matrix Square Root

X is a square root of A ∈ Cn×n ⇐⇒ X 2 = A .

Number of square roots may be zero, finite or infinite.

Definition

For A with no eigenvalues on R− = {x ∈ R : x ≤ 0} the

principal square root A1/2 is unique square root X with

spectrum in open right half-plane.

MIMS Nick Higham Functions of a Matrix 38 / 45

Page 43: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Newton’s Method for Square Root

Apply Newton to F (X ) = X 2 − A = 0: X0 given,

Solve XkEk + EkXk = A− X 2k

Xk+1 = Xk + Ek

}k = 0, 1, 2, . . .

Modified Newton iteration: freeze Fréchet derivative at X0:

Solve X0Ek + EkX0 = A− X 2k

Xk+1 = Xk + Ek

}k = 0, 1, 2, . . . ,

X0 diagonal⇒ cheap to solve for Ek .

MIMS Nick Higham Functions of a Matrix 39 / 45

Page 44: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Visser Iteration

Set X0 = (2α)−1I in modified Newton:

Visser iteration (1937)

Xk+1 = Xk + α(A− X 2k ), X0 = (2α)−1I.

Stationary iteration.

Richardson iteration.

Linear convergence.

Choice of α?

MIMS Nick Higham Functions of a Matrix 40 / 45

Page 45: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Visser History

Xk+1 = Xk + α(A− X 2k ), X0 = (2α)−1I.

Used with α = 1/2 by Visser (1937) to show positive

operator on Hilbert space has a positive square root.

Likewise in functional analysis texts, e.g. Riesz &

Sz.-Nagy (1956).

Enables proof of existence of A1/2 without using

spectral theorem.

Used computationally by Liebl (1965), Babuška,

Práger & Vitásek (1966), Späth (1966), Duke (1969),

Elsner (1970).

Elsner proves cgce for A ∈ Cn×n with real, positive

ei’vals if 0 < α ≤ ρ(A)−1/2.

MIMS Nick Higham Functions of a Matrix 41 / 45

Page 46: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Visser Transformations

Let Xk = θYk , β = θα and A = θ−2A.

Yk+1 = Yk + β(A− Y 2k ), Y0 =

1

2βI.

Set β = 1/2:

Yk+1 = Yk +1

2(A− Y 2

k ), Y0 = I.

With A ≡ I − C and Yk = I − Pk :

Pk+1 =1

2(C + P2

k ), P0 = 0.

Qk = Pk/2:

Qk+1 = Q2k +

C

4, Q0 = 0.

MIMS Nick Higham Functions of a Matrix 42 / 45

Page 47: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Visser Convergence

Xk+1 = Xk + α(A− X 2k ), X0 = (2α)−1I.

Theorem (H, 2006)

Let A ∈ Cn×n and α > 0. If Λ(I − 4α2A) lies in the cardioid

D = {2z − z2 : z ∈ C, |z| < 1 }

then A1/2 exists and Xk → A1/2 linearly.

−4 −3 −2 −1 0 1 2

−2

−1

0

1

2

MIMS Nick Higham Functions of a Matrix 43 / 45

Page 48: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

Example

A ∈ R16×16 spd with aii = i2, aij = 0.1, i 6= j .

Aim for rel residual < nu in IEEE DP arithmetic.

Pulay iteration D = diag(A): θ = 0.191, 9 iters.

Visser iteration α = 0.058 (hand optimized), 245 iters.

−4 −3 −2 −1 0 1 2

−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

MIMS Nick Higham Functions of a Matrix 44 / 45

Page 49: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

In Conclusion

Many applications of f (A), e.g. control theory,

computer graphics, theoretical physics.

Need better understanding of conditioning of

f (A).

Can we exploit structure, e.g. A ∈ matrix

automorphism group or Jordan or Lie

algebra?

Krylov methods needed for large, sparse A.

How to use Cauchy integral computationally

(H & Trefethen)?

MIMS Nick Higham Functions of a Matrix 45 / 45

Page 50: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

References I

D. A. Bini, N. J. Higham, and B. Meini.

Algorithms for the matrix pth root.

Numer. Algorithms, 39(4):349–378, 2005.

C.-H. Guo and N. J. Higham.

A Schur–Newton method for the matrix pth root and its

inverse.

SIAM J. Matrix Anal. Appl., 28(3):788–804, 2006.

N. J. Higham.

Functions of a Matrix: Theory and Computation.

Book in preparation.

MIMS Nick Higham Functions of a Matrix 44 / 45

Page 51: Functions of a Matrix: Theory and Computationhigham/talks/talk06_funm.pdf · Classic MATLAB  help fun FUN For matrix arguments X , the functions SIN, COS, ATAN, SQRT, LOG,

References II

N. J. Higham.

The scaling and squaring method for the matrix

exponential revisited.

SIAM J. Matrix Anal. Appl., 26(4):1179–1193, 2005.

C. S. Kenney and A. J. Laub.

A Schur–Fréchet algorithm for computing the logarithm

and exponential of a matrix.

SIAM J. Matrix Anal. Appl., 19(3):640–663, 1998.

R. Mathias.

Evaluating the Frechet derivative of the matrix

exponential.

Numer. Math., 63(2):213–226, 1992.

MIMS Nick Higham Functions of a Matrix 45 / 45