ma 580; numerical analysis i - nc state universityma 580; numerical analysis i c. t. kelley nc state...

Part II: Notation, Background, Errors

MA 580; Numerical Analysis I

C. T. KelleyNC State University

tim [email protected]

Version of September 5, 2016

NCSU, Fall 2016Part II: Notation, Background, Errors

c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 1 / 60


Contents

1 Vectors and Matrices

2 Norms

3 Eigenvalues and Eigenvectors

4 Special Matrices

5 Errors and Floating Point Arithmetic

6 Stability and Conditioning



Vectors and Matrices

Notation: Vectors

RN is the space of N dimensional vectors.

RM×N is the space of M × N matrices.

Vectors x are column vectors

x =

x1x2...

xN

= (x1, . . . , xN)T .




Notation: Matrices

Matrices A are M × N. The ijth component is

aij or, if A is a complex expression Aij

If aj ∈ RM is the jth column of A, we can write A as

A = (a1, . . . , aN).




Matrix Products

If A is M × L and B is L× N, the matrix product C = AB isM × N and

cij =L∑

l=1

ailblj

Example: scalar product

vTu =N∑i=1

viui




More Matrix Notation

The transpose of A ∈ RM×N is AT ∈ RN×M and ATij = Aji .

So, the transpose of a column vector is a row vector.

The identity matrix I ∈ RN×N : Ix = x for all x.

(I)ij = δij =

{1 if i = j0 otherwise

Define the inverse of A ∈ RN×N , A−1: A−1A = AA−1 = I




Canonical vectors

We call the columns of I

I =

1. . .

1

=(e1 e2 . . . en

)the canonical vectors.Warning: sometimes e is also an error. You’ll figure it out from thecontext.




Canonical Vectors

e1 =

10...0

, e2 =

01...0

, · · · , eN =

00...1

.




Sequences of Vectors

If {xk} is a sequence of vectors, the ith component of xk is

xik

as it would be if you regarded the sequence as a large matrix

X = (x1, x2, . . . )

and used i as the row index and the iteration counter as thecolumn index.



Norms

Vector Norms

Review the definition of norm if you don’t remember it.The three important norms are `1, `2, `∞.

‖x‖1 =N∑i=1

|xi |, ‖x‖2 =

(N∑i=1

|xi |2)1/2

, and ‖x‖∞ = maxi|xi |.

More generally, the `p norm:

‖x‖p =

(N∑i=1

|xi |p)1/p

for 1 ≤ p <∞.Generally the specific norm will not be so important. If I don’t saywhat the norm is then it is any vector norm.



Norms

Matrix Norms

We will consider matrix norms which are induced by vector norms.This means

‖A‖ = max‖x‖=1

‖Ax‖.

For A ∈ RN×N , induced norms satisfy

‖AB‖ ≤ ‖A‖‖B‖

and‖Ax‖ ≤ ‖A‖‖x‖

The norm of the product is less than or equal to the product of thenorms.



Norms

Computing the `∞ matrix norm: I

Do the math (absolute value of sum ≤ sum of absolute values) toget

|(Ax)i | = |N∑j=1

aijxj | ≤N∑j=1

|aij ||xj | ≤ ‖x‖∞maxi

N∑j=1

|aij |

for all x and all i . So,

‖A‖∞ ≤ maxi

N∑j=1

|aij | = maximum absolute row sum

We’ll show that the norm is the maximum absolute row sum.



Norms

Computing the `∞ matrix norm: II

To show that the ≤ is really =. You’ll need to find a vector thatattains the bound. Here it is.Let i∗ be the row of A with max norm

N∑j=1

|ai∗j | = maxi

N∑j=1

|aij |

and let xj = sign(ai∗j). Then ‖x‖ = 1 and

(Ax)i∗ =N∑j=1

|ai∗j |

SHAZAM!



Norms

`1 matrix norm?

‖A‖1 = maxj

N∑i=1

|aij |

the max absolute column sum.You should be able to prove this.We’ll do `2 later.



Eigenvalues and Eigenvectors

λ is an eigenvalue of A with corresponding eigenvector x if

Ax = λx

Any eigenvalue is a root of the characteristic polynomial of A

pC (z) = det(zI− A)

pC has N roots by the fundamental theorem of algebra. So thereare N eigenvalues, counted with multiplicity.



Eigenvalues and Eigenvectors

Multiplicity

The algebraic multiplicity of an eigenvalue λ is the number oftimes it appears in the factorization of pC into linear terms

pC (z) =N∏i=1

(z − λi )

The geometric multiplicity of λ is the dimension of the null spaceof λI− A.The two notions of multiplicity are not the same. Take

A =

(0 10 0

),

please.



Special Matrices

Diagonal Matrices

A matrix D is diagonal if dij = 0 unless i = j .

D =

d11

. . .

dnn

.

In this case we sometimes abbreviate dii as di and write

D = diag(d1, . . . , dN)

where di is the ith entry in the diagonal.Example: I = diag(1, 1, . . . , 1)



Special Matrices

Diagonalizable Matrices

A matrix is diagonalizable if it has a basis of eigenvectors {vi}Ni=1.In this case the matrix V = (v1, . . . , vN) is nonsingular and

A = VΛV−1

where Λ = diag(λ1, . . . , λN).



Special Matrices

Triangular Matrices

A ∈ RN×N is upper triangular if aij = 0 for i > j . That is,

A =

a11 . . . a1n. . .

...ann

.

A is lower triangular if AT is upper triangular.



Special Matrices

Orthogonal Matrices

An N × N matrix U is orthogonal if

U−1 = UT

This means that the columns of U form an orthonormal basis.Orthogonal matrices are a very big deal in this course.



Special Matrices

Symmetric Matrices

A is symmetric is A = AT .A few facts:

The eigenvalues {λi}Ni=1 of A are real.

There’s an orthonormal basis {ui}Ni=1 of eigenvectors of A.

U = (u1, . . . ,uN) is an orthogonal matrix.

A = UΛUT (spectral theorem for symmetric matrices)

Prove some of this stuff.



Special Matrices

Symmetric Positive (semi)Definite Matrices

A symmetric matrix A is symmetric positive definite (spd) if

xTAx > 0 for all x 6= 0

Alternatively, A is spd if all its eigenvalues are positive.Positive semidefinite: all eigenvalues nonnegative or

xTAx ≥ 0 for all x



Special Matrices

Rank-one or Outer Product Matrices

Recall the inner product of v and u

uTv = (1× N)(N × 1) = (1× 1) =N∑i=1

uivi .

The outer product is

uvT = (N × 1)(1× N) = N × N,

and (uvT )ij = uivj



Special Matrices

Rank One Matrices

A is rank-one matrix if the range R(A) of A has dimensionone.

Fact: A is rank-one if and only if A = uvT for vectors u andv. You get to prove this in the homework.



Special Matrices

Eigen-decomposition of symmetric matrices

If A = AT then

A = UΛUT =N∑i=1

λiuiuTi .

From this you can prove, for symmetric A,

‖A‖2 = maxi|λi |

You’ll see things like this on exams.



Special Matrices

Eigenvalues/vectors of Rank-One matrices

A = xyT

Any vector in E = {z | yTz = 0} is a null-vector of A (Ax = 0).If yTx 6= 0 (x 6∈ E ) then

Ax = (yTx)x

so λ = yTx is an eigenvalue with eigenvector x.Algebraic multiplicity = geometric multiplicity!What if yTx = 0?



Special Matrices

`2 norm of symmetric matrix A = UΛUT

Since U is orthogonal then every x ∈ RN has the expansion

x =∑i

(uTi x)ui ,

which implies that

Ax =∑i

λi (uTi x)ui .

So,

‖Ax‖22 =∑i

λ2i (uTi x)2 ≤ λ2N

∑i

(uTi x)2 = λ2N‖x‖2,

with equality if x = uN . That’s it.



Special Matrices

`2 Matrix Norm

Start with‖Ax‖22 = (Ax)T (Ax) = xT (ATA)x.

ATA is symmetric positive semidefinite, so

ATA = UΛUT where 0 ≤ λ1 ≤ λ2 ≤ · · · ≤ λN .

This means (you should fill in the details) that

‖Ax‖22 ≤ λN‖x‖2 with equality if x = uN .

So ‖A‖2 =√λN where λN is the largest eigenvalue of ATA.



Special Matrices

The Sherman-Morrison Formula: Rank-one changes

Assume A ∈ RN×N ; v,u ∈ RN and

A is nonsingular,

1 + vTA−1u 6= 0.

Then A + uvT is nonsingular and

(A + uvT )−1 =

(I +

(A−1u)vT

1 + vTA−1u

)A−1.

How do you prove this? It’s your problem in the homework.



Special Matrices

Things to think about

In what sense is the spectral decomposition for a symmetricmatrix A = UΛUT unique?What about the 1× 1 case? What about the 2× 2 identity?

What are the eigenvalues of an orthogonal matrix?

Is an orthogonal matrix diagonalizable?

What are the orthogonal spd matrices?



Errors and Floating Point Arithmetic

Relative and Absolute Errors

Let x be a vector, matrix, scalar, . . .Suppose you approximate x by x̃. The absolute error is

‖x− x̃‖ sensitive to units

The relative error (for x 6= 0) is

‖x− x̃‖‖x‖

not sensitive to units

It’s ok to think of a relative error of 10−k as meaning that x and x̃agree to k decimal digits.




Floating Point Arithmetic

A real number is an abstraction. It does not physically exist.

There are uncountably many real numbers.

There are uncountably many real numbers between any tworeal numbers.

You cannot store all the real numbers in a physical device.

On the other hand . . .




A floating point number is like a dog.

It is not an abstraction. It is realer that a real number.

It consumes energy.

It radiates heat.

It occupies space.

It makes noise.




Geography of Floating Point Numbers: Radix

Floating point numbers are expressed in three fields inradix (or base) P arithmetic.

Modern computers use radix 2 (binary) arithmetic.The standard is IEEE-754 (1985).

We will use radix 10 (decimal) for simple examples in this partof the course.

Historical systems have used 16 (hexadecimal) and 8 (octal).




Geography of Floating Point Numbers: The Fields

± Exponent Mantissa

Sign bit.

Exponent field

Mantissa, significand, or fraction.

Conventions for uniqueness: normalized numbers

The point is before the leading digit.The leading digit of a non-zero number is non-zero.

You can think of this as scientific notation with limited range.




How big is an IEEE float?

Width (bits) Sign Exponent Manitssa

Double 64 1 11 52

Single 32 1 8 23

Matlab default is 64 bit IEEE double precision.

Special objects:

INF 1/0, 101000, . . .NaN 0/0, INF ∗ 0, missing data in statistics

If you get these, you have a bug!




I have a base 10 brain. What does this mean to me?

In double precision:

Exponent Range:

Largest > 0 floating point number: ≈ 1.8× 10308

Smallest normalized > 0 floating point number:≈ 2.22× 10−308 =2.22e-302

denormalized: 4.9407e-324, eps(0)

Unit roundoff: εu ≈ 1.11× 10−16

relative error bound for conversion from real to floatand elementary operations.

Difference between 1 and floating point number > 1 nearestto 1: 2.2204e-16

eps or eps(1)




Why would anyone use single precision?

Ideas?




What you need to remember

There are finitely many floating point numbers.

Floating point numers are not equally spaced.

There’s a role (but not in MA580) for single (and lower)precision.help single

If computation arrives at

something larger than the largest float: overflow, get INFsomething 0 < x < eps(0): underflowgradual or zero

Overflow is a problem. Underflow is usually not.




Rounding and Floating Point Operations

Real number: x, Floating point representationfl(x) = x(1 + ε), |ε| ≤ εuIEEE 754 default: round to nearest.

Operations: Given x , y ∈ R and elementary operation ◦

fl(x ◦ y) = fl(fl(x) ◦ fl(y)) = (x ◦ y)(1 + εu)

Given something to evaluate f (x) you hope that

‖fl(f (x))− f (x)‖ = εf ‖f (x)‖, where εf ≈ sizeof (computation)εu




A Toy Floating Point Number System

Here’s a radix 10 system.

Two digit mantissa

Exponent range: -2:2

Largest float = .99 ∗ 102 = 99

eps(0) = .10 ∗ 10−2

eps(1) = 1.1− 1 = .1

Round to nearest.




Overflow in inner product

Given x = (10, 10)T compute ‖x‖2.Answer =

√100 + 100 =

√200 which rounds to .14× 102, a legal

floating point number.Wrong way: Add x1y1 (Overflow) to x2y2 (Overflow) and take thesquare root.Right way:

Normalize x to get w = x/‖x‖∞ = (1, 1)T

Compute ‖w‖2 =√

2 rounds to .14× 101.

Multiply ‖w‖ by ‖x‖∞ = 10 to get ‖x‖2 = .14× 102




Order of summation

Let’s compute

101∑i=1

ai , where a1 = 90, a2 = a3 = . . . a101 = .01

If do we this in order

fl(a1 + a2) = fl(90.01) = 90, . . .

and the summation stagnates with a result of 90 (wrong).Do it backwards and

fl(2∑

i=101

.01) = fl(1) = 1, and fl(1 + 90) = 91.

Moral: sum the small things first.c©C. T. Kelley, I. C. F. Ipsen, 2016 Part II: Notation, Background, Errors MA 580, Fall 2016 43 / 60



The Quadratic Formula

From your early childhood, the solutions of

ax2 + bx + c = 0

are

r± =−b ±

√b2 − 4ac

2a

Let’s break it.




Catastrophic Cancellation or Loss of Signifcance

Let’s find the roots ofx2 + 4x + .1

with the quadratic formula. The answer (computed in IEEEdouble) is

r± =−4±

√16− .4

2≈

−.02515 +

−3.975 -

In the toy floating point system, these round to

−.25× 10−1 and − .40× 101




Apply the Formula in Toy Floats

Sincefl(16− .4) = fl(15.6) = 16

we may be in trouble. When we compute r− we get

r− = (−4− 4)/2 = −4 right! .

But for r+ we are lost

r+ = (−4 + 4)/2 = 0 wrong!

The subtraction 16− .4 lost all information.




The Fix

Before doing anything else, rewrite the formla for r+ to avoid thesubtraction.

r+ = r+−b −

√b2 − 4ac

−b −√

b2 − 4ac= − 4ac

2a(b +√

b2 − 4ac)

For our problem this is

− .4

2(8)= − .1

4

which rounds to −.25× 10−1. SHAZAM!



Stability and Conditioning

Stability

An algorithm is unstable when applied to a problem if the size ofthe output grows exponentially as some measure of problemcomplexity increases.This is not a precise definition, butYou’ll know it when you see it.




An unstable method.

Here’s a way to compute eigenvalue-eigenvector pairs.Let A be symmetric with eigenvalues|λ1| ≤ |λ2| · · · ≤ |λN−1| < |λN |.

Pick a large n and x ∈ RN

Compute w = Anx

Set un = w/‖w‖; λn = uTn AuN

Theorem: un → u and Au = λNu.This algorithm is unstable even if N = 1! Consider A = 3.




But why do you care?

Doesn’t the huge size of w go away when you divide by ‖w‖?

Yes, it does in exact arithmetic,

but not in floating point.You get an overflow from an intermediate step.

But you can fix this one: Pick x ∈ RN

While not happy

w = Axx = w/‖w‖

Same results, but normalizing each time.Fix for instability: change algorithm.




Conditioning

A computation is poorly conditioned if a small relative changein the input produces a large relative change in the output.

Define small!!!This is yet another “I’ll know it when I see it.” kind of thing.

You’ll get inaccurate results for a sufficiently poorlyconditioned computation no matter what you do.

Fix for poor conditioning: reformulate.

A problem is well conditioned if small relative changes in the inputlead to small relative changes in the output.There is, of course, a large and fuzzy middle ground.




Condition number

We can quantify it this time

κ = “ condition number ” =norm of relative change in output

norm of relative change in out put.

If x is the input, S(x) is the output, and δx is the change in x.Then

κ =‖S(x + δx)− S(x)‖/‖S(x)‖

‖δx‖/‖x‖.

Note: κ depends on what you’re doing Sand where you’re doing it x.




Example: subtraction

Here

x =

(xy

), δx =

(δxδy

), and S(x) = x − y

So|S(x + δx)− S(x)| = |δx − δy |,

Use the `1 norm and

‖δx‖ = |δx |+ |δy |, and ‖x‖ = |x |+ |y |.




Conditioning of Subtraction

Plug it all into the formula and . . .

κ =|δx − δy |/|x − y ||δx |+ |δy |/(|x |+ |y |)

.

There is no reason to expect δx and δy to cancel, so the mostsensible estimate is

κ =|x |+ |y ||x − y |

Wow! Subtraction of nearly equation numbers is a bad ideaagain!!!




Conditioning of Matrix-Vector Product

Data: A fixed, x, δx.

κ =‖A(x + δx)− Ax‖/‖Ax‖

‖δx‖/‖x‖.

We’ll use the estimate

‖A−1‖−1‖x‖ ≤ ‖Ax‖ ≤ ‖A‖‖x‖

We know the one on the right because norm of product ≤ productof norms. The one on the left follows from

‖x‖ = ‖A−1Ax‖ ≤ ‖A−1‖‖Ax‖.

Now to estimate this mess . . .




Start with this

κ =‖Aδx‖/‖Ax‖‖δx‖/‖x‖

use1/‖Ax‖ ≥ 1/(‖A−1‖‖x‖) = ‖A−1‖/‖x‖

to get

κ ≤ ‖A‖‖A−1‖‖δx‖/‖x‖‖δx‖/‖x‖

= ‖A‖‖A−1‖.

This leads to the most profound definition in the course . . .




Condition number of a matrix A

The condition number of an N × N nonsingular matrix A relativeto the vector norm ‖ · ‖ is

κ(A) = ‖A‖‖A−1‖ ≥ 1.

Can you see why κ(A) ≥ 1?When the norm matters, we indicate that with κ. For example

κ2(A) = ‖A‖2‖A−1‖2.

A matrix is well-conditioned if κ(A) is small, ill-conditioned if κ(A)is HUGE.Once again, ill-defined with lots of middle ground.




(over) Generalized view of κ

Return to

κ =‖S(x + δx)− S(x)‖/‖S(x)‖

‖δx‖/‖x‖.

and think of

‖S(x + δx)− S(x)‖‖δx‖

≈ ‖Sx(x)u‖.

where u = δx/‖δx‖ is a unit vector in the direction δxThen, since ‖u‖ = 1,

κ ≈ ‖Sx(x)‖‖x‖‖S(x)‖

.

You’ll hear more about this when we get to nonlinear equations.




Bottomline on stability and conditioning

Stability is a property of the algorithm.Fix instability with a change in the algorithm.

Conditioning is a property of the problem.Fix ill conditioning with a reformulation of the problem.




O and o Notation

an = O(bn) as n→∞ means that there is K

‖an‖ ≤ K‖bn‖ for all sufficiently large n.

an = o(bn) as n→∞ means that

limn→∞

‖an‖/‖bn‖ = 0.

Similarly f (h) = O(hp) and f (h) = o(hp) as h→ 0.


ma 580; numerical analysis i - nc state universityma 580; numerical analysis i c. t. kelley nc state...

Documents