linear systems & naive gaussian elimination – part b – · pdf filelinear...

Linear Systems & NaiveGaussian Elimination

– Part B –

Prof. Dr. Florian Rupp

German University of Technology in Oman (GUtech)Introduction to Numerical Methods for ENG & CS

(Mathematics IV)

Spring Term 2017

Today, we will discuss further numericalaspects of naive Gaussian elimination

Prof. Dr. Florian Rupp GUtech 2017: Numerical Methods – 2 / 37

Today’s topics:

■ Continuing the Computer Lab

■ Computing errors & accuracy of the naive Gaussian algorithm

■ Residual vectors and their meaning

■ Testing the algorithm

■ Ill-conditioning and the Vandermonde matrix

■ Motivation of pivoting strategies

Corresponding textbook chapters: 2.1 and 2.2

Computer Lab – Part 2:Error & Residual Vector

Recap: The naive Gaussian eliminationalgorithm runs with O(n3) long operations


Phase 1 (forward elimination):

for k = 1 to n− 1for i = k + 1 to n

xmult ← ai,k/ak,kai,k ← xmultfor j = k + 1 to n

ai,j ← ai,j − xmult · ak,jend forbi ← bi − xmult · bk

end forend for

3 nested loops of length n with one

multiplication in the innermost loop.

We assume O(n3) long operations.

Phase 2 (back substitution):

xn ← bn/an,nfor i = n− 1 to 1

sum ← bifor j = i+ 1 to n

sum ← sum− ai,j · xj

end forxi ← sum/ai,i

end for

2 nested loops of length n with one

multiplication in the innermost loop.

We assume O(n2) long operations.

Reviewing the highlights from last time(1/ 3)


Page 375, exercise 1 a (easier version)

Use naive Gaussian elimination to LU -factorize the following matrix into anunit lower triangular matrix L and an upper triangular matrix U :

A :=

1 3 00 −1 33 0 3

.

We have

L =

0 0 00 0 0−3 9 0

, and U =

1 3 00 −1 30 0 30

.



Computer Exercise

The upward velocity of a rocket is given at three different times as followsv1 = 106.8 [m/s] at time t1 = 5 [s], v2 = 177.2 [m/s] at time t2 = 8 [s],and v3 = 279.2 [m/s] at time t3 = 12 [s]. Applying the method of leastsquare approximation we obtain the linear system

52 5 182 8 1122 12 1

x1

x2

x3

=

106.8177.2279.2

.

Solve this system with the help of our NaiveGauss-function.

We have:

x1 = 0.29047 . . . , x2 = 19.69047 . . . , and x3 = 1.08571 . . . .

A remark on the computer exercise


The computed solution elementsx1, x2, x3 are the coefficients of anquadratic least-square approximation

p(t) = x1t2 + x2t+ x3 .

Think of it as interpolating the

speed of the rocket such that the

quadratic error between the

approximation and the given data

points is minimal.

For t ∈ [0, 13] we see that this

quadratic interpolation gives very

nice results:

4 5 6 7 8 9 10 11 12 1350

100

150

200

250

300

350

Back to our computer code


There, are still some tasks open:

■ Measure the quality of our computation(see next slide: error & residual vector)

■ Test the performance of the algorithm

■ Include special cases ... see “reviewing the highlights from last time(3/ 3)”

Residual & error vectors as measurementsfor the quality of a computation (1/ 2)


Next, it is important to get a feeling on how good our computed results are.

For a linear system Ax = b, let x be the true solution and x̃ be the computedsolution. Then, we define the

■ error vector: e := x̃− x■ residual vector: r := Ax̃− b

Of course, we would accept a solution if e and/ or r are small (in somevector-norm).

The exact solution x is often not known such that it is common to take theresidual vector r as a error-measurement.

Though, due to a possible sensitive of the algorithm to roundoff errors –

i.e., it is ill-conditioned – (think of the π-example) a small/ or smaller

residual vector does not mean that there is not a better solution of the given

linear system.

Residual & error vectors as measurementsfor the quality of a computation (2/ 2)


The question of whether a computed solution to a linear system is a goodsolution is extremely difficult to answer and beyond the scope of this lecture.Though we will have a short glimpse on the fundamental ideas that lead tosatisfactory answers for this question.

An important relationship between the error vector and the residual vector is

Ae = r

becauseAe = A(x̃− x) = Ax̃−Ax = Ax̃− b = r .

Please, keep this relationship in mind. We will use it later on when discussing

condition and stability of algorithms.

The famous MATLAB backslash operator


In MATLAB the system of equations Ax = b is solved with the backslashcommand: x = A \b.

When A is a quadratic n× n matrix MATLAB automatically examines A tosolve the system with the method that gives the least roundoff error andfewest operations.

■ If A is a permutation of a triangular system, then the appropriatetriangular solver is used.

■ If A appears to be symmetric and positive definite, then a Choleskyfactorization and two triangular solves are attempted.

■ If Cholesky factorization fails of if A does not appear to be symmetric, anLU factorization and two triangular solves are attempted.

Let us assume that with x = A \b we come as close to the exact solution of

a linear system as possible in MATLAB. (Be careful, the problem itself may be

ill-conditioned, more on that later.)

Let’s apply this to our computer code


Extending the code:

■ Add the calculation of the error and residual vector for the computedresults of the circuit-example.

■ How accurate is the the result in your opinion?

Computer Lab – Part 3:Testing the Code

Constructing a test case for the naiveGaussian algorithm (1/ 2)


One good way to test a procedure is to set-up an artificial problem whosesolution is know beforehand. Sometimes the test problem includes a parameterthat can be changed to vary the difficulty. The next example illustrates this:

In general, the Vandermonde-matrix is given as

V :=

1 2 4 8 . . . 2n−1

1 3 9 27 . . . 3n−1

...1 n+ 1 (n+ 1)2 (n+ 1)3 . . . (n+ 1)n−1

.

I.e.,V =

(

vi,j = (1 + i)j−1)

i,j=1,...,n

(you know this matrix already from your computer example – the rocket’s

speed).

Constructing a test case for the naiveGaussian algorithm (2/ 2)


If we define a column vector

b :=

(

bi =1

i((1 + i)n − 1)

)

i=1,...,n

we get a nice linear system V x = b the analytic solution of which is

x = (1, 1, 1, 1, . . . , 1)T ∈ Rn .

Deriving the analytic solution of V x = b(1/ 2)


The following procedure is similar to recovering the rocket’s speed in yourhomework. Assume, the polynomial

p(t) = 1 + t+ t2 + t3 + t4 + · · ·+ tn−1 =n∑

j=1

tj−1 =n∑

j=1

xjtj−1

is given, where all coefficients xj are by definition equal to one.

Next, we simply forget that we know the values of the coefficients and want

to recover them by the known evaluations of the polynomial p(t) at the

integers t = 1, 2, 3, . . . , n.

Deriving the analytic solution of V x = b(2/ 2)


This gives the following system of n equations for n unknowns:

n∑

j=1

xj(1 + i)j−1 = p(1 + i) =n∑

j=1

(1 + i)j−1 =(1 + i)n − 1

(1 + i)− 1

=1

i((1 + i)n − 1) ,

which is equivalent to V x = b. Here, we used the formula for the sum of ageometric series.

This is actually the ansatz for polynomial interpolation as we will see in a

month or so.

Let’s apply this to our computer code


You find a file VandermondeMat.m on our blackboard course page that youshould download to the same folder as our MATLAB naive Gaussianalgorithm implementation.

First, for increasing size n = 5, 6, 7, . . . of the Vandermonde-matrix measure

■ the time required by naive Gaussian elimination (tic - toc), and■ the number of long operations performed.

Next to the accuracy of our computations:

■ modify the code of your program such that the output vector x ofNaiveGauss for the Vandermonde-system is subtracted by ones in eachrow, i.e., compute the absolute error (which equals the relative error inthis case).

■ What do you recognize for n = 10, 11, 12?

Interpretation of theVandermonde-system test case


When increasing, we suddenly obtain a huge relative error for one of thecomponents such that the result of our computation is actually worthless.

At that stage the round-off error that is present in computing xi ispropagated and magnified throughout the backwards substitution phase.

We say that the Vandermonde-matrix is ill-conditioned because itssolution is that prone to errors (i.e. small changes in the input data).Note, being ill-conditioned is a property of the problem not the algorithm.

A first remark on conditioning


Note by heart:

Conditioning is a property of the problem, stability is a property of thealgorithm, and both have effects on the accuracy of the solution.

Thus, if the answer is not right, the algorithm should not be blamedautomatically; the condition of the problem may be bad (ill-conditionedproblem).

If the problem is ill-conditioned, no matter what algorithm is used, accu-racy can not be gained.



Reviewing the highlights from last time

Page 80, exercise 3 cApply naive Gaussian elimination to the following system and account for thefailures. Solve the system by other means if possible:

{

0 · x1 + 2x2 = 4x1 − x2 = 5

.

Naive Gaussian elimination does not work, because the pivot elementa1,1 = 0 is vanishing.

We have to permute the rows of this system of linear equations first, toget a non-vanishing pivot element. If we do this, we see that the systemis already in upper echelon form; backward substitution gives the result.

Naive Gaussian EliminationCan Fail: Reasons &

Remedies

Stability of the naive Gaussian elimination


As said, Gaussian elimination can fail. Today, we “pimp-up” our naiveGaussian elimination algorithm such that it avoids most reasons for notperforming well.

Therefore, we first discuss examples where failure occurs in

■ the forward elimination and/ or■ the backward substitution.

Note:

■ Naive Gaussian elimination is a very efficient algorithm, but it is a highlyunstable one. It does not avoid magnifying small errors (we will illustratethis by replace a1,1 = 0 with a1,1 = ε≪ 1 in the next example).

■ Even the most efficient and stable algorithm fails if the problem isill-conditioned. I.e., there may be problems, that you simply cannot solvewithout transforming them to a well-conditioned equivalent one.

Failure 1: a pivot is close to zero (1/ 3)


We already know that naive Gaussian elimination fails for the system

(

0 11 1

)(

x1x2

)

=

(

12

)

.

If a numerical procedure actually fails for some values of the data, then it islikely to fail for data near the failing values, too. To “test” this dictum, weconsider the system

(

ε 11 1

)(

x1x2

)

=

(

12

)

.

in which 0 < ε≪ 1 is a small positive number different from zero1.

1 negative values for ε with a small absolute value would work analogously



In this ε-perturbed case the naive Gaussian algorithm works and its forwardelimination results in

ε 1 11 1 2

−→ε 1 10 1− ε−1 2− ε−1

and, analytically, its backward substitution gives for small values of ε:

x2 =2− ε−1

1− ε−1≈ 1 ,

and

x1 = ε−1(1− x2) = ε−1

(

1−2− ε−1

1− ε−1

)

=1

1− ε≈ 1 .

If ε is very small, ε−1 is huge. So if the calculation is performed by a

computer with finite word length, we will get another picture . . .



In this ε-perturbed case the naive Gaussian algorithm works and its forwardelimination results on a computer with finite word length in

ε 1 11 1 2

−→ε 1 10 1− ε−1 2− ε−1

=ε 1 10 −ε−1 −ε−1

as 1− ε−1≈ −ε−1 and 2− ε−1≈ −ε−1 if ε−1 ≫ 1. Backward substitutiongives

x2 =−ε−1

−ε−1= 1 ,

andx1 = ε−1(1− x2) = ε−10 = 0 .

The relative error for the computed solution of x1 is thus 100 %.

Side remark: adding/ subtracting twonumbers of different magnitude


Example

Given an 8-digit decimal machine with a 16-digit accumulator (i.e. “calculator”)and ε = 10−9. What is then the result of 2− ε−1?

To subtract, the computer must interpret the numbers as

ε−1 = 109 = 0.10000000 · 1010 = 0.1000000000000000 · 1010

− 2 = 0.20000000 · 101 = 0.0000000002000000 · 1010

0.0999999998000000 · 1010

and finally rounding to 8 decimal digits gives 0.10000000 · 1010 = ε−1 again.

Remedy (?): permute the rows, i.e. find anon-zero pivot row


If we permute the equations, the naive Gaussian algorithm works perfectly finefor

0 1 11 1 2

I↔II−→

1 1 20 1 1

=⇒ x2 = 1 , x1 = 1

as well as for

ε 1 11 1 2

I↔II−→

1 1 2ε 1 1

−→1 1 20 1− ε 1− 2ε

which gives

x2 =1− 2ε

1− ε≈ 1 , x1 = 2− x2 ≈ 1 .

Hypothesis:The difficulty of obtaining correct results is not simply due to ε being small,but rather to its being small relative to the other coefficients in the same row.

Failure 2: elements in the same row areof extremely different magnitude (1/ 2)


Let us “check” this alarming hypothesis first with an example, where allelements in a row are of the same magnitude O(ε). Here, naive Gaussianelimination works without problems.

ε 2ε 3ε4 5 0 (plus −4

ε-times the 1st row)

−→ε 2ε 3ε0 −3 −12

−→ε 2ε 3ε0 1 4

−→

{

x1 = −5x2 = 4

Next, we consider the worst possible case where there is actually a non-pivot

element that is extremely small compared to all other elements. Here, we will

face the problems as in the example where one of the pivot elements was

actually very small.

Failure 2: elements in the same row areof extremely different magnitude (2/ 2)


Let the pivot element be one and the other element in its row be of O(ε−1),then

1 ε−1 ε−1

1 1 2−→

1 ε−1 ε−1

0 1− ε−1 2− ε−1

and we face exactly the same issue as discussed.

This situation can be resolved again by interchanging the two rows (thoughthere would be no need for that as the pivot element is clearly different formzero). This gives:

1 ε−1 ε−1

1 1 2I↔II−→

1 1 21 ε−1 ε−1

−→1 1 20 ε−1 − 1 ε−1 − 2

and hence the correct solution

x2 =ε−1 − 2

ε−1 − 1≈ 1 , x1 = 2− x2 ≈ 1 .

Thus, an intelligent row pivoting strategyis required


In total we have

naive Gaussian elimination

+ row switching if the pivot element is vanishing relativeto the remaining entries in its row

= stable algorithm

This improvement strategy is called scaled pivoting.

We will formalize how a computer can decide whether a “pivot elementis vanishing relative to the remaining entries in its row” or not.

Summary & Outlook

Naive Gaussian elimination opens thedoor to a plethora of further tasks


Naive GaussianElimination

Pivoting Strategies

Computing Costs

Matrix Factorization (LU, Cholesky, etc.)

Error Estimation (i.e., condition & stability)

What happens if the matrix has a special structure?

What happens if the matrix is really large?

Eigenvalue Computations

Major concepts covered today (1/ 2):residual & error vectors


■ When solving the linear system Ax = b, if the true or exact solutionis x and the approximate or computed solution is x̃, then importantquantities are

◆ error vector: e = x̃− x◆ residual vector: r = Ax̃− b

■ For an n× n system of linear equations Ax = b the forwardelimination phase of the naive Gaussian elimination involvesapproximately O(n3) long operations (multiplications or divisions),whereas the back substitution requires only O(n2) long operations.

■ A problem is ill-conditioned, if small changes in its input data havea huge impact on its result.

Major concepts covered today (2/ 2):condition & stability


■ Conditioning is a property of the problem, stability is aproperty of the algorithm, and both have effects on the accuracyof the solution. Thus, if the answer is not right, the algorithm shouldnot be blamed automatically; the condition of the problem may bebad (ill-conditioned problem). If the problem is ill-conditioned, nomatter what algorithm is used, accuracy can not be gained.

■ The naive Gaussian algorithm is a highly efficient algorithm, but itis not stable, i.e., does not avoids magnifying small errors. We haveseen this in our example in our pivoting example, where we replaceda1,1 = 0 with a1,1 = ε.

■ The naive Gaussian algorithm is stabilized when we apply a scaledpivoting strategy (row switching) if the pivot element is vanishingrelative to the remaining entries in its row.

Preparation for the next lecture (1/ 2)


Please, prepare these short exercises for the next lecture:

1. Page 80, exercise 2For what values of α ∈ R does naive Gaussian elimination produce erro-neous answers for the system

(

1 1α 1

)(

x1x2

)

=

(

22 + α

)

Explain what happens in the computer.

Preparation for the next lecture (2/ 2)


Please, prepare these short exercises for the next lecture:

2. 6, exercise 9 (easier version)Give the LU-factorization of

A :=

2 −1 22 −3 36 −1 8

.

3. Page 82, computer exercise 9Carry out the test for our function NaiveGauss based on the

Vandermonde-matrix but reverse the order of the equations. I.e., in the

code replace i by n− i+ 1 in the appropriate places.

linear systems & naive gaussian elimination – part b – · pdf filelinear...

Documents