solution of discretized equations

7/29/2019 Solution of Discretized Equations

1/26

5. Solution of Discretized Equations

In the last two chapters, we have seen the principles underlying the discretization of the

governing equations using schemes of the desired accuracy and stability behaviour. The result

of the discretization of an equation is a system of coupled, often non-linear, algebraic equations.

In this chapter, we will look at methods available to solve these discretized equations. Before

discussing the details of the algebraic equation solvers, let us first examine the nature of the

discretized equations by taking the example of generic scalar transport equation:

d(rhophi)/dt + del.(rhouphi) = del.(gamma delphi) + Sphi (1)

where phi is the scalar and the other terms and variables have the usual significance. In

principle, the following discussion can be extended to the solution of simultaneous solution of

the mass, momentum and energy conservation equations. However, several additionals issueshave to be considered before this is possible and a full discussion of this, i.e., the simultaneous

solution of all the governing equations, is postponed to the next chapter. Consideration of the

the generic scalar equation (1) is sufficient for the present purpose of studying the techniques for

the solution of the discretized equations. Without much loss of generality, we restrict our

attention to a two-dimensional case in cartesian coordinate system with constant properties and

no source terms. For this case, equation (1) can be written as

d(rhophi)/dt + d/dx(urho phi) + d/dy(vrhophi) = gamma (d2phi/dx2 + d2phi/dy2) (2)

Considering a rectangular grid (Figure 1) with uniform spacing of dx and dy in the x- and y-

directions and using a first order upwind scheme (assuming u and v to be positive) for the

advection term, central scheme for the diffusion term and a first order implicit scheme for the

time derivative, the discretized equation corresponding to equation (2) can be written as

1/dt( phiijn+1 phiijn) + u/dx( phiijn+1 phi-1jn+1) + v/dy(phiijn+1 phiij-1n+1)

= gamma[(phhi+1jn+1 2phiijn+1 + phii-1jn+1)/dx^2

+ (phiij+1n+1 2phhijn+1 + phiij-1n+1)/dy^2] (3)

The above equation can be rearranged to cast it in the following form:

aijphijn+1 + ai-1jphii-1jn+1 + ai+1jphii+1jn+1 + aij-1phiij-1n+1 + aiij+1phiij+1n+1 = bij (4)

where ai-1j =

ai+1j =

aij-1 = (5)


2/26

aij+1 =

aij =

bij =

Considering a 5 x 5 grid with Dirichlet boundary conditions on a cartesian grid shown in Figure

2, for each time step, we have the following nine unknowns: phi22, phi32, phi42, phi23, phi33,

phi43, phi24, phi34 and phi44. Correspondingly, we will have nine algebraic equations of the

form of (4). Listing these in the above lexicographic order, i.e., in a permution involving (ijk),

going through all the possible values of I while keeping before changing j and k and continuing

this way until all the permutations are over, results in the following set of algebraic equations:

(6)

which can be put in matrix form as

[A] [phi] = [b] (7)

where the A, phi and b are given by

(8)

Equations of the form (7) have to be solved at each time step in order to obtain the values of

phiij. Before we address the question of how to solve these equations, a number of remarks are

in order. In general, the flow may be characterized by a number of variables, e.g., laminar,

newtonian, incompressible, non-reacting, isothermal two-dimensional flow requires the

specification of u, v and p to completely characterize the flow field. A number of additional

variables are required for the description of turbulent reacting flows as will be seen in Chapter 8.

Each such variable will have a set of equations of the form (7). Often, these equations are

coupled and the coefficients aij are non-linear and unknown (as they may involve variables

which have not yet been solved for). In the general case, these are linearized and solved

iteratively using methods to be discussed in Chapter 6. The process of linearization results in a

set of equations of the form of equation (7) for each variable with constant (estimated)

coefficients (which are regularly updated in the iterative scheme). These have to be repeatedly

solved in order to arrive at the desired solution. Therefore, an efficient method for the solution

of the linear algebraic equations is necessary in order to keep the computational requirements towithin reasonable limits. In the present chapter, we discuss various options for doing so.

It is pertinent, at this stage, to consider the general features of the set of equations we

wish to solve. The number of equations in the set is equal to the number of grid points at which

the variable is to be evaluated. For a typical CFD problem, the number of grid points is very

large and can be of the order of 10000 to 1000000 or even higher for complicated three-


3/26

dimensional flows. Hence, the set of equations we are dealing with is very large. Hence speed

of solution and memory requirements are important considerations in choosing an efficient

method. Another general feature of the equations is the sparseness of the coefficient matrix A.

Since the value of a variable phiij is typically expressed in terms of its immediate neighbours,

each equation will have only a few non-zero coefficients. For example, in the example

considered, phiij is expressed in terms of its four neighbours and the computational molecule,

see Figure 3, involves five nodal points and correspondingly, the algebraic equations contain at

most five non-zero coefficients. If schemes of higher order accuracy are used, or if a three-

dimensional case is considered in a non-orthogonal coordinate system, more number of nodes

may be involved. Schemes with a 19-node computational molecule have been proposed. This

implies that each of the equations of (7) may contain up to 19 non-zero coefficients. For a

problem with 10000 grid points, the non-zero coefficeints could therefore be 190000 which is

considerably less than the general case in which all the 10000 x 10000 = 108

coefficients would

be non-zero. Thus, the coefficient matrix A is generally very sparse. Also, the non-zero

coefficients may lie in the case of strucutred grids along certain diagonals as is evident fromequation (8). The number of these diagonals depends on the terms present in the governing

partial differential equation and the discretization and linearization schemes used to obtain the

linearized algebraic equations. This diagonal structure of the coefficient matrix is not in general

present in an unstrucutred mesh. Finally, the coefficient matrix usually exhibits the diagonal

dominance feature, i.e, the magnitude of the diagonal element is greater than the sum of the

magnitudes of the off-diagonal elements. While this is not a requirement for the general case,

this is certianly a desirable feature and special care is taken in the discretization and linearization

process to ensure this in its weak form, i.e., that diagonal dominance is present for at leat one of

the set of equations while for the others the magnitude of the diagonal element is at least equal to

the sum of the magnitudes of the off-diagonal elements.

These are the general features of the set of linear algebraic equations which we need to

solve. Two large classes of methods are available for the resolution of these linear algebraic

equations: direct and iterative methods. Direct methods are based on a finite number of

arithmetic operations leading to the exact solution of a linear algebraic system except for round-

off errors that are inevitable in a computed solution. Iterative methods, on the other hand, are

based on producing a succession of approximate solutions which leads to the exact solution after

an infinite number of steps. In practice, due to round-off and other considerations (such as the

need to update the coefficient matrix A to account for non-linearity), a finitely accurate solutionis sought. When this is the case, the number of arithmetic operations required to solve the

equation using a direct method can be very high, and it is often much larger than that required by

an iterative method to achieve a given level of accuracy. Also, iterative methods take account

account of the sparseness of the coefficient matrix while direct methods, except in special cases,

do not do so. For all these reasons, iterative methods are invariably used to solve the linearized

algebraic equations in CFD problems. However, many techniques used for accelerating the


4/26

convergence rate of iterative methods are based on approximations derived from direct methods.

Also, when multigrid methods or coupled solution methods are used, both iterative and direct

methods may be used in the overall scheme of solution. In view of this, we discuss both direct

and iterative methods. The basic methods of each class are discussed in Sections 5.1 and 5.2,

respectively. These are followed in Section 5.3 by more advanced iterative methods and in

Section 5.4 by multigrid methods.

5.1 Direct Methods

5.1.1 Cramer's rule

This is one of the most elementary methods and is often taught in school-level algebra

courses. For a system of equations described as

sum over I of (aijphi I ) = bi (9)

the solution for phii can be obtained as

phi I = |Ai| / |A| (10)

where the matrix Ai is obtained by replacing the ith column by the column vector bi. Thus, for

the system of three equations given by

2 phi + 3 phi2 + 4 phi3 = 5

6 phi1 + 7 phi2 + 8 phi3 = 9

10 phi1 + 13 phi2 + 14 phi3 = 12

phi1, phi2 and phi3 are given by

| 5 3 4 | | 2 3 4 |

phi1 = | 9 7 8 | / | 6 7 8 | =

| 12 13 14 | | 10 13 14|

| 2 5 4 | | 2 3 4 |

phi1 = | 6 9 8 | / | 6 7 8 | =

| 10 12 14 | | 10 13 14|

| 2 3 5 | | 2 3 4 |


5/26

phi1 = | 6 7 9 | / | 6 7 8 | =

| 10 13 12 | | 10 13 14|

While this method is elementary and yields a solution for any non-singular matrix A, it is very

inefficient when the size of the matrix is large. The number of arithmetic operations required to

obtain the solution varies as (n+1)!, where n is the number of unknowns (or equations), for large

n. Thus, the number of arithmetic operations for 10 equations is of the order of 4 x 107

and can

be obtained in a fraction of a second using a Gigaflop personal computer. For the solution of 20

equations, the number of required is of the order of 5 x 1019

which increases to 8 x 1033

for a

system of only 30 equations. Even the fastest computer on the earth would require billions of

years to get a solution in this case involving the equivalent of 30 grid points. Hence extreme

caution should be exercised in using Cramers rule in CFD computations.

5.1.2 Gaussian elimination

In contrast to the Cramers rule, Gaussian elimination method is a very useful and

efficient way of solving a general system of algebraic equations, i.e., one which does not have

any structural simplifications such as bandedness, symmetry and sparseness, etc. The method

consists of two parts. In the first part, the system of n equations and n variables is systematically

and successively reduced to a smaller system containing a smaller number of variables by a

process known as forward elimination. This ultimately results in a coefficient matrix in the form

of an upper triangular matrix which is readily solved in the second part using a process known as

back substitution. These steps are explained below.

Consider the set of equations given by

a11u1 + a12u2 + .................................... = c1

a21u1 + a22u2 + .................................... = c2

.

. (11)

.

an1u1 + an2u2 + .................................... = cn

The objective of the forward elimination is to transform the coefficient matrix {aij} into an uppertriangular array by eliminating some of the unknowns from of the equations by algebraic

operations. This can be initiated by choosing the first equation as the "pivot" equation and using

it to eliminate the u1 term from each equation. This is done by multiplying the first equation by

a21/a11 and substracting it from the second equation. Multiplying the pivot equation by a31/a11

and subtracting it from the third equation eliminates u1 from the third equation. This procedure

is continued until u1 is eliminated from all the equations except the first one. This results in a


6/26

system of equations in which the first equation remains unchanged and the subsequent (n-1)

equations form a subset with modified coefficients (as compared to the original aij) in which u1

does not appear:

a11u1 + a12u2 + .................................... = c1

a22u2 + a23u3 + .................... = c2

.

. (12)

.

an2u2 + an3u3 + ........................ = cn

Now, the first equation of this subset is used as pivot to eliminate u2 from all the equations

below it. The third equation in the altered system is then used as the next pivot equation and the

process is continued until only an upper triangular form remains:

a11u1 + a12u2 + .................................... = c1

a22u2 + a'23u3 + ......................... = c'2

a33u3 + a34u4 + ................ = c3 (13)

.

an-1n-1un-1 + an-1nun = cn-1

annun = cn

This completes the forward elimination process. The back substitution process consists of

solving the set of equations given by (13) by successive substituion starting from the bottom-

most equation. Since this equation contains only one variable, namely, un, it can be readily

calculated as un = cn/ ann. Knowing un, the equation immediately above it can be solved by

subsituting the value of un into it. This process is repeated until all the variables are obtained.

Of the two steps, the forward elimination step is the most consuming and requires about

n3/3 arithmetic operations for large n. The back substitution process requires only about n2/2

arithmeticoperations. Thus, for large n, the total number of arithmetic operations required to

solve a linear system of n equations by Gaussian elimination varies as n3 which is significantly

less than the (n+1)! operations required by Cramers rule. While Gaussian elimination is the

most efficient method for full matrices without any specific structure, it does not take advantageof the sparseness of the matrix. Also, unlike in the case of iterative methods, there is no

possibilty of getting an approximate solution involving fewer number of arithmetic operations.

Thus, full solution, and only full solution, is possible at the end of the back substitution process.

This feature of lack of an intermediate, approximation solution is shared by all direct methods

and is a disadvantage when solving a set of non-linear algebraic equations as the coefficient

matrix needs to be updated repeatedly in an overall iterative scheme as the Newton-Raphson


7/26

method. Finally, for large systems that are not sparse, Gaussian elimination, when done using

finite-precision arithmetic, is susceptible to accumulation of round-off errors and a proper

pivoting strategy is required to reduce this. Pivoting strategy is also required to eliminate the

possibility of zero-diagonal element as this will lead to a division by zero. A number of

pivoting strategies have been discussed in Marion (19??) although accumulation of round-off

error may not pose a major problem in CFD-related situations as the coefficient matrix is very

sparse.

5.1.3 Gauss-Jordon elimination

A variant of the Gaussian elimination method is the Gauss-Jordan elimination method.

Here, variables are eliminated from rows both above and below the pivot equation. The

resulting coefficient matrix is a diagonal matrix containing non-zero elements only along the

diagonal. This eliminates the back substitution process. However, no computational advantage

is gained as the number of arithmetic operations required for the elimination process is about

three times higher than that required for the Gaussian elimination method. Gauss-Jordonelimination can be used to find the inverse of the coefficient matrix efficiently. It is also

particularly attractive when solving for a number of right hand side vectors [b] in equation (7).

Thus, Gauss-Jordon method can be used to simultaneously solve the following sets of equations:

[A] .[x1 |_| x2 |_| x3 |_| Y] = [b1 |_| b2|_| b3 |_| I] (14)

which is a compact notation for the following system of equations:

A. x1 = b1 A.x2 = b2 Ax3 = b3 and A.Y = I

where I is the identity matrix and Y is obviously the inverse of A, i.e., A-1. The computed A-1

can be used later, i.e., not at the time of solving equation (14) to solve for an additional right

hand side vector b4, i.e., to solve A.x4 = b4 as x4 = A-1b4 at very little additional cost. Since

computations are usually done with finite-precision arithmetic, the computed solution, x4, may

be affected by round-off error, which however may not be a hazard in CFD-related problems as

the coefficient matrix A is usually sparse and the resulting round-off error is therefore smaller

than in solving full matrices.

5.1.4 LU decomposition

LU decomposition is one of several factorization techniques used to decompose the

coefficient matrix A into a product of two matrices so that the resulting equation is easier to

solve. In LU decomposition, the matrix A is written as the product of a lower (L) and an upper

(U) triangular matrix, i.e., [A] = [L] [U] or aij = sum over k of (lik ukj). Such decomposition is


8/26

possible for any non-singular square matrix. Before we discuss the algorithm for this

decomposition, i.e., the method of finding lij an uij for a given aij, let us examine the

simplification of the solution resulting from this factorization.

With the LU decomposition, the matrix equation

[A] [phi] = [b] (7)

can be written as

[L][U][phi] = [b] (14)

which can be solved in two steps as

[L] [y] = [b] (15a)

[U][phi] = [y] (15b)

Solution of equations (15a) and (15b) can be obtained easily by forward and backward

substitution, respectively. Thus, the LU decomposition renders the solution of (7) easy. The key

to the overall computational efficiency lies in the effort required to find the elements of [L] and

[U].

For a given non-singular matrix A, the LU decomposition is not unique. This can be seen

readily by writing the decomposition in terms of the elements as

| | | | | |

| | | | | | (16)

| | = | | | |

| | | | | |

Since aij = sum (lik ukj), equation (16) gives N2 equations, where N is the number of rows (or

columns) in matrix A, while the number of unknowns, i.e., the elements of L and U matrices, are

N2+N. An efficient algorithm, known as Crouts decomposition, results if the N diagonal

elements of L, i.e., lii, are set to unity. This makes the number of remaining unknowns equal to

the number of available equations which can be solved rather trivially by rearranging the

equations in a certain order. The Crouts algorithm for finding the the lij and uij can besummarized as follows:

Set lii = 1 for k = 1, 2, . N For each j = 1, 2, . N, solve for (17)

uij = aij - sum(k = 1 to I-1) of (likukj) for I = 1, 2, , j lij = 1/ujj(aij- sum(k = 1to j-1 of (likukj) for I = j+1, =2, N


9/26

Pivoting is necessary to avoid division by zero (which can be achieved by basic pivoting) as well

as to reduce round-off error (which can be achieved by partial pivoting involving row-wise

permutations of the matrix A). A variation of the Crouts decomposition is the Doolittle

decomposition where the diagonal elements of the upper triangular matrix are all set to unity and

the rest of uij (I=/j) and lij are found by an algorithm similar to that in equation (17). It can be

shown that this procedure is equivalent to the forward elimination step of the Gaussian

elimination method described in section 5.1.2 above. Thus, the number of arithmetic operations

to perform the LU decomposition is N3/3. However, the final solution solution is obtained by a

forward substitution step followed by a back substitution step, each of which would take about

n2/2 number of arithmetic operations. Thus, for large N, the Gaussian elimination and the LU

decomposition are nearly equivalent in terms of the number of the number of arithmetic

operations required to obtaine the solution. The principal advantage of the LU decomposition

method lies in the fact that the LU decomposition step does not require manupulation of the right

hand side vector, b, of equation (7). While LU decomposition is rarely used on its own in largeCFD problems, an approximate or incomplete form of it is used to accelerate the convergence of

some iterative methods, as will be discussed below.

5.1.5 Cholesky Decomposition

This is a special form of the LU decomposition for matrices which are symmetric and

positive definte. If matrix A is symmetric, then

Aij = Aji (18)

and if it is positive definite, then for all non-zero vectors v,

v.A.v >0 (19)

which is the equivalent of saying that all the eigenvalues of A are real and posiitive. A sufficient

condition for positive definiteness is the Scarborough condition of general diagonal dominance,

namely,

|aii| >= sum over j = 1 to N but j=/I of (|aij| for all I

= sum over j= 1 to N but j=/I of (|aij|) for at least one I (20)

For matrices which satisfy conditions (18) and (19), the LU decompositon can be performed as

A = LLT

(21)requiring half the number of arithmetic operations and without any pivoting. Noting that Lij =

LjiT, the algorithm for Cholesky decomposition can be written as

Lii = (aii sum over k = 1 to I-1 of (Lik^2))^0.5

and

Lji = 1/Lii(aij-sumover k = 1 to I-1(LikLjk) for j = I+1, I+2 , N (22)


10/26

Since the conditions of symmetry and positive definiteness of A are not satisfied in many cases

dealing with fluid flow problems, the Cholesky decomposition is not used itself as a solution

scheme but an incomplete version of it is used in conjugate gradient methods to be described

later in this chapter.

5.1.6 Direct Methods for Banded Matrices

Banded coefficient matrices often result in using CFD problems based on structured

mesh algorithms. In addition to allowing for more efficient storage of the coefficients (wherein

the non-zero values of the coefficient matrix are not usually stored), banded matrices often

permit simplification of the general elimination/ decomposition techniques described above to

such an extent that very efficient solution methods may be found for simple, but not uncommon,

banded matrices. Here, we examine the specialized algorithms for three such banded matrices:

tridiagonal, pentadiagonal and block tridiagonal systems.

Consider the set of algebraic equations represented by

Tx = s (23)

where T is an n x n tridiagonal matrix with elements Ti, I-1 = ci, Ti,I = ai and Ti, I+1 = bi. The

standard LU decomposition of T then produces an L and U matrices which are bidiagonal. This

process can be simplified to produce an equivalent upper bidaigonal matrix of the form

Ux = y (24)

where the diagonal elements of U are all unity, i.e., Ui,I = 1 for all i. Denoting the superdiagonal

elements of U, namely, Ui,I+1 by di, equation (24) implies conversion of the ith equation of the

tridiagonal system of equation (23), namely,

cixi-1 + aixi + bixi+1 = si (25)

to be expressed as

xi + dixi+1 = yi (26)

where the coefficients di and yi are yet to be determined. This is done as follows. Solving (26)

for xi, we havexi = yi - dixi+1

Thus, xi-1 = y-1 - di-1xi

Substituting the above expression for xi-1 into (25) and rearranging, we have

xi + bi/(ai-ci*di-1) xi+1 = (si-yi-1*ci)/(ai-ci*di-1) (27)


11/26

Comparing equations (26) and (27), we get the following recurrence relations to determine di

and yi:

di = bi/(ai-ci*di-1) yi = (si-ci*yi-1)/(ai-ci*di-1) (28)

Also, for the first row, i.e., I = 1, we have d1 = b1/a1 and y1 = s1/a1. Using equation (28), the

tridiagonal matrix can be converted into an upper bidiagonal matrix as shown schematically in

Figure 5.XXX. It can be seen that the resulting matrix equation, Ux=y, can be solved readily by

back-substitution.

The above procedure, involving a simplification of the Gaussian elimination procedure,

to solve the tridiagonal system given by equation (23) is known as the Thomas algorithm or the

tridiagonal matrix algorithm (TDMA). It can be summarized as follows:

Step I : Determine the coefficients, di and yi, of the bidiagonal system (24) using the followingformulae:

d1 = b1/a1 and y1 = s1/a1

di+1 = bi+1/(ai+1-ci+1di) | for I = 1 to N-1 (29)

and yi+1 = (si+1 ci+1yi)/(ai+1-ci+1di) |

Step II: Solve the bidiagonal system (24) by back-substitution:

xN = yN and xi = yi-dixi+1 for I = N-1, N-2, 2,1 (30)

The Thomas algorithm, encapsulated in equations (29) and (30), is very efficient and the number

of arithmetic operations required for solution of equation (23) varies as N (compared to ~N3

for

Gaussian elimination and LU decompostion techniques). No pivoting strategy is incoporated in

the Thomas algorithm and it may fail therefore if a division by zero is encountered. It can be

shown that this possibility does not arise if the matrix T is diagonally dominant, which is usually

the case in many applications. In such cases, the Thomas algorithm provides a very efficient

method for solving the set of linear equations.

In a number of cases, the elements ai, bi and ci of the matrix T in equation (23) may notbe scalars but may themselves be matrices (Anderson et al., 1984) and equation (23) may take

the following form:

| [B1] [C1] | |[X1]| |[Y1]

|[A2} [B2] [C2} | |[X2]| |[Y2]|

| [A3] [B3] [C3] | |[X3]| = |[Y3]| (31)


12/26

| | | | | |

| | | | | |

| [AN [BN] | |[XN]| |[YN]|

where [A], [B], [C] , [X] and [Y] are square matrices. Matrix of the form of the coefficient

matrix in equation (31) are called block-tridiagonal matrix. The Thomas algorithm for scalar

coefficients can be used to solve block-tridiagonal matrices also. The only modification required

is that the division by the scalar coefficient should be replaced by multiplication with the inverse

of the corresponding matrix. Thus, the Thomas algorithm for block-tridiagonal systems can be

written as

d1 = a1-1b1 (32a)

y1 = ai-1s1 (32b)

di+1 = (ai+1-ci+1di)-1(bi+1) | for I = 1, N-1 (32c)

yi+1 = (ai+1 ci+1di)-1(zi+1-ci+1yi) | (32d)

and the back substitution step is given by

xN = yN (33a)

xi = yi-dixi+1 (33b)

If the sub-matrices [A], [B] etc. are large, the matrix inversion steps in equation (32) may be

performed without explicitly evaluating the inverse, for example, by solving a1d1=b1 instead of

as given in equation (32a).

The Thomas algorithm for tridiagonal matrices can be extended to the solution of

pentadiagonal systems. Consider the set of equations

Px = s (34)

where P is a pentadiagonal matrix, i.e., it has non-zero elements only along five adjacent

diagonals. Let these non-zero elements be represented by five row vectors as

Pi,I-2=ei; Pi,I-1 = di; Pi,I = ai; Pi,I+1=bi; Pi,I+2=ci (35)

The pentadiagonal system given by equation (34) can be converted first to a quadradiagonal

system

Qx = s (36)

where Q contains non-zero elements in (I,I-1), (I,I), (I, I+1) and (I, I+2), and then to an upper

tridiagonal system

Tx = s (37)


13/26

where T contains non-zero elements in (I,I), (I,I+1) and (I,I+2) which can be solved by back-

substitution. Denoting the elements of the matrices Q and T in a manner analogous to that given

for P in equation (35) as

Qi,I-1 = di; Qi,I = ai; Qi,I+1 = bi; Qi, I+2 = ci (38)

Ti,I = ai; Ti,I+1 = bi; Ti,I+2 = ci (39)

The procedure adopted for the tridiagonal system can be extended to obtain recurrence relations

to determine the elements of Q, T, s and s. These details are not given but the resulting

pentadiagonal matrix algorithm (PDMA) for the solution of the pentadiagonal system of

equation (34), involving three steps, is given below:

Step I : Find the elements of Q and s using the following formulae:

a1 = a1 b1 = b1 c1 = c1 s1 = s1d2 = d2 a2 = a2 b2 = b2 s2 = s2

and for I = 2 to N-1, ci = ci

aI+1 = ai+1 - bI*(ei+1/dI)

bI+1 = bi+1 - cI*(ei+1/dI) (40)

dI+1 = di+1 - aI*(ei+1/dI)

sI+1 = si+1 - sI*(ei+1/dI)

Step II: Find the elements of T and s using the following formulae:

b1 = b1/a1 c1 = c1/a1 s1 = s1/a1

for I = 1 to N-1 bI+1 = (bI+1-dI+1*cI)/(aI+1-dI+1*bI)

cI+1 = cI+1/(aI+1-dI+1*bI) (41)

sI+1 = (sI+1 -dI+1*sI)/ (aI+1-dI+1*bI)

Step III: Find xi by modified backward substitution using the following formulae:

xN = sN

xN-1 = sN-1 - bN-1*xNand for I = N-2, N-3, , 1 xi = sI - bI*xi+1 - cI*xi+2 (42)

Similar to the Thomas algorithm, the pentadiagonal scheme given above is a special version of

Gaussian elimination procedure in which arithmetic operations involving the zero elements of

the coefficient matrix are pruned out leaving a compact and efficient scheme. Even more

efficient methods such as cyclic reduction methods are available for specialized matrices


14/26

(Pozrikidis, 1998) but these are not discussed here.

It should be noted that the TDMA and PDMA schemes given above are valid only when

the three or five diagonals, respectively, are adjacent to each other. If there are zero diagonal

elements in between, as illustrated in Figure 5. Xxx, then the above schemes may not work.

Specifically, the discretization of the transient heat conduction equation in two-dimensions

results in a penta-diagonal system but one in which the diagonals are not adjacent to each other.

The PDMA scheme cannot be used to solve this system, and thus has limited application in

CFD-related problems. For such cases, the alternating direction implicit (ADI) method can be

used which makes use of the TDMA scheme to solve equations implicitly in one direction at a

time. This method is discussed later.

5.2 Basic Iterative Methods

Iterative methods adopt a completely different approach to the solution of the set of linear

algebraic equations. Instead of solving the original equation

Ax = b (7)

they solve x = Px + q (43)

where the matrix P and the vector q are constructed from A and b in equation (7). Equation (43)

is solved by the method of successive approximations, also known as Picards method. Starting

with an arbitrary initial vector x0, a sequence of vectors xk, k>=0 is produced from the formula

xk+1 = Pxk + q k >=0 (44)

The iterative method is said to be convergent if

lim k-> inf (xk) = x for every initial vector x0 (45)

Whether or not an iterative method converges depends on the choice of the matrix P, known as

iteration matrix, in equation (43). Even for convergent methods, the rate of convergence is notnecessarily the same. (For non-linear algebraic equations additional considerations arise but the

choice of the initial vector, x0, is also important; however, for linear problems, this is not the

case.) Thus, an iterative method for the linear algebraic equation (7) is characterized by the

construction of P and q; by the conditions for convergence of the sequence (44) and by the rate

of convergence. Once these are determined, the implementation of an iterative scheme is rather

simple compared to the direct methods so much so that Gauss, in 1823, was supposed to have


15/26

written in reference to the iterative method (see Axelsson, 1994) I recommend this modus

operandi. you will hardly eliminate anymore, at least not when you have more than two

unknowns. The indirect method can be pursued while half asleep or while thinking about other

things. This advantage of simplicity is negated by the theoretical limit infinite number of

iterations needed to get the exact solution. However, there is a practical limitthe machine

accuracy or round-off error-- to the accuracy that can be attained when solving equations using

modern computers and it is sufficient to undertake only a finite number of iterations to achieve

this for any converging iterative methods. Finally, when the algebraic equations are solved in a

CFD context involving non-linear equations, it is not necessary to solve the equations even to

machine accuracy. Some of these and other characteristicsboth advantageous and

disadvantageous in comparison with direct methodsare dicussed below for three classical

iterative methods, namely, the Jacobi method, the Gauss-Seidel method and the relaxation or

more specifically, the successive overrelaxation (SOR) method. We first discuss how the

iteration matrix P is constructed from the coefficient matrix A in each case and follow this up by

an analysis of its convergence behaviour.

All the three methods involve splitting of the matrix A into two matrices as

A = M - N (46)

where M is an easily invertible matrix, i.e., of diagonal or triangular or block diagonal or block

triangular structure. It is noted that diagonal systems, i.e., systems in which the coefficeint

matrix is diagonal, can be solved (inverted, although the inverse of the matrix may not be

explicitly computed in practice) readily, triangular matrices can be solved efficiently by forward

or back-substitution. Substituting the above splitting into equation (7), we obtain

(M-N)x = b

or Mx = Nx + b

or x = M-1Nx + M-1b (47)

which is in the form of equation (43) with P = M-1N and q = M-1b. In practice, M-1 is not

computed and the iterative scheme derived from equation (47) is written as

Mxk+1 = Nxk + b (48)

We can now discuss the above-mentioned three classical schemes in this framework.

5.2.1 Jacobi Method

In the Jacobi method, the set of equations comprising equation (7) are reordered, if

necessary, in such a way that the diagonal elements of the coefficient matrix A are not zero and


16/26

M is taken as a diagonal matrix containing all the diagonal elements of A. Thus, if the matrix A

is split into three matrices, namely, D, E, and F such that

Dij = aij deltaij

Eij = -aij if I < j or 0 otherwise (49)

and Fij = -aij if I > j and 0 otherwise

then A can be written as

A = D - E - F (50)

In the Jacobi method,

M = D

N = E+F (51)

resulting in the iteration scheme

Dx= (E+F)x + b

or x = D-1(E+F) + D-1b = D-1(D-A) + D-1b

or x = (I-D-1A)x + D-1b (52)

The implementation of the Jacobi scheme follows the iterative formula

Dxk+1 = (E+F)xk + b (53)

and, due to the diagonal form of D, is quite simple to solve. Denoting the iteration index k by a

superscript, one iteration of the Jacobi scheme for the linear system given by equation (7) takes

the following form:

a11x1k+1 = -a12x2k - a13x3k .-a1n-1xn-1k -a1nxnk - b1

a22x2k+1 = -a21x1k -a23x3k ..-a2n-1xn-1k -a2nxnk - b2

.

. (54)

an-1n-1xn-1k+1= -an-1,ax1k -an-1,2x2k -an-1,3x3k .. -an-1,nxnk- bn-1

an,nxnj+1 = -an1x1k -an2x2k -an3x3k-an-1,nxn-1k - bn

Its convergence behaviour will be analyzed later in the section.

5.2.2 Gauss-Seidel Method

In the Gauss-Seidel method, the matrices M and N of equation (46) are taken as


17/26

M = D-E N = F (55)

where D, E and F are given by equation (49) as above. The resulting iteration scheme is

(D-E)x= Fx + b

or x = (D-E)-1Fx + (D-E)-1b (56)

The implementation of the Gauss-Seidel scheme follows the iterative formula

(D-E)xk+1 = Fxk + b (57)

Although equation (57) appears to be more complicated than the corresponding Jacobi formula

given by equation (53), it can be rearranged to give the following equally simple formula:

Dxk+1 = Exk+1 + Fxk + b (58)

which allows sequential evaluation of xi in a form very similar to that of the Jacobi method.Denoting the iteration index k by a superscript as before, one iteration of the Gauss-Seidel

scheme for the linear system given by equation (7) takes the following form:

a11x1k+1 = -a12x2k - a13x3k .-a1n-1xn-1k -a1nxnk - b1

a22x2k+1 = -a21x1k+1 -a23x3k ..-a2n-1xn-1k -a2nxnk - b2

.

. (59)

an-1n-1xn-1k+1= -an-1,ax1k+1 -an-1,2x2k+1 -an-1,3x3k+1 .. -an-1,nxnk- bn-1

an,nxnj+1 = -an1x1k+1 -an2x2k+1 -an3x3k+1-an-1,nxn-1k - bn

5.2.3 Successive Over-relaxation (SOR) Method

The rate of convergence of either the Jacobi method or the Gauss-Seidel method can be

changed by a simple technique known as relaxation in which the value of the variable at k+1th

iteration is taken as

xk+1 = xk + w*dxk (60)

where dxk is the estimated improvement in xk, i.e. (xk+1 - xk) where xk+1 is the value estimated

by Jacobi/Gauss-Seidel method using equation (54) or (59). If w 1, then to over-relaxation. It can be shown (Cliaret, 1989) thatconvergence is possible only for 0 < w < 2, and that too under certain other conditions. Under-

relaxation is often used to solve non-linear algebraic equations where divergence is often a real

possibility. In the context of solution of linear algebraic equations, over-relaxation is often used

to improve the convergence rate resulting in the method known as successive over-relaxation

(SOR). When applied to the Jacobi method, the splitting of the matrix A corresponds to


18/26

M = D/w N = (1-w)/w D + E + F (61)


(D/w) x= [(1-w)/w D+ E+F]x + b

or x =(D/w)-1[(1-w)/wD + E+F]x + (D/w)-1b (62)

Denoting the iteration index k by a superscript, one iteration of the Jacobi scheme with SOR for

the linear system given by equation (7) takes the following form:

a11x1k+1 = a11x1k - w{a11x1k + a12x2k + a13x3k + .+ a1nxnk - b1}

a22x2k+1 = a22x2k - w{a21x1k + a22x2k + a23x3k +1.+a2nxnk - b2}

. (63)

an,nxnj+1 = annxnk - w{an1x1k +an2x2k + an3x3k + +annxnk - bn}

When SOR is applied to the Gauss-Seidel method, the splitting of the matrix Acorresponds to

M = D/w -E N = (1-w)/w D + F (64)


(D/w-E) x= [(1-w)/w D+F]x + b

or x =(D/w-E)-1[(1-w)/wD +F]x + (D/w-E)-1b (65)

Denoting the iteration index k by a superscript, one iteration of the Gauss-Seidel scheme with

SOR for the linear system given by equation (7) takes the following form:

a11x1k+1 = a11x1k - w{a11x1k + a12x2k + a13x3k + .+ a1nxnk - b1}

a22x2k+1 = a22x2k - w{a21x1k+1 + a22x2k + a23x3k +1.+a2nxnk - b2}

. (66)

an,nxnk+1 = annxnk - w{an1x1k+1 +an2x2k+1 + an3x3k+1 + +annxnk - bn}

It can be seen that the implementation of the successive over-relaxation technique requires little

extra overhead in per-iteration work while significant speed-up of convergence rate (up to an

order of magnitude) can be obtained with an optimum value of the relaxation parameter, wopt.However, as will be shown later, the optimum value is not known a priori in many cases and

may have to be estimated on a trial and error basis in the initial stages of the computation.

An essential difference between the Jacobi and the Gauss-Seidel methods lies in the order

in which the values of xi are found using equations (54) and (59), respectively. In the Jacobi

method, the order of the solution of equations (54) does not matter, and may be computed even


19/26

in parallel. Hence the Jacobi method is also known as simultaneous iteration method. In the

Gauss-Seidel method, the evaluation of successive xi has to be done in a prescribed order due to

the substitution of the latest values of known values and thus it is termed as successive iteration

method. Thus, the term successive over-relaxation (SOR) method is usually applied in the

context of a Gauss-Seidel SOR scheme (for example, see Cliarlet, 1989; Axelsson, 1994).

5.2.4 Block Iterative Methods

The above iterative methods have been applied to the evaluation of xi, I = 1 to N, each of

which is a scalar, corresponding to the value of a variable at a grid point. In such a case, the

iterative schemes are called point methods, for example, point Jacobi method and point Gauss-

Seidel method with SOR. These methods can be readily extended to the case when the

coefficient matrix has a block structure such as the block tridiagonal structure given by equation

(31). In this case, the matrix A can still be split into D, E and F each of which may consist of

further sub-matrices. An example of such decomposition is shown in Figure 5.xx (Fig. on p. 169of Ciarlet). The Jacobi or Gauss-Seidel iterative schemes are then applicable according to this

block decompostion. The corresponding iterative scheme is then called block-Jacobi or block-

Gauss-Seidel scheme.

Consider the block-decomposition of the matrix A shown in Figure 5.xx where the

original n x n system of linear equations is decomposed into a 4 x 4 system of equations

consisting of elements Apq. If this is split into matrices D, E, F, then D consists of the sub-

matrices {A11, A22, A33, A44}, E consists of {A21, A31, A32, A41, A42, A43} and F of {A12,

A13, A14, A23, A24, A34. A block-Gauss-Seidel method with SOR for this 4 x 4 system can be

written readily as

A11x1k+1 = A11x1k - w{A11x1k + A12x2k + A13x3k + A14x4k - b1}


A33x3k+1 = A11x1k - w{A31x1k + A32x2k + A33x3k + A34x4k - b3} (67)


where xi and bi themselves are vectors. Thus, each of the above four equations themselves

represents a further set of linear equations to be solved either by direct or iterative methods.

This kind of situation often arises in the coupled solution of Navier-Stokes equations, as will bediscussed in the next chapter.

5.3 Convergence Analysis of the Classical Iterative Schemes

It is often thought that the convergence of Jacobi and Gauss-Seidel schemes goes

together, i.e., one converges if the other does and that the latter method converges twice as fast.


20/26

We dispel this notion by considering the following examples suggested by Ciarlet (1989).

Consider, firstly, the set of three simultaneous linear algebraic equations given by

x1 + 2x2 2x3 = -2

x1 + x2 + x3 = 6 (68)

2x1 + 2x2 + x3 = 9

It can be shown, using any direct method, that the correct solution is xi = {1,2,3}. The solution

obtained using Jacobi and Gauss-Seidel schemes with an initial guess of xi = {0,0,0}is shown in

Table 5.1. It can be seen that the Jacobi method converges quickly while the Gauss-Seidel

method diverges rapidly. Similar results are obtained for other initial guesses. The convergence

behaviour of the SOR scheme is summarized in Table 5.2 where the calculated values are shown

for Jacobi with SOR and Gauss-Seidel with SOR for the SOR parameter values of w = 1.25, 1.5,

1.75 and 1.9. We can see that while the Gauss-Seidel method diverges rapidly in all the cases,

the Jacobi method converges eventually in all cases although it goes through large negative and/or positive values before converging in some cases. Also, increasing the value of the relaxation

parameter appears to worsen the convergence behaviour.

Consider now a second example:

2x1 x2 + x3 = 3

x1 + x2 +x3 = 6 (69)

x1 + x2 2x3 = -3

Again, it can be shown that the correct solution is xi = {1,2,3}. The solution obtained using the

Jacobi and Gauss-Seidel schemes for this case is shown in Table 5.3 for an initial guess of xi =

{0,0,0}. For this case, the Jacobi scheme diverges but the Gauss-Seidel scheme converges. The

Jacobi with SOR is also found to diverge for this example. The Gauss-Seidel scheme with SOR

exhibits a more complicated behaviour and the iteration values are summarized in Table 5.4 for

relaxation parameter values of 0.75, 1.05, 1.10 and 1.14. The first value, corresponding to

under-relaxation, appears to make the convergence faster while for the last value of 1.14, the

scheme diverges.

Thus, the convergence behaviour of these classical schemes can be more complicatedthan what appears to be the case. A necessary and sufficient condition (Axelsson, 1994) for

convergence for an iterative scheme of the form given by equation (43)

xk+1 = Pxk + q k>=0 (43)

for non-singular algebriac equations given by equation (7)

Ax = b (7)

is that the spectral radius of the iteration matrix, P, is less than unity, i.e.,


21/26

rho(P) < 1 for convergence (70)

The spectral radius of a matrix P is the non-negative number defined by

rho(P) = max {|lambdai|}, 1 0 as m -> infleading to a converged solution. The rate at which the error decreases is governed, for large m,

by the largest eigenvalue of P, i.e., by its spectral radius. Thus, the average convergence factor

(per step for m steps) depends on the spectral radius; the smaller the spectral radius, the larger

the convergence factor and the faster the rate of convergence. The asymptotic rate of

convergence is proportional to ln[rho(P)], i.e., asmptotically as n -> inf, the error decreases by a

factor of e every 1/(-ln [rhoP]) number of iterations, or by one order (factor of ten) for every


22/26

2.3/(-ln[rhoP]) number of iterations.

Consider the simple case of the Laplace equation in the square [0,a] x [0,a] in the x- and

y-direction. If this square is divided into a uniform mesh of M x M subrectangles, then for a

five-point Jacobi scheme, the eigenvalues of the iteration matrix are given by (see, for example,

Hirsch, 1988)

lambdaJac = 1- (sin2(lpi/2M) + sin^2(mpi/2M)]=1/2[cos(lpi/M)+cos(mpi/M)] (78)

where l and m correspond to the grid indices in the x- and y-directions, respectively, and take

values from 1 to (M-1). The spectral radius of the Jacobi iteration matrix is given by the highest

eigenvalue, that is, for l = m = 1 in equation (78) and is therefore

rho(PJ) = cos (pi/M) (79)

For large M, the spectral radius of the Jacobi iteration matrix can be approximated as

rho(PJ) 1- pi^2/2M^2 (80)

The asymptotic convergence rate for the Jacobi scheme for the Laplace equation with Dirichlet

boundary conditions is (2M^2/pi^2) or 0.466N iterations to reduce the error by one order where

N is the number of equations (Mx M) being solved simultaneously. Each equation of the Jacobi

iteration requires 5 arithmetic operations, and thus the total number of arithmetic operations

required to reduce error by four orders of magnitude for the Jacobi scheme is 9N^2. This

compares favourably with an 1/3N3 variation for the Gaussian elimination scheme discussed

earlier for large values of N.

For the Gauss-Seidel method, it can be shown that the corresponding eigenvalues of the

amplification matrix are equal to the square of the eigenvalues of the Jacobi method; thus,

lambdaGS = 1/4[cos(lpi/M)+cos(mpi/M)] 2 (81)

Hence the spectral radius of the iteration matrix for the Gauss-Seidel method for the Laplace

equation is given by

rho(PGS) = cos^2 (pi/M) (82)

which can be approximated for large M as

rho(PGS) 1- pi^2/M^2 (83)


23/26

Thus, the Gauss-Seidel method converges twice as fast as the Jacobi method for this problem.

For example, the total number of arithmetic operations required to reduce the error by four

orders of magnitude will be 4.5N2.

The above results of the relative convergence of the Jacobi and Gauss-Seidel methods for

the Laplace (and Poisson) equation can be generalized to the case where the coefficient matrix A

is symmetric and positive definite. It can be shown (Hirsch, 1988) that for such matrices, the

SOR method would converge provided the overrelaxation factor satisfied the relation

0 < w < 2/ [1+ rho(P) ] (84)

where rho (P) is the spectral radius of the iteration matrix of the pure Jacobi or Gauss-Seidel

method (i.e., without overrelaxation). The convergence rate for these methods will depend

strongly on the overreelaxation factor and is greatly improved near the optimal relaxation factor,wopt, which corresponds to the minimum spectral radius of the iteration matrix. For the Jacobi

method with SOR, the eigenvalues of the iteration matrix are related by

lambda(PJSOR) = (1-w) + wlambda(PJ) (85)

where lambda(PJ) are the eigenvalues of the pure Jacobi iteration matrix. The optimum value of

the overrelaxation parameter, w, for the JSOR method is given by

wopt,JSOR = 2/[2-(lambdamin + lambdamax) (86)

where lambdamin and lambdamax are the minimum and the maximum values of the iteration

matrix corresponding to the pure Jacobi method. For the specific case of the Laplace equation

with Dirichlet boundary conditions, the minimum and maximum eigenvalues are given by

lambdamax = -lambdamin = cos (pi/M)

and the optimal relaxation occurs for w = 1 corresponding to the Jacobi method without SOR

itself!

For the point Gauss-Seidel method with SOR, the eigenvalues of the corresponding

iteration matrix are related by

lambdaGSSOR = 1-w + w* lambda(Pjac) * lambdaGSSOR^0.5 (87)

The optimal relaxation factor, when A is symmetric and positive definite, is given by


24/26

wopt, GSSOR = 2/[1-{1-rho^2(Pjaco)}^0.5] (88)

The spectral radius for the optimal relaxation factor is given by

rho(GSSOR) = wopt 1 (89)

For the case of the Laplace equation with Dirichlet conditions, the spectral radius of the Jacobi

iteration matrix is cos (pi/M) and hence wopt for the GSSOR method is given by

wopt = 2/[1+ sin(pi/M)] 2(1-pi/M)

Hence the spectral radius at optimum relaxation is given by

rho(GSSOR) = 2(1-pi/M) 1 = 1- pi/M (90)

Comparing this with the corresponding expression (equation (83)) without SOR, we find that, for

large M and in the asymptotic limit, the number of iterations required to reduce error by an order

of magnitude varies as N^0.5 and that the total number of arithmetic operations required to

reduce error by a decade varies as N^1.5, compared to N^2 variation for the pure GS method.

This represents a considerably enhanced convergence rate for large values of N.

It is interesting to note from equations (79), (82) and (90) that as the number of

subdivisions (M) increases, the spectral radius approaches closer to unity and the convergence

rate therefore decreases. Thus, for diffusion-dominated flows, convergence would be slower on

finer grids and a larger number of iterations would be required to reduce the error by a given

factor.

The above discussion is applicable in the asymptotic limit of large number of iterations

and for large number of subdivisions (large M). In the initial stages of computation, the residual

reduction may typically follow one of the two idealized curves shown in Figure 5.xxx (Fig.

12.1.3 of Hirsch) depending on the spectrum of eigenvalues present in the iteration matrix and

their damping characteristics. A residual history of the type of curve a may be expected if the

iterative scheme damps rapidly the high frequency errors but damps poorly the low frequencyerrors. A response of the type of curve b may be obtained if the low frequency errors are

damped rapidly. A detailed discussion of this eigenvalue analysis of an iterative method is

presented in Hirsch (1988). Such detailed analysis of the properties of iterative schemes is

possible for symmetric matrices. For unsymmetric matrices, the convergence behaviour is more

complicated and a non-monotone convergence may be expected, and the interested reader is

referred to Axelsson (1994) for more details.


25/26

The above results on the conditions for convergence and the rate of convergence

expressed in terms of the spectral radius of the iteration matrix are not very useful in practice

because of the large size of matrices involved, typically of the order of 104

x 104

or more. The

evaluation of the eigenvalues and the spectral radius for such cases may not be practically

feasible. The following more restrictive condition, but one which is readily implementable, is

often used as a guide for the convergence behaviour of iterative schemes:

A sufficient condition for covergence of an iterative scheme is that the matrix A (in

equation 7) is irreducible and must have general diagonal dominance, i.e.,

|aii| >= sum I=/j (|aij|) for all I

|aii| > sum I=/ j (|aij|) for at least on ei (91)

Both these conditions, namely, irreducibility and diagonal dominance, require someclarification.. A reducible matrix does not require the simultaneous solution of all the equations;

simultaneous solution of a reduced set of equations is possible. When this is not the case, the

matrix A is said to be irreducible.

Diagonal dominance of a matrix refers to the condition that

|aii| > sum j=/I |aij| for all I (94)

A matrix which satisfies equation (94) is said to be strictly diagonally dominant. A matrix which

satisfied the diagonal dominance condition for at least one row and satisfies the others with an

equality sign, is said to be generally diagonally dominant. A matrix satisfying the irreducibility

and the general diagonance condition is called irreducibly diagonally dominant matrix. It can be

shown that (Ciarlet, 1989) that under such conditions, the spectral radius of the iteration matrices

corresponding to the Jacobi and Gauss-Seidel methos is less than unity and that they converge

for all initial vectors, x0. Also, the SOR method would converge for 0 < w < 2.

Let us summarize the results that we have obtained in this section. For the basic iterative

methods, namely, the Jacobi method and the Gauss-Seidel method, the number of airthmetic

operations required to reduce the residual by k orders of magnitude varies as ~kN^2 where N isthe number of equations being solved simultaneously. For the optimal SOR method, the number

of arithmetic operations may vary as ~kN^1.5, constituting a significant improvement for large

N. However, the convergence rate decreases rather sharply for non-optimal (especially sub-

optimal) values of the relaxation factor and, in a practical implementation of the iterative

scheme, a search for an optimal value of w should be incorporated for large N while taking

account of the disparity of variation of the residual in the initial stages of the computation


26/26

(Figure 5.xxx).

The above results, which are valid for diffusion-dominated flows, compare very

favourably with the best general purpose direct method for the solution of the same equation,

namely, the Gaussian elimination method, which takes 1/3N^3 number of arithmetic operations

for the solution (to machine accuracy). The basic iterative methods are much simpler, easier to

program, require less storage and take advantage of the structure of the matrix. However, these

advantages are gained at the cost of a significant curtailment of applicability. While the

Gaussian elimination method (with pivoting) would work for any non-singular matrix, the basic

iterative methods are limited to diagonally dominant matrices. Also, they compare significantly

less favourably with the best general purpose direct method for a diagonally domaninat

tridiagonal matrix, namely, the Thomas algorithm, for which the number of operations varies as

~ N. While SOR remains an effective means of enhancing the convergence rate of the basic

methods, its efficacy is reduced for non-optimal values and may not work at all in some cases, as

shown in Table 5.x (GS with w= 1.14). A number of methods and strategies have beendeveloped to arrive at improving the convergence behaviour of the iterative methods. Some of

these are discussed below.

5.3 Advanced Iterative Methods

A number of iterative methods have been devised to improve the convergence behaviour

of the basic iterative schemes. These take the form of (i) improving the sensitivity of the SOR

method to the choice of w, e.g., Chebyshev iterative methods; (ii) taking advantage of the

efficient Thomas algorithm for tridiagonal matrices, e.g., ADI methods; (iii) providing a better-

conditioned splitting of the matrix A into M and N, e.g., strongly implicit methods; and (iv)

taking advantage of the error smoothening properties of some iterative methods, e.g., multigrid

methods.

solution of discretized equations

Documents