fast direct solution of a large linear system

6
CHAPTER 10 Fast Direct Solution of a Large Linear System Exercise 10.5: Fourier matrix The Fourier matrix F N has entries (F N ) j,k = ! (j -1)(k-1) N , ! N := e - 2N i = cos 2N - i sin 2N . In particular for N = 4, this implies that ! 4 = -i and F 4 = 2 6 6 4 1 1 1 1 1 -i -1 i 1 -1 1 -1 1 i -1 -i 3 7 7 5 . Computing the transpose and Hermitian transpose gives F T 4 = 2 6 6 4 1 1 1 1 1 -i -1 i 1 -1 1 -1 1 i -1 -i 3 7 7 5 = F 4 , F H 4 = 2 6 6 4 1 1 1 1 1 i -1 -i 1 -1 1 -1 1 -i -1 i 3 7 7 5 6= F 4 , which is what needed to be shown. Exercise 10.6: Sine transform as Fourier transform According to Lemma 10.2, the Discrete Sine Transform can be computed from the Discrete Fourier Transform by (S m x) k = i 2 (F 2m+2 z) k+1 , where z = [0,x 1 ,...,x m , 0, -x m ,..., -x 1 ] T . For m = 1 this means that z = [0,x 1 , 0, -x 1 ] T and (S 1 x) 1 = i 2 (F 4 z) 2 . Since h = 1 m+1 = 1 2 for m = 1, computing the DST directly gives (S 1 x) 1 = sin(h)x 1 = sin 2 x 1 = x 1 , while computing the Fourier transform gives F 4 z = 2 6 6 4 1 1 1 1 1 -i -1 i 1 -1 1 -1 1 i -1 -i 3 7 7 5 2 6 6 4 0 x 1 0 -x 1 3 7 7 5 = 2 6 6 4 0 -2ix 1 0 2ix 1 3 7 7 5 = -2i 2 6 6 4 0 x 1 0 -x 1 3 7 7 5 = -2iz. Multiplying the Fourier transform with i 2 , one finds i 2 F 4 z = z, so that i 2 (F 4 z) 2 = x 1 = (S 1 x) 1 , which is what we needed to show. 64

Upload: others

Post on 20-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

CHAPTER 10

Fast Direct Solution of a Large Linear System

Exercise 10.5: Fourier matrix

The Fourier matrix FN has entries

(FN)j,k = !(j�1)(k�1)N , !N := e�

2⇡N i = cos

✓2⇡

N

◆� i sin

✓2⇡

N

◆.

In particular for N = 4, this implies that !4 = �i and

F4 =

2

664

1 1 1 11 �i �1 i1 �1 1 �11 i �1 �i

3

775 .

Computing the transpose and Hermitian transpose gives

FT4 =

2

664

1 1 1 11 �i �1 i1 �1 1 �11 i �1 �i

3

775 = F4, FH4 =

2

664

1 1 1 11 i �1 �i1 �1 1 �11 �i �1 i

3

775 6= F4,

which is what needed to be shown.

Exercise 10.6: Sine transform as Fourier transform

According to Lemma 10.2, the Discrete Sine Transform can be computed from theDiscrete Fourier Transform by (Smx)k =

i2(F2m+2z)k+1, where

z = [0, x1, . . . , xm, 0,�xm, . . . ,�x1]T.

For m = 1 this means that

z = [0, x1, 0,�x1]T and (S1x)1 =

i

2(F4z)2.

Since h = 1m+1

= 12for m = 1, computing the DST directly gives

(S1x)1 = sin(⇡h)x1 = sin⇣⇡2

⌘x1 = x1,

while computing the Fourier transform gives

F4z =

2

664

1 1 1 11 �i �1 i1 �1 1 �11 i �1 �i

3

775

2

664

0x1

0�x1

3

775 =

2

664

0�2ix1

02ix1

3

775 = �2i

2

664

0x1

0�x1

3

775 = �2iz.

Multiplying the Fourier transform with i2, one finds i

2F4z = z, so that i

2(F4z)2 = x1 =

(S1x)1, which is what we needed to show.

64

Exercise 10.7: Explicit solution of the discrete Poisson equation

For any integer m � 1, let h = 1/(m + 1). For j = 1, . . . ,m, let �j = 4 sin2�j⇡h/2

�,

D = diag(�1, . . . ,�m), and S = (sjk)jk =�sin(jk⇡h)

�jk. By Section 10.2, the solution

to the discrete Poisson equation is V = SXS, where X is found by solving DX+XD =4h4SFS. Since D is diagonal, one has (DX+XD)pr = (�p + �r)xpr, so that

xpr = 4h4 (SFS)pr�p + �r

= 4h4

mX

k=1

mX

l=1

spkfklslr�p + �r

so that

vij =mX

p=1

mX

r=1

sipxprsrj = 4h4

mX

p=1

mX

r=1

mX

k=1

mX

l=1

sipspkslrsrj�p + �r

fkl

= h4

mX

p=1

mX

r=1

mX

k=1

mX

l=1

sin�

ip⇡m+1

�sin

�pk⇡m+1

�sin

�lr⇡m+1

�sin

�rj⇡m+1

sin2⇣

p⇡2(m+1)

⌘+ sin2

⇣r⇡

2(m+1)

⌘ fkl,

which is what needed to be shown.

Exercise 10.8: Improved version of Algorithm 10.1

Given is that

(?) TV +VT = h2F.

Let T = SDS�1 be the orthogonal diagonalization of T from Equation (10.4), andwrite X = VS and C = h2FS.

(a) Multiplying Equation (?) from the right by S, one obtains

TX+XD = TVS+VSD = TVS+VTS = h2FS = C.

(b) Writing C = [c1, . . . , cm], X = [x1, . . . ,xm] and applying the rules of blockmultiplication, we find

[c1, . . . , cm] = C

= TX+XD

= T[x1, . . . ,xm] +X[�1e1, . . . ,�mem]

= [Tx1 + �1Xe1, . . . ,Txm + �mXem]

= [Tx1 + �1x1, . . . ,Txm + �mxm]

= [(T+ �1I)x1, . . . , (T+ �mI)xm],

which is equivalent to System (10.9). To find X, we therefore need to solve the mtridiagonal linear systems of (10.9). Since the eigenvalues �1, . . . ,�m are positive,each matrix T + �jI is diagonally dominant. By Theorem 1.24, every such matrixis nonsingular and has a unique LU factorization. Algorithms 1.8 and 1.9 then solvethe corresponding system (T + �jI)xj = cj in O(�m) operations for some constant �.Doing this for all m columns x1, . . . ,xm, one finds the matrix X in O(�m2) operations.

(c) To find V, we first find C = h2FS by performing O(2m3) operations. Next wefind X as in step b) by performing O(�m2) operations. Finally we compute V = 2hXSby performing O(2m3) operations. In total, this amounts to O(4m3) operations.

65

(d) As explained in Section 10.3, multiplying by the matrix S can be done inO(2m2 log2 m) operations by using the Fourier transform. The two matrix multiplica-tions in c) can therefore be carried out in

O(4�m2 log2 m) = O(4�n log2 n1/2) = O(2�n log2 n)

operations.

Exercise 10.9: Fast solution of 9 point scheme

Analogously to Section 10.2, we use the relations between the matrices T,S,X,D torewrite Equation (9.18).

TV +VT� 1

6TVT = h2µF

() TSXS+ SXST� 1

6TSXST = h2µF

() STSXS2 + S2XSTS� 1

6STSXSTS = h2µSFS

() S2DXS2 + S2XS2D� 1

6S2DXS2D = h2µSFS

() DX+XD� 1

6DXD = 4h4µSFS = 4h4G

Writing D = diag(�1, . . . ,�m), the (j, k)-th entry of DX + XD � 16DXD is equal to

�jxjk + xjk�k � 16�jxjk�k. Isolating xjk and writing �j = 4�j = 4 sin2(j⇡h/2) then

yields

xjk =4h4gjk

�j + �k � 16�j�k

=h4gjk

�j + �k � 23�j�k

, �j = sin2

✓j⇡h

2

◆.

Defining ↵ := j⇡h/2 and � = k⇡h/2, one has 0 < ↵, � < ⇡/2. Note that

�j + �k � 2

3�j�k > �j + �k � �j�k

= 2� cos2 ↵� cos2 � � (1� cos2 ↵)(1� cos2 �)

= 1� cos2 ↵ cos2 �

� 1� cos2 �

� 0.

Let A = T ⌦ I + I ⌦ T � 16T ⌦ T be as in Exercise 9.13.(b) and si as in Section

10.2. Applying the mixed-product rule, one obtains

A(si ⌦ sj) = (T⌦ I+ I⌦T)(si ⌦ sj)� 1

6(T⌦T)(si ⌦ sj) =

(�i + �j)(si ⌦ sj)� 1

6�i�j(si ⌦ sj) = (�i + �j � 1

6�i�j)(si ⌦ sj).

The matrix A therefore has eigenvectors si ⌦ sj, and counting them shows that thesemust be all of them. As shown above, the corresponding eigen values �i + �j � 1

6�i�j

are positive, implying that the matrix A is positive definite. It follows that the System(9.17) always has a (unique) solution.

66

Exercise 10.10: Algorithm for fast solution of 9 point scheme

The following describes an algorithm for solving System (9.17).

Algorithm 1 A method for solving the discrete Poisson problem (9.17)

Require: An integer m denoting the grid size, a matrix µF 2 Rm,m of function values.Ensure: The solution V to the discrete Poisson problem (9.17).1: h 1

m+1

2: S �sin(jk⇡h)

�mj,k=1

3: � �sin2

�j⇡h2

��mj=1

4: G SµFS

5: X ⇣

h4gi,j�i+�j� 2

3�i�j

⌘m

j,k=1

6: V SXS

For the individual steps in this algorithm, the time complexities are shown in thefollowing table.

step 1 2 3 4 5 6

complexity O(1) O(m2) O(m) O(m3) O(m2) O(m3)

Hence the overall complexity is determined by the four matrix multiplications andgiven by O(m3).

Exercise 10.11: Fast solution of biharmonic equation

From Exercise 9.14 we know that T 2 Rm⇥m is the second derivative matrix. Accordingto Lemma 1.31, the eigenpairs (�j, sj), with j = 1, . . . ,m, of T are given by

sj = [sin(j⇡h), sin(2j⇡h), . . . , sin(mj⇡h)]T,

�j = 2� 2 cos(j⇡h) = 4 sin2(j⇡h/2),

and satisfy sTj sk = �j,k/(2h) for all j, k, where h := 1/(m + 1). Using, in order, thatU = SXS, TS = SD, and S2 = I/(2h), one finds that

h4F = T2U+ 2TUT+UT2

() h4F = T2SXS+ 2TSXST+ SXST2

() h4SFS = ST2SXS2 + 2STSXSTS+ S2XST2S

() h4SFS = S2D2XS2 + 2S2DXS2D+ S2XS2D2

() h4SFS = ID2XI/(4h2) + 2IDXID/(4h2) + IXID2/(4h2)

() 4h6G = D2X+ 2DXD+XD2,

where G := SFS. The (j, k)-th entry of the latter matrix equation is

4h6gjk = �2jxjk + 2�jxjk�k + xjk�

2k = xjk(�j + �k)

2.

Writing �j := sin2(j⇡h/2) = �j/4, one obtains

xjk =4h6gjk

(�j + �k)2=

4h6gjk�4 sin2(j⇡h/2) + 4 sin2(k⇡h/2)

�2 =h6gjk

4(�j + �k)2.

67

Exercise 10.12: Algorithm for fast solution of biharmonic equation

In order to derive an algorithm that computes U in Problem 9.14, we can adjustAlgorithm 10.1 by replacing the computation of the matrix X by the formula fromExercise 10.11. This adjustment does not change the complexity of Algorithm 10.1,which therefore remains O(�n3/2). The new algorithm can be implemented inMatlab asin Listing 10.1.

function U = simplefastbiharmonic(F)

m = length(F);

h = 1/(m+1);

hv = pi

*

h

*

(1:m)’;

sigma = sin(hv/2). 2;

S = sin(hv

*

(1:m));

G = S

*

F

*

S;

X = (h 6)

*

G./(4

*

(sigma

*

ones(1,m)+ones(m,1)

*

sigma’). 2);

U = zeros(m+2,m+2);

U(2:m+1,2:m+1) = S

*

X

*

S;

end

Listing 10.1. A simple fast solution to the biharmonic equation

Exercise 10.13: Check algorithm for fast solution of biharmonic equation

The Matlab function from Listing 10.2 directly solves the standard form Ax = bof Equation (9.21), making sure to return a matrix of the same dimension as theimplementation from Listing 10.1.

function V = standardbiharmonic(F)

m = length(F);

h = 1/(m+1);

T = gallery(’tridiag’, m, -1, 2,-1);

A = kron(T 2, eye(m)) + 2

*

kron(T,T) + kron(eye(m),T 2);

b = h. 4

*

F(:);

x = A\b;

V = zeros(m+2, m+2);

V(2:m+1,2:m+1) = reshape(x,m,m);

end

Listing 10.2. A direct solution to the biharmonic equation

After specifying m = 4 by issuing the command F = ones(4,4), the com-mands simplefastbiharmonic(F) and standardbiharmonic(F) both returnthe matrix

2

666664

0 0 0 0 0 00 0.0015 0.0024 0.0024 0.0015 00 0.0024 0.0037 0.0037 0.0024 00 0.0024 0.0037 0.0037 0.0024 00 0.0015 0.0024 0.0024 0.0015 00 0 0 0 0 0

3

777775.

68

For large m, it is more insightful to plot the data returned by our Matlab functions.For m = 50, we solve and plot our system with the commands in Listing 10.3.

F = ones(50, 50);

U = simplefastbiharmonic(F);

V = standardbiharmonic(F);

surf(U);

surf(V);

Listing 10.3. Solving the biharmonic equation and plotting the result

0

20

40

60

0

20

40

600

1

2

3

4

5

x 10−3

simplefastbiharmonic

0

20

40

60

0

20

40

60

0

1

2

3

4

5

x 10−3

standardbiharmonic

On the face of it, these plots seem to be virtually identical. But exactly how close arethey? We investigate this by plotting the di↵erence with the command surf(U-V),which gives

0

20

40

60

0

20

40

600

0.5

1

1.5

2

x 10−14

simplefastbiharmonic minus standardbiharmonic

We conclude that their maximal di↵erence is of the order of 10�14, which makes themindeed very similar.

69