lecture 8: fast linear solvers (part 3)zxu2/acms60212-40212/lec-09-3.pdfcholesky factorization...
Post on 22-Jan-2021
7 Views
Preview:
TRANSCRIPT
Lecture 8: Fast Linear Solvers (Part 3)
1
Cholesky Factorization
âą Matrix đŽ is symmetric if đŽ = đŽđ.
âą Matrix đŽ is positive definite if for all đ â đ, đđđšđ > 0.
âą A symmetric positive definite matrix đŽ has Cholesky factorization đŽ = đżđżđ, where đż is a lower triangular matrix with positive diagonal entries.
âą Example.
â đŽ = đŽđ with đđđ > 0 and đđđ > |đđđ|đâ đ is positive definite.
2
đŽ =
đ11 đ12 ⊠đ1đđ12 đ22 ⊠đ2đâź
đ1đ
âźđ2đ
â±âŠ
âźđđđ
=
đ11 0 ⊠0đ21 đ22 ⊠0âźđđ1
âźđđ2
â±âŠ
âźđđđ
đ11 đ21 ⊠đđ10 đ22 ⊠đđ2âź0
âź0
â±âŠ
âźđđđ
3
In 2 Ă 2 matrix size case đ11 đ21đ21 đ22
=đ11 0đ21 đ22
đ11 đ210 đ22
đ11 = đ11; đ21 = đ21/đ11; đ22 = đ22 â đ212
4
Submatrix Cholesky Factorization Algorithm
5
for đ = 1 to đ đđđ = đđđ for đ = đ + 1 to đ đđđ = đđđ/đđđ end for đ = đ + 1 to đ // reduce submatrix for đ = đ to đ đđđ = đđđ â đđđđđđ
end end end
Equivalent to đđđđđđ due
to symmetry
6
Remark:
1. This is a variation of Gaussian Elimination algorithm.
2. Storage of matrix đŽ is used to hold matrix đż .
3. Only lower triangle of đŽ is used (See đđđ = đđđ â đđđđđđ).
4. Pivoting is not needed for stability.
5. About đ3/6 multiplications and about đ3/6 additions are required.
Data Access Pattern
Read only
Read and write
Column Cholesky Factorization Algorithm
7
for đ = 1 to đ for đ = 1 to đ â 1 for đ = đ to đ đđđ = đđđ â đđđđđđ
end end đđđ = đđđ
for đ = đ + 1 to đ đđđ = đđđ/đđđ
end end
Data Access Pattern
8
Read only
Read and write
Parallel Algorithm
âą Parallel algorithms are similar to those for LU factorization.
9
References âą X. S. Li and J. W. Demmel, SuperLU_Dist: A scalable
distributed-memory sparse direct solver for unsymmetric linear systems, ACM Trans. Math. Software 29:110-140, 2003
âą P. HĂ©non, P. Ramet, and J. Roman, PaStiX: A high-performance parallel direct solver for sparse symmetric positive definite systems, Parallel Computing 28:301-321, 2002
QR Factorization and HouseHolder Transformation
Theorem Suppose that matrix đŽ is an đ Ă đ matrix with linearly independent columns, then đŽ can be factored as đŽ = đđ
where đ is an đ Ă đ matrix with orthonormal columns and đ is an invertible đ Ă đ upper triangular matrix.
10
QR Factorization by Gram-Schmidt Process
Consider matrix đŽ = [đ1 đ2 ⊠|đđ]
Then,
đ1 = đ1, đ1 =đ1
||đ1||
đ2 = đ2 â (đ2 â đ1)đ1, đ2 =đ2
||đ2||
âŠ
đđ = đđ â đđ â đ1 đ1 ââŻâ đđ â đđâ1 đđâ1, đđ =đđ
||đđ||
đŽ = đ1 đ2 ⊠đđ
= đ1 đ2 ⊠đđ
đ1 â đ1 đ2 â đ1 ⊠đđ â đ1
0 đ2 â đ2 ⊠đđ â đ2
âź âź â± âź0 0 ⊠đđ â đđ
= đđ
11
Classical/Modified Gram-Schmidt Process
for j=1 to n
đđ = đđ
for i = 1 to j-1
đđđ = đđ
âđđ (đ¶đđđ đ đđđđ)
đđđ = đđâđđ (đđđđđđđđ)
đđ = đđ â đđđđđ
endfor
đđđ = ||đđ||2
đđ = đđ/đđđ
endfor 12
Householder Transformation
Let đŁ â đ đ be a nonzero vector, the đ Ă đ matrix
đ» = đŒ â 2đđđ
đđđ
is called a Householder transformation (or reflector).
âą Alternatively, let đ = đ/||đ||, đ» can be rewritten as
đ» = đŒ â 2đđđ.
Theorem. A Householder transformation đ» is symmetric and orthogonal, so đ» = đ»đ = đ»â1.
13
1. Let vector đ be perpendicular to đ.
đŒ â 2đđđ
đđđđ = đ â 2
đ(đđđ)
đđđ= đ
2. Let đ =đ
| đ |. Any vector đ can be written as
đ = đ + đđđ đ. đŒ â 2đđđ đ = đŒ â 2đđđ đ + đđđ đ = đ â đđđ đ
14
âą Householder transformation is used to selectively zero out blocks of entries in vectors or columns of matrices.
âą Householder is stable with respect to round-off error.
15
For a given a vector đ, find a Householder
transformation, đ», such that đ»đ = đ
10âź0
= đđ1
â Previous vector reflection (case 2) implies that vector đ is in parallel with đ â đ»đ.
â Let đ = đ â đ»đ = đ â đđ1, where đ = ±| đ | (when the arithmetic is exact, sign does not matter).
â đ = đ/|| đ||
â In the presence of round-off error, use
đ = đ + đ đđđ đ„1 | đ |đ1
to avoid catastrophic cancellation. Here đ„1 is the first entry in the vector đ.
Thus in practice, đ = âđ đđđ đ„1 | đ | 16
Algorithm for Computing the Householder Transformation
17
đ„đ = max { đ„1 , đ„2 , ⊠, đ„đ } for đ = 1 to đ đŁđ = đ„đ/đ„đ end
đŒ = đ đđđ(đŁ1)[đŁ12 + đŁ2
2 +âŻ+ đŁđ2]1/2
đŁ1 = đŁ1 + đŒ đŒ = âđŒđ„đ đ = đ/| đ |
Remark: 1. The computational complexity is đ(đ). 2. đ»đ is an đ đ operation compared to đ(đ2) for a general matrix-vector product. đ»đ = đ â đđ(đđ»đ).
QR Factorization by HouseHolder Transformation
Key idea:
Apply Householder transformation successively to columns of matrix to zero out sub-diagonal entries of each column.
18
âą Stage 1 â Consider the first column of matrix đŽ and determine a
HouseHolder matrix đ»1 so that đ»1
đ11đ21âź
đđ1
=
đŒ10âź0
â Overwrite đŽ by đŽ1, where
đŽ1 =
đŒ1 đ12â ⯠đ1đ
â
0 đ22â âź đ2đ
â
âź âź âź âź0 đđ2
â ⯠đđđâ
= đ»1đŽ
Denote vector which defines đ»1 vector đ1. đ1 = đ1/| đ1 |
Remark: Column vectors
đ12â
đ22â
âźđđ2â
âŠ
đ1đâ
đ2đâ
âźđđđâ
should be computed by
đ»đ = đ â đđ(đđ»đ). 19
âą Stage 2 â Consider the second column of the updated matrix
đŽ ⥠đŽ1 = đ»1đŽ and take the part below the diagonal to determine a Householder matrix đ»2
â so that
đ»2â
đ22â
đ32â
âźđđ2â
=
đŒ2
0âź0
Remark: size of đ»2â is (đ â 1) Ă (đ â 1).
Denote vector which defines đ»2â vector đ2. đ2 = đ2/| đ2 |
â Inflate đ»2â to đ»2 where
đ»2 =1 00 đ»2
â
Overwrite đŽ by đŽ2 ⥠đ»2đŽ1
20
âą Stage k
â Consider the đđĄâ column of the updated matrix đŽ and take the part below the diagonal to determine a Householder matrix đ»đ
â so that
đ»đâ
đđđâ
đđ+1,đâ
âźđđ,đâ
=
đŒđ
0âź0
Remark: size of đ»đâ is (đ â đ + 1) Ă (đ â đ + 1).
Denote vector which defines đ»đâ vector đđ. đđ = đđ/| đđ |
â Inflate đ»đâ to đ»đ where
đ»đ =đŒđâ1 0
0 đ»đâ , đŒđâ1 is (đ â 1) Ă (đ â 1) identity matrix.
Overwrite đŽ by đŽđ ⥠đ»đđŽđâ1
21
âą After (đ â 1) stages, matrix đŽ is transformed into an upper triangular matrix đ .
âą đ = đŽđâ1 = đ»đâ1đŽđâ2 = ⯠=đ»đâ1đ»đâ2âŠđ»1đŽ
âą Set đđ = đ»đâ1đ»đâ2âŠđ»1. Then đâ1 = đđ. Thus đ = đ»1
đđ»2đ âŠđ»đâ1
đ
âą đŽ = đđ
22
Example. đŽ =1 11 21 3
.
Find đ»1 such that đ»1
111
=đŒ100
.
By đ = âđ đđđ đ„1 | đ |,
đŒ1 = â 3 = â1.721.
đą1 = 0.8881, đą2 = đą3 = 0.3250.
By đ»đ = đ â đđ đđ»đ , đ»1
123
=â3.46410.36611.3661
đ»1đŽ =â1.721 â3.4641
0 0.36610 1.3661
23
Find đ»2â and đ»2, where
đ»2â 0.36611.3661
=đŒ2
0
So đŒ2 = â1.4143, đ2 =0.79220.6096
So đ = đ»2đ»1đŽ =â1.7321 â3.4641
0 â1.41430 0
đ = (đ»2đ»1)đ= đ»1
đđ»2đ
=â0.5774 0.7071 â0.4082â0.5774 0 0.8165â0.5774 â0.7071 â0.4082
24
đŽđ = đ â đđ đ = đ
â đđđđ đ = đđđ â đ đ = đđđ
Remark: Q rarely needs explicit construction
25
Parallel Householder QR
âą Householder QR factorization is similar to Gaussian elimination for LU factorization.
â Additions + multiplications: đ4
3đ3 for QR versus
đ2
3đ3 for LU
âą Forming Householder vector đđis similar to computing multipliers in Gaussian elimination.
âą Subsequent updating of remaining unreduced portion of matrix is similar to Gaussian elimination.
âą Parallel Householder QR is similar to parallel LU. Householder vectors need to broadcast among columns of matrix.
26
References
âą M. Cosnard, J. M. Muller, and Y. Robert. Parallel QR decomposition of a rectangular matrix, Numer. Math. 48:239-249, 1986
âą A. Pothen and P. Raghavan. Distributed orthogonal factorization: Givens and Householder algorithms, SIAM J. Sci. Stat. Comput. 10:1113-1134, 1989
27
top related