class19_pde-i-handout.pdf
TRANSCRIPT
-
7/29/2019 class19_pde-I-handout.pdf
1/78
Partial Differential Equations in HPC, I
M. D. Jones, Ph.D.
Center for Computational ResearchUniversity at Buffalo
State University of New York
High Performance Computing I, 2012
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 1 / 84
-
7/29/2019 class19_pde-I-handout.pdf
2/78
Introduction Types
Types of PDEs
Recapitulate - types of partial differential equations (2nd order, 2D):
Axx + 2Bxy + Cyy + Dx + Ey + F = 0
Three major classes:Hyperbolic (often time dependent, e.g., wave equation)
Elliptic (steady state, e.g., Poissons Equation)
Parabolic (often time dependent, e.g. diffusion equation)
(nonlinear when coefficients depend on )
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 3 / 84
-
7/29/2019 class19_pde-I-handout.pdf
3/78
Introduction Hyperbolic
Hyperbolic Problems
Hyperbolic problems:
det(Z) = A B
B C < 0
Examples: Wave equation
Explicit time steps
Solution at point depends on neighboring points at prior time step
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 4 / 84
-
7/29/2019 class19_pde-I-handout.pdf
4/78
Introduction Elliptic
Elliptic Problems
Elliptic Problems:
Z =
A B
B C
= positive definite
Examples: Laplace equation, Poisson Equation
Steady-state problems
Solution at points depends on all other points
Locality is tougher than hyperbolic problems
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 5 / 84
-
7/29/2019 class19_pde-I-handout.pdf
5/78
Introduction Parabolic
Parabolic Problems
Parabolic Problems:
det(Z) = A B
B C
= 0
Examples: Heat equation, Diffusion equation
Needs an elliptic solve at each time step
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 6 / 84
I d i P b li
-
7/29/2019 class19_pde-I-handout.pdf
6/78
Introduction Parabolic
Elements, Differences, Discretizations ...
There are many forms of discretization that can be used on these
systems:
Finite Elements
Finite Differences
Adaptive combinations of the above
for our purposes, though, we can illustrate many of the basic ideas
using a simple one-dimensional example on a regular mesh.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 7 / 84
B d V l P bl Si l Elli ti E l i 1D
-
7/29/2019 class19_pde-I-handout.pdf
7/78
Boundary Value Problems Simple Elliptic Example in 1D
Simple Elliptic Example in 1D
For starters we consider the following boundary value problem:
d2
dx2= f(x), x [0, 1],
with initial conditions:
d/dx(0) = 0, (1) = 0.
The plus side of this example is that we have an integral general
solution:
(x) =
1
x
dt
t
0
f(s)ds
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 9 / 84
Boundary Value Problems Simple Elliptic Example in 1D
-
7/29/2019 class19_pde-I-handout.pdf
8/78
Boundary Value Problems Simple Elliptic Example in 1D
Mesh Approach
Now we apply a regular mesh, i.e. a uniform one defined by N points
uniformly distributed over the domain:
xi = (i 1)x, x = h= 1/N.
Using a mid-point formula for the derivative:
2hd/dx(x) = (x + h/2) (x h/2),
Applying this central difference formula twice yields
(x h) + 2(x) (x + h) = h2d2/dx2(x),
or in another form
n1 + 2n n+1 = h2
n,
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 10 / 84
Boundary Value Problems Simple Elliptic Example in 1D
-
7/29/2019 class19_pde-I-handout.pdf
9/78
Boundary Value Problems Simple Elliptic Example in 1D
So now our original boundary value problem can be expressed in finite
differences as
n1 + 2n n+1 = h2fn
and this is now easily reduced to a matrix problem, of course. Butbefore we go there, we can generalize the above to use an arbitrary
mesh, rather than uniform ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 11 / 84
Boundary Value Problems Simple Elliptic Example in 1D
-
7/29/2019 class19_pde-I-handout.pdf
10/78
Boundary Value Problems Simple Elliptic Example in 1D
Finite Differences on an Arbitrary Mesh
Now consider a (possibly) non-uniform mesh, discretized according to0 = x0 < x1 < ... < xN+1 = 1. The discretized boundary value problemthen becomes
2
xn+1 xn1
n+1 n
xn+1 xn
n n1
xn xn1 = f(xn)
Let hn = xn xn1, and we can scale by (hn+1 hn)/2:
hn+1 hn2
f(xn) = n+1 n
hn+1+
n n1hn
,
=
1
hn+1+
1
hn
n
1
hn+1n+1
1
hnn1. (1)
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 12 / 84
Boundary Value Problems Simple Elliptic Example in 1D
-
7/29/2019 class19_pde-I-handout.pdf
11/78
Boundary Value Problems Simple Elliptic Example in 1D
Eq. 1 is a symmetric tridiagonal matrix, which can be shown to bepositive definite. The i-th row (1 < i < N) has entries:
ai,i1 = 1
hiai,i =
1
hi+
1
hi+1ai,i+1 =
1
hi+1
and the resulting matrix equation looks like:
A = F
and we can subject this matrix equation to our full arsenal of matrix
solution methods.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 13 / 84
Boundary Value Problems Simple Elliptic Example in 1D
-
7/29/2019 class19_pde-I-handout.pdf
12/78
Boundary Value Problems Simple Elliptic Example in 1D
Finite Elements for the Boundary Value Problem
Using a mesh-based discretization is not the only way, of course. Thefinite element method uses piecewise-linear hat functions, ui which
are unity at the i-th node, xi, and zero elsewhere. The mathematical
basis is the variational formulation:
10
(x)u
i(x)dx =1
0f(x)ui(x)dx,
and an approximation scheme for the general integral, e.g. the
trapezoidal rule:
xi+1
xi
f(x)dxx
2(f(xi) + f(xi+1)) .
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 14 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
13/78
y
Matrix Factorization
This formulation reduces to the same form as our finite difference one,
and can also be recast for an arbitrary mesh.
Regardless of whether we used finite elements or finite differences,
our simple 1D example reduced to the simple tridiagonal form:
A = F,
where ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 15 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
14/78
y
the tridiagonal matrix A is given by:
A =
1 1 0 0 . . . 01 2 1 0 . . . 0
0 1 2 1 . . . 0.
..
.
..
.
..
.
..
.
..
.
..0 . . . 1 2 1 00 . . . 0 1 2 10 . . . 0 0 1 2
and F1 = h2f(x1)/2, Fi = h2f(xi).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 16 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
15/78
Direct Solution
The tridiagonal matrix A can be factored very simply into a product of
lower (L) and upper (LT) bidiagonal matrices:
L =
1 0 0 0 . . . 01 1 0 0 . . . 0
0 1 1 0 . . . 0...
......
......
...
0 . . . 1 1 0 00 . . . 0 1 1 0
0 . . . 0 0 1 1
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 17 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
16/78
LT =
1 1 0 0 . . . 00 1 1 0 . . . 00 0 1 1 . . . 0...
......
......
...
0 . . . 0 1 1 00 . . . 0 0 1 10 . . . 0 0 0 1
note that LT is just the transpose of L.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 18 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
17/78
So this starts to look quite familiar to our old friend, LU decomposition.
We introduce the auxiliary vector Y which solves by forward
substitution:
LY = F,
and then the solution is given by backward solution
LT = Y,
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 19 / 84
Boundary Value Problems Matrix Factorization
-
7/29/2019 class19_pde-I-handout.pdf
18/78
The pseudo-code for this forward and backward solution is given by
the following:
y ( 1 ) = f ( 1 )do i =2,N
y ( i ) = y ( i 1) + f ( i )
end dop h i ( N) = y (N )do i =N1,1,1
p h i ( i ) = p h i ( i 1) + y ( i )end do
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 20 / 84
Boundary Value Problems Parallel Direct Triangular Solver
-
7/29/2019 class19_pde-I-handout.pdf
19/78
Parallel Prefix for Tridiagonal Systems
Let us consider the solution of a general tridiagonal system given by:
A =
a1 b1 0 0 . . . 0c1 a2 b2 0 . . . 0
0 c2 a3 b3 . . . 0...
......
.... . .
...
which we factor:
A = LU
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 21 / 84
Boundary Value Problems Parallel Direct Triangular Solver
-
7/29/2019 class19_pde-I-handout.pdf
20/78
and the factors L and U are lower and upper bidiagonal matrices,
respectively,
L =
1 0 0 0 . . . 0l1 1 0 0 . . . 00 l2 1 0 . . . 0...
..
.
..
.
..
.
. ..
..
.
U =
d1 e1 0 0 . . . 00 d2 e2 0 . . . 00 0 d3 e3 . . . 0...
.
.....
.
.... .
.
..
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 22 / 84
Boundary Value Problems Parallel Direct Triangular Solver
-
7/29/2019 class19_pde-I-handout.pdf
21/78
Matching up terms in Aij = (LU)ij yields the following three equations:
ci1 = li1di1 (i, i 1) (2)
ai = li1ei1 + di (i, i) (3)
bi = ei (i 1, i) (4)
From Eq. 2 we obtain li1 = ci1/di1 and substituting into Eq. 3yields a recurrence relation for di:
di = aici1ei1
di1= ai +
fidi1
,
where fi = ci1ei1.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 23 / 84
Boundary Value Problems Parallel Direct Triangular Solver
-
7/29/2019 class19_pde-I-handout.pdf
22/78
The parallel prefix method is then used for solving the recurrence
relation for the di by defining di = pi/qi, and expressing in matrix form:
piqi
= ai + fiqi1/pi1 =
ai fi1 0
pi1qi1
= Mi
pi1qi1
= MiMi1 . . . M1
p0q0
(5)
where f1 = 0, p0 = 1, and q0 = 0. In other words, we can write therecurrence relation as the direct product of a sequence of 2x2 matrix
multiplications.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 24 / 84
Boundary Value Problems Parallel Direct Triangular Solver
P ll l P fi
-
7/29/2019 class19_pde-I-handout.pdf
23/78
Parallel Prefix
The parallel prefix method is then used (in parallel) to quickly evaluatethe product in Eq. 5 by using a binary tree structure to quickly evaluate
the individual products. The following figure illustrates this process
(albeit for only eight terms): 7M1M 2M 3M 4M 5M 6M M 8
4M3M x 5M 6Mx 7M 8Mx1M 2Mx
1M 2M 3M 4Mxx x 5M 6M 7M 8Mxx x
1M 2M 3M 4Mxx x 5M 6M 7M 8Mxx xx
Note that this procedure is quite generally valid for any associative
operation (i.e. (A B)C = A (BC)), and can be done in onlyO(log2N) steps.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 25 / 84
Boundary Value Problems Parallel Direct Triangular Solver
Parallel Efficiency
-
7/29/2019 class19_pde-I-handout.pdf
24/78
Parallel Efficiency
The time taken for this parallel method is then given by:
(P) 2c1n
P+ 2c2 log2 P,
where c1 is the time taken for 1 Flop, and c2 that for one step of
parallel prefix using 2x2 matrices. The parallel efficiency is then just
1
E(P) 2 +
2Plog2 P
n,
where = c2/c1.
E(P) 1
2 + 2,
with n Plog2 P.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 26 / 84
Iteration, Iteration, Iteration, ... Back to Jacobi
Recapitulate
-
7/29/2019 class19_pde-I-handout.pdf
25/78
Recapitulate
Recall that in solving the matrix equation Ax = b Jacobis iterativemethod consisted of repeatedly updating the values of x based on their
value in the previous iteration:
x(k)i = 1Ai,i
bij=i
Ai,jx(k1)j ,
we will come back to convergence properties of this relation a bit later.
Now, however, we have a matrix A that has been determined by our
choice of a simple stencil for the derivatives in our PDE, which is verysparse, making for a large reduction in the workload.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 28 / 84
Iteration, Iteration, Iteration, ... Back to Jacobi
Iterative Techniques
-
7/29/2019 class19_pde-I-handout.pdf
26/78
Iterative Techniques
Now we consider, for our simple 1D boundary value problem used inthe last section, iterative methods for solution. The simplest scheme is
Jacobi iteration. For the general mesh in 1D,
k+1n =hn
hn+1 + hn
kn+1 +hn+1
hn+1 + hn
kn1 +hnhn+1
2
f(xn),
whereas for our even simpler uniform mesh:
k+11 = k2 + (h
2/2)f(x1),
k+1n =
kn+1 + kn1 + h2f(xn)
/2. n [2, N]
Recall that xN+1 = 1, and (1) = 0.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 29 / 84
Iteration, Iteration, Iteration, ... Back to Jacobi
Parallelization Strategies for Jacobi
-
7/29/2019 class19_pde-I-handout.pdf
27/78
Parallelization Strategies for Jacobi
First, in the classical Jacobi, each iteration involves a matrix
multiplication, plus a vector addition. As long as the solution vectors k
and k+1 are kept distinct, there is a high degree of data parallelism
(i.e. well suited to OpenMP or similar shared memory API).Each iteration takes O(N) work, but typically O(N2 log N) iterations arerequired to obtain the same accuracy as the original discrete equations
(in fact we can solve such a tridiagonal system in O(N) work in theserial case).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 30 / 84
Iteration, Iteration, Iteration, ... Gauss-Seidel
Gauss Seidel
-
7/29/2019 class19_pde-I-handout.pdf
28/78
Gauss-Seidel
The Gauss-Seidel method is a variation on Jacobi that makes use of
the updated solution values as they become available, i.e.
k+11 = k2 + (h
2/2)f(x1),
k+1n =
kn+1 + k+1n1 + h
2f(xn)
/2. n [2, N]
(where n is increasing as we proceed). Gauss-Seidel has advantages
- it converges faster than Jacobi, but we have introduced some rather
severe data dependencies at the same time.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 31 / 84
Iteration, Iteration, Iteration, ... Gauss-Seidel
Parallel Gauss-Seidel
-
7/29/2019 class19_pde-I-handout.pdf
29/78
Parallel Gauss-Seidel
Gauss-Seidel is made more difficult to parallelize by the datadependencies introduced by the use of the updated solution vectordata on prior grid points. We can break the dependencies byintroducing a tile, N = rP,
k+1
1= k
2+ (h2/2)f(x
1),
k+1ri+1 =1
2
kri+2 +
kri + h
2f(xri+1)
i [1, P 1]
k+1ri+n
=1
2
kri+n+1 +
k+1ri+n1
+ h2f(xri+n)
n [2, r] i [0, P 1]
In this way we get P parallel tasks worth of work. Pseudo-code forMPI-based version of this approach follows ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 32 / 84
Iteration, Iteration, Iteration, ... Gauss-Seidel
-
7/29/2019 class19_pde-I-handout.pdf
30/78
do i t e r = 1 , N _ i t e r a t i on si f ( myID == 0 ) p h i ( 0 ) = p h i ( 1 )i f ( myiD == N_procs1) p h i ( 1 +N / N _p ro cs ) = 0 ! u pp er BC
do i =1,N/Pu ( i ) = 0 .5( u ( i + 1 )+ u ( i 1) ) + f ( i ) ! n ot e h=1
end doSENDRECV( u ( 0 ) to / from myID1 )SENDRECV( u(1 +N/P ) to / from myID+1 )
end do
in practice this method is never used, though (well, ok, as a smoother
in multigrid, sometimes).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 33 / 84
Iteration, Iteration, Iteration, ... Reordering Gauss-Seidel
Reordering
-
7/29/2019 class19_pde-I-handout.pdf
31/78
Reordering
So what is done, in practice, for Gauss-Seidel, is to reorder theequations. For example using odd/even combinations:
k+11 = k2 + (h
2/2)f(x1),
k+1
2n1=
1
2k
2n2+ k
2n+ h2f(x
2n1) n [2, N/2]
k+12n =1
2
k+12n1 +
k+12n+1 + h
2f(x2n1)
n [1, N/2]
and this looks very much (it is) like a pair of Jacobi updates. This is
referred to as coloring the index sets, and the above is an example ofthe red-black coloring scheme (looks like a chess board). Generally
you can use more than two colors, albeit with increasing complexity.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 34 / 84
Iteration, Iteration, Iteration, ... Reordering Gauss-Seidel
Red-Black Ordering in Gauss-Seidel
-
7/29/2019 class19_pde-I-handout.pdf
32/78
Red Black Ordering in Gauss Seidel
The red-black scheme is illustrated in the following examples (this is in2D, where it looks a bit more pretty). Gray squares have been updated.
Left figure is simple Gauss-Seidel, while right is a classic red-black.
Note the 5-point stencil illustrating the data dependence at a particular
grid point.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 35 / 84
Iteration, Iteration, Iteration, ... Reordering Gauss-Seidel
Parallel Example
-
7/29/2019 class19_pde-I-handout.pdf
33/78
Parallel Example
Updating our example, a two-color scheme is used, but in a slightly
unorthodox way, in which the values phi(k) are one color, and
remaining values phi(1) ...phi(k-1) the other color. This is doneto facilitate the use of message-passing parallelism, and to better
overlap computation with communication.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 36 / 84
Iteration, Iteration, Iteration, ... Reordering Gauss-Seidel
-
7/29/2019 class19_pde-I-handout.pdf
34/78
k = N /P
t a g _ l ef t = 1i f (myID < Nprocs1) SEND ( p h i ( k ) t o myID +1 , t a g _ l e f t )
do i t e r =1 , N i t e r si f (myID > 0 ) RECV( p h i (0 ) f ro m myID1, t a g _ l ef t )i f ( m yID == 0 ) p h i ( 0 ) = p h i ( 1 )p h i ( 1 ) = 0 . 5 ( p hi ( 0 ) + p hi ( 2 ) ) + f ( 1 )t a g _ ri g h t = 2 i t e ri f (myID > 0 ) SEND( p h i (1 ) t o myID1, t a g _ ri g h t )
do i = 2 , k
1p h i ( i ) = 0 . 5( p h i ( i 1) + p h i ( i + 1) ) + f ( i )end doi f (myID < Nprocs1) RECV ( p h i ( k + 1 ) f r om myID + 1 , t a g _ r i g h t )i f (myID == Nprocs1) p h i ( k + 1) = 0p hi ( k ) = 0 .5 ( p h i ( k + 1 ) + p h i ( k1) ) + f ( k )t a g _ l e f t = 2 i t e r + 1i f (myID < Nprocs1) SEND( p h i ( k ) t o myID +1 , t a g _ l e f t )
end do
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 37 / 84
Multigrid Methods Multigrid Basics
Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
35/78
u t g d as cs
Multigrid is easy to conceptualize, but considerably more tedious in its
details. In a nutshell:
Jacobi or Gauss-Seidel forms the basic building block
Iterations are halted while problem is transferred to a coarser
mesh
Recursively applied
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 39 / 84
Multigrid Methods Multigrid Basics
Multigrid, in a Nutshell
-
7/29/2019 class19_pde-I-handout.pdf
36/78
g ,
For simplicity, consider once again, our linear elliptic problem,
L = f
and if we discretize using a uniform length, h,
Lhh = fh
To this point we have addressed several methods of solving for the
exact solution to the discretized equations (directly solving the matrix
equations as well as relaxation methods like Jacobi/Gauss-Seidel).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 40 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
37/78
Now, let h be an approximate solution to the discretized equation, andvh = h h. Then we can define a defect or residual as
dh = Lhh fh,
using which our error can be written as
Lhvh = Lhh Lhh = dh
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 41 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
38/78
We have been finding vh by approximating Lh, Lh:
Jacobi: Lh = diagonal part of LhGauss-Seidel: Lh = lower triangular part of Lh:
i =
N
j=i=1
Lijj fi
/Lii
In multigrid, what we are instead going to do is coarsen,
Lh = LH
(and typically H = 2h).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 42 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
39/78
Using this two-grid approximation,
LHvH = dH
and we need to handle both the fine->coarse (sometimes called
restriction or injection) transition:
dH = Rdh,
and the coarse->fine (prolongationor interpolation)
vh = PvH
and our new solution is just newh = h + vh.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 43 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
40/78
In summary, our two-grid cycle looks like:
Compute defect on fine grid: dh = Lhh fh
Restrict defect: dH = Rdh
Solve on coarse grid for correction: LHvH = dHInterpolate correction to fine grid: vh = PvH
Compute next approximation, newh = h + vh
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 44 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
41/78
The next level of refinement is that we smooth out higher-frequency
errors, for which relaxation schemes (Jacobi/Gauss-Seidel) are well
suited. In the context of our two-grid scheme, we pre-smooth on the
initial fine grid, and post-smooth the corrected solution.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 45 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
42/78
Putting the pieces together, the next idea is to develop a full multigrid
by applying the scheme recursively down to some trivially simple
(either by direct or relaxation methods) coarse grid. This is known as a
cycle, but let us first develop a bit more mathematical formalism.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 46 / 84
Multigrid Methods Multigrid Basics
Multigrid Notation
-
7/29/2019 class19_pde-I-handout.pdf
43/78
Some definitions are in order:Let K > 1 define the multigrid levels
Level-zero Grid is our starting coarse grid:
xi = (i 1)x, x = 1/N, 1 i N + 1
Next level, the level-one grid, comes from subdividing level-zero:
xN+2i = xi i [1, N + 1]
xN+2i+1 =1
2 (xi + xi+1) i [1, N]
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 47 / 84
Multigrid Methods Multigrid Basics
Multigrid Notation (contd)
-
7/29/2019 class19_pde-I-handout.pdf
44/78
Level-k grid (k is the number of points in level k, 0 = N + 1, andNk is the number of points in all previous levels N0 0,N1 = N + 1):
xNk+2i1 = xNk1+i i [1, k1]
xNk+2i =12
(xNk1+i + xNk1+i+1) i [1, k1 1]
and induction leads to
k = 2k1 1Nk+1 = Nk + k
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 48 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
45/78
As an example, consider the following figure:
5=33
4=17
3=9
2=5
1=3
0=2
which shows 5 levels of a one-dimensional multigrid.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 49 / 84
Multigrid Methods Multigrid Basics
Multigrid Solution
-
7/29/2019 class19_pde-I-handout.pdf
46/78
Let Nk1+1, ..., Nk denote the solution values at level k 1. Thesevalues define a piecewise linear function on the mesh for the k 1-thlevel,
k1(x) =
ki=1
Nk1+iuk1i (x),
where the uk1i (x) basis functions are defined as in the finite elementcase - unity only for the i-th node at level k 1.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 50 / 84
Multigrid Methods Multigrid Basics
-
7/29/2019 class19_pde-I-handout.pdf
47/78
Since the grids are nested, k1(x) is also piecewise linear at level k,for which we can interpolate:
Nk+2i1 = Nk1+i i [1, k1]
Nk+2i =1
2
Nk1+i + Nk1+i+1
i [1, k1 1]
If UNk = 0, then UNk+1 = 0.
It is also necessary to coarsen, or transfer a solution to a coarser grid.
Inverting the coarse-fine refinement is a reasonable choice:
Nk1+i =1
2 Nk+2i1 +1
4
nk+2i2 + Nk+2i
i [1, k1]
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 51 / 84
Multigrid Methods V-Cycle
Applying Jacobi/Gauss-Seidel
-
7/29/2019 class19_pde-I-handout.pdf
48/78
The Jacobi or Gauss-Seidel method can be applied on any level ofmultigrid. Previously we had (for Gauss-Seidel):
1 2 +1
2h2f(x1)
n1
2
n+1 + n1 + h
2fn
n [2, N]
and now for multigrid:
Nk+1 Nk+2 + FNk+1
Nk+n1
2
Nk+n+1 +Nk+n1 + FNk+n
n [1, k]
where FNk+1 = h2kf(xNk+1)/2, FNk+i = h
2kf(xNk+i).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 52 / 84
Multigrid Methods V-Cycle
Applying Jacobi/Gauss-Seidel (contd)
-
7/29/2019 class19_pde-I-handout.pdf
49/78
The Gauss-Seidel/Jacobi iterations are not used for the sake of
convergence, but rather for smoothing the solution, in the sense of
smoothing out error between the approximate and full solution.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 53 / 84
Multigrid Methods V-Cycle
-
7/29/2019 class19_pde-I-handout.pdf
50/78
For the k-th level iteration we can write the solution using the
shorthand notation (note this is the converged solution):
Akk = Fk,
after doing, say, mk smoothing steps we write the residual error as
Rk
= Fk
Ak
k
,
where
RNk+1 FNk+1 Nk+1 + Nk+2
RNk+n FNk+n 2Nk+n + Nk+n+1 + Nk+n1 n [2, k]
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 54 / 84
Multigrid Methods V-Cycle
-
7/29/2019 class19_pde-I-handout.pdf
51/78
The error E = , where is the exact solution of A = F,satisfies the residual equation,
AkEk = Rk.
The important result is
k = k + Ek.
The recursive part comes in solving the residual equation, which is
done on a coarser mesh,
RNk1+i =1
2
RNk+2i1 +1
4RNk+2i2 + RNk+2i i [1, k1]
and so on back to level-zero.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 55 / 84
Multigrid Methods V-Cycle
-
7/29/2019 class19_pde-I-handout.pdf
52/78
Now, see why it is called the V-cycle? You go back up the grids
(fine-to-coarse) then back again (the residuals are added back to
using fine-to-coarse steps), resulting in a classic V if you plot thelevels.
1
2
3
4
5
execution time
level
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 56 / 84
Multigrid Methods Full Multigrid
Full Multigrid
-
7/29/2019 class19_pde-I-handout.pdf
53/78
Repeating the V-cycle on a given level will converge to O(x2) inO(| log x|) iterations
Now that we have all of the pieces, we can outline the full multigrid
V-cycle
Solve Akk = Fk at level 0, 0 = (1, ..., N+1)
1 initially approximated by coarse-to-fine transfer of 0
Level-one proceeds (smoothing, solve level-zero residual
equation, coarse-to-fine interpolation again, more smoothing) r
times
Level-two proceeds as in level-one.
By doing r repetitions at each level, we get the Full Multigrid V-cycle.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 57 / 84
Multigrid Methods Full Multigrid
Full Multigrid (contd)
-
7/29/2019 class19_pde-I-handout.pdf
54/78
A picture is worth a few hundred words, at least:
E
S
E E
S S
S
S
S
S
E
S
S
S
S
S
E
S
S
S
S
S
S
E
S
S
S
E
execution time
level
where S denotes smoothing, and E the exact solution on the coarsest
grid (illustrated is 4-level, 2-cycle, i.e. K=4, r=2).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 58 / 84
Multigrid Methods Full Multigrid
Full Multigrid (contd)
-
7/29/2019 class19_pde-I-handout.pdf
55/78
Without proof, it can be shown that full multigrid converges to O(x2)in O(1) iterations. For large enough r, each k-th level iteration willsolve to an accuracy O(x2k). In practice, r = 10 is typically sufficient.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 59 / 84
Multigrid Methods Parallel Multigrid
Multigrid in Parallel
-
7/29/2019 class19_pde-I-handout.pdf
56/78
The good news is that multigrid easily parallelizes in basically the
same way as Jacobi/Gauss-Seidel. We have basically two parts:
1 Smoothing, which is indeed Jacobi/Gauss-Seidel, and can be
parallelized in exactly the same fashion, with little modification
2 Inter-grid transfers (refinement & un-refinement is one way of
looking at these interpolations) - these are relatively trivial to deal
with, as only elements at the ends of the arrays need to be
transferred
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 60 / 84
Multigrid Methods Parallel Multigrid
Multigrid Scaling
-
7/29/2019 class19_pde-I-handout.pdf
57/78
Now lets examine the scaling behavior of multigrid.
Basic V-cycle, we estimate the time as
V K1k=0
c1N2k
P,
where c1 is the time spent doing basic floating-point operations
during smoothing.
Communication is involved just at exchanging data at the ends of
each interval at each step in the V-cycle,
comm latK,
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 61 / 84
Multigrid Methods Parallel Multigrid
Multigrid Scaling (contd)
-
7/29/2019 class19_pde-I-handout.pdf
58/78
Solving on the coarse grid at level K, for which there are N2K
unknowns. We need to collect the right-hand sides at each
processor, proportional to N2K
. Calling the constant ofproportionality c2,
K c2N2K
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 62 / 84
Multigrid Methods Parallel Multigrid
Multigrid Scaling, contd
-
7/29/2019 class19_pde-I-handout.pdf
59/78
Putting the pieces together, the overall time is
(P) c1N2P
+ latK + c2N2K
If we equate the first and last terms, P = 2K+1/, where = c2/c1. Inthat case,
(P) 4c1NP
+ lat log2 (P/2) ,
and the efficiency
E1(P) 4 +Plog2 (P/2)
c1N
E(P) 1
4
Plog2 (P/2)
c1N,
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 63 / 84
Multigrid Methods Parallel Multigrid
-
7/29/2019 class19_pde-I-handout.pdf
60/78
which implies (finally!) that the full multigrid is scalable if
Plog2 P N, and
E(P)
1
4 + /c1
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 64 / 84
Multigrid Methods Parallel Multigrid
MG Illustrated
-
7/29/2019 class19_pde-I-handout.pdf
61/78
Flow through square duct with 90
bend, LR = Line Relaxation (Gauss-Seidel), PR = PointRelaxation (Gauss), DADI = (diagonalized) Alternating Direction Implicit, (S)MG =
(Single)Multi-Grid. Work units normalzied to DADI-SG iterations.
L. Yuan, "Comparison of Implicit Multigrid Schemes for Three-Dimensional Incompressible Flows," J. Comp. Phys. 177, 134-155
(2002).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 65 / 84
Multidimensional Problems Moving to Multiple Dimensions
Moving to Multiple Dimensions
-
7/29/2019 class19_pde-I-handout.pdf
62/78
Yes, in fact there are problems requiring more than one dimension ...
However, what we have done thus far in treating a one dimensional
elliptic problem with mesh methods is easily transferable to 2D and3D.
Lets look at our elliptic problem in 2D ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 67 / 84
Multidimensional Problems 2D Example
2D Example
-
7/29/2019 class19_pde-I-handout.pdf
63/78
We start with
2
x2
2
y2= f(x, y), x, y [0, 1]
and for simplicity we take strictly Dirichlet boundary conditions, = 0on the boundary. Again define a regular mesh, xi = (i 1)x,yj = (j 1)y. Again for simplicity we will take N mesh points in eachdirection. Using finite differences or finite elements (and we could
easily extend to an arbitrary mesh) results in a simple system of linear
equations ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 68 / 84
Multidimensional Problems 2D Example
-
7/29/2019 class19_pde-I-handout.pdf
64/78
For a regular tensor product (tensor product of one-dimensional
bases) mesh,
A =
1
h2
BN IN 0 0 . . . 0IN BN IN 0 . . . 0
0 IN BN IN . . . 0...
.. .
.. .
.. .
.. .
.
..0 . . . IN BN IN 00 . . . 0 IN BN IN0 . . . 0 0 IN BN
where IN is just the NN identity matrix, and ...
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 69 / 84
Multidimensional Problems 2D Example
-
7/29/2019 class19_pde-I-handout.pdf
65/78
BN =
4 1 0 0 . . . 01 4 1 0 . . . 0
0 1 4 1 . . . 0...
. . .. . .
. . .. . .
...
0 . . . 1 4 1 00 . . . 0 1 4 10 . . . 0 0 1 4
and note that the numerical entries in BN are for 2D, using two-point
central differences.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 70 / 84
-
7/29/2019 class19_pde-I-handout.pdf
66/78
Initial Value Problems Basics
Forward to the Future
-
7/29/2019 class19_pde-I-handout.pdf
67/78
Initial value problems can be a bit more difficult to handle with scalableparallel algorithms. In a similar vein to our simple example to illustrate
boundary value problems, we can illustrate initial value problems using
a simple physical problem - the massless pendulum, which obeys:
d2u
dt2= f(u),
with initial conditions u(0) = a0, u
(0) = a1. In this context u is theangle that the pendulum makes with respect to the force of gravity, and
f(u) = mgsin(u).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 73 / 84
Initial Value Problems Basics
Finite Differences Redux
-
7/29/2019 class19_pde-I-handout.pdf
68/78
Applying our usual central difference scheme using a two-point formula
yields:
un+1 = 2un un1 f(un),
where = t2
. we can make this a linear problem by taking the smallangle approximation
sin(u) u+ . . . u
-
7/29/2019 class19_pde-I-handout.pdf
69/78
In the linear problem (small oscillations in the case of our pendulum):
d2u
dt2= mgu
and the difference equations become a simple recursion relation:
un+1 = (2 mg)un un1
with starting values
u0 = a,
u1 = a0 a1t,
which will allow for the solution for all n 0.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 75 / 84
Initial Value Problems Basics
Matrix/Linear Form
-
7/29/2019 class19_pde-I-handout.pdf
70/78
We can write the linear form using a very simple matrix-vectorequation:
LU = b
or
1 0 0 0 0 . . .T 1 0 0 0 . . .1 T 1 0 0 . . .0 1 T 1 0 . . .0 0 1 T 1 . . ....
.
.
....
.
.
....
. . .
u1u2u3u4u5...
=
(1 mg)a0 + a1ta0
000
.
.
.
where T = (mg 2). For this set of linear equations we can bringour usual bag of tricks in terms of scalable parallel algorithms.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 76 / 84
Initial Value Problems Basics
Nonlinearity
-
7/29/2019 class19_pde-I-handout.pdf
71/78
If we do not make the linear approximation in our simple example, we
do not have such an easy (or at least well traveled) path to follow.
Using a similar matrix-vector notation,
F(U) = L0U + mg(U) b = 0 (6)
where L0 is of similar form as before, but with the terms removed (i.e.diagonal elements are 2). The (U) term can also be written as amatrix,
(U)ij = i1,j sin(ui1),
where ij is the Kronecker delta-function, and this term contains all ofour nonlinearities.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 77 / 84
Initial Value Problems Newton-Raphson Method
Newton-Raphson Method
-
7/29/2019 class19_pde-I-handout.pdf
72/78
The Newton-Raphson method is one of the most powerful in terms of
solving nonlinear problems. Basically we want to solve Eq. 6,
F(U) = 0
using a root-finding method, i.e. when F(U) crosses the U axis. So for
a given guess Un we drop a tangent line to F and find where that linecrosses the axis:
0 F(Un)
Un+1 Un= F
(Un),
or
Un+1 = Un F(Un
)F
(Un).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 78 / 84
Initial Value Problems Newton-Raphson Method
Convergence of Newton-Raphson
-
7/29/2019 class19_pde-I-handout.pdf
73/78
Recall that Newton-Raphson has some very nice convergence
properties, provided that you start close to a solution:
|Un+1 Un| C|Un Un1|2
namely the number of correct figures doubles with each iteration.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 79 / 84
Initial Value Problems Newton-Raphson Method
Multidimensional Newton-Raphson
Extending the previous relations to multiple dimensions is
-
7/29/2019 class19_pde-I-handout.pdf
74/78
Extending the previous relations to multiple dimensions is
straightforward: we replace F
(Un) by the Jacobian matrix of all partial
derivatives of F with respect to vector U:
JF(U)ij =Fi(U)
uj,
and then solve:
JF(Un)V = F(Un),
Un+1 = Un + V.
and repeat until convergence. In our simple pendulum case, we have
J(U)ij = i1,j cos(ui1).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 80 / 84
Initial Value Problems Starting Newton-Raphson
Starting Newton-Raphson
So if Newton Raphson converges quickly given a good starting
-
7/29/2019 class19_pde-I-handout.pdf
75/78
So if Newton-Raphson converges quickly given a good starting
approximation, how do we ensure that we have a good starting point?
Generally there is not a good answer to that question, but oneapproach is to use a simplified approximation, f to the nonlinear
function f, and then solve
d2u
dt2 = f(u),
along with the original initial conditions. In our pendulum example, we
could have used f(u) = u instead of f(u) = sin(u). This starting pointgives us a residual,
R(u) =d2u
dt2 f(u) = f(u) f(u).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 81 / 84
Initial Value Problems Starting Newton-Raphson
and if we view v as a correction to u we can write:
-
7/29/2019 class19_pde-I-handout.pdf
76/78
and if we view v as a correction to u, we can write:
d2
vdt2
= vf(u) R(u)
using a simple Taylor expansion of f(u+ v) f(u) + vf
(u) + O(v2).Or in other terms:
d2v
dt2= vf
(u)
f(u) f(u)
.
The size of v is zero at the start (thanks to the initial conditions), so its
size can be tracked as the calculation of u proceeds, and provides a
way of monitoring the calculation error.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 82 / 84
Initial Value Problems Other Methods
Other Methods
-
7/29/2019 class19_pde-I-handout.pdf
77/78
It is worth mentioning other techniques for initial value problems as
well:
Multigrid in the time variable - start coarse, and them improve.
See, for example:
L. Baffico et. al., Parallel-in-time Molecular-Dynamics
Simulations, Phys. Rev. E 66, 057701 (2002)Wave-form relaxation for a system of ODEs, using domain
decomposition rather than parallelize over the time domain. See,
for example:
J. A. McCammon et. al., Ordinary Differential Equations of
Molecular Dynamics, Comput. Math. Appl. 28, 319 (1994).
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 83 / 84
Initial Value Problems Other Methods
And Then?
-
7/29/2019 class19_pde-I-handout.pdf
78/78
Of course, this has been a mere sampling of parallel methods for
partial differential equations. Hopefully it has given you a feel for at
least the general principles involved.
In the next discussion of PDEs, we will focus more on available HPC
solution libraries and specialized methods.
M. D. Jones, Ph.D. (CCR/UB) Partial Differential Equations in HPC, I HPC-I Fall 2012 84 / 84