conjugate gradient method
DESCRIPTION
CONJUGATE GRADIENT METHOD. Monica Garika Chandana Guduru . METHODS TO SOLVE LINEAR SYSTEMS. Direct methods Gaussian elimination method LU method for factorization Simplex method of linear programming Iterative method Jacobi method Gauss-Seidel method - PowerPoint PPT PresentationTRANSCRIPT
CONJUGATE GRADIENT METHOD
Monica GarikaChandana Guduru
METHODS TO SOLVE LINEAR SYSTEMSDirect methods Gaussian elimination method LU method for factorization Simplex method of linear programmingIterative method Jacobi method Gauss-Seidel method Multi-grid method Conjugate gradient method
Conjugate Gradient methodThe CG is an algorithm for the numerical solution of particular system of linear equations
Ax=b. Where A is symmetric i.e., A = AT and positive
definite i.e., xT * A * x > 0 for all nonzero vectors
If A is symmetric and positive definite then the function
Q(x) = Β½ x`Ax β x`b + c
Conjugate gradient methodConjugate gradient method builds up a
solution x*β¬ Rn
in at most n steps in the absence of
round off errors.Considering round off errors more than n
steps may be needed to get a good approximation of exact solution x*
For sparse matrices a good approximation of exact solution can be achieved in less than n steps in also with round off errors.
In oil reservoir simulation, The number of linear equations corresponds to the number of grids of a reservoirThe unknown vector x is the oil pressure of
reservoirEach element of the vector x is the oil
pressure of a specific grid of the reservoir
Practical Example
Linear System
Square matrixUnknown vector
(what we want to find)Known vector
Matrix Multiplication
Positive Definite Matrix
> 0[ x1 x2 β¦ xn ]
ProcedureFinding the initial guess for solution π₯0Generates successive approximations to π₯0Generates residualsSearching directions
x0 = 0, r0 = b, p0 = r0
for k = 1, 2, 3, . . . Ξ±k = (rT
k-1rk-1) / (pTk-1Apk-1) step length
xk = xk-1 + Ξ±k pk-1 approximate solution
rk = rk-1 β Ξ±k Apk-1 residual
Ξ²k = (rTk rk) / (rT
k-1rk-1) improvement
pk = rk + Ξ²k pk-1 search direction
Conjugate gradient iteration
Iteration of conjugate gradient method is of the form
x(t) = x(t-1) + s(t)d(t) where, x(t) is function of old value of vector x s(t) is scalar step size x d(t) is direction vector Before first iteration, values of x(0), d(0) and g(0) must be set
Steps to find conjugate gradient methodEvery iteration t calculates x(t) in four steps : Step 1: Compute the gradient g(t) = Ax(t-1) β b Step 2: Compute direction vector d(t) = - g(t) + [g(t)` g(t) / g(t-1)` g(t-1)] d(t-1) Step 3: Compute step size s(t) = [- d(t)` g(t)]/d(t)β A d(t)] Step 4: Compute new approximation of x x(t) = x(t-1) + s(t)d(t)
Sequential Algorithm1) π₯0 = 02) π0 := π β π΄π₯03) π0 := π04) π := 05) πΎπππ₯ := maximum number of iterations to be done6) if π < ππππ₯ then perform 8 to 167) if π = ππππ₯ then exit8) calculate π£ = π΄πk
9) Ξ±k : = rkT rk/pT
k v10) π₯k+1:= π₯k + ak pk
11) πk+1 := πk β ak v12) if πk+1 is sufficiently small then go to 16end if13) π½k :=( r T
k+1 r k+1)/(rTk πk)
14) πk+1 := πk+1 +Ξ²k pk
15) π := π + 116) πππ π’ππ‘ = π₯k+1
Complexity analysisTo Identify Data DependenciesTo identify eventual communicationsRequires large number of operationsAs number of equations increases complexity
also increases .
Why To Parallelize?Parallelizing conjugate gradient method is a
way to increase its performanceSaves memory because processors only store
the portions of the rows of a matrix A that contain non-zero elements.
It executes faster because of dividing a matrix into portions
How to parallelize?For example , choose a row-wise block
striped decomposition of A and replicate all vectors
Multiplication of A and vector may be performed without any communication
But all-gather communication is needed to replicate the result vector
Overall time complexity of parallel algorithm is
Ξ(n^2 * w / p + nlogp)
Row wise Block Striped Decomposition of a Symmetrically Banded Matrix
( a ) (b )
Dependency Graph in CG
Algorithm of a Parallel CG on each Computing Worker (cw)1) Receive ππ€π0,ππ€ π΄,ππ€ π₯02) ππ€π0 =ππ€ π03) π := 04) πΎπππ₯ := maximum number of iterations to be done5) if π < ππππ₯ then perform 8 to 166) if π = ππππ₯ then exit7) π£ =ππ€ π΄ππ€ππ8) ππ€ππΌπ9) ππ€π·πΌπ10) Send ππΌπ,π·πΌπ11) Receive πΌπ12) π₯π+1 = π₯π + πΌπππ13) Compute Partial Result of ππ+1: ππ+1 =ππ β πΌππ£14) Send π₯π+1, ππ+115) Receive π πππππ16) if π πππππ = π πππ’π‘πππππππβππ go to 2317) ππ€ππ½π18) ππ€π·π½π19) Send ππ€ππ½π , ππ€π·π½π20) Receive π½π21) ππ€ππ+1 =ππ€ ππ+1 + π½π ππ€ππ22) π := π + 123) Result reached
Speedup of Parallel CG on Grid Versus Sequential CG on Intel
Communication and Waiting Time of the Parallel CG on Grid
We consider the difference between f at the solution x and any other vector p:
1) 21) 21 1) ( )2 2
1 1simplify) ( )2 21 1replace ) ( )2 2
1 12 2
t
t
t t
t t
t t t t
t t
i f
ii f
ii i f f
f f
f f
t
t
t t
t t
p p Ap b p c
x x Ax b x c
x p x Ax b x c p Ap b p c
x p x Ax b x p Ap b p
b x p x Ax x Ax p Ap x Ap
x Ax p A
12
positive-definitess) ( ) 0
t
t
f f
p x Ap
x p A x p
x p
Parallel Computation Designβ Iterations of the conjugate gradient method can be executed only in sequence, so the most advisable approach is to parallelize the computations, that are carried out at each iteration The most time-consuming computations are the multiplication of
matrix A by the vectors x and dβ Additional operations, that have the lower computational complexity order, are different vector processing procedures (inner product, addition and subtraction, multiplying by a scalar).While implementing the parallel conjugate gradient method, it can be used parallel algorithms of matrix-vector multiplication,
Parallel Computation Design
βPureβ Conjugate Gradient Method (Quadratic Case)
0 - Starting at any x0 define d0 = -g0 = b - Q x0 , where gk is the column vector of gradients of the objective function at point f(xk)
1 - Using dk , calculate the new point xk+1 = xk + ak dk , where
2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + bk dk where
T a k = - gk dk
dkTQdk
b k = gk+1TQd kdkTQd k
ADVANTAGES Advantages:1) Gradient is always nonzero and linearly
independent of all previous direction vectors.2) Simple formula to determine the new
direction. Only slightly more complicated than steepest descent.
3) Process makes good progress because it is based on gradients.
ADVANTAGESAttractive are the simple formulae for
updating the direction vector.Method is slightly more complicated than
steepest descent, but converges faster.Conjugate gradient method is an Indirect SolverUsed to solve large systemsRequires small amount of memoryIt doesnβt require numerical linear algebra
ConclusionConjugate gradient method is a linear solver
tool which is used in a wide range of engineering and sciences applications.
However, conjugate gradient has a complexity drawback due to the high number of arithmetic operations involved in matrix vector and vector-vector multiplications.
Our implementation reveals that despite the communication cost involved in a parallel CG, a performance improvement compared to a sequential algorithm is still possible.
ReferencesParallel and distributed computing systems by
Dimitri P. Bertsekas, John N.TsitsiklisParallel programming for multicore and cluster
systems by Thomas Rauber,Gudula RungerScientific computing . An introduction with parallel
computing by Gene Golub and James M.OrtegaParallel computing in C with Openmp and mpi by
Michael J.QuinnJonathon Richard Shewchuk, βAn Introduction to
the Conjugate Gradient Method Without the Agonizing Painβ, School of Computer Science, Carnegie Mellon University, Edition 1 1/4