ParCFD 2008 1
Parallel computation of pollutant dispersion in
industrial sites
Julien Montagnier Marc Buffat David Guibert
ParCFD 2008 2
Motivation
Numerical simulation of pollutant dispersion
in industrial sites
Better evaluation of risk than with 1D model
dispersion
Efficiency Navier Stokes solver to run
parametric studies
Development of a parallel 3D Navier
Stokes solver on unstructured meshes.
3D
Observations
1D
ParCFD 2008 3
Numerical MethodsProperties :
Finite volume on unstructured finite elements mesh.
Incompressible segregated solver with projection methods
Extension to variable density flow with projection on energy equation
(Mach Uniformity through the coupled pressure and temperature
correction algorithm, 2005 Nerinckx,)
Algorithm
Fixed point non linear iteration for each time step with :
A (Wk+1 -Wk ) = F(Wk)
Parallelization :
Evaluation of fluxes and assembling part (RHS + matrix) parallelized
using domain decomposition
Implicit upwind schemes >> Efficient solvers to solve large
unstructured sparse linear systems of several millions of dofs.
matrix RHS
ParCFD 2008 4
Parallel Linear Solvers
Use of PETSC Krylov subspace iterative methods
Acceleration of convergence with different
preconditioning methods (Hypre library) :
parallel ILU / AMG (Algebraic Multigrid Method)
Many way of tunning AMG methods :
Coarsening schemes
Falgout
PMIS, (Parallel Maximal Independent Set)
HMIS
Interpolation operation
Classical interpolation
FF, FF1 (De Sterck, Yang : Copper 2005 ; De
Sterck 2006)
ParCFD 2008 5
3D Poisson Equation : Tetrahedral Mesh
Scale up
1 >> 64 processors
12,500 >> 400,000 dofs / proc
Speed up
1,000,000 dofs
P2CHPD IBM cluster, with Intel dual quad
core processor nodes and Infiniband
ParCFD 2008 6
Scale up results
Bring out 3 groups of preconditioning methods
1) ILU
2) AMG with high complexity coarsening
schemes
3) AMG with low complexity coarsening
schemes
Better AMG scale up with low complexity
coarsen schemes
Krylov with AMG preconditioning + FF1
interpolation give the best scale up.
(500 x faster than ILU)
ParCFD 2008 7
On the IBM cluster, scalability is good from
200,000 dofs / proc
With lower dofs, too much communication
cause a loss in scalability
Beware the problem size !
ParCFD 2008 8
Speed Up on 1,000,000 dofs
PMIS-FF1 give the best results
On 32 processors 10 % faster than PMIS FF, 270 % faster than Falgout classical 500 % faster than ILU
Efficiency collapse over 16 processors
(62,500 dofs / procs)
No. of procs
Speed up
ParCFD 2008 9
Real case study
PMIS – FF1
Real geometry.
Application on meshes : 5 M of cells,
30 M of dofs
Scalar transport equation
ParCFD 2008 10
Assembling time : 30% total time
Parallelization of matrix assembling and
RHS assembling perform well
Parallelization of linear solver perform well
but depends on problem size
Parallel efficiency on Navier Stokes Problem
ParCFD 2008 11
ParCFD 2008 12
ParCFD 2008 13
Conclusion
Objective : Build a new efficient parallel Navier Stokes solver
Laplacian equation : Low complexity scheme PMIS with FF1 interpolation gives the best
results (speed up, scale up, simulation times)
500 times faster than ILU preconditioning methods
Navier Stokes problem on 5M cells mesh run in 6 hours on 64 processors.
Good speed up on 5M cells mesh up to 64 processors.
Communications in linear solver process limits speed up