computational engineering at nacad - coppe/ufrjalvaro/cilamce03.pdf · computational engineering at...
TRANSCRIPT
Computational Engineering at NACADAlvaro L.G.A. Coutinho
NACAD-Center for Parallel ComputingCOPPE/Federal University of Rio de Janeiro, Brazil
October, 2003
©Alvaro LGA Coutinho 2/48
Contents:Contents:Introduction: Who we are and what we doField Equations for Grid-based ApplicationsFinite Element DiscretizationComputational IngredientsGrid-based Demonstration ProblemsConcluding Remarks
©Alvaro LGA Coutinho 3/48
IntroductionIntroductionWho are we ?
NACAD Center for Parallel Computing, COPPE/Federal University of Rio de Janeiro, Brazil
Associated Laboratories
LAMCE, NTT, LAB2M, CEMONComputer Methods in Engineering Lab, Data Mining Lab, Basin Simulation Lab, Environment Monitoring Lab Civil Engineering Department
LASPOTPower Systems LaboratoryElectrical Engineering Department
Parallel Computing LaboratoryComputer Science Department
©Alvaro LGA Coutinho 4/48
Introduction (contIntroduction (cont’’d)d)What we do ? High Performance Computing: research and development
– Parallel, vector, and cluster computing– Scientific visualization– Applications to:
Petroleum EngineeringPower SystemsAerospace Engineering Environment Data MiningGovernanceFinancial EngineeringMeteorology
Cray SV1
InfoServer Itautec
©Alvaro LGA Coutinho 5/48
Field Equations for GridField Equations for Grid--based based ApplicationsApplications
General Form of PDE’s for Engineering Systems
©Alvaro LGA Coutinho 6/48
Governing Equations in Eulerian Framework
Ω=⋅∇
Ω+=∇+∆∇⋅+∂∂
in
inTpt
0
),f(q
u
c1u-uuuρ
ν
Navier-Stokes Equations
Energy Transport Equation
Ω=∇⋅∇−∇⋅+∂∂ inThTkTc
tTc
pp),()(
1cuρρ
Mass Transfer Equations
Ω=∇Κ⋅∇−∇⋅+∂∂ inT
t),(h)(
2cccuc
©Alvaro LGA Coutinho 7/48
Eulerian Governing Equations
Multi-phase Darcy-flow in Porous Media:
j
ij
x∂Φ∂
−= π
ππ µ
Ku
κzg ⋅ρ−=Φ πππ p
( )πππ
π
π
ππ ρρµ
φρ qxt
S
j
ij +⎟⎟⎠
⎞⎜⎜⎝
⎛
∂Φ∂
⋅∇=∂
∂ K
π =1, 2, ... , nphases
From Mendonça, 2003
©Alvaro LGA Coutinho 8/48
Governing Equations in Lagrangian Framework
Equation of Motion for Solids and Structures:
©Alvaro LGA Coutinho 9/48
Lagrangian Governing Equations
Remarks:
From Quaranta&Alves, 2002
©Alvaro LGA Coutinho 10/48
Arbitrary Eulerian Lagrangian Governing Equations
Incompressible N_S equations in ALE frame moving with velocity w:
Velocity w is conveniently adjusted to Eulerian (w=0), far from moving object to Lagrangian (w=u) on the fluid-structure interface.Fluid is considered attached to the body.Need to solve extra-field equation to define mesh movement: our choice is to solve the Laplacian.
From Felippa, Park and Farhat (CMAME, 2001)
©Alvaro LGA Coutinho 11/48
FEM DiscretizationFEM Discretization
Good mathematical background and ability to handle complex geometries by using unstructured grids
FEM FEM DiscretizationDiscretization
Variational Formulation
©Alvaro LGA Coutinho 13/48
FEM Computational IngredientsFEM Computational Ingredients
Space-Time AdaptationAdaptive step sizeMesh refinement/unrefinement
Non-linear Solution Methods, Iterative SolversData Structures: Memory complexity O(meshparameters)Partitioned Time-Marching SchemesHigh Performance Computing Issues
©Alvaro LGA Coutinho 14/48
Adaptive Step size Control for Time Step Selection
CFL
Valli, Coutinho, Carey, CNME, 2002
©Alvaro LGA Coutinho 15/48
Adaptive Mesh Refinement
Fundamental for high accuracy computationsWe prefer adaptive remeshing with Delaunay triangulation with a coarse background meshZZ viscous stress error indicator do guide adaptationALE we need to move both background and current meshes
Sampaio&Coutinho, IJNMF, 1999
©Alvaro LGA Coutinho 16/48
Nonlinear Solution Method: Inexact Newton Method
Given utol, rtol, relative unknown and residual tolerances and RHS vector, b do i while convergence Compute residual vector, 11 −−−= iii uJbr Update jacobian matrix, iJ Compute tolerance for iterative driver, ηi
Solve ii ruJ =∆ for tolerance ηi Update solution, uuu ∆+←
If toluuu
≤∆
and tol
i
rbr
≤ then convergence
End while
Backtracking is sometimes useful !Coutinho et al, IJNME, 2001
©Alvaro LGA Coutinho 17/48
Iterative Solution Methods
Symmetric systems: PCGNon-symmetric systems: GMRESMatrix-vector products– Element-by-element
– Matrix-free
Preconditioning keeping same data structures
epKpKnel
1ee∑
=
=
( )∑=
=nel
1eepLwpK )(,
©Alvaro LGA Coutinho 18/48
Edge-based Solution
FE mesh Graph representation
Sparse matrix
©Alvaro LGA Coutinho 19/48
Edge-based FE Scheme
Disassembling of Element Matrix
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
××
××
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
××××
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡××××
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
•••••••••
++=
0000
0
00
000
00000
321 EdgeEdgeEdgeElement
Assembling of Edge Matrix
I
J
K
L
E1E2
EdgeIJ Elem Elem
⎡
⎣⎢⎤
⎦⎥=
× ×× ×⎡
⎣⎢⎤
⎦⎥+
× ×× ×⎡
⎣⎢⎤
⎦⎥1 2
©Alvaro LGA Coutinho 20/48
Edge Matrices and Matvec
∑=
=m
s
es
e
1KK
Element matrices disassemblingm is the number of element edges, which is 6 for tetrahedra or 28 for hexahedra.
Edge Matrix
∑∈
=Es
ess KK E is the set of all elements sharing
a given edge s
Edge-by-edge matrix-vector product
s
nedges
ss pKpK ∑
==
1
©Alvaro LGA Coutinho 21/48
Computational costs for symmetric sparse matrix-vector products in tetrahedral meshes
DataStructure
Memory flop i/a
EBE 429 × nnodes 1,386 × nnodes 198 × nnodes
Edges 63 × nnodes 252 × nnodes 126 × nnodes
nel ≈ 5.5×nnodes, nedges ≈ 7×nnodes
©Alvaro LGA Coutinho 22/48
SuperedgesIdea introduced by Löhner (94) and implemented in CSM and CFD by Martins et al (97,98,02) and Coutinho et al (01) for tetrahedraand hexahedraDesigned to improve i/a ratio and flop balanceOnce data have been gathered from memory to processor (registers), reuse them as much as possibleFormed by edge list reorderingDifferent grouping are possible increasing code complexityNodes reordered in increasing order as they appear in the superedge list (Löhner, 93)2D triangle, 3D tetrahedra
Superedges in blue
Guanabara Bay
©Alvaro LGA Coutinho 23/48
Partitioned Time Marching Scheme
Mesh partitioning algorithms for time-marching: I/E, E/E, Iterative/Direct, etc
Partition can evolve in time
Implicit Edges in RED
©Alvaro LGA Coutinho 24/48
High Performance Computing High Performance Computing IssuesIssues
FEM is a unstructured grid method characterized by:
Discontinuous data – no i-j-kaddressingGather-scatter operationsRandom memory access patternsData dependenceMinimize indirect addressing is a must
©Alvaro LGA Coutinho 25/48
Parallel Solution Strategies
Shared Memory: Mesh Coloring
Distributed Memory: Mesh Partitioning
©Alvaro LGA Coutinho 26/48
GridGrid--based Demonstration based Demonstration ProblemsProblems
Fluid Flow in Deformable Porous Media -Well Stability: What you can do in a PCReservoir Engineering: Effects of Memory SpeedHydrodynamic computations in Araruama Lagoon: Example of Cluster ComputingFluid-Structure Interaction in Rio-Niteroi BridgeStress analysis of sedimentary basins
©Alvaro LGA Coutinho 27/48
Fluid Flow in Deformable Porous Media Well Stability: What you can do in a PC
Quasi-static deformation of plastic porous media coupled with 1-phase flowStrain depends on poro-pressurePorosity is function of volumetric changeStaggered coupling
©Alvaro LGA Coutinho 28/48
Coupled 1-phase (water) and Solid in 3D: Vertical and Horizontal Wells
Mesh Data
36,105 nodes,
191,163 elements,
236,090 edges (23% simple, 18% s3 and 59% s6)
Solid Material Data
Internal radius 0.11 m; External Radius 20.0 m
Formation Pressure: 32.4 MPa
Insitu stresses (V/H): 32.1 MPa; 9.0 MPa
Young’s modulus: 1.2 GPa, Poisson: 0.2
Internal angle: 45; Cohesion: 8.5 MPa; Biot: 1.00
©Alvaro LGA Coutinho 29/48
Coupled 1-phase (water) and Solid in 3D: Vertical and Horizontal Wells
Stiffness Updating
PCG total PCG Average
NR BCT Time (s)
Secant 931 93,1 10 0 579,1 Tangent 931 93,1 10 0 551,4
Stiffness Updating
PCG total PCG average
NR BCT Time (s)
Secant 6.049 40,1 148 0 4.709,2Tangent 1.291 92,2 14 4 981,4
3 poro-pressure load steps: 34,088, 14,9 e 4,9 MPa;Non-linear Solver: Edge-based Inexact Newton; PIII 1GHz
Vertical Well
Horizontal Well
©Alvaro LGA Coutinho 30/48
Numerical Results for Horizontal Well
Plastic fringes around well Total displacements around well
©Alvaro LGA Coutinho 31/48
Reservoir Engineering: Effects of Memory Speed
True heterogeneous reservoir: SPE 10th
comparative project: http://www.streamsim.com/pages/spe10.htmlReservoir dimensions: 1200x2200x170 ftUnstructured grid generated from 60x220x85 cells
5,610,000 tetrahedra1,159,366 points6,843,365 edges
©Alvaro LGA Coutinho 32/48
Effects Memory Speed
From Jack Dongarra, 2002
©Alvaro LGA Coutinho 33/48
Preprocessing and Matvec performance on the CRAY SV1
SuperE
Edge
G&L
Preprocessing
MATVEC
755 933
777
224,23
204,66
982,85
0
200
400
600
800
1000
Tim
e (s
)
G&L Galle and Lohner, 2002
Reordering effectSuperedge/edge = 0.81G&L/edge = 0.83
©Alvaro LGA Coutinho 34/48
Hydrodynamic computations in Araruama Lagoon: Example of Cluster Computing
From http://data.ecology.su.se/mnode/South%20America/araruama/araruama1/Araruamabud.htm
©Alvaro LGA Coutinho 35/48
Geometrical Data39.300 m
12.900 m
Open boundary
Small Mesh, Dual method, METIS, 4 procs
©Alvaro LGA Coutinho 36/48
0
200
400
600
800
1000
0 100 200 300 400 500
Actual time(s)
Sim
ulat
ion
time
(s)
ReferenceTotalSolver
Topological Data and Computer
Mesh Nodes Elements Edges Equations
Small 19.732 36.300 56.035 52.859 Medium 75.767 145.200 220.970 214.628 Big 296.737 580.800 877.540 864.866
Computer: InfoServer Itautec 16 nodes / 32 processors PIII-1GHz
–Memory: 8 Gbytes RAM (distributed) –Disk: 250 Gbytes–Fast Ethernet, Gigabit
Medium mesh
©Alvaro LGA Coutinho 37/48
Performance Results
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14 16
Processadores
Spee
d-up
dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s
1
23
45
67
89
10
0 4 8 12 16 20 24
ProcessadoresSp
eed-
up
dt = 1 sdt = 10 sdt = 20 sdt = 30 sdt = 40 sdt = 50 s
Medium Mesh Big Mesh
©Alvaro LGA Coutinho 38/48
Simulation Results
©Alvaro LGA Coutinho 39/48
Fluid-Structure Interaction in Rio-Niteroi Bridge
Rio
300 m 2002003044
3044
Steel structure
60 m
©Alvaro LGA Coutinho 40/48
Solution CharacteristicsSpace-time adaptive solution for the incompressible N-S equations in ALE frame
Field reduction for bridge structure: 1 vertical modeLES for fluid via numerically implicit SGS model of Sampaio et al, IJNMF, 2004Cray SV1, parallel efficiency > 0.88 up to 8 cpu’s
Eulerian domainALE domain
©Alvaro LGA Coutinho 41/48
Numerical Simulations
©Alvaro LGA Coutinho 42/48
Comparison with Experimental Results
©Alvaro LGA Coutinho 43/48
Stress analysis of sedimentary basins
Solving large scale CSM problems undergoing plastic deformations is very important in many engineering applicationsCSM is nowadays used in Oil&Gas to understand the formation mechanisms of sedimentary basins to evaluate new prospectsComplex geometries, due to the presence of faults, rheological and mechanical factorsNeed to improve current spatial resolution capabilitiesUnstructured grid methods combined with efficient data structures and nonlinear solution methodsReordering vertices and nodes is very important
©Alvaro LGA Coutinho 44/48
Extensional Behavior of a Sedimentary Basin
3D Model of a sedimentary cover (4 km) over a basement (2 km) with length of 15 km and thickness of 6 kmModel presents an ancient inclined fault with 500 m length and 60o
of slopeFinite element mesh: 2 611 036 tetrahedra, 3 916 554 edges and 445 752 nodesParallel run on a CRAY J932 at Eagan, MN, USA
Fault detail
©Alvaro LGA Coutinho 45/48
Numerical Results
PCG-EBE PCG-Edges
204.0
(1.00)
35.3
(0.17)
Memory requirements to hold the tangent stiffness matrix (Mw)
ITS [10-1 , 10-6].
Nonlinear Iters
36
PCG Iters 9 429
Elapsed Time (min)
15
Inexact Tangent Stiffness Solution 12 load increments, nonlinear tolerances 10-3. CRAY J932se, 16 processors
©Alvaro LGA Coutinho 46/48
Real-life Basin in NE Brazil
©Alvaro LGA Coutinho 47/48
Final RemarksComputational Engineering and Science appear in many important engineering problems and mechanical systems There is no general approachHighly sophisticated techniques are needed to attain desired computational efficiencyNeed of more computer power to tackle challenging 3D problemsResearch in grid-computing, multiphysics, graph-reordering strategies, fast multipole methods, etc
©Alvaro LGA Coutinho 48/48
AcknowledgementsCollaborators: J. Alves, L. Landau, M. Pfeil, R. Battista, J. Telles, F. Ribeiro (COPPE), P. Sampaio (IEN), U. Mello (IBM), G. Carey (UT-Austin), T. Tezduyar (Rice)Students (and ex): M. Martins, M. Cunha, R. Sydenstricker, L. Catabriga, C. Dias, A. Valli, P. Hallak, I. Slobodcicov, P. Antunes, D. Souza, P. Sesini, A. Silva, R. Elias. A. Mendonça, W. NeyFunding: CNPq, CAPES, FINEP/CTPetro, ANP, PetrobrasComputational Resources: NACAD, Cray, SGI