improvement of parallel samcef mecano using mumps...

23
This document is the property of SAMTECH S.A. Page 1 Improvement of parallel SAMCEF Mecano using MUMPS solver Jean-Pierre Delsemme, Masha Sosonkina

Upload: others

Post on 26-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 1

Improvement of parallel SAMCEF Mecano

using MUMPS solver

Jean-Pierre Delsemme, Masha Sosonkina

Page 2: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 2

Overview

� SAMTECH, who we are

� The past: History of MUMPS in SAMCEF

� The present: "Gigadof" challenge in Maaximus

� The future: use of PETSC and MUMPS for large

eigen-value problems

� Conclusions

Page 3: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 3

Who we are

� European Leader in CAE

� Development and engineering services

� Strong link with aerospace & aeronautic

� Active in automotive, wind turbine and

electric industry

Page 4: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 4

Our numbers

� Our turnover

• 18 millions € in 2006

• 23 millions € in 2008

� Our employees

• 250 persons

• 3 activities (SAMTECH, OPEN ENGINEERING, GDTECH)

• 12 branches in six European countries• 2 Asian branches (China, Japan)

� Our history

• First line of the FE code « SAMCEF » in 1967• First customer in aeronautic in 1977

• 9th subsidiary in 2009

Page 5: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 5

Standard Analyses in Structures Simulation

� Stress Analysis (linear and non-linear)

� Heat Transfer (conduction, convection,

radiation)

� Linear Dynamic analysis (free vibration,

harmonic, transient, random, fluid-structure

interaction)

� Transient Analysis (non linear simulation in

time domain)

� Rotor dynamics

� Optimization

On Metallic and composite structures

Page 6: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 6

� Thermo-elasto-visco-plastic analysis

� Simulation of thermal ablation process

� Wound composite structures

� Mixing 2D-axisymetrical and 3D models

� Mixing finite elements and mechanisms

� Openness and rapid adaptation

� User material

� Link to proprietary codes

Advanced Analyses in Structure Simulation

TEA Mecano

Page 7: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 7

Our references

无法找到该页无法找到该页无法找到该页无法找到该页

Page 8: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 8

SAMCEF Mecano

� Flow chart Input

Preprocessing

Problem assemblyK=Σkelement(q) G=Σgelement(q)

Problem solve: Kq=G

Convergence y/n

Results archive

Output

Page 9: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 9

SAMCEF v11 (2005)

Problem assemblyK=Σkelement(q) G=Σgelement(q)

Problem solve: Kq=GWrite K and G on file

csh –c mpirun MUMPS

Read q from file

Convergence y/n

Mumps …Mumps 3

Mumps 2Mumps 1

Page 10: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 10

SAMCEF v11 (2005)

� Airbus NL benchmarks, 3 large scale models

130 k elements, 260 k and 560k ( 3.624 Mdof)

� Use of external parallel solver MUMPS

� 8 h 0' elapsed time on HP

� "Despite significant efforts by HP (hundreds of CPU

hours, hardware replaced etc.), it has ‘so far’ only

been possible to run the 560k shell model using

Mecano" Airbus

Lsm560k = 10 x

Page 11: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 11

Preprocessing

Problem assembly

Problem solve: Kq=G

Preprocessing

Problem assembly

Problem solve: Kq=G

SAMCEF v12 (2007)

Input

Preprocessing

Problem assembly

Problem solve: Kq=G

Convergence y/n

Results archive

Output

Matrix assembled by SAMCEF and

provided as I,J,Val to MUMPS distributed by

processors.

Page 12: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 12

SAMCEF v12 (2007)

� In-depth software modification

� Integration of MUMPS solver

� Parallel treatment of elements (material law

integration, matrix and force update…)

0:00

1:12

2:24

3:36

4:48

6:00

7:12

8:24

9:36

1 proc /1 node

2 proc /2 nodes

3 proc /3 nodes

6 proc /3 nodes

Ela

pse

d t

ime

(h:m

m:s

s)V12V11

Composite skin stringer

separation

Page 13: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 13

Results archive

Output

Preprocessing

Problem assembly

Problem solve: Kq=G

Preprocessing

Problem assembly

Problem solve: Kq=G

SAMCEF v13 (2009)

Input

Preprocessing

Problem assembly

Problem solve: Kq=G

Convergence y/n

Results archive

Output

Results archive

Output

Page 14: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 14

SAMCEF v13 (2009)

� Parallel treatment of results

� Bottleneck for large model

� Indirect benefit du to the split of data-base

� Post processing adapted

• 4 750 000 DOFs

• 1 073 380 Elements

• Cluster of 6 PC Linux (Xeon)

• V12.1 � 53 H 39 Min

• V13.1 � 22 H 32 Min

Page 15: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 15

SAMCEF v14 (2011)

� Maaximus, the "Gigadof" challenge

� Calculate a fuselage

made of,

as much as possible,

identical single barrels

� Identification of new

limitations

� "standardisation" of

long integer version

Page 16: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 16

SAMCEF v14 (2011)

B1 Benchmark

• 3 sections of fuselage

• 13,956,620 dof's

• 1,891,176 shell elem.

• 2322011 elem. In total

• 7 h 24' on a cluster of 8 nodes,

up to 100% of prescribed load

• 5 time steps, 1 rejected,

19 iterations

• Intel(R) Core(TM) 2.67 GHz

• 12 Gb per node

Page 17: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 17

SAMCEF v14 (2011)

B1 Benchmark

• 4 sections of fuselage

• 18,580,217 dof's

• 2,521,568 shell elem.

• 3,092,810 elem. In total

• 28 h 30' on a cluster

of 10 nodes.

• 7 time steps, 5 rejected,

57 iterations up to 91% of the load

• Intel(R) Core(TM) 2.67 GHz

• 12 Gb per node

Page 18: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 18

Large eigen-value problem

� Find many vibration modes of large-scale problems.

� Typically need 1000 modes.

� Number of degrees of freedom (DOF) may be more

than 10 million.

� Algorithms:

• Generalized eigenvalue problem.

• Shift-and-invert in parallel.

• Direct sparse linear system solver.

� Development objective: Scalable code both in CPU and

memory.

Page 19: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 19

Parallel Software Architecture

� PETSc [Argone, USA] – linear algebra solver interface.

� SLEPc [Univ of Valencia, Spain] - eigensolver.

� MUMPS [INRIA, France] – linear system solver.

• Options: Pastix [INRIA, France], SuperLU [LBNL, USA].

� Matrix is always distributed.

� Various partitioners may be used:

• (Par)Metis, block, (pt)Scotch

Page 20: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 20

Gains due to Parallel Implementation

� Multi-etage problem with cyclic symmetry.

• 5 MLN DOF, 402 MLN of entries.

• 456 modes requested.

� Gains compared with original (sequential) DYNAM.

• nonsymmetric MUMPS solver.

• 8 OpenMP threads in Intel MKL.Speedup MUMPS, %

20 12.0 4330 16.7 4540 18.8 5180 26.5 58

MPI procs

Page 21: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 21

Issues and Future Work

� Solve large-scale eigenvalue problems: 20+ MLN DOF.

• Symmetric MUMPS solver has been used.

• Maximum sparsity is extracted from the mass matrix.

• Sequential ordering runs out of space -> parallel ordering has been used.

Page 22: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 22

Conclusions

� Thanks to MUMPS's team.

� Several improvements of MUMPS since 2005 are used

in SAMCEF

• Out of core

• Possibility to provide user's allocated memory to MUMPS

• …

� Wish list

• Static condensation

� Extension to Mass matrix

• Intensive use of Lagrange's multiplyers

� Possibility to take it into account in the ordering ?

� Mode of elimination "deterministic"

cr1

ccrcrr*rr KKKKK −−=

cr1

cccc1

ccrc

cr1

ccrccr1

ccrcrr*rr

KKMKK

MKKKKMMM−−

−− +−−=

Page 23: Improvement of parallel SAMCEF Mecano using MUMPS solvermumps.enseeiht.fr/doc/ud_2010/Delsemme_talk.pdf · This document isthe propertyof SAMTECHS.A. Page 1 Improvement of parallel

This document is the property of SAMTECH S.A. Page 23

Thank you for your time.

Any questions ?