ansys hpc · public cloud cloud computing could provide cost-effective access to scaled-out elastic...
TRANSCRIPT
© 2011 ANSYS, Inc. 8/29/111
ANSYS HPC High Performance Computing Leadership
Barbara [email protected]
© 2011 ANSYS, Inc. 8/29/112
Why HPC for ANSYS? Insight you can’t get any other way
HPC enables high-fidelity • Solve the un-solvable• Be sure your design is
“right”• Innovate with confidence
HPC delivers throughput• Consider multiple design
ideas• Optimize your design• Ensure performance across
range of conditions
© 2011 ANSYS, Inc. 8/29/113
HPC has become a software issue.
• Clock Speed – Leveling off
• Core Counts – Growing
• Exploding (GPUs)
• Future performance depends on highly scalable parallel software
• ANSYS goal: lead the way into this future.
Source: http://www.lanl.gov/news/index.php/fuseaction/1663.article/d/20085/id/13277
© 2011 ANSYS, Inc. 8/29/114
HPC Deployment - Trends / Challenges
Local computing infrastructure for simulation is being replaced by centralized HPC resources, shared by a globally distributed workforce.
• Remote access, scheduling, and visualization tools
• Data sharing, IP protection, central data archiving
“Mega” Simulations for high-fidelity understanding
Design Exploration for improved product integrity
• (Intermittent) need for 1000’s of processors, terabytes of RAM
• Storage / data management
“Private Cloud” strategies for infrastructure
• Remote or on-site, 3P managed• Elastic capacity, rapid deployment, service
level agreements, CAPEX vs. OPEX
© 2011 ANSYS, Inc. 8/29/115
Performance / Software Milestones
• Scalability, GPUsCurrent HPC Practice
• Customer case studiesDeployment Trends
• ANSYS vision
Agenda
© 2011 ANSYS, Inc. 8/29/116
0 256 512 768 1024 1280 15360
2
4
6
8
10
12
2010 Hardware(Intel Westmere, QDR IB)
Number of Cores
RATING
0 2 4 6 8 10 120
2
4
6
8
10
12
2008 Hardware (Intel Harpertown, DDR IB)
Number of Cores
RA
TIN
G
Systems keep improving: faster processors, more cores• Ideal rating (speed) doubled in two years!
Memory bandwidth per core and network latency/BW stress scalability
• 2008 release (12.0) re-architected MPI – huge scaling improvement, for a while…
• 2010 release (13.0) introduces hybrid parallelism – and scaling continues!
ANSYS FLUENT Scaling Achievement
© 2011 ANSYS, Inc. 8/29/117
Extreme CFD Scaling - 1000’s of cores
Enabled by ongoing software innovationHybrid parallel: fast shared memory
communication (OpenMP) within a machine to speed up overall solver performance; distributed memory (MPI) between machines
© 2011 ANSYS, Inc. 8/29/118
Parallel Scaling ANSYS Mechanical
0 64 128 192 2560
50
100
150
200
250
300 Sparse Solver (Parallel Re-Ordering)
Number of cores
Sol
utio
n R
atin
g
0 16 32 48 640
500
1000
1500
2000
2500
3000
3500
4000 PCG Solver (Pre-Conditioner Scaling)
Number of cores
Sol
utio
n R
atin
g
Focus on bottlenecks in
the distributed memory
solvers (DANSYS)
– Sparse Solver• Parallelized
equation ordering
• 40% faster w/ updated Intel MKL
– Preconditioned Conjugate Gradient (PCG) Solver• Parallelized
preconditioning step
© 2011 ANSYS, Inc. 8/29/119
Architecture-Aware Partitioning
Original partitions are remapped to the cluster considering the network topology and latencies
Minimizes inter-machine traffic reducing load on network switches
Improves performance, particularly on slow interconnects and/or large clusters
Partition Graph3 machines, 8 cores eachColors indicate machines
Original mapping New mapping
© 2011 ANSYS, Inc. 8/29/1110
File I/O PerformanceCase file IO• Both read and write
significantly faster in R13• A combination of serial-IO
optimizations as well as parallel-IO techniques, where available
Parallel-IO (.pdat)• Significant speedup of parallel
IO, particularly for cases with large number of zones
• Support for Lustre, EMC/MPFS, AIX/GPFS file systems added
Data file IO (.dat)• Performance in R12 was highly
optimized. Further incremental improvements done in R13
Parallel Data write R12 vs. R13
BMW -68%
FL5L2 4M -63%
Circuit -97%
Truck 14M -64%
truck_14m, case read
© 2011 ANSYS, Inc. 8/29/1111
GPU Computing!
CPUs and GPUs work in a collaborative fashion
Multi-core processors•Typically 4-6 cores•Powerful, general
purpose
Many-core processors•Typically hundreds of cores•Great for highly parallel code,
within memory constraints
CPU GPU
PCI Express channel
© 2011 ANSYS, Inc. 8/29/1112
SolverKernel Speedu
ps
OverallSpeedup
s
From NAFEMS
World Congress May 2011
Boston, MA, USA
“Accelerate FEA Simulations with
a GPU”-by Jeff
Beisheim, ANSYS
ANSYS Mechanical SMP – GPU Speedup
Tesla C2050 and Intel Xeon 5560
© 2011 ANSYS, Inc. 8/29/1113
•Windows workstation : Two Intel Xeon 5560 processors (2.8 GHz, 8 cores total), 32 GB RAM, NVIDIA Tesla C2070, Windows 7, TCC driver mode
New: GPU Acceleration for DANSYS
0
1
2
3 R14 Distributed ANSYS Total Simulation Speedups for R13 Benchmark set
© 2011 ANSYS, Inc. 8/29/1114
0
2
4
6
8
10
R14 Distributed ANSYS w/wo GPU
Tota
l Spe
edu
p
ANSYS Mechanical – Multi-Node GPU
Mold
PCB
Solder balls
Results Courtesy of MicroConsult Engineering, GmbH
• Solder Joint Benchmark (4 MDOF, Creep Strain Analysis)
Linux cluster : Each node contains 12 Intel Xeon 5600-series cores, 96 GB RAM, NVIDIA Tesla M2070, InfiniBand
© 2011 ANSYS, Inc. 8/29/1115
First capability for physics outside core linear solver, with modest memory requirement: view factors, ray tracing, reaction rates, etc.
R&D focus on linear solvers, smoothers – but potential limited by Amdahl’s Law
GPU Acceleration for CFD
Radiation viewfactor calculation (ANSYS FLUENT 14)
© 2011 ANSYS, Inc. 8/29/1116
Performance / Software Milestones
• Scalability, GPUsCurrent HPC Practice
• Customer case studiesDeployment Trends
• ANSYS vision
Agenda
© 2011 ANSYS, Inc. 8/29/1117
HPC for Turbocharger Design
• 8M to 12M element models (ANSYS CFX)
• Previous practice (8 nodes HPC)– Full stage compressor runs 36-48
hours– Turbine simulations up to 72 hours
• Current practice (160 nodes)– 32 nodes per simulation– Full stage compressor 4 hours– Turbine simulations 5-6 hours– Simultaneous consideration of 5
ideas– Ability to address design
uncertainty – clearance tolerance
“ANSYS HPC technology is enabling Cummins to use larger models with greater geometric details and more-realistic treatment of physical phenomena.”
http://www.ansys.com/About+ANSYS/ANSYS+Advantage+Magazine/Current+Issue
© 2011 ANSYS, Inc. 8/29/1118
2005
2011
2009
20073 Millions of Cells(6 Days)
25 Millions(4 Days)
10 Millions(5 Days)
50 Millions(2 Days)
Increase of :
Ø Spatial-temporal Accuracy
Ø Complexity of Physicals Phenomenon
SupersonicMultiphaseRadiation
CompressibilityConduction/Convection
TransientOptimisation / DOEDynamic Mesh
LES CombustionAeroacousticFluid Structure Interaction
HPC for High Fidelity at EuroCFD
• Model sizes up to 100M dells (ANSYS FLUENT)
• 2011 cluster of 700 cores (expansions pending)– 64-256 cores per simulation
© 2011 ANSYS, Inc. 8/29/1119
Solder joint failure analysis• Thermal stress 7.8 MDOF• Creep strain 5.5 MDOF
Simulation time reduced from 2 weeks to 1 day
• From 8 – 26 cores (past) to 128 cores (present)
“HPC is an important competitive advantage for companies looking to optimize the performance of their products and reduce time to market.”
HPC for High Fidelity at Microconsult GmbH
© 2011 ANSYS, Inc. 8/29/1120
“HPC” on the Desktop
•Cognity Limited – steerable conductors for oil recovery
•ANSYS Mechanical simulations to determine load carrying capacity
•750K elements, many contacts
•12 core workstations / 24 GB RAM
•6X speedup / results in 1 hour or less
•5-10 design iterations per day
“Parallel processing makes it possible to evaluate five to 10 design iterations per day, enabling Cognity to rapidly improve their design.
http://www.ansys.com/About+ANSYS/ANSYS+Advantage+Magazine/Current+Issue
© 2011 ANSYS, Inc. 8/29/1121
Case study on the value of HW refresh and software best-practice
Deflection and bending of 3D glasses• ANSYS Mechanical – 1M DOF models
Optimization of:• Solver selection (direct vs iterative)• Machine memory (in core execution)• Multicore (8-way) parallel with GPU acceleration
Before/After:
77x speedup – from 60 hours per simulation to 47 minutes.
Most importantly: HPC tuning added scope for design exploration and optimization.
Desktop HPC at NVIDIA
© 2011 ANSYS, Inc. 8/29/1122
Performance / Software Milestones
• Scalability, GPUsCurrent HPC Practice
• Customer case studiesDeployment Trends
• ANSYS vision
Agenda
© 2011 ANSYS, Inc. 8/29/1123
Cloud computing puts renewed focus on an ongoing challenge:
How can we optimize remote access to HPC for ANSYS users?
Critical emerging issue for
• Public cloud (outsourcing of HPC)• Private cloud (internal remote access /
elastic provisioning)
Remote Access to Simulation / Cloud
© 2011 ANSYS, Inc. 8/29/1124
Level 0 : Local Computing• Pre / Solve / Post on local desktop system• Files stored locally under individual control• Inherent capacity limitations; also limits
collaboration and data management
Level 1: Remote Batch • Pre / Post on local desktop system• Batch solve conducted on ‘remote’ HPC
resource• Bottlenecks related to file transfer and
limitation s of local hardware (e.g., can’t postprocess large files)
• Remote access and job management also challenging.
Levels of Remote Access
© 2011 ANSYS, Inc. 8/29/1125
Level 2: Remote Interactive Workflow
Remote HPC Resource
DataStorage
Pre / Solve / Post / EKMMobileUser doing
“Full Remote Simulation”
Web Browser Access
Remote 3D Visualization
Server
Remote 3DVisualization
•Schedule / provision HW resources•Launch application•Launch remote visualization tool for interactive use•Access data (EKM)•Monitor job progress
• Full simulation process conducted via remote access (Pre/Solve/Post)
• Files reside at the HPC resource – for efficiency, enhanced data management, collaboration
• Feasible and implemented today by best-in-class
© 2011 ANSYS, Inc. 8/29/1126
Simulation Data Management
Web Browser (IE, Firefox, etc.)
Firewall
Desktop Application (ANSYS Workbench, VB, etc.) File Server
(http, ftp)
Relational Database(Mysql, Oracle, DB2, etc.)
Compute Cluster
Application Server(Jboss , etc.)
Content Management Repository (Jackrabbit, etc.)
ANSYS EKM
Storesmetadata
Executes simulationsand extracts data usinga batch system such as
RSM, LSF, SGE
Repository of all archived files and applications
Access results over the full product lifecycle. Re-use of previous simulations . Long-term archival storage.
© 2011 ANSYS, Inc. 8/29/1127
Interface to PLM / PDM
Pull functional requirements or geometry from PLM to ANSYS EKM
• Check update status from EKM
Return simulation results to PLM• Associated to the master model
ANSYS EKM Windchill
Teamcenter
M
atrixOne
Sm
arte am
EKM Datalink
Functional SpecGeometry, Attributes
Simulation Report
© 2011 ANSYS, Inc. 8/29/1128
Public Cloud
Cloud computing could provide cost-effective access to scaled-out elastic infrastructure
• Scale up - extreme problem size, data storage and backup• Surge capacity, intermittent workloads or users• Cost effective balance of CAPEX vs. OPEX
But challenges remain for• IP protection, export compliance and data sharing• Optimized use of on-premise vs. “remote” systems • Remote workflow (visualization, file transfer)
ANSYS focus: • Improved remote simulation workflow (private cloud) • Enable seamless use of 3P hosted cloud
– “Bring Your Own License” model
© 2011 ANSYS, Inc. 8/29/1129
Performance / Software Milestones
• Scalability, GPUsCurrent HPC Practice
• Customer case studiesDeployment Trends
• ANSYS visionSummary / Next steps
Agenda
© 2011 ANSYS, Inc. 8/29/1130
Summary / “Take Home” Points
High-Performance Computing can add great value to your use of ANSYS
• What could you learn from a 10M (100M) cell / DOF model?
ANSYS continues to focus on software development for HPC
• Mission Critical - in order to maintain scaling as the hardware ‘revolution’ continues
We believe that this focus will enable ANSYS to maximize your overall return on investment.
Optimizing IT deployment is a critical challenge• Getting started or scaling out
– Desktop / remote access, data management and archiving, job submission and resource management, etc
© 2011 ANSYS, Inc. 8/29/1131
Next Steps
ANSYS is committed to understanding your IT environment and deployment challenges
ANSYS (and our partners) can provide solutions today – and you can help drive our product strategy.
Questions/Comments: [email protected]
Thank You
© 2011 ANSYS, Inc. 8/29/1132