创新释放高性能计算潜力images.nvidia.com/cn/gtc/downloads/pdf/partners/606...node 1p-32p...
TRANSCRIPT
创新释放高性能计算潜力
林俊:华为服务器领域首席架构师
22
Market Trends
33
Requirement for Compute
1972 0.004 MIPS
1989
20 MIPS
Mobility
Cloud
Big Data
Security
2014 124,000 MIPS
2020
Millions of MIPS
Opportunity for Innovation
Internet of Things
Industry 4.0
Intelligent City
Traditional
Architecture
44
Computing Innovation slows down
Low Utilization High PowerFast Growth Not Secure
Past
The Doubling of Transistors are Slowing Down
Single Core Performance Increase
is Slowing Down
Multi Core Performance Limited by mdahl’s Law限制
Uneven Subsystem
Development
Now
55
The End of Moore’s LawTick Tock
Process Architecture Optimization
10 µm – 1971
6 µm – 1974
3 µm – 1977
1.5 µm – 1982
1 µm – 1985
800 nm – 1989
600 nm – 1994
350 nm – 1995
250 nm – 1997
180 nm – 1999
130 nm – 2001
90 nm – 2004
65 nm – 2006
45 nm – 2008
32 nm – 2010
22 nm – 2012
14 nm – 2014
10 nm – 2016
7 nm – 2018
5 nm – 2020
Covalent radius of Silicon Atom is 111 pm (0.111 nm)
66
Changes to CPU Power Consumption
77
Increase usage of AcceleratorsAdoption accelerated since 2010. Nvidia still dominates
Total performance share plateaued in past year, mainly due to life cycle
Accelerator-based system projected to dominate for next decade
88
Heterogeneous Architecture
Processors are moving toward specialization
Performance per Watt is becoming more important
• Heterogeneous CPUs can be more flexible, higher cost performance, and high power performance
• First used by storage systems• Internet server begin small scale deployment from 2013~2014
• Enterprise server application still lag behind 3~5 years
99
SolutionHuawei HPC Technology
1010
World Class HPC Solutions TODAY170+
Countries 2015 Revenue
16
R&D Centers
36Joint Innovation
Centers
79,000
R&D Engineers
Standalone Compute
Node 1P-32P
Modular HPC
Systems
NVMe SSD
HPC storage
Big Data
storage
Network FabricModular &
Container Data center
$63B
% of Revenue in R&D
14.2%
Huawei FusionServer
OceanStor CloudEngine
Reduce Complexity
More Performance / $
Design for Growth
HPC Private Cloud
》》
Petascale System
Direct Liquid Cooling
Workload Optimization
Ecosystem Partnership
》》
1111
Simplify HPC Systems TODAYMAXIMIZE EFFICIENCY ACCELERATE WORKLOAD
MAXIMIZE PERFORMANCE FOR
INDIVIDUAL WORKLOAD
Flexible, modular architecture
Multiple innovative form factors
Deep optimization with hardware
acceleration
Super Fat nodes
CONVERGED HPC & BIG DATA
MAXIMIZE HARDWARE ROI
Single HPC cluster and storage
system for both traditional HPC
MPI workload and Hadoop
Innovative big data analytic
appliance with deep hardware
and software optimization for
maximum cost effectiveness
MORE COMPUTE LESS SPACE
LOWER POWER CONSUMPTION
End-to-end energy efficient design
HVDC
High Ambient Temperature ~40oC
Direct Liquid Cooling ~ 84%
coverage
Tight integration with Huawei data
center infrastructure
SDS
Big Data
SDI
1212
HPC / IT Solutions for Tomorrow
Deep Learning
Enabling HPC Cloud
FusionInsight
Big Data
FusionSphere
Cloud OS
ManageOne
Management software
FusionStorage
Software Defined Storage Pool
〉〉〉〉
〉〉〉〉
〉〉〉〉
〉〉〉〉
Big Data Acceleration
DDR4
Next Gen CPU
GPU/FPGA
Accelerato
rHBM/HMC
GPU/FPGA
Accelerato
r HB
M/H
MC
New Heterogeneous CPU
New Memory Hierarchy
DDR4 DRAM as SCM Cache
DDR4-SCM-DIMM
SRAM
Cache
X86 CPU
SCM-SSD NVMe-
SSD HDD
HBA/RAID
HB
M
GPU
ME
MFPGA
New Technology Enablement Leverage Cross Disciplinary AssetsCreative Converged Solutions
1313
HPC Cloud Framework
- Open architecture
- Rapid deployment
- Efficient operation
- Demand based
- Maximize utilization
- Multi-tenents VPC isolation
- On-premise to cloud end-to-end
secured
Simulation Graphic Visualization
Technical Computing
Head NodeResource Pool Job SchedulerUser Login DB
Agile
Elastic
Secure
HPC Compute HPC Distributed Storage
Low Latency Networking
Security Isolation
Compute NodeGPGPU CPU IntenseGraphic Virtualized Memory Intense
Storage NodeObject Store HPC NASDistributed File System HPC Block
1414
Converged HPC & Big Data
1515
Builds A Leading Computing Platform With NVIDIA
Copyright©2016 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the
future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially from those expressed or implied in the predictive
statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.
THANK YOU
HPC Solutions
1717
HUAWEI HPC MomentumManufacturing CAE/CFD Education/Research/SupercomputingChip Design & Manufacturing
Oil & Gas ExplorationEnergy Production & Distribution Digital Media
1818
Industrial CAE Simulation
Vibration and noise
Crash & safety
Indoor acoustics
Static strength
NVH Electro-
magnetics CFD
Physical component Computing model Result obtaining Verification analysis
Processing before modeling Processing after analysisComputing resolution
Size model Computing analysis
Design Sample Verification Design Sample Verification Product…Planning
Computational fluid
dynamics (CFD)
Structural
mechanics
Electromagnetic
simulation
System
engineering
General development process disadvantages
Long development period
High design cost
Weak process control
Customers' main requirements
Short development cycle
Low cost
Intuitive analysis and controllable process
Industrial HPC cluster highlights
Application integration and optimization
Operation cost reduction
Large-scale cluster management
1919
Huawei CAE Simulation Solution
Applications
Industrial simulation
application scenarios
Hardware platforms
Cluster capabilities
Application optimization
centralized management
LS-DYNA
PAM-CRASH
Computing: X6800 & E9000
Network: IB EDR
High parallelism
100 Gbit/s Fluid mechanics analysis
FLUENT
STAR-CCM+ABAQUS
NASTRAN
Computing: 8100 V3 & KunLun
Storage: OceanStor V3
24TB large memory capacity
400GB/s high storage bandwidth
Bright Computing PARATERA IBM Platform Altair
Application optimization
Star-CCM+ test, performance up 30%
PAM-Crash test, performance up 10%
Energy saving
Cabinet- and board-level liquid cooling, PUE ≤ 1.1
45ºC warm water cooling, lower power consumption for heat exchange
Converged management
Software license and job scheduling, unified management
Hardware platform fault diagnostics, high reliability
Crash simulation
test
Fluid mechanics
analysis
Structure
analysis
simulation
2020
Open/Cooperative HPC Ecosystem
Hardware
Application Local partners
Software
2121
CAE Customer Success Stories
Volkswagen Builds Immersive Car Crash Simulation Test Platform with Huawei HPC
Saves 50% design costs and shortens the product development cycle from 3 months to 1 week.
HiSilicon Builds Chip Simulation Cloud Platform with Huawei HPC
Increases computing capability from 1 million grids to 10 million grids, improving computing efficiency by 5x.
Global Foundries Builds Chip Simulation Platform with Huawei HPC
Shortens chip design computing simulation time from 1 day to 1 hour.
Daimler Mercedes-Benz Builds Core Vehicle R&D Capabilities with Huawei HPC
Improves simulation efficiency by 50% and saves power consumption by 10%.
2222
HPC Computing Drives R&D
149 TFLOPS/cabinet, 64 CPUs per cabinet
100G network, proprietary EDR switching
technology
100 TFLOPS-level CPU computing capability
21.1 TFLOPS/chassis, 8 GPUs per chassis
50% density increase, 1U 4-socket ultra-high computing
density
10 TFLOPS-level heterogeneous computing capability
24 TB in-memory capacity per node
2084 GB/s memory data bandwidth per node
Fat node in-memory computing
Animation rendering and production
3DMax、Maya、Softimage
Weather forecasting, environment
monitoring, aviation simulation
WRF、MM5、CMAQ、CAMs
Gene sequencing and molecular
motion simulation
BLAST、FASTA、Gromacs、NAMD
Hardware platform Application optimization Benefits
X6800
E9000
KunLun
2323
Fast Massive Data Transmission
Compute node cluster
Storage node cluster
OceanStor 9000
IB/10GE/GE
… …
… …400 GB/s massive data bandwidth
100 GB file system, biggest in the industry
Huawei massive data storage solution
Smart teaching
management system
Electronic reading library
storage system
Periodical and paper
storage system
200 GB/s bandwidth
50 GB storage capacity per node
144 nodes per cluster
Common solution
3 to 288 nodes linear expansion
Hardware platform Application optimization Benefits
2424
University of Toronto
Up to 12 TB memory capacity per node, 5x the
ecosystem modeling computing requirement,
enough for long-term expansion
Shortens the 4D simulation computing result time
for the integrated watershed-receiving waterbody
model from 6 days to 4 hours
2525
Bibliotheca Alexandrina in Egypt
Deployment density improved 33%, and
overall system energy efficiency improved
10%
NAMD application performance delivered
in the cluster deployment test is 10%
higher than that required by the customer
2626
Huawei Warm Water Cooling Solution
Cooling system (including the primary
loop)
CDU system and cooling media
Secondary loop between the CDU system and cooling
cabinets
Huawei FusionServer
liquid cooling cabinetsAir conditioning system
Cabinet- and board-level warm water cooling integrated delivery
2727
Huawei Warm Water Cooling Solution
Integrated cooling loop component, low leakage
risk
Physical isolation of water flows from circuits, no
short-circuit risks
217 system verification test items, high reliability
Warm water cooling reduces TCO by
30% compared with air cooling.
>>> >>>
Cooling PUE ≤ 1.1
Up to 45ºC inlet water
80% warm water cooling
Air cooling TCO
Liquid cooling TCO
TCO ratio
2828
Poland PCSS
“Huawei’s liquid cooling HPC cluster helps PCSS significantly reduce hardware investments and TCO.
This year, PCSS and Huawei will further their cooperation by building a joint innovation center to
develop solutions covering computing, storage, and cluster architectures. The cooperation with Huawei
has enabled PCSS to become one of the most competitive HPC service providers in Europe.”
— Norbert Meyer,
Manager of HPC & Data Department, PCSS
1.37 PFLOPS, PUE < 1.2, top 100 supercomputing center worldwide
Warm water cooling reduces electricity consumption by 3.26 million kWh per equipment room every year,
lowering consumption by 40%+
Copyright©2016 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without limitation, statements regarding the
future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially from those expressed or implied in the predictive
statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.
THANK YOU