hpc trends : opportunities and challenges · hpc trends : opportunities and challenges françois...

HPC Trends : Opportunities and ChallengesFrançois Robin, GENCI and CEA, PRACE/WP7 leader

PRACE Industrial Seminar, Amsterdam, September 3, 2008

2

Outline

• Introduction

• HPC [hardware] trends

• Challenges ahead

• PRACE & HPC trends

• Conclusion

3

Introduction• Computer simulation is essential:

– For scientific discovery and for addressing societal challenges– For competitiveness of industry

��

�� !"#�� $��

�� %��!��!�� # ��&��!"#�� '�� '��#��

��

()��

4

Introduction• Computer simulation is essential:

– For scientific discovery and for addressing societal challenges

– For competitiveness of industry

Cerfacs

CE

A/D

SV

DassaultA

viation

• Progress will, in most cases, depend on being able to perform larger simulations (or faster simulations)– General trend: multiscale and multi physics simulations

• This can be achieved with:– Larger supercomputers (more compute power, more

memory)+ storage, visualisation, …

Topic of this presentation

– Simulation codes and tools able to take advantage of such systems– Experts in HPC, numerical analysis, algorithms, applications, and in

the phenomenon to be simulated working together

5

A long history of continuous increase of supercomputer performances

Warning: This is Linpack performances (close to peak performances), not performances of real applications.- the performance of a real application is usually much lower than the Linpack performance - algorithmic improvements contribute often a lot to the improvement over time of performances of real applications

ASCI Red1 Tflops4 500 processors

LANL RoadRunner1 Pflops≈ 13 000 cores + 100 000 SPE

CRAY XMPCRAY 21 Gflops4 processors

*+,-

*+,.

6

0,01

0,1

1

10

100

1000

10000

100000

1000000

1975 1980 1985 1990 1995 2000 2005 2010

Some CEA production systems

/��

/�� 0�1""

2��

#�� 21"

#�� 21"3��

%��

Gflo

ps(p

eak)

4"

2"5 1 �� 5 1""�� 5 4��

5 #�� 5 6�� 1"�5 !��7��8 �� 7��5 9 �� 7��:�;5 $� �� 5 �� <"'��

7

Complementary and interconnected centres

Tier-0

Tier-1

Tier-2

..…

#�� 21"=)6�2<��*.>�)�� =�

#�� 21"=96� �?1�@>�)�� =�1""� �?1�*->�)�� =�

/�� 6A#�*�B�)�� =�

#�� 21"=)6� ?'CC�*->�)�� =�:0�D>>�<"')�� =��2";/�� 6A#�D�)�� =�

8

Outline

• Introduction




• Conclusion

9

Main HPC trends - Findings from PRACE meetings with Top 50 system manufacturers (Feb. 08)

• Several factors have an important impact on computer architecture:

– Performance is never enough while processor speed reaches a limit– Power consumption has become a primary concern– Network complexity/latency is a main hindrance

– There is still the memory wall

• Different architectures are scalable to Petaflop/s in 2009/2010– MPP, Cluster of SMP (Thin-nodes / Fat-nodes), Hybrid system (coarse, fine grain)– None of them is likely to be optimal for all applications– For a specific architecture, no configuration (number of nodes, memory size,

topology of the interconnect ... ) is likely to be optimal for all applications

• The evolution of technology (constraint by acceptable power consumption), will lead to the need of a very high level of parallelism to reach 1 Petaflop/s

• The memory hierarchy gets more complex with widening gaps

��

��

�� :�� ;�� =� �

10

Interconnection network(s)Interconnection network(s)

Architecture: A generic view

Core

Shared memory

Core Core

Acc

eler

ator Core

Shared memory

Core Core

Acc

eler

ator

Com

pute

node

sIO

and

ser

vice IO node IO node Services node

11

Interconnection network(s)Interconnection network(s)

Core

Shared memory

Core Core

Acc

eler

ator

Com

pute

node

sIO

and

ser

vice IO node

Low power µpEx: IBM/BG/P,

≈ 3,5 Gflops/core

Commodity µpEx: Intel/Xeon,

≈ 10 Gflops/core

High performance µpEx: IBM/Power6, ≈ 20 Gflops/core

Vector processorsEx: NEC SX9, ≈ 100

Gflops/core

Accelerators (GPU, Cell, FPGA, …)Ex: AMD FS 9250, ≈ 200 Gflops DP ≈ 150W

Ex: IBM PowerXcell 8i, ≈ 100 Gflops DP ≈ 90WEx: ClearSpeed CSX700 ≈ 100 Gflops DP ≈ 25W

Commodity thin-nodes≈ 2-4 sockets / node

High performance fat-nodes > 2-4 sockets / node

SSD storage≈ 10s GB

SAS disks≈ 1 TB

SATA disks>≈ 2 TB

Software RAID Intelligent storage

Commodity interconnectEx: IB-DDR or QDR

Specific interconnectEx: SGI/NL, IBM BG/x, CRAY XTx

Ingredients for a Petaflop/s system in 2009/2010

��

��

� �

��

��

��

��

:�E7�?

'CC=?

$2��

��

� ��F

��2 ��

;

12

Evolution of processors

Pro

gram

mab

ility

Performance

Single core Multi-core Many-

core

GPU

CPU

Bas

ed o

n an

Inte

l pre

sent

atio

n at

SIG

GR

AP

H 2

008

CPU• Evolving toward multi-

core• Motivated by energy-

efficient performance and by limitation of ILP

• Trend: 2x #cores every 18 months

2007/20084 cores

2/3.2 Ghz

2008/20092-8 cores

2006/20072 cores

1.8/3 Ghz

2005/20062 cores

2.5/3.7 Ghz

Richard D

racott, INTE

L, ISC

2008R

andy Allen, A

MD

, May 2008

20084 cores

201012 cores

20052 cores

20096 cores

13

Evolution of Accelerators

Pro

gram

mab

ility

Performance

Many-

core

GPUFPGA

CPU

Fully programmable

Partially programmable

Fixed function

Bas

ed o

n an

Inte

l pre

sent

atio

n at

SIG

GR

AP

H 2

008

Outstanding performance/price and performance/electricity ratio for well suited and programmed applications

6��)��)*>*�)� ��2"�G>�*�)� ��4"

$14�9��2��+D.>�*�)� ��2"��>�D�)� ��4"

�?1�" ��H#�,G>�*�)� ��4"

#��2��#2HI>>�>�*�)� ��4"

GPU• Evolving toward general-purpose computing• Addressing the HPC market (data-parallel

programming)FPGA• Less flexible but best performance/watt

14

Integration of accelerators into compute nodes

Pro

gram

mab

ility

Performance

CPU

Bas

ed o

n an

Inte

l pre

sent

atio

n at

SIG

GR

AP

H 2

008

GPUFPGA

Goal: reduce overhead by speeding-up/limiting

data transfers

AMD• Torrenza initiative: use HyperTransport

connectionINTEL• Plug accelerators into processor sockets

Goal: reduce overhead by speeding-up/limiting data transfers

,�<?=�*>>��

.�<?=�� I.>��:E*@�"#��;

15

Future trends (1/2)• Processors and accelerators

– Hybrid-multicore• Large/small cores• Cores and accelerators, …

– Many-core• Large number of small cores on a chip • With possibly specific hardware (graphic operations)

• Memory:– Possible future contenders

• Magnetic RAM (MRAM): fast, permanent• Z-RAM: fast, very dense

– Closer integration between processor and memory• PIM• 3D stacking or flex cables

��BD:J;�E,@��

�� :*@��;

D>>+=D>*>;

PackagePackage

DRAMDRAMCPUCPU

HeatHeat--SinkSink

16

Future trends (2/2)• Increased of optics interconnect

– Optical interconnect– Silicon photonics - integration of photonics on chip– Products from Luxtera and Lightfleet

• Increasing importance of data and IOs– Towards data-centric computing centre– Distributed and parallel file systems– Data integrity (ZFS)

• Heterogeneous configuration– Nodes– Interconnection network

# � ��

��

17

Outline

• Introduction




• Conclusion

18

Typical configuration of a Petaflop/s systems in 2009/2010

• 10 Gflops/core• 6-8 cores/socket • 2-4 sockets/node

• 100 000 cores• 1000's of nodes• Possibly accelerators

– Coarse (vector-scalar)– Fine (GPU, Cell, FPGA, ..)

• Possibly heterogeneous– Node configuration– Network architecture– …

• 80 to 300 m2• 2 to 10 MW• Mostly water cooling

• Floor space doesn't include:– Space for storage systems (disks / tapes)– Space needed for installation

(unpacking/testing/…) and during the installation of a new system

– Space for electrical and mechanical rooms

• Electricity doesn't include:– Storage and other peripheral systems– Cooling, loss in power supply (UPS, …)

19

Challenges

• Major challenges– Performance (computation and

IO)• Single processor performance

– TCO (Total cost of ownership) • Electricity

– Programmability– Scalability– Reliability

• Some ways to address these challenges

– Programming Petaflop/s systems– Cost of electricity– Reliabiliby

– The PRACE approach– Area of interest of PRACE prototypes

• Accelerators• Many-cores• Languages for programming

accelerators • Parallel languages• Advanced IO• Low power systems

20

Programming Petaflop/s system (1/2)

• Accelerators:– Accelerators are potentially faster than processors because they give to the users

complete control over:• Scheduling: Multiple data-parallel units• Transfer of data between memories and caches

– Specific languages: CUDA for Nvidia, Clearspeed SDK, …– Some languages are trying to target several accelerators and multicore chips:

CAPS/HMPP, RapidMind, OpenCL (Apple + AMD),

• Parallel programming languages:– OpenMP and MPI– PGAS languages gaining acceptance and performance with hardware support:

• Simpler distributed memory programming• CAF (co-array Fortran - included in Fortran 2008 standard) and UPC (Unified Parallel C)

– tab(i)[j]: element i of tab on processor j

– Longer term: DARPA/HPCS: Chapel (CRAY), X10 (IBM), [Fortress (SUN)]

21

Programming Petaflop/s system (2/2)• Tools:

– Debuggers: Totalview, DDT, …– Profilers: OPT, OpenSpeedShop, …

• Libraires– Application: mathematical, …– Run-time (threads, …)

• ISV applications: strong trends towards parallelism– PRACE is willing to cooperate with ISV to foster this trend

• Challenges:– Parallelism: How to scale to 100 000 ways ?– Dealing with a complex memory hierarchy– Complexity and heterogeneity– Portability and durability of applications– Training and education

22

Cost of electricity

Source : NusConsulting Group, Study on the cost of Electricity in Europe in 1997, May 1997

Cost of 1 MW-year in 1997

- €

100 000 €

200 000 €

300 000 €

400 000 €

500 000 €

600 000 €

700 000 €

800 000 €

900 000 €

Cost of a MW-year 860 000 € 830 000 € 730 000 € 680 000 € 560 000 € 460 000 €

Germany Netherlands UK SpainFrance (marché

dérégulé)France (marché

régulé)

Cost of 1 MW-year / CEA computing center

- €

100 000 €

200 000 €

300 000 €

400 000 €

500 000 €

600 000 €

700 000 €

800 000 €

900 000 €

2005 2006 2007 2008 2009 2010 2011 2012

23

Energy efficiency• IT equipments

– "Green" (power efficient) components: processors, disks, power supplies, fans, …– Power chain efficiency: avoid several voltage conversion– Water cooling: cooling doors or direct cooling of electronic components

• Computer centre (PUE = Total_Facility_Power/IT_Electrical_Power)– Power supply

• Power efficient UPS (5-10% power loss)• UPS only for critical elements

– Cooling system• Power efficient chillers• Increase of operation margins (temperature/hygrometry)• "Free" cooling using outside air• Use of heat produced to heat offices

24

Pow

er e

ffici

ency

of s

yste

ms

in th

e To

p50

0

50

100

150

200

250

300

350

400

450

500

LANL (

IBM)

IBM (I

BM)EDF (I

BM)ANL (

IBM)

FZJ (IB

M)

IDRIS

(IBM)

Univ. U

mea (I

BM)

TOTAL (SGI)

FZJ (IB

M)

LLNL (

IBM)

IBM (I

BM)BNL (

IBM)

Renss

aler I

ns. (

IBM)

R-Sys

tes (D

ell)

MHPCC (Dell

)

TACC (Dell

)

NMCAC (SGI)

NCSA (Dell

)

Univ. T

suku

ba (A

ppro

Intl)

Univ. K

yoto

(Fuji

tsu)

Univ. T

okyo

(Hita

chi)

TATA (HP)

BSC (IBM)

CECMWF (I

BM)

NCAR (IBM)

Gov. (

HP)

Univ. G

dans

k (ACTIO

N)

Univ. M

osco

w (T-P

latfor

ms)RZG (I

BM)

CEA (Bull

SA)

DoD A

SC (SGI)

ECMWF (I

BM)

TACC (Sun

)

ARL (Lin

ux N

etwor

x)LR

Z (SGI)

ERDC MSRC (C

ray)

ERDC MSRC (C

ray)

Univ. B

erge

n (Cra

y)

ORNL (Cra

y)

NASA (SGI)

GSIC (N

EC/Sun

)

SNL (Cra

y)NSC (H

P)

LLNL (

IBM)

LBNL (

Cray)

LLNL (

Appro

Intl)

CEA (Bull

SA)

SNL (Dell

)

Univ. E

dinbu

rg (C

ray)

Earth

Simula

tor (N

EC)

IBM Cell

IBM BG/P

IBM/Xeon

IBM BG/L

SGI/Xeon

Warning: The electricity consumption depends greatly on memory and disk configuration while performance of some systems may not be usable for all applications (accelerators)

Recommendation: In the context of a procurement, use TCO as selection criteria

Examples of electricity cost (5 years life time, 0.7M€/MW-year):- LANL (Cell): 4.2 M€- Earth Simulator: 22 M€

25

Reliability• What is important is the availability of the resources from the user point of view

(computing and data)– IT equipments– Infrastructure

• Reaching a good availability ratio – Avoid failures

• Reliable components• Monitoring / preventive maintenance - repair failures before they are visible to the users

– Make the failure of a component transparent• Redundancy / fail-over

– Limit the impact of a failure to a subset of the system– Use checkpoint-restart to reduce the impact of a job crash

• Application checkpoint-restart• System chekpoint-restart with migration (not there yet )

• There are always faulty components in a large supercomputer !

5 A��

5 :'�;�� 8 ��

26

Outline

• Introduction




• Conclusion

27

PRACE and HPC trends• For PRACE, understanding the major HPC trends is of critical

importance for both:– taking advantage of future promising technologies and architectures– playing an active role in the European HPC ecosystem and its relationship with

international high end initiatives– interacting with vendors and fostering their presence in Europe

• Main actions:– Meeting with vendors (WP7+WP8) : system manufacturers (Feb 08), processors

and accelerators (Sep 08), Network and IO (Oct 08), …– Prototypes: 2009/2010 target (WP7/WP5), beyond 2010 target (WP8)– AHTP (WP8): Advanced HPC technology platform– Survey of HPC centres and installation requirements for Petaflop/s systems– Exchanges between PRACE partners

28

PRACE Prototypes: Essential tools for preparing the design and the deployment of the future HPC infrastructure

• Initial deployment of the PRACE infrastructure (2009/2010)

• Match current and future application requirements with vendor roadmaps

• Ensure an easy integration of the future supercomputers: – in the PRACE computing centres – in the PRACE infrastructure

• Six prototypes selected and will be deployed in the next months

• Preparation of the evolution of the PRACE infrastructure after 2010

• Early evaluation of technologies and architectures

• Interact with vendors on the basis of a common set of European requirements

• Opportunities of higher performances with evolutions of applications:– Programming models and languages– Models and numerical kernels

• Selection is in progress

Coordinated selection and the experimentation of the prototypesand feedback towards vendors

29

Outline

• Introduction




• Conclusion

30

HPC trends: opportunities and challenges

• Future leading edge supercomputers will provide a greatly increased performance and will be fantastic tools for research and industry

• PRACE is working on the challenges that will be necessary to address in order to be able to influence and to make the best use of future architectures and technologies

• PRACE gathers the best competencies of European HPC centres. Thecollaboration between computing centres, sharing of experience and work in common is key to the success of PRACE

• PRACE and the European HPC ecosystem would benefit from extending this collaboration to HPC industry and industrial HPC users

• This will contribute to strengthening this HPC ecosystem which is of strategic importance to the competitiveness of Europe.

31

Credits

Aad van der Steen, Stéphane Requena, Alain Lichnewsky, Jean-Philippe Nominé, Jean Marie Normand, Thomas Eickermann, Hervé Lozach, Jacques-Charles Lafoucrière, Jean-Pierre Delpouve, Jacques David, ..

32

About PRACE

http://www.prace-project.eu/

The PRACE project receives funding from the EU's Seventh Framework Programme (FP7/2007-2013) under grant agreement

n°RI-211528.

hpc trends : opportunities and challenges · hpc trends : opportunities and challenges françois...

Documents