1 the swiss initiative for high-performance computing and networking neil stringfellow, associate...

31
The Swiss Initiative for High- Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Upload: arnold-douglas

Post on 30-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

The Swiss Initiative for High-Performance Computing and Networking

Neil Stringfellow, Associate Director CSCS

Page 2: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Centro Svizzero die Calcolo Scientifico (CSCS)Swiss National Supercomputing Center

Established in 1991 by the Swiss Government as autonomous unit of ETH Zurich

Located in Manno, near Lugano Highly qualified, internationally

recognized staff (41 FTE) Develops, promotes, and

provides leading-edgehigh-performance computingservices to the Swiss researchcommunity 400 users working on 50 projects (status 2009)

Hosting and operating on behalf of Meteo Swiss the supercomputer foroperational weather forecasts (8 simulations per day, first country to run high resolution weather forecast in Europe)

Page 3: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Distribution of compute time in 2008 by application areas

26%

20%

16%

10%

9%

8%

8% 3%

Benutzerstatistik nach Forschungsgebiet

Earth and Environmental Sciences

Chemistry

Physics

Materials Science

Biosciences

Astronomy

Fluid dynamics

Nanoscience

Page 4: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Distribution of compute time in 2008 by institutions

59%

13%

9%

6%

5%4% 2% 1%1%

Benutzerstatistik nach Institutionen

ETHZ

PSI

UNI-ZH

MCH

EPFL

UNI-BA

UNI-GE

EMPA

UNI-BE

Page 5: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

5

The national HPCN strategy

Issues HPC is a key requirement for leadership science as well as for a

knowledge based society and industry The international competition HPC is accelerating (USA, Japan,

D, F, UK, E, China and India) Economy of scale is a basis of HPC

Answers Installation of a Petaflop/s computer by 2011/2012 Construction of a new CSCS building Creation of a Swiss competence network to connect existing

application areas and reach out to new ones

Page 6: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Stimulus Package

Swiss Stimulus Package

700 Million CHF

2% of all Stimulus Money went to CSCS !

3.5 Million for Building Planning

10 Million for New Machine 3 Million for HPC

Education

Page 7: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Cray XT5 – Monte Rosa 14,752 processors

1844 eight-way nodes 2 AMD 2.4 GHz “Shanghai” Opterons per node

Upgrade underway to 2.4 GHz “Istanbul”

Peak performance 141 Tflop/s Linpack 117 Tflop/s Peak will be 212 Tflop/s after upgrade

29 Terabytes of memory 16 Gigabytes per node

2 Gbytes per processor core

287 Terabytes of scratch file system ~ capable of 12 GB/s sustained write bandwidth

23rd on Top500 list in June 2009 4th most powerful system in Europe

Already at 90% Utilisation ~ 30% of jobs require > 50% of machine

7

Page 8: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Pillars of 21. century scientific method

Theory (since antiquity)

combined with experiment (since Galilei & Newton)

and simulation(since Metropolis, Teller, von Neuman, Fremi, ... 1940s)

Excellence in Science requires leadership in all three areas: theory, experiment, and simulations

Page 9: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Invest in algorithms or computer hardware?

19701975198019851990199520001101001000100001000001000000

1E71E81E9

1E10

relative performance

computer speed19701975198019851990199520001101001000100001000001000000

1E71E81E9

1E10

relative performance

computer speed19701975198019851990199520001101001000100001000001000000

1E71E81E9

1E10

relative performance

computer speed

(source: David Landau, UGA)

Page 10: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Simulations are necessary for scientific investigations to cope effectively with complex systems

Science is about discovery and understanding - those who come first get the credit

Simulations that use high-performance computing (HPC) have the competitive edge

Leadership in science requires leadership in simulation and leadership in HPC in particular

Page 11: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Role of science in Switzerland: why we are well positioned to make leading contributions to HPC

Switzerland puts a high value on scientific research and education and on maintaining international leadership in science and engineering

The density of internationally recognized computational scientists in Switzerland is very high, even when compared to the USA

Stable funding and flat hierarchies in Switzerland and particularly at ETH allow for a pragmatic, solution-oriented, and nimble response to new challenges and opportunities

Page 12: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Top 200 in Shanghai List

Computational Science in Switzerland

The density of internationally recognized computational scientists in Switzerland is very high

ETH ZurichEPF Lausanne

University of ZurichUniversity of BaselUniversity of BernUniversity of Geneva

EMPAPaul Scherrer Institute

CSCS User Community

Page 13: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland

Potential Vorticity streamers are intrusions of stratospheric air into the troposphere. They affect various atmospheric processes, like heavy precipitation over the Alps.

ECHAM-HAM high-resolution simulations reliably capture the frequency at which potential vorticity streamers occur. Low resolution simulations underestimate their occurrence.

(master thesis A. Béguin, ETH Zurich, 2009)

absolute streamer occurrence on 330 K during winter (DJF)2.8 x 2.8

(T42L19) 1.9 x 1.9(T63L31)

1.1 x 1.1(T106L31)

reference data (ERA40, 1x 1)

Predicting the frequency of severe weather events in a changing climate:

high-resolution simulations are crucial

CSCSSwiss National Supercomputing Center

Page 14: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

0m

2000m

1500m

1000m

500m

sea

1.1 x 1.1(T106) land / sea distribution and terrain height

INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland

2.8 x 2.8 (T42) 1.9 x 1.9(T63)

Europe in ECHAM-HAM

high-resolution required to:1) provide boundary conditions for nested regional model2) compare model with regional scale observational data

for example: Italy, the Alps, or Denmark are missing at low resolution, 2.8 x 2.8

CSCSSwiss National Supercomputing Center

Page 15: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Why resolution is such an issue for Switzerland70 km 35 km 8.8 km

2.2 km 0.55 km

1X 100X

10,000X 1,000,000X

Page 16: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland

The Alpine area is very vulnerable to changes in the water cycle such as droughts, heat waves, and floods. Current projections of future changes in summer precipitation are highly uncertain.

Advantages of cloud-resolving climate models: (1) Better representation of the land surface, (2) Explicit representation of heav precipitation (e.g. thunderstorms).

Better representation of the daily cycle of precipitation in summer periods (Hohenegger et al. 2008, MZ).

High-resolution cloud-resolving regional climate simulations: Towards improved simulations of the water cycle in a changing climate

CSCSSwiss National Supercomputing Center

Cloud resolving @ 2.2km

State-of-the-art @ 25km

Page 17: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCEETH ZURICH, Switzerland

CSCSSwiss National Supercomputing Center

Terrain height in the regional climate model at different resolutions

Page 18: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Importance of HPC for modelling other Natural Hazards in Switzerland

Climate and Weather

Avalanches

Energy

Astrophysics

Engineering

Earthquakes

•In 1356 Basel was destroyed by an Earthquake.•We now know that large earthquakes are more frequent than previously thought•Earthquake modelling is important for planning nuclear power plant safety

Page 19: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Selected application areas for simulation based science and engineering in Switzerland

Climate and Weather

Materials science

Chemistry/Pharmaceutical

Biomedical

Energy

and many others

Astrophysics

Engineering

Page 20: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Simulations require a high-performance computing ecosystem

Local/institutional capacity computing

Capability computing at

regional/national centers

Leadership

1. Prior to 2004:VASP code developedon workstations and clusters and runson about 100 proc.

2. Scale-out 2005:Algorithm and implementation adapted for leadership systems

4. Large simulations since 2008:Continue large simulations on capability systems

3. Leadership runs 2006-2007:Production runs on leadership Cray XT3/4 system(~5000 processors)

Page 21: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Strategic goals

In order to sustain a leading position in science, Switzerland has to develop leadership in HPC to support simulations, one of the three pillars of modern science

Sustainable implementation of the HPC ecosystem in Switzerland, which includes the national supercomputing center, institutional computing facilities, as well effective mapping of models and methods onto modern HPC hardware

Establish strong relationships with leadership computing facilities around the world

Develop key components of HPC in Switzerland Method and algorithm development Programming models, languages, and architectures for HPC Sustained operations of national and institutional HPC systems

Page 22: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

The ecosystem in numbers (peak performance)2011 (planned)

20 PFlop/sSequoia (BG/Q) @ LLNLLCF3: Argonne or Oak Ridge

Local/institutional capacity computing

Capability computing at

regional/national centers

Leadership

2009 (today)

1.5 PFlop/s

300 TFlop/s

60 TFlop/s

5x

5x

Jaguar @ ORNL/LCF200 XT5 cabinets

Rosa @ CSCSonly 20 XT5 cabinets => 210 TFlop/s (infrastructure limited)

EPFL: ~60 TFlop/s UZH: ~60 TFlop/sETHZ: ~70 TFlop/s

4 PFlop/s

5x

Will require new building infrastructure at CSCS

800 TFlop/s

5x

Think about this now!

Page 23: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Elements of the Swiss HPCN Initiative

Swiss Platform for HP2C (2009-12): Simulation capabilities that make effective use of next generation

supercomputers Establish HPC in CSE programs at Swiss universities

Hardware Phase I (2009-11): Upgrade Cray XT system at CSCS to maximum possible within current

infrastructure Develop new building infrastructure by 2012:

State of the art infrastructure that support a machine footprint that is about a factor 10 larger than today

Hardware Phase II (2012-15): Goal for CSCS is to host systems with performance of 20-25%

compared to largest leadership system in the world

Page 24: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Experiences with upgrade in 2009 Implemented in record time!

March: financing, decision & placement of order

February through April: site preparations

May - Installation June: early users & acceptance July: part of CSCS user program

CSCS at maximum of current building capacity Current power usage 1.9 MW (99% of

capacity) Running at maximum cooling capacity

(frequent system shut-down in summer) Abandon memory upgrade in fall 2009 No room to further grow computer

systems in the future

Page 25: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Textmasterformate durch Klicken bearbeiten

New building planned in Lugano

Area (1500 m^2) Power & cooling ~ 10 MW Proximity to academic institution Facilitate seamlessly changes in

computer hardware Extensible

Page 26: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Simulations

Models,Methods,& Implementation

Map to Hardware

System operation

System design

Learning from the Oak Ridge experience: Covering all aspect of the simulation system

Physics (chemistry, ...)

Application software

Comp. mathematics

Computer Science

Computer Center

Hardware vendor

Applied researchCSCS & USI

CSE at universities

Example based on ORNL’s early science teams that run on the first petaflop/s systems

vendors

Users

CSCS’s (HPC Centers) traditional role

Distributing the tasks in Switzerland:

Systems research CS Dept.& vendors

Page 27: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

The Swiss platform for High-Performance and High-Productivity Computing ( ) Develop simulation capabilities that will make effective use

of supercomputing platforms in 2012-14 Implement the “networking” part of the HPCN strategy

Core program in computational mathematics and problem oriented computer science (jointly between CSCS & University of Lugano)

About 10-15 domain science sub-projects at Swiss universities with ~3 “embedded” HPC developers per project

Explore future hardware architectures with industry (Cray, IBM, other) and lading laboratories (ORNL, NERSC, others)

Develop HPC components of computational science and engineering curricula at Swiss universities Already established: CSE at ETH, U. Basel Currently under development: CSE @ USI, EPFL, UZH Reach out to other universities

Page 28: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Projects have to face “brutal facts of HPC”

Massive concurrency: applications will have to put up with millions (billions) of threads

Less and (relatively) slower memory per thread: memory consideration should be integral part of complexity analysis

Only slow improvements in inter-processor and inter thread communications - remember that speed of light is constant!

Stagnant I/O subsystems: you don’t want to limit progress in simulation capabilities with rate of progress in long-term storage technologies

Resilience and fault tolerance: resilience towards failure of individual components; (energy) cost to error detection and correction is non-negligible

Page 29: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

Expected research priorities of projects

Significant problems that require orders of magnitude more computer power than what is available today

Significant re-engineering of algorithms and refactoring of codes - scientific progress cannot be limited by legacy software

Consider emerging parallel programming models - multiple levels of parallelism, PGAS, DARPA HPCS languages, heterogeneous nodes (consider CPU + accelerator)

Revisit workflows, in particular to avoid I/O

Letters of intent were due August 15, 2009Project proposals were due September 30, 2009Review and decision making process in October/November 2009Tier 1 projects start in Dec./Jan. 2009Tier 2 projects start ca. spring/summer 2010

Page 30: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

CSCS service portfolio

Business Services

•Administration•Human resources•Finance•Building Infrastructure•IT Infrastructure

Business Services

•Administration•Human resources•Finance•Building Infrastructure•IT Infrastructure

National Supercomputi

ng Service

•HPC Systems•System programming•Resource allocation•User support•User education & training•Short- to medium-term application support

National Supercomputi

ng Service

•HPC Systems•System programming•Resource allocation•User support•User education & training•Short- to medium-term application support

Scientific Computing

•Long-term application development support•Data analysis & visualisation•Experimental HPC systems

Scientific Computing

•Long-term application development support•Data analysis & visualisation•Experimental HPC systems

Research Computing Collocation

Service

•MeteoSwiss•CHIPP•Other hosting mandates

Research Computing Collocation

Service

•MeteoSwiss•CHIPP•Other hosting mandates

Internal support services

Technologytransfer

Core business:Academic HPC service and HPC research

CONFIDENTIAL

Page 31: 1 The Swiss Initiative for High-Performance Computing and Networking Neil Stringfellow, Associate Director CSCS

High-risk & high-impact projects of the

(www.hp2c.ch)

UpgradeCray XT514’752 cores Dual core upgrade

Cray XT33’328 cores Upgrade

Cray XT31’664 proc.

New procurementCray XT31’100 processors

2005

2007

2008

2009

2010

2011

2012

2013

2006 Hex-core upgrade 22’128 cores

Begin constructionof new building

“Final” upgradeCray XT5

Procurementnext generationsupercomputerHPCN initiative

New building