angèle simard canadian meteorological center meteorological service of canada msc computing update

30
Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update

Upload: reginald-randall

Post on 01-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Angèle Simard

Canadian Meteorological Center

Meteorological Service of Canada

MSC ComputingUpdate

2

Canadian Meteorological

Centre

National Meteorological Centre for Canada

Provide services to National Defence and Emergency organizations

International WMO commitments ( emergency response, Telecomm, data, etc…)

Supercomputer used for NWP, climate, air quality, etc… ; operations and R&D

3

Outline

Current Status

Supercomputer Procurement

Conversion Experience

Scientific Direction

Current Status

5

NECSX-532M2

NECSX-4/80M3

NEC SX-3/44R

Cray1SCDC

176

CrayXMP 416

CDC 7600

NEC SX-3/44

NEC SX-6/80M10

1

10

100

1000

10000

100000

1000000

1974

1976

1978

1980

1982

1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

MF

LO

PF

s (P

EA

K)

CrayXMP 28

MSC’s Supercomputing History

6

CMC Supercomputer Infrastructure

Supercomputer

High Speed Network

1000 Mb/s switched128 ports. Each host has links

to ops & dev nets

SGI O300: 8 PEs

1 TB4xFC

Central File Server

LSIRAID

ADIC AML-E (Tape Robot)

145 TB4 DST drives20 MB/s ea.

Front Ends

NetappF840.05 TB

LinuxCluster

12 DL380

NECSX-6/80M10

3.6 TB RAIDFC switched

128 ports

SGIOrigin 3000

7

Supercomputers: NEC

NEC

– 10 node

– 80 PE

– 640 GB Memory

– 2 TB disks

– 640 Gflops (peak)

8

0

20

40

60

80

100

120

140

160

180

200

1996 1997 1998 1999 2000 2001 2002

YEAR

TB

Archive Size

New data Written

Read

Archive Statistics

9

Automated Archiving System

22,000 tape mounts/month

11 Terabytes growth/month

maximum of 214 TB

SupercomputerProcurement

11

RFP/Contract Process Competitive process

Formal RFP released in mid-2002

Bidders need to meet all mandatory requirements to make it to next phase

Bidders were rejected if bids above the funding envelope

Bidders were rated on performance (90%) and price (10%)

IBM came first

Live preliminary benchmark at IBM facility

Contract awarded to IBM in November 2002

12

1.042

2.47996

5.89251

0

1

2

3

4

5

6

0 2.5 5

Initial, Upgrade & Optional Upgrade

Su

stai

ned

TF

lop

s

IBM

IBM Committed Sustained Performance in MSC TFlops

13

On-going Support

The systems must achieve a minimum of 99 % availability.

Remedial maintenance 24/7: 30 minute response time between 8 A.M. and 5 P.M. on weekdays. One hour response outside above periods.

Preventive maintenance or engineering changes: maximum of eight (8) hours a month, in blocks of time not exceeding two (2) or four (4) hours per day (subject to certain conditions) .

Software support: Provision of emergency and non-emergency assistance.

14

CMC Supercomputer Infrastructure

High Speed Network

1000 Mb/s switched128 ports. Each host has links

to ops & dev nets

SGI O300: 8 PEs

1 TB4xFC

Central File Server

LSIRAID

ADIC AML-E (Tape Robot)

145 TB4 DST drives20 MB/s ea.

Front Ends

NetappF840.05 TB

LinuxCluster

12 DL380

SGIO3000's24 PEs

(MIPS R12K)

Supercomputers

IBM SP -832 PE & 1.6TB

15

Supercomputers: Now and Then…

NEC

– 10 node

– 80 PE

– 640 GB Memory

– 2 TB disks

–640 Gflops (peak)

IBM

– 112 nodes

– 928 PE

– 2.19 TB Memory

– 15 TB Disks

– 4.825 Tflops (peak)

Conversion Experience

17

Transition to IBM

Prior to conversion:

Developed and tested code on multiple platforms (MPP & vector)

Where possible, avoided proprietary tools

When required, hid proprietary routines under a local API to centralize changes

Following contract award, obtained early access to the IBM (thanks to NCAR)

18

Transition to IBM

Application Conversion Plan:

Plan was made to comply to tight schedule

Plan’s key element included a rapid implementation of an operational “core” (to test mechanics)

Phased approach for optimized version of models

End to End testing scheduled to begin mid-September

19

Scalability

Rule of Thumb

To obtain required wall-clock timings:

Require 4 times more IBM processors to obtain same performance with NEC SX-6

20

Timing (min) Configuration NEC IBM NEC IBM

GEMDM regional 38 34 8 MPI 8 MPI x4 OpenMP (48 hour) (32 total)

GEMDM Global 35 36 4 MPI 4 MPI x4 OpenMP (240 hour) (16 total) 3D-Var(R1) 11 09 4 6 CPU OpenMP (without AMSU-B) 3D-Var (G2) 16 12 4 6 CPU OpenMP (with AMSU-B)

Preliminary Results

SCIENTIFIC DIRECTIONS

22

Global System

Now: Uniform resolution of 100 km (400 X 200 X 28) 3D-Var at T108 on model levels, 6-hr cycle Use of raw radiances from AMSUA, AMSUB and GOES

2004: Resolution to 35 km (800 X 600 X 80) Top at 0.1 hPa (instead of 10 hPa) with additional AMSUA and AMSUB channels 4D-Var assimilation, 6-hr time window with 3 outer loops at full model resolution and inner loops at T108 (cpu equivalent of a 5- day forecast of full resolution model) new datasets: profilers, MODIS winds, QuikScat

2005+: Additional datasets (AIRS, MSG, MTSAT, IASI, GIFTS, COSMIC) Improved data assimilation

23

Regional System

Now: Variable resolution, uniform region at 24 km (354 X 415 X 28) 3D-Var assimilation on model levels at T108, 12-hr spin-up

2004: Resolution to 15 km in uniform region (576 X 641 X 58) Inclusion of AMSU-B and GOES data in assimilation cycle Four model runs a day (instead of two) New datasets: profilers, MODIS winds, QuikScat

2006: Limited area model at 10 km resolution (800 X 800 X 60) LAM 4D-Var data assimilation Assimilation of Radar data

24

Ensemble Prediction System

Now: 16 members global system (300 X 150 X 28) Forecasts up to 10 days once a day at 00Z Optimal Interpolation assimilation system, 6-hr cycle,use of derived radiance data (Satems)

2004: Ensemble Kalman Filter assimilation system, 6-hr cycle,

use of raw radiances from AMSUA, AMSUB and GOES Two forecast runs per day (12Z run added) Forecasts extended to 15 days (instead of 10)

25

...Ensemble Prediction System

2005:

Increased resolution to 100 km (400 X 200 X 58) Increased members to 32 Additional datasets such as in global deterministic system

2007:

Prototype regional ensemble 10 members LAM (500 X 500 X 58) No distinct data assimilation; initial and boundary conditions from global EPS

26

Mesoscale System

Now: Variable resolution, uniform region at 10 km (290 X 371 X 35)

Two windows; no data assimilation

2004: Prototype Limited area model at 2.5 km (500 X 500 X 45)

over one area

2005: Five Limited area model windows at 2.5 km (500 X 500 X 45)

2008: 4D data assimilation

27

Coupled Models

Today

In R&D: coupled atmosphere, ocean, ice, wave, hydrology, biology & chemistry

In Production: storm surge, wave

Future

Global coupled model for both prediction & data assimilation

28

Estuary & Ice Circulation

29

Merchant Ship RoutingP

uiss

ance

Temps …2 jours...

Route sudRoute sud

Route nordRoute nord

30

Challenges

IT Security

Accelerated Storage Needs

Replacement of Front-end & Storage Infrastructure