ccsm4 - a flexible new infrastructure for earth system modeling

39
06/23/22 1 CCSM4 - A Flexible New CCSM4 - A Flexible New Infrastructure for Earth Infrastructure for Earth System Modeling System Modeling Mariana Vertenstein NCAR CCSM Software Engineering Group

Upload: latoya

Post on 06-Jan-2016

33 views

Category:

Documents


1 download

DESCRIPTION

CCSM4 - A Flexible New Infrastructure for Earth System Modeling. Mariana Vertenstein NCAR CCSM Software Engineering Group. Major Infrastructure Changes since CCSM3. CCSM4/CPL7 development could not have occurred without the following collaborators DOE/SciDAC - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 1

CCSM4 - A Flexible New CCSM4 - A Flexible New Infrastructure for Earth System Infrastructure for Earth System

ModelingModeling

Mariana Vertenstein NCAR

CCSM Software Engineering Group

Page 2: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Major Infrastructure Major Infrastructure Changes since CCSM3Changes since CCSM3

CCSM4/CPL7 development could not have occurred without the following collaborators– DOE/SciDAC

Oak Ridge National Laboratory (ORNL) Argonne National Laboratory (ANL) Los Alamos National Laboratory (LANL) Lawrence Livermore National Laboratory (LLNL)

– NCAR/CISL– ESMF

04/20/23 2

Page 3: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

OutlineOutline What are software requirements of community

earth model? Overview of current CCSM4 How does CCSM4 address requirements?

– Flexibility permits greater efficiency, throughput, ease of porting and model development

How is CCSM4 being used in new ways?– Interactive ensembles - extending traditional definition of

component– Extending CCSM to ultra high resolutions

What is CCSM4 Scalability and Performance? Upcoming releases and new CCSM4 scripts

04/20/23 3

Page 4: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 4

CESM General Software CESM General Software Requirements Requirements

Page 5: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 5

Specific High Resolution Specific High Resolution Requirements Requirements

Capability to use both MPI and OpenMP effectively to address requirements of new multi-core architectures

Scalable and flexible coupling infrastructure

Parallel I/O throughout model system (for both scalable memory and performance)

Scalable memory (minimum global arrays) for each component

Page 6: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

6

CCSM4 OverviewCCSM4 Overview Consists of a set of 4 (5 for CESM) geophysical

component models on potentially different grids that exchange boundary data with each other only via communication with a coupler (hub and spoke architecture)– New science is resulting in sharply increasing number

of fields being communicated between components Large code base: >1M lines

– Fortran 90 (mostly)– Developed over 20+ years– 200-300K lines are critically important --> no comp

kernels, need good compilers Collaborations are critical

– DOE/SciDAC, University Community, NSF (PetaApps), ESMF

Page 7: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23

CAM Modes: Multiple Dycores, Multiple Chemistry Options, WACCM, single column

CAM Modes: Multiple Dycores, Multiple Chemistry Options, WACCM, single column

CAM DATM (WRF)CAM DATM (WRF)Atmosphere ComponentAtmosphere Component

Data-ATM: Multiple Forcing/Physics Modes Data-ATM: Multiple Forcing/Physics Modes

CLM Modes: no BGC, BGC, Dynamic-Vegetation, BGC-DV, Prescribed-Veg, Urban

CLM Modes: no BGC, BGC, Dynamic-Vegetation, BGC-DV, Prescribed-Veg, Urban

CLM DLND (VIC)CLM DLND (VIC)Land ComponentLand Component

Data-LND: Multiple Forcing/Physics Modes Data-LND: Multiple Forcing/Physics Modes

CICE Modes: Fully Prognostic, PrescribedCICE Modes: Fully Prognostic, PrescribedCICE DICE CICE DICE Ice ComponentIce Component

Data-ICE : Multiple Forcing/Physics Modes Data-ICE : Multiple Forcing/Physics Modes

POP Modes: Ecosystem, Fully-coupled, Ocean-only, Multiple Physics Options

POP Modes: Ecosystem, Fully-coupled, Ocean-only, Multiple Physics Options

POP DOCN(SOM/DOM) (ROMS)POP DOCN(SOM/DOM) (ROMS)Ocean ComponentOcean Component

Data-OCN : Multiple Forcing/Physics Modes (SOM/DOM)Data-OCN : Multiple Forcing/Physics Modes (SOM/DOM)

New Land Ice Component

CouplerRegridding, Merging, Calculation of ATM/OCN fluxes,

Conservation diagnostic

What are the CCSM Components?

Page 8: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM Component GridsCCSM Component Grids Ocean and Sea-Ice must run on same grid

– displaced pole, tripole Atmosphere and Land can now run on different grids

– these in general are different from the ocean/ice grid– lat/lon, but also new cubed sphere for CAM

Globally grids span low resolution (3 degree) to ultra-high– 0.25 ATM/LND [1152 x 768]– 0.50 ATM/LND [576 x 384]– 0.1 OCN/ICE [3600 x 2400]

Regridding – Done in parallel at runtime using mapping files that are

generated offline using SCRIP – In past, grids have been global and logically rectangular –

but now can have single point, regional, cubed sphere …– Regridding issues are rapidly becoming a higher priority

04/20/23 8

Page 9: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM Component CCSM Component ParallelismParallelism

MPI/OpenMP– CAM, CLM, CICE, POP have MPI/OpenMP hybrid capability– Coupler only has MPI capability– Data models only have MPI capability

Parallel I/O (use of PIO library)– CAM, CICE, POP, CPL, Data models all

have PIO capability

04/20/23 9

Page 10: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

New CCSM4 Architecture New CCSM4 Architecture

processors

New Single Executable CCSM4 architecture (cpl7)

tim

e

CPL (regridding, merging)

CAM

CLM

CICE

Driver (controls time evolution)

POP

Sequential Layout

processors

Hybrid Sequential/Concurrent Layouts

CAM

CLM CICE

POP

Driver

CPL

Original Multiple Executable CCSM3 architecture (cpl6)

CAM CLM CICE POP CPLtim e

processors

Page 11: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 11

Advantages of CLP7 DesignAdvantages of CLP7 Design New flexible coupling strategy

– Design targets a wide range of architectures - massively parallel peta-scale hardware, smaller linux clusters, and even single laptop computers

– Provides efficient support of varying levels of parallelism via simple run-time configuration for processor layout

New CCSM4 scripts provide one simple xml file to specify processor layout of entire system and automated timing information to simplify load balancing

Scientific unification – ALL model development done with one code base -

elimination of separate stand-alone component code bases (CAM, CLM)

Code Reuse and Maintainability– Lowers cost of support/maintenance

Page 12: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 12

More CPL7 advantages…More CPL7 advantages… Simplicity

– Easier to debug - much easier to understand time flow

– Easier to port – ported to IBM p6 (NCAR) Cray XT4/XT5 (NICS,ORNL,NERSC) BGP (Argonne), BGL (LLNL) Linux Clusters (NCAR, NERSC, CCSM4-alpha users)

– Easier to run - new xml-based scripts permit user-friendly capability to create “out-of-box” experiments

Performance (throughput and efficiency)– Much greater flexibility to achieve optimal load balance for

different choices of Resolution, Component combinations, Component

physics– Automatically generated timing tables provide users with

immediate feedback on both performance and efficiency

Page 13: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 Provides a Seamless End-to-End Cycle of Model CCSM4 Provides a Seamless End-to-End Cycle of Model Development, Integration and Prediction with Development, Integration and Prediction with

One Unified Model Code Base One Unified Model Code Base

04/20/23 13

Page 14: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

New frontiers for CCSM New frontiers for CCSM

Using the coupling infrastructure in novel ways– Implementation of interactive

ensembles Pushing the limits of high resolution

– Capability to really exercise the scalability and performance of the system

04/20/23 14

Page 15: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 and PetaApps CCSM4 and PetaApps CCSM4/CPL7 is integral piece of NSF CCSM4/CPL7 is integral piece of NSF

Petaapps award Petaapps award – Funded 3 year effort aimed atFunded 3 year effort aimed at advancing

climate science capability for petascale climate science capability for petascale systemssystems

– NCAR, COLA, NERSC, U. MiamiNCAR, COLA, NERSC, U. Miami– Interactive ensembles using CCSM4/CPL7 Interactive ensembles using CCSM4/CPL7

involves both computational and scientific involves both computational and scientific challengeschallenges

used to understand how oceanic, sea-ice and used to understand how oceanic, sea-ice and atmospheric noise impacts climate variabilityatmospheric noise impacts climate variability

can also scale out to tens of thousands of processorscan also scale out to tens of thousands of processors

– Also examine use of PGAS language in CCSMAlso examine use of PGAS language in CCSM

04/20/23 15

Page 16: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 16

CLM

CPL

POP

Driver

CAM CAM CAM

tim

e

processors

Interactive Ensembles and Interactive Ensembles and CPL7CPL7• All Ensemble members run concurrently on non-overlapping

processor sets• Communication with coupler takes place serially over ensemble members• Setting new number of ensembles requires editing 1 line of an xml file• 35M CPU hours TeraGrid [2nd largest]

• All Ensemble members run concurrently on non-overlapping processor sets• Communication with coupler takes place serially over ensemble members• Setting new number of ensembles requires editing 1 line of an xml file• 35M CPU hours TeraGrid [2nd largest]

CICE

tim

e POP

Driver

CLM

CICE

CAM

processors

POP POP

CPL

Currently being used to perform ocean data assimilation (using DART) for POP2

Page 17: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 and Ultra High Resolution CCSM4 and Ultra High Resolution

DOE/LLNL Grand Challenge Simulation– .25° atmosphere/land and .1° ocean/ice– Multi-institutional collaboration (ANL, LANL,

LLNL, NCAR, ORNL) – First ever U.S. multi-decadal global climate

simulation with eddy resolving ocean and high resolution atmosphere

0.42 sypd on 4048 cpus (Atlas LLNL cluster) 20 years completed 100 GB/simulated month

04/20/23 17

Page 18: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Ultra High Resolution Ultra High Resolution (cont)(cont)

NSF/PetaApps Control Simulation (IE baseline) – John Dennis (CISL) has carried this out– .5° atmosphere/land and .1° ocean/ice– Control run in production @ NICS (Teragrid)

1.9 sypd on 5848 quad-core XT5 cpus (4-5 months continuous simulation)

155 years completed 100TB of data generated (generating 0.5-1 TB per wall clock

day) 18M CPU hours used

– Transfer output from NICS to NCAR (100 – 180 MB/sec sustained) – archive on HPSS

– Data analysis using 55 TB project space at NCAR

04/20/23 18

Page 19: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Next steps at high Next steps at high resolutionresolution

Future work– Use OpenMP capability in all components effectively to

take advantage multi-core architectures Cray XT5 hex-core and BG/P

– Improve disk I/O performance [currently using 10 - 25% of time]

– Improve memory footprint scalability

Future simulations– .25° atm/ .1° ocean – T341 atm/ .1° ocean (effect of Eulerian dycore)– 1/8° atm (HOMME)/.25° land/ .1° ocean

04/20/23 19

Page 20: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 Scalability and CCSM4 Scalability and Performance Performance

04/20/23 20

Page 21: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 21

New Parallel I/O library (PIO)New Parallel I/O library (PIO) Interface between the model

and the I/O library. Supports– Binary – NetCDF3 (serial netcdf)– Parallel NetCDF (pnetcdf)

(MPI/IO)– NetCDF4

User has enormous flexibility to choose what works best for their needs

– Can read one format and write another

Rearranges data from model decomp to I/O friendly decomp (rearranger is framework independent) – model tasks and I/O tasks can be independent

Page 22: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

PIO in CCSMPIO in CCSM PIO implemented in CAM, CICE and POP Usage is critical for high resolution, high

processor count simulations– Serial I/O is one of the largest sources of

global memory in CCSM - will eventually always run out of memory

– Serial I/O results in serious performance penalty at higher processor counts

Performance benefit noticed even with serial netcdf (model output decomposed on output I/O tasks)

04/20/23 22

Page 23: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CPL scalabilityCPL scalability

Scales much better than previous version – both in memory and throughput

Inherently involves a lot of communication versus flops

New coupler has not been a bottleneck in any configuration we have tested so far – other issues such as load balance and scaling of other processes have dominated

Minor impact at 1800 cores (kraken peta-apps control)

23

Page 24: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 Cray XT Scalability CCSM4 Cray XT Scalability

24(Courtesy of John Dennis)

POP 4028

CAM 1664

CICE 1800

CPL 1800

processors

tim

e1.9 sypd on 5844 cores with

i/oon kraken quad-core XT5

Page 25: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 25

CAM/HOMME DycoreCAM/HOMME DycoreCubed-sphere grid overcomes dynamical core scalability

problems inherent with lat/lon gridWork of Mark Taylor (SciDAC), Jim Edwards (IBM), Brian

Eaton(CSEG)PIO library used for all I/O (work COULD NOT have been done without PIO)

•BGP (4 cores/node): Excellent scalability down to 1 element per processor (86,200 processors at 0.25 degree resolution).

•JaguarPF (12 cores/node): 2-3x faster per core than BGP, scaling not as good - 1/8 degree run loosing scalability at 4 elements per processor

PIO library used for all I/O (work COULD NOT have been done without PIO)

•BGP (4 cores/node): Excellent scalability down to 1 element per processor (86,200 processors at 0.25 degree resolution).

•JaguarPF (12 cores/node): 2-3x faster per core than BGP, scaling not as good - 1/8 degree run loosing scalability at 4 elements per processor

Page 26: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CAM/HOMMME CAM/HOMMME Real Planet: 1/8° SimulationsReal Planet: 1/8° Simulations

CCSM4 - CAM4 physics configuration with cyclical year 2000 ocean forcing data sets– CAM-HOMME 1/8°, 86400 cores−CLM2 on lat/lon 1/4°, 512 cores−Data ocean/ice, 1°, 512 cores−Coupler, 8640 cores

Jaguarpf simulationExcellent scalability: 1/8 degree running at 3 SYPD on JaguarLarge scale features agree well with Eulerian and FV dycores

l Runs confirm that the scalability of the dynamical core is preserved by CAM and the scalability of CAM is preserved by CCSM real planet configuration.

Page 27: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

How will CCSM4 be released?How will CCSM4 be released?- Leverage Subversion revision control

system- Source code and Input Data obtained

from Subversion servers (not tar files)- Output data of control runs from ESG- Advantages:

- Easier for CSEG to produce frequent updates- Flexible way to have users obtain new

updates of source code (and bug fixes)- Users can leverage Subversion to merge new

updates into their “sandbox” with their modifications

04/20/23 27

Page 28: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 28

Obtaining the Code and Obtaining the Code and UpdatesUpdates

Subversion Source Code Repository (Public)https://svn-ccsm-release.cgd.ucar.edu

svn co

obtain ccsm4.0

code

make your own

modifications in your sandbox

obtain new code updates and bug fixes which are

merged by subversion with

your own changes

svn merge

Page 29: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Creating an Experimental Creating an Experimental CaseCase

New CCSM4 Scripts Simplify:– Porting CCSM4 to your machine– Creating your experiment and

obtaining necessary input data for your experiment

– Load Balancing your experiment– Debugging your experiment- if

something goes wrong during the simulation (never happen of course) - simpler to determine what it is

04/20/23 29

Page 30: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Porting to your machinePorting to your machine CCSM4 scripts contain a set of supported

machines – user can run out of the box CCSM4 scripts also support a set of “generic”

machines (e.g. linux clusters with a variety of compilers) – user still needs to determine which generic

machine most closely resembles their machine and needs to customize Makefile macros for their machine

– user feedback will be leveraged to continuously upgrade the generic machine capability post-release

04/20/23 30

Page 31: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Obtaining Input DataObtaining Input DataInput data is now in Subversion

repositoryEntire input data is about 900 GB

and growingCCSM4 scripts permit user

automatically obtain only the input data need for a given experimental configuration

04/20/23 31

Page 32: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 32

Accessing input data for your Accessing input data for your experimentexperiment

Set up experiment

create_newcase

(component set, resolution, machine)

determine local root directory where all input

data will go(DIN_LOC_ROOT)

Subversion Input Data Repository (Public)https://svn-ccsm-inputdata.cgd.ucar.edu

use

check_input_data –export

to automatically obtain ONLY

required datasets for experiment in DIN_LOC_ROOT

load balance your

experimental configuration

(use timing files)

use

check_input_data

to see of required

datasets are present in

DIN_LOC_ROOT

run Experiment

Page 33: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Load Balancing Your Load Balancing Your ExperimentExperiment

Load balancing exercise must be done before starting an experiment –

Repeat short experiments (20 days) without I/O and adjust processor layout to – optimize throughput – minimize idle time (maximize efficiency)

Detailed timing results are produced with each run

Makes load balancing exercise much simpler than in CCSM3

04/20/23 33

Page 34: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Load Balancing CCSM ExampleLoad Balancing CCSM Example

Idle time/cores

1664 cores

POP

CICE

Processors

CAM

CPL7 Tim

e

CLM

Increase core count for POP

3136 cores

1.53 SYPD

POPCICE

Processors

CAM

CPL7 Tim

e

CLM

4028 cores1664 cores

2.23 SYPD

Reduced Idle time

Page 35: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

CCSM4 Releases and CCSM4 Releases and TimelinesTimelines

• January 15, 2010:• CCSM4.0 alpha release - to subset of users and

vendors with minimal documentation (except for script's User's Guide)

• April 1, 2010: • CCSM4.0 release - Full documentation, including User's

Guide, Model Reference Documents, and experimental data

• June 1, 2010: CESM1.0 release • ocean ecosystem, CAM-AP, interactive chemistry,

WACCM

• New CCSM output data web design underway (including comprehensive diagnostics)

04/20/23 35

Page 36: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

04/20/23 36

CCSM4.0 alpha

release

Extensive CCSM4 User’s Guide already in place

apply for alpha user access at www.ccsm.ucar.edu/models/ccsm4.0

Page 37: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Upcoming ChallengesUpcoming Challenges

This year– Carry out IPCC simulations– Release CCSM4 and CESM1 and updates– Resolve performance and memory issues with ultra-

high resolution configuration on Cray XT5 and BG/P– Create user-friendly validation process for porting to

new machines On the horizon

– Support regional grids– Nested regional modeling in CPL7– Migration to optimization for GPUs

37

Page 38: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

6/23/09

Big Interdisciplinary Team! Big Interdisciplinary Team!

Contributors: D. Bader (ORNL)D. Bailey (NCAR)C. Bitz (U Washington)F. Bryan (NCAR)T. Craig (NCAR)A. St. Cyr (NCAR)J. Dennis (NCAR)B. Eaton (NCAR)J. Edwards (IBM)B. Fox-Kemper (MIT,CU)N. Hearn (NCAR)E. Hunke (LANL)B. Kauffman (NCAR)E. Kluzek (NCAR)B. Kadlec (CU)D. Ivanova (LLNL)E. Jedlicka (ANL)E. Jessup (CU)R. Jacob (ANL)P. Jones (LANL)J. Kinter (COLA)A. Mai (NCAR)

Funding:– DOE-BER CCPP Program Grant

DE-FC03-97ER62402 DE-PS02-07ER07-06 DE-FC02-07ER64340 B&R KP1206000

– DOE-ASCR B&R KJ0101030

– NSF Cooperative Grant NSF01– NSF PetaApps Award

Computer Time:– Blue Gene/L time:

NSF MRI GrantNCARUniversity of ColoradoIBM (SUR) program

BGW Consortium DaysIBM research (Watson)

LLNLStony Brook & BNL

– CRAY XT time:NICS/ORNLNERSCSandia

S. Mishra (NCAR)S. Peacock (NCAR)K. Lindsay (NCAR)W. Lipscomb (LANL)R. Loft (NCAR)R. Loy (ANL)J. Michalakes (NCAR)A. Mirin (LLNL)M. Maltrud (LANL)J. McClean (LLNL)R. Nair (NCAR)M. Norman (NCSU)N. Norton (NCAR)T. Qian (NCAR)M. Rothstein (NCAR)C. Stan (COLA)M. Taylor (SNL)H. Tufo (NCAR)M. Vertenstein (NCAR)J. Wolfe (NCAR)P. Worley (ORNL)M. Zhang (SUNYSB)

38

Page 39: CCSM4 - A Flexible New Infrastructure for Earth System Modeling

Thanks! Questions?CCSM4.0 alpha release page at

www.ccsm.ucar.edu/models/ccsm4.0

04/20/23 39