parallel io in the community earth system model
DESCRIPTION
Parallel IO in the Community Earth System Model. Jim Edwards John Dennis (NCAR) Ray Loy(ANL ) Pat Worley (ORNL). Some CESM 1.1 Capabilities: Ensemble configurations with multiple instances of each component Highly scalable capability proven to 100K+ tasks Regionally refined grids - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/1.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Parallel IO in the Community Earth System Model
Jim Edwards John Dennis
(NCAR)Ray Loy(ANL)
Pat Worley (ORNL)
![Page 2: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/2.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
![Page 3: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/3.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Some CESM 1.1 Capabilities:– Ensemble configurations with multiple
instances of each component– Highly scalable capability proven to
100K+ tasks– Regionally refined grids– Data assimilation with DART
![Page 4: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/4.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Prior to PIO• Each model component was
independent with it’s own IO interface
• Mix of file formats – NetCDF– Binary (POSIX)– Binary (Fortran)
• Gather-Scatter method to interface serial IO
![Page 5: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/5.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Steps toward PIO• Converge on a single file format
– NetCDF selected • Self describing• Lossless with lossy capability (netcdf4
only)• Works with the current postprocessing tool
chain
![Page 6: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/6.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Extension to parallel• Reduce single task memory profile• Maintain single file decomposition
independent format• Performance (secondary issue)
![Page 7: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/7.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Parallel IO from all compute tasks is not the best strategy– Data rearrangement is complicated
leading to numerous small and inefficient IO operations
– MPI-IO aggregation alone cannot overcome this problem
![Page 8: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/8.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Parallel I/O library (PIO)• Goals:
– Reduce per MPI task memory usage– Easy to use– Improve performance
• Write/read a single file from parallel application
• Multiple backend libraries: MPI-IO,NetCDF3, NetCDF4, pNetCDF, NetCDF+VDC
• Meta-IO library: potential interface to other general libraries
![Page 9: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/9.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
CPL7 COUPLER
CISL LAND ICE MODEL
CAM ATMOSPHERIC
MODEL
CLM LAND MODEL
POP2 OCEAN MODEL
CICE OCEAN ICE MODEL
PIO
netcdf3pnetcdf
netcdf4
HDF5
VDC
MPI-IO
![Page 10: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/10.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Separation of Concerns• Separate computational and I/O
decomposition• Flexible user-level rearrangement• Encapsulate expert knowledge
PIO design principles
![Page 11: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/11.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• What versus How– Concern of the user:
• What to write/read to/from disk?• eg: “I want to write T,V, PS.”
– Concern of the library developer:• How to efficiently access the disk?• eq: “How do I construct I/O operations so
that write bandwidth is maximized?”• Improves ease of use• Improves robustness• Enables better reuse
Separation of concerns
![Page 12: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/12.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Separate computational and I/O decomposition
computational decomposition
I/O decomposition
Rearrangement between computational and I/Odecompositions
![Page 13: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/13.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• A single technical solution is not suitable for the entire user community:– User A: Linux cluster, 32 core job, 200
MB files, NFS file system– User B: Cray XE6, 115,000 core job,
100 GB files, Lustre file system
Different compute environment requires different technical solution!
Flexible user-level rearrangement
![Page 14: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/14.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Writing distributed data (I)
+ Maximize size of individual io-op’s to disk- Non-scalable user space buffering- Very large fan-in large MPI buffer allocations
Correct solution for User A
Computational decompositionI/O decomposition
Rearrangement
![Page 15: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/15.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Writing distributed data (II)
+ Scalable user space memory + Relatively large individual io-op’s to disk- Very large fan-in large MPI buffer allocations
Computational decomposition
Rearrangement
I/O decomposition
![Page 16: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/16.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Writing distributed data (III)
+ Scalable user space memory+ Smaller fan-in -> modest MPI buffer allocations- Smaller individual io-op’s to disk
Correct solution for User B
Computational decomposition
Rearrangement
I/O decomposition
![Page 17: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/17.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Flow-control algorithm• Match size of I/O operations to stripe size
– Cray XT5/XE6 + Lustre file system– Minimize message passing traffic at
MPI-IO layer• Load balance disk traffic over all I/O nodes
– IBM Blue Gene/{L,P}+ GPFS file system
– Utilizes Blue Gene specific topology information
Encapsulate Expert knowledge
![Page 18: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/18.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Did we achieve our design goals?• Impact of PIO features
– Flow-control– Vary number of IO-tasks– Different general I/O backends
• Read/write 3D POP sized variable [3600x2400x40]
• 10 files, 10 variables per file, [max bandwidth]• Using Kraken (Cray XT5) + Lustre filesystem
– Used 16 of 336 OST
Experimental setup
![Page 19: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/19.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
3D POP arrays [3600x2400x40]
![Page 20: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/20.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
3D POP arrays [3600x2400x40]
![Page 21: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/21.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
3D POP arrays [3600x2400x40]
![Page 22: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/22.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
3D POP arrays [3600x2400x40]
![Page 23: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/23.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
3D POP arrays [3600x2400x40]
![Page 24: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/24.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
PIOVDCParallel output to a VAPOR Data Collection (VDC)• VDC:
– A wavelet-based, gridded data format supporting both progressive access and efficient data subsetting
• Data may be progressively accessed (read back) at different levels of detail, permitting the application to trade off speed and accuracy
– Think GoogleEarth: less detail when the viewer is far away, progressively more detail as the viewer zooms in
– Enables rapid (interactive) exploration and hypothesis testing that can subsequently be validated with full fidelity data as needed
• Subsetting– Arrays are decomposed into smaller blocks that significantly improve
extraction of arbitrarily oriented sub arrays• Wavelet transform
– Similar to Fourier transforms– Computationally efficient: order O(n)– Basis for many multimedia compression technologies (e.g. mpeg4,
jpeg2000)
![Page 25: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/25.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Other PIO Users• Earth System Modeling Framework
(ESMF)• Model for Prediction Across Scales
(MPAS)• Geophysical High Order Suite for
Turbulence (GHOST)• Data Assimilation Research Testbed
(DART)
![Page 26: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/26.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Penn State University
26
Write performance on BG/L
April 26, 2010
![Page 27: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/27.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
Penn State University
27
Read performance on BG/L
April 26, 2010
![Page 28: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/28.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
100:1 Compression with coefficient prioritization10243 Taylor-Green turbulence (enstrophy field) [P. Mininni, 2006]
No compression Coefficient prioritization (VDC2)
![Page 29: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/29.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
40963 Homogenous turbulence simulation Volume rendering of original enstrophy field and 800:1 compressed field
Data provided by P.K. Yeung at Georgia Tech and Diego Donzis at Texas A&M
Original: 275GBs/field 800:1 compressed: 0.34GBs/field
![Page 30: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/30.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
F90 code generation
interface PIO_write_darray! TYPE real,int! DIMS 1,2,3 module procedure write_darray_{DIMS}d_{TYPE} end interface
genf90.pl
![Page 31: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/31.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
# 1 "tmp.F90.in"interface PIO_write_darray module procedure dosomething_1d_real module procedure dosomething_2d_real module procedure dosomething_3d_real module procedure dosomething_1d_int module procedure dosomething_2d_int module procedure dosomething_3d_intend interface
![Page 32: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/32.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• PIO is opensource– http://code.google.com/p/parallelio/
Documentation using doxygen• http://web.ncar.teragrid.org/~dennis/pio_do
c/html/
![Page 33: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/33.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Thank you
![Page 34: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/34.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• netCDF3– Serial– Easy to implement– Limited flexibility
• HDF5– Serial and Parallel– Very flexible– Difficult to implement– Difficult to achieve good performance
• netCDF4– Serial and Parallel – Based on HDF5– Easy to implement– Limited flexibility– Difficult to achieve good performance
Existing I/O libraries
![Page 35: Parallel IO in the Community Earth System Model](https://reader038.vdocuments.mx/reader038/viewer/2022102703/56816614550346895dd95d94/html5/thumbnails/35.jpg)
Parallel IO in CESM Jim [email protected]
NCAR, P.O. Box 3000, Boulder CO, 80307-3000 USA
Workshop on Scalable IO in Climate Models 27/02/2012
• Parallel-netCDF– Parallel– Easy to implement– Limited flexibility– Difficult to achieve good performance
• MPI-IO– Parallel – Very difficult to implement– Very flexible– Difficult to achieve good performance
• ADIOS– Serial and parallel– Easy to implement– BP file format
• Easy to achieve good performance– All other file formats
• Difficult to achieve good performance
Existing I/O libraries (con’t)