netcdf for high performance computing introduction to netcdf what is netcdf? netcdf data models how...

91

Upload: hugo-johnson

Post on 13-Jan-2016

317 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 2: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF for High Performance Computing

Introduction to NetCDFWhat is netCDF?

NetCDF Data ModelsHow we think of data.

NetCDF Software LibrariesUsing netCDF APIs.

NetCDF and PerformanceGetting the most out of netCDF.

Secrets of NetCDFStuff known only to the netCDF insiders!

Page 3: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Introduction to NetCDF

• NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.

• NetCDF now supports four binary formats and APIs in many programming languages.

• There is a large body of existing netCDF software, many netCDF programmers, and lots and lots of netCDF data.

Page 4: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 5: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 6: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Data Models

Variable

AttributeAttribute

Variable

AttributeAttribute

AttributeAttribute

Variable

AttributeAttribute

Variable

AttributeAttribute

AttributeAttribute

A netCDF-4 file can organize variable, dimensions, andattributes in groups, which can be nested.

Variable

AttributeAttribute

AttributeAttribute

Page 7: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Commitment to Backward

Compatibility

Because preserving access to archived data

for future generations is sacrosanct:

NetCDF-4 provides both read and write access to

all earlier forms of netCDF data.Existing C, Fortran, and Java netCDF programs

will continue to work after recompiling and

relinking.Future versions of netCDF will continue to

support both data access compatibility and API

compatibility.

Page 8: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Conventions• The NetCDF User's Guide recommends some

conventions (ex. "units" and "Conventions" attributes).

• Conventions are published agreements about how data of a particular type should be represented to foster interoperability.

• Most conventions use attributes.• Use of an existing convention is highly

recommended. Use the CF Conventions, if applicable.

• A netCDF file should use the global "Conventions" attribute to identify which conventions it uses.

Page 9: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

The Climate and Forecast Conventions

• The CF Conventions are becoming a widely used standard for atmospheric, ocean, and climate data.

• The NetCDF Climate and Forecast (CF) Metadata Conventions, Version 1.3, describes consensus representations for climate and forecast data using the netCDF-3 data model.

• http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.3/cf-conventions.html

Page 10: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

HDF5/NetCDF-4 Interoperability

• NetCDF-4 can interoperate with HDF5 with a SUBSET of HDF5 features.

• Will not work with HDF5 files that have looping groups, references, and types not found in netCDF-4.

• HDF5 file must use new dimension scale API to store shared dimension info.

• If a HDF5 follows the Common Data Model, NetCDF-4 can interoperate on the same files.

• HDF5 files created with netCDF-4 look like normal HDF5 files, and use dimension scales.

Page 11: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Tools for NetCDF: ncdump and ncgen

• These two tools come with the netCDF distribution, and are supported by the netCDF programming team.

• ncdump converts a netCDF data file to human-readable text form.

• ncgen takes the text form (CDL) and creates a netCDF data file.

• Both support netCDF-4 for classic model files, but ncgen does not support enhanced model (but this is under development.)

Page 12: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Additional NetCDF Tools

• Many graphics tools support netCDF, including grads, IDV, NCL.

• Command line tools include the NetCDF Command Operators (NCO.)

• Tools can quite easily upgrade to handle netCDF-4/HDF5 classic model files.

• Tools have to be substantially enhanced to handle the enhanced model.

• A partial list of tool can be found on the netCDF web page.

Page 13: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Documentation for NetCDF• Extensive documentation can be found on-

line:• NetCDF Users Guide• NetCDF Installation and Porting Guide• NetCDF Tutorial• API References for Java, C, F77, F90, and CXX• FAQ• One or Two Day NetCDF Workshop• Example programs in C, F77, F90, C++, Java,

Python, MATLAB, Perl, IDL• Example datasets illustrating CF conventions

Page 14: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Documentation

Page 15: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 16: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 17: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 18: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Support for NetCDF

• There is no dedicated support team for netCDF. All netCDF support is handled by netCDF programmers (there are three of us for C/Fortran).

• There are many support resources on-line, including a database of all support questions and responses.

• Most support questions are related to building netCDF. Before sending a support request, see the “Build Troubleshooter” section of the home page.

Page 19: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Email Support

Send bug reports to:[email protected]

Your support email will enter a support tracking system which will ensure that it does not get lost.

But it may take us a while to solve your problem...

Page 20: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF Data Models

• The Classic Data Model• UML and Text Descriptions• Introducing the Common Data Language (CDL)• Dimensions• Variables

• Coordinate Variables• Attributes

• NetCDF-4 Features That Don't Use Enhanced Model• The Enhanced Data Model

• UML and Text Descriptions• HDF5 Features Now Available in NetCDF

Page 21: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

The Classic NetCDF Data Model

Page 22: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

UML Classic NetCDF Data Model

Page 23: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Common Data Language of Simple Classic Model File

netcdf example { // example of CDL notation dimensions:

x = 3 ; y = 8 ;

variables: float rh(x, y) ;

rh:units = "percent" ; rh:long_name = "relative humidity" ;

// global attributes :title = "simple example, lacks some conventions" ;

data: rh = 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89 ; }

Page 24: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

CDL Details• This example has only one variable, but

multiple variables of may be included in a netCDF file.

• You can use the ncdump utility to get the CDL form of a binary netCDF file (more on this later).

• You can use the ncgen utility to generate a binary netCDF file from CDL (more on this later).

• This simple example neglects recommended best practices for netCDF data.

• NcML is an XML-based notation similar to CDL for netCDF data

Page 25: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Dimensions• Dimensions may be shared among variables,

indicating a common grid.• Dimensions may be associated with

coordinate variables to identify coordinate axes.

• In the classic netCDF data model, at most one dimension can have the unlimited length, which means variables can grow along that dimension.

• In the enhanced data model, multiple dimensions can have the unlimited length.

Page 26: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Variables in the Classic Model• In the classic data model, the type of a

variable is the external type of its data as represented on disk, one of: char, byte, short, int, float, double.

• The shape of a non-scalar variable is specified with a list of dimensions.

• A variable may have attributes to specify properties such as units.

• A variable is equivalent to a HDF5 dataset.

Page 27: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

A Strong Convention: Coordinate Variables

• A variable with the same name as a dimension is called a coordinate variable.

• Examples: lat, lon, level, and time.• The notion of coordinate variables has been

generalized to multidimensional coordinate axes in the netCDF Java library and the Common Data Model it supports.

• Closest HDF5 equivalent: dimension scales.

Page 28: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Coordinate Variable CDL Examplenetcdf elev1 {dimensions:lat = 180 ;lon = 360 ;

variables:float lat(lat) ;

lat:standard_name = "latitude" ;lat:units = "degrees_north" ;

float lon(lon) ;lon:standard_name = "longitude" ;lon:units = "degrees_east" ;

short elev(lat, lon) ;elev:standard_name = "height" ;elev:missing_value = 32767s ;elev:units = "meter" ;

Page 29: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Attributes in the Classic Model• Like variables, the type of an attribute may be

one of char, byte, short, int, float, or double.• Attributes are scalar or 1D.• Global attributes apply to a whole file. Variable

attributes apply to a specific variable.• NetCDF conventions are defined primarily in

terms of attributes.• Attributes cannot have attributes.

Page 30: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

More Realistic Examplenetcdf co2 {dimensions:T = 456 ;

variables:float T(T) ;

T:units = "months since 1960-01-01" ;float co2(T) ;

co2:long_name = "CO2 concentration by volume" ;

co2:units = "1.0e-6" ;co2:_FillValue = -99.99f ;

// global attributes::references = "Keeling_etal1996

Keeling_etal1989 Keeling_etal1995" ;

Page 31: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Advantages of the Classic Model

• Files that follow the classic model will be compatible with existing netCDF software.

• Classic model is simple but powerful.• We advise that you use the classic model

wherever possible, for maximum interoperability.

Page 32: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Not All NetCDF-4 Features Require the Enhanced Model!

• Many features of the HDF5 layer can be used without using the enhanced data model:

• parallel I/O• zlib compression/decompression• endianness control• chunking• expanded size limits

Page 33: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 Enhanced Model

• The classic model is expanded to include groups, new types (including user-defined types), multiple unlimited dimensions.

• Using the enhanced model requires rewriting the reading programs.

Page 34: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 35: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 Features from HDF5

• The enhanced netCDF-4 data model includes new types: String type Unsigned ints and 64-bit ints Compound type VLEN type Enum type Opaque type

Page 36: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

More NetCDF-4 Enhanced Model Features

• Multiple unlimited dimensions are allowed (and in any order.)

• Groups are supported.• Dimensions are visible to any sub-groups.

Page 37: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

No enddefs and redefs Needed

• In classic netCDF model, enddef is needed to end define mode, redef to re-enter define mode.

• With netCDF-4 files, this is not necessary. These functions will be called automatically, as needed.

• If you use the NC_CLASSIC_MODEL flag when creating the file, you must explicitly call enddef and redef.

Page 38: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF Software Libraries• The netCDF core library is written in C and

Java.• Fortran 77 is “faked” when netCDF is built –

actually C functions are called by Fortran 77 API.

• A C++ API also calls the C API, a new C++ API us under development to support netCDF-4 more fully.

Page 39: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF C/Fortran/CXX Architecture

NetCDF-4 C

NetCDF-3 C

NetCDF classic NetCDF-4/HDF5

HDF5 1.8.0

zlibNetCDF 64-bit offset

F77 CXX

F90

User App.

Page 40: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

C API: Simple Example

nc_create(FILE_NAME, NC_CLOBBER, &ncid); nc_def_dim(ncid, "x", NX, &x_dimid); nc_def_dim(ncid, "y", NY, &y_dimid); dimids[0] = x_dimid; dimids[1] = y_dimid; nc_def_var(ncid, "data", NC_INT, NDIMS,

dimids, &varid); nc_enddef(ncid); nc_put_var_int(ncid, varid, &data_out[0][0]); nc_close(ncid);

Page 41: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

F90 API: Simple Example

call check( nf90_create(FILE_NAME, NF90_CLOBBER, ncid) )call check( nf90_def_dim(ncid, "x", NX, x_dimid) )call check( nf90_def_dim(ncid, "y", NY, y_dimid) )

dimids = (/ y_dimid, x_dimid /)

call check( nf90_def_var(ncid, "data", NF90_INT, dimids, varid) )

call check( nf90_enddef(ncid) )call check( nf90_put_var(ncid, varid, data_out) )call check( nf90_close(ncid) )

Page 42: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

C++ and NetCDF-4

• Existing C++ API works with netCDF-4 classic model files.

• The existing API was written before many features of C++ became standard, and thus needed updating.

• A new C++ API has been partially developed by Shanna-Shaye Forbes, a Unidata student employee.

• You can build the new API (which is not complete!) with --enable-cxx4.

Page 43: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF and Performance

• Parallel I/O with parallel netcdf.• Parallel I/O with NetCDF-4.0• Chunking• Cache Parameters

Page 44: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Parallel I/O Introduction• Parallel I/O allows many processes to read/write

netCDF data at the same time.• Used properly, parallel I/O allows users to overcome

I/O bottlenecks in high performance computing environments.

• Users must choose between parallel-netcdf or netCDF-4 for parallel access. The APIs are not compatible.

Page 45: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Parallel I/O Choices for NetCDF Users

read-only data access with sequential netCDF library – “fseek parallelism.” Is this used? Is it safe?

parallel-netcdf package from Argonne/Northwestern.

netCDF-4 with netCDF-4/HDF5 files. HDF5 with netCDF-4/HDF5 files.

Page 46: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

parallel-netcdf from Argonne/Northwestern

parallel-netcdf (a.k.a. pnetcdf) is a package based on the netCDF C library, with parallel I/O modifications from programmers at Argonne National Labs, Northwestern University, and the community.

Reads and writes classic and 64-bit offset format files.

Different (but similar) API, with “ncmpi_” instead of “nc_” in function names (ex. ncmpi_create() to create a file).

Provides a subset of netCDF API.

Page 47: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

parallel-netcdf and Data Formats

The parallel-netcdf package reads and writes data of classic format and 64-bit offset format.

The output files are compatible with the netCDF library.

parallel-netcdf does not read/write files of netCDF-4/HDF5 format.

Page 48: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

parallel-netcdf Features

Provides two layers of API, a high-level one that matches netcdf API, and a lower-level one that allows better use of MPI.

Support for non-blocking operations. Collective and independent reads and writes. Support for MPI data types.

Page 49: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

C Example

The following C example is drawn from a talk Robert Latham gave at 2007 NetCDF Workshop.

The example creates a very simple netCDF file.

Page 50: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Some parallel-netcdf C Code

#include <mpi.h>#include <pnetcdf.h>int main(int argc, char **argv){ int ncfile, ret, count; char buf[13] = "Hello World\n"; MPI_Init(&argc, &argv);

ret = ncmpi_create(MPI_COMM_WORLD, "myfile.nc", NC_CLOBBER, MPI_INFO_NULL, &ncfile); if (ret != NC_NOERR) return 1;

Page 51: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

parallel-netcdf C Code 2

ret = ncmpi_put_att_text(ncfile, NC_GLOBAL, "string", 13, buf); if (ret != NC_NOERR) return 1; ncmpi_enddef(ncfile); /* entered data mode – nothing to do. */ ncmpi_close(ncfile); MPI_Finalize(); return 0;}

Page 52: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

More About parallel-netcdf

See their website:www.mcs.anl.gov/parallel-netcdf

Robert Lantham's presentation to 2007 netCDF workshop (“Parallel NetCDF”): http://www.unidata.ucar.edu/software/netcdf/workshops/2007/pnetcdf/pnetcdf-ucar.pdf

Page 53: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Parallel I/O with NetCDF-4

NetCDF-4 provides access to HDF5 parallel I/O features for netCDF-4/HDF5 files.

NetCDF classic and 64-bit offset format may not be opened or created for use with parallel I/O.

A few functions have been added to the netCDF API to handle parallel I/O.

These functions are also available in the Fortran 90 and Fortran 77 APIs.

Page 54: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Building NetCDF with Parallel I/O• You must build netCDF-4 properly to take

advantage of parallel features.• HDF5 1.8.2 must be built with --enable-parallel.• Typically the CC environment variable is set to

mpicc, and the FC to mpifc (or some local variant.) You must build HDF5 and netCDF-4 with the same compiler and compiler options.

• The netCDF configure script will detect the parallel capability of HDF5 and build the netCDF-4 parallel I/O features automatically. No configure options to the netcdf configure are required.

Page 55: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Building withm NetCDF Parallel I/O (2)

• For non-parallel build the configure modifies the netcdf.h file to define types MPI_Comm and MPI_Info as ints.

• Parallel I/O is tested in nc_test4/tst_parallel.c, nc_test4/tst_parallel2.c, and nc_test4/tst_parallel3.c, if the --enable-parallel-tests option to configure is used.

Page 56: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Opening/Creating netCDF-4/HDF5 File for Parallel I/O in C

• Special nc_create_par and nc_open_par functions are used to create/open a netCDF file.

• The parallel access associated with these functions is not a characteristic of the data file, but the way it was opened.

Page 57: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

The C Open/Create Functions

int nc_create_par(const char *path, int cmode, MPI_Comm comm, MPI_Info info, int *ncidp);

int nc_open_par(const char *path, int mode, MPI_Comm comm, MPI_Info info, int *ncidp);

Page 58: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Opening/Creating netCDF-4/HDF5 File for Parallel I/O in F90

In the upcoming 4.0.1 release, the nf90_open and nf90_create calls have been modified to permit parallel I/O files to be opened/created using optional parameters.

In the 4.0 release nf90_open_par and nf90_create_par functions were provided. These are still supported, but the use of optional parameters to nf90_create and nf90_open is recommended.

Page 59: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 Parallel I/O Details

All netCDF metadata writing operations are collective. That is, all creation of groups, types, variables, dimensions, or attributes.

Data reads and writes (ex. calls to nc_put_vara_int and nc_get_vara_int) may be independent (the default) or collective. To make writes to a variable collective, call the nc_var_par_access function.

/* Use these with nc_var_par_access(). */

#define NC_INDEPENDENT 0

#define NC_COLLECTIVE 1

EXTERNL int

nc_var_par_access(int ncid, int varid, int par_access);

Page 60: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 C Parallel I/O Example

nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid);

nc_def_dim(ncid, "d1", DIMSIZE, dimids); nc_def_dim(ncid, "d2", DIMSIZE, &dimids[1]); nc_def_var(ncid, "v1", NC_INT, NDIMS, dimids, &v1id);

/* Set up slab for this process. */ start[0] = mpi_rank * DIMSIZE/mpi_size; start[1] = 0; count[0] = DIMSIZE/mpi_size; count[1] = DIMSIZE; nc_var_par_access(ncid, v1id, NC_INDEPENDENT); nc_put_vara_int(ncid, v1id, start, count, &data[mpi_rank*QTR_DATA]);

Page 61: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 Parallel I/O F90 Example

The netCDF release contains an examples directory with subdirectories for C, F77, F90, C++, and CDL.

Examples are compiled and run as tests, and also included in the netCDF tutorial document.

For the 4.0.1 release a new set of examples is available in F90: simple_xy_par_wr and simple_xy_par_rd.

For a future release, C and (perhaps) F77 versions will be provided.

Page 62: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

simple_xy_par_rd.f90 Defining File call check( nf90_create(FILE_NAME, NF90_NETCDF4, & ncid, comm = MPI_COMM_WORLD, & info = MPI_INFO_NULL) )

call check( nf90_def_dim(ncid, "x", p, x_dimid) ) call check( nf90_def_dim(ncid, "y", p, y_dimid) )

dimids = (/ y_dimid, x_dimid /)

call check( nf90_def_var(ncid, "data", NF90_INT, & dimids, varid) ) call check( nf90_enddef(ncid) )

Page 63: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

simple_xy_par_rd.f90 Writing Data

start = (/ 1, my_rank + 1/) count = (/ p, 1 /) call check( nf90_put_var(ncid, varid, data_out, & start = start, count = count) )

Page 64: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Demonstrating Collective and Independent Operations

How do we know that collective/independent modes are working properly?

Using MPE and Jumpshot this feature can be explored and verified.

I used nc_test4/tst_parallel.c for this demonstration. It has been instrumented with optional MPE commands. When recompiled with -DUSE_MPE, and relinked with -llmpe -lmpe, tst_parallel.c will produce CLOG2 output.

This is now available in the snapshot distribution.

Page 65: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Independent Access

Page 66: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Collective Access

Page 67: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Choosing Between Parallel-NetCDF and NetCDF-4 Parallel I/O

Since APIs are different, user must choose one or the other (or both).

NetCDF-4 cannot perform parallel I/O on netCDF classic/64-bit offset data.

parallel-netcdf cannot access netCDF-4/HDF5 files.

Page 68: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF-4 vs. parallel-netcdf for ROMS Model (Yang)

Page 69: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

I Have a Dream A conversion layer to make it possible to

compile parallel netcdf code with netCDF-4, and use netCDF-4 parallel I/O.

A layer in netCDF-4 library to allow parallel netcdf for parallel I/O on classic/64-bit offset files, and netCDF-4 parallel I/O on netCDF-4/HDF5 files.

Page 70: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Chunking

Chunking can have a significant performance impact.

If you don't specify chunking parameters with nc_def_var_chunking you will get default chunking. These defaults have been changed for the upcoming 4.0.1 release.

Russ Rew has been running some tests with various chunking options. You can run these tests too with nc_test4/tst_chunks.c.

Page 71: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Writing Chunks contiguous write 1 432 432 0.559 sec chunked write 1 432 432 36 36 36 1.97 sec 3.5 x slower compressed write 1 432 432 36 36 36 2.08 sec 3.7 x slower

contiguous write 432 1 432 18.1 sec chunked write 432 1 432 36 36 36 1.5 sec 12 x faster compressed write 432 1 432 36 36 36 1.5 sec 12 x faster

contiguous write 432 432 1 223 sec chunked write 432 432 1 36 36 36 9.55 sec 23 x faster compressed write 432 432 1 36 36 36 9.74 sec 23 x faster

Page 72: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Reading Chunks

contiguous read 1 432 432 0.353 sec chunked read 1 432 432 36 36 36 1.06 sec 3 x slower compressed read 1 432 432 36 36 36 1.1 sec 3.1 x slower

contiguous read 432 1 432 6.22 sec chunked read 432 1 432 36 36 36 1.45 sec 4.3 x faster compressed read 432 1 432 36 36 36 1.45 sec 4.3 x faster

contiguous read 432 432 1 77.1 sec chunked read 432 432 1 36 36 36 7.68 sec 10 x faster compressed read 432 432 1 36 36 36 7.83 sec 9.9 x faster

Page 73: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Chunk Time Conclusions

Chunked access is slower then contiguous access in the best case (access of data in same order as on disk).

Chunked access much faster for “cross-grained” access.

Deflation does not seem to affect times (on this platform).

Page 74: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Chunk Cache Control

With the upcoming 4.0.1 release, control of HDF5 chunk cache is possible.

Even without user intervention, cache size is raised from HDF5default (1 MB) to 32 MB (configurable when netCDF is built).

New function nc_set_chunk_cache allows user tuning of HDF5 cache parameters.

Cannot be changed for an open file.

Page 75: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Secrets of NetCDF

• Converting to netCDF-4• Using Compression• NetCDF Development Process• NetCDF Testing• Future Plans

Page 76: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Converting Existing NetCDF Code to NetCDF-4

• NetCDF-4.0 is a drop-in replacement for netCDF-3.x. By default all created files are in classic format.

• To create netcdf4/hdf5 files, change the mode parameter in nc_create() calls.

• Existing write code will work the same way, but a netcdf4/hdf5 file will be produced.

• NetCDF reading software must be upgraded (i.e. relinked) with the netCDF-4.0 release.

• No modifications needed in reading code.

Page 77: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

More on Converting to NetCDF-4

• When using the classic model, use the NC_CLASSIC_MODEL flag when creating the file.

• To use enhanced model, but writing and reading software must be modified.

• Some advanced structures are problematic in Fortran. We don't know how to handle compound types in a machine independent way, for example.

• Makefiles must be changed to include: -lnetcdf -lhdf5_hl -lhdf5 -lz

Page 78: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Using Compression

• NetCDF-4 supports zlib deflation.• szlib data may be read through netCDF-4, but

not written. Is there interest in this?• You can't deflate while writing with parallel

I/O.• Set deflate with the nc_def_var_deflate after

the variable is defined, but before the first endef.

Page 79: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Deflate of 2D Radar

Data

Page 80: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Testing of C/F77/F90/C++ Libraries

• Cross-platform testing takes place nightly at Unidata.

• The daily snapshot release is a complete netCDF release, version “netcdf-4.0-snapshot2008101402”

• Snapshot passes “make distcheck” on Linux platform before its release.

• Get the daily snapshot for the latest updates to the code and documentation.

• Snapshot documentation us on-line: http://www.unidata.ucar.edu/software/netcdf/docs_snapshot/

Page 81: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 82: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using
Page 83: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Test Platforms

• Each build includes a full run of the default netCDF tests.

• The automatic testing of the default build includes Linux, AIX, IRIX, Mac, SunOS, Cygwin, HP-UX (netcdf-3 only) and Visual Studio builds.

• Various C compilers are tested: gcc (several versions), Intel, PG, AIX, Sun, HPUX, and IRIX.

• Various Fortran compilers are tested: g95, gfortran (pre and post 4.2), ifort (9.1 and 10.1), PGI, AIX, IRIX, HPUX, and even g77.

Page 84: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

More on Testing

• Additional, optional tests are also run on some platforms, including MPI I/O builds, and tests for very large files.

• Full output of configure and make, as well as the config.log, are available from the hyperlink on the test page.

• In case of problems, find your platforms and take a look.

Page 85: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

NetCDF C/Fortran/C++ Development Process

• Agile programming, with aggressive refactoring, and heavy reliance on automatic testing.

• Daily snapshot allows bug fixes to be released immediately.

• Our highest priority is fixing bugs so that we do not have a bug-list to maintain and prioritize.

Page 86: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Russ Rew – core netcdf-3, ncdump, ncgen, C++

Page 87: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Dennis Heimbigner – OpenDAP client, ncgen enhanced

Page 88: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Ed Hartnett – core netcdf-4, tests, Fortran APIs and tests

Page 89: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Future Plans• Upcoming C/F77/F90/C++ 4.0.1 release

currently in beta.• 4.1 release of C libraries including built-in

OpenDAP client, ncgen support for netCDF-4 enhanced model.

• NetCDF 4.2 and beyond: development of DAP client to fully support netCDF-4 enhanced model.

• Development of libcf C/Fortran library to assist with CF conventions.

Page 90: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Questions About Using NetCDF-4?

Page 91: NetCDF for High Performance Computing Introduction to NetCDF What is netCDF? NetCDF Data Models How we think of data. NetCDF Software Libraries Using

Contact Information

NetCDF website: www.unidata.ucar.edu/software/netcdf

Ed Hartnett - [email protected]