parallel hdf5 introductory tutorial

39
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group [email protected] 5/19/2008 1 SCICOMP 14 Tutorial

Upload: hieu

Post on 15-Jan-2016

69 views

Category:

Documents


1 download

DESCRIPTION

Parallel HDF5 Introductory Tutorial. May 19, 2008 Kent Yang The HDF Group [email protected]. Outline. Overview of Basic HDF5 concept Overview of Parallel HDF5 design MPI-IO vs. Parallel HDF5 Overview of Parallel HDF5 programming model The benefits of using Parallel HDF5 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Parallel HDF5 Introductory Tutorial

Parallel HDF5Introductory Tutorial

May 19, 2008Kent Yang

The HDF [email protected]

5/19/2008 1SCICOMP 14 Tutorial

Page 2: Parallel HDF5 Introductory Tutorial

Outline

• Overview of Basic HDF5 concept• Overview of Parallel HDF5 design• MPI-IO vs. Parallel HDF5• Overview of Parallel HDF5 programming model• The benefits of using Parallel HDF5• Situations where parallel HDF5 may not work well

5/19/2008 2SCICOMP 14 Tutorial

Page 3: Parallel HDF5 Introductory Tutorial

Overview of Basic HDF5 Concept

5/19/2008 3SCICOMP 14 Tutorial

Page 4: Parallel HDF5 Introductory Tutorial

What is HDF5?

• File format for managing any kind of data• Software (library and tools) for accessing data in

that format• Especially suited for large and/or complex data

collections• Platform independent • C, F90, C++, Java APIs

5/19/2008 4SCICOMP 14 Tutorial

Page 5: Parallel HDF5 Introductory Tutorial

Example HDF5 file

“/” (root)

“/foo”

Raster imageRaster image

palettepalette

3-D array3-D array

2-D array2-D arrayRaster Raster imageimage

lat | lon | temp----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6

TableTable

5/19/2008 5SCICOMP 14 Tutorial

Page 6: Parallel HDF5 Introductory Tutorial

compressed

extendable

Metadata for Fred

Dataset “Fred”

File AFile A

File BFile B

Data for FredData for Fred

Special Storage Options

chunked

compressed

extendable

Split file

Better subsettingAccess time;extendable

Improves storage efficiency, Transmission speed

Arrays can be extendedin any direction

Metadata in one file,raw data in another

Page 7: Parallel HDF5 Introductory Tutorial

5/19/2008 SCICOMP 14 Tutorial 7

Virtual File I/O LayerAllows HDF5 format address space to map to disk, the

network, memory, or a user-defined device

Network

NetworkFile Family MPI I/O Memory

Virtual file I/O driversVirtual file I/O drivers

Memory

Stdio

File File FamilyFamily

FileFile

““Storage”Storage”

Page 8: Parallel HDF5 Introductory Tutorial

Overview of Parallel HDF5 Design

5/19/2008 8SCICOMP 14 Tutorial

Page 9: Parallel HDF5 Introductory Tutorial

PHDF5 Requirements

• MPI programming• PHDF5 files compatible with serial HDF5 files

• Shareable between different serial or parallel platforms• Single file image to all processes

• One file per process design is undesirable• Expensive post processing• Not useable by different number of processes

• Standard parallel I/O interface• Must be portable to different platforms

5/19/2008 9SCICOMP 14 Tutorial

Page 10: Parallel HDF5 Introductory Tutorial

PHDF5 Implementation Layers

ApplicationApplication

Parallel computing system (IBM AIX)Parallel computing system (IBM AIX)

Computenode

Computenode

I/O library (HDF5)I/O library (HDF5)

Parallel I/O library (MPI-I/O)Parallel I/O library (MPI-I/O)

Parallel file system (GPFS)Parallel file system (GPFS)

Switch network/I/O serversSwitch network/I/O servers

Computenode

Computenode

Computenode

Computenode

Computenode

Computenode

Disk architecture & layout of data on diskDisk architecture & layout of data on disk

PHDF5 built on top of standard MPI-IO API

5/19/2008 10SCICOMP 14 Tutorial

Page 11: Parallel HDF5 Introductory Tutorial

Parallel Environment Requirements

• MPI with MPI-IO. E.g.,• MPICH2 ROMIO

• Vendor’s MPI-IO: IBM,SGI etc.

• Parallel file system. E.g.,• GPFS

• Lustre

5/19/2008 11SCICOMP 14 Tutorial

Page 12: Parallel HDF5 Introductory Tutorial

MPI-IO vs. HDF5

• MPI-IO is an Input/Output API.• It treats the data file as a “linear byte stream” and

each MPI application needs to provide its own file view and data representations to interpret those bytes.

• All data stored are machine dependent except the “external32” representation.

• External32 is defined in Big Endianness• Little endian machines have to do the data

conversion in both read or write operations.

• 64-bit sized data types may lose information.

5/19/2008 12SCICOMP 14 Tutorial

Page 13: Parallel HDF5 Introductory Tutorial

MPI-IO vs. HDF5 Cont.

• HDF5 is a self-described data management software.

• It stores the data and metadata according to the HDF5 data format definition.

• Each machine can store the data in its own native representation for efficient I/O.

• Any necessary data representation conversion is done by the HDF5 library automatically.

• 64-bit sized data types may not lose information.

5/19/2008 13SCICOMP 14 Tutorial

Page 14: Parallel HDF5 Introductory Tutorial

Programming Restrictions

• Most PHDF5 APIs are collective• PHDF5 opens a parallel file with a communicator

• Returns a file-handle

• Future access to the file via the file-handle

• All processes must participate in collective PHDF5 APIs

• Different files can be opened via different communicators

5/19/2008 14SCICOMP 14 Tutorial

Page 15: Parallel HDF5 Introductory Tutorial

Examples of PHDF5 API

• Examples of PHDF5 collective API• File operations: H5Fcreate, H5Fopen, H5Fclose• Objects creation: H5Dcreate, H5Dopen, H5Dclose• Objects structure: H5Dextend (increase dimension

sizes)

• Array data transfer can be collective or independent• Dataset operations: H5Dwrite, H5Dread

5/19/2008 15SCICOMP 14 Tutorial

Page 16: Parallel HDF5 Introductory Tutorial

PHDF5 API Languages

• C and F90 language interfaces• Platforms supported:

• Most platforms with MPI-IO supported• IBM SP, Linux clusters, Cray XT3, SGI Altix

5/19/2008 16SCICOMP 14 Tutorial

Page 17: Parallel HDF5 Introductory Tutorial

How to Compile PHDF5 Applications

• h5pcc – HDF5 C compiler command• Similar to mpicc

• h5pfc – HDF5 F90 compiler command• Similar to mpif90

• To compile:• % h5pcc h5prog.c• % h5pfc h5prog.f90

5/19/2008 17SCICOMP 14 Tutorial

Page 18: Parallel HDF5 Introductory Tutorial

Overview of Parallel HDF5 Programming Model

5/19/2008 18SCICOMP 14 Tutorial

Page 19: Parallel HDF5 Introductory Tutorial

Creating and Accessing a File Programming model

• HDF5 uses access template object (property list) to control the file access mechanism

• General model to access HDF5 file in parallel:• Setup MPI-IO access template (access property

list)• Open File • Access Data• Close File

5/19/2008 19SCICOMP 14 Tutorial

Page 20: Parallel HDF5 Introductory Tutorial

Parallel File Create

• ->36 plist_id = H5Pcreate(H5P_FILE_ACCESS);• ->37 H5Pset_fapl_mpio(plist_id, comm, info);• ->42 file_id = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, • H5P_DEFAULT, plist_id);

• 49 /*• 50 * Close the file.• 51 */• 52 H5Fclose(file_id);• 54 MPI_Finalize();

5/19/2008 20SCICOMP 14 Tutorial

Page 21: Parallel HDF5 Introductory Tutorial

Writing and Reading Hyperslabs Programming model

• Distributed memory model: data is split among processes• PHDF5 uses hyperslab model• Each process defines memory and file hyperslabs• Each process executes partial write/read call

• Collective calls• Independent calls

5/19/2008 21SCICOMP 14 Tutorial

Page 22: Parallel HDF5 Introductory Tutorial

P0

P1

File

Hyperslab Example Writing dataset by columns

5/19/2008 22SCICOMP 14 Tutorial

Page 23: Parallel HDF5 Introductory Tutorial

Writing Dataset by Column

P1

P0

FileMemory

block[1]

Block[0]

P0 offset[1]

P1 offset[1]stride[1]

dimsm[0]dimsm[1]

5/19/2008 23SCICOMP 14 Tutorial

Page 24: Parallel HDF5 Introductory Tutorial

Writing Dataset by Column

85 /*86 * Each process defines hyperslab in * the file88 */89 count[0] = 1;90 count[1] = dimsm[1];91 offset[0] = 0;92 offset[1] = mpi_rank;93 stride[0] = 1;94 stride[1] = 2;95 block[0] = dimsm[0];96 block[1] = 1;9798 /*99 * Each process selects hyperslab.100 */101 filespace = H5Dget_space(dset_id);102 H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, stride, count, block);

5/19/2008 24SCICOMP 14 Tutorial

Page 25: Parallel HDF5 Introductory Tutorial

Writing Dataset by Column

P1

P0

FileMemory

block[1]

Block[0]

P0 offset[1]

P1 offset[1]stride[1]

dimsm[0]dimsm[1]

5/19/2008 25SCICOMP 14 Tutorial

Page 26: Parallel HDF5 Introductory Tutorial

• 96 /* Create property list for collective dataset write. */• 98 plist_id = H5Pcreate(H5P_DATASET_XFER);• ->99 H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);• 100 • 101 status = H5Dwrite(dset_id, H5T_NATIVE_INT,• 102 memspace, filespace, plist_id, data);

• 104 H5Dclose(dset_id);

Dataset collective Write

5/19/2008 26SCICOMP 14 Tutorial

Page 27: Parallel HDF5 Introductory Tutorial

My PHDF5 Application I/O is slow

• If my application I/O performance is slow, what can I do?• Use larger I/O data sizes

• Independent vs. Collective I/O

• Specific I/O system hints

• Increase I/O bandwidth

5/19/2008 27SCICOMP 14 Tutorial

Page 28: Parallel HDF5 Introductory Tutorial

Independent Vs Collective Access

• User reported Independent data transfer mode was much slower than the Collective data transfer mode

• Data array was tall and thin: 230,000 rows by 6 columns

:::

230,000 rows:::

:::

230,000 rows:::

5/19/2008 28SCICOMP 14 Tutorial

Page 29: Parallel HDF5 Introductory Tutorial

# of Rows Data Size

(MB)

Independent (Sec.)

Collective (Sec.)

16384 0.25 8.26 1.72

32768 0.50 65.12 1.80

65536 1.00 108.20 2.68

122918 1.88 276.57 3.11

150000 2.29 528.15 3.63

180300 2.75 881.39 4.12

Independent vs. Collective write

6 processes, IBM p-690, AIX, GPFS

5/19/2008 29SCICOMP 14 Tutorial

Page 30: Parallel HDF5 Introductory Tutorial

Independent vs. Collective write (cont.)

Performance (non-contiguous)

0

100

200

300

400

500

600

700

800

900

1000

0.00 0.50 1.00 1.50 2.00 2.50 3.00

Data space size (MB)

Tim

e (

s)

Independent

Collective

5/19/2008 30SCICOMP 14 Tutorial

Page 31: Parallel HDF5 Introductory Tutorial

Parallel Tools

• ph5diff

• Parallel version of the h5diff tool

• h5perf

• Performance measuring tools showing I/O performance for different I/O API

5/19/2008 31SCICOMP 14 Tutorial

Page 32: Parallel HDF5 Introductory Tutorial

ph5diff

• A parallel version of the h5diff tool• Supports all features of h5diff• An MPI parallel tool• Manager process (proc 0)

• coordinates each the remaining processes (workers) to “diff” one dataset at a time;

• collects any output from each worker and prints them out.

• Works best if there are many datasets in the files with few differences.

• Available in v1.8.

5/19/2008 32SCICOMP 14 Tutorial

Page 33: Parallel HDF5 Introductory Tutorial

h5perf

• An I/O performance measurement tool• Test 3 File I/O API

• POSIX I/O (open/write/read/close…)

• MPIO (MPI_File_{open,write,read,close})

• PHDF5• H5Pset_fapl_mpio (using MPI-IO)• H5Pset_fapl_mpiposix (using POSIX I/O)

5/19/2008 33SCICOMP 14 Tutorial

Page 34: Parallel HDF5 Introductory Tutorial

APIs that applications can use to achieve better performance

• H5Pset_dxpl_mpio_chunk_opt  • H5Pset_dxpl_mpio_chunk_opt_num   • H5Pset_dxpl_mpio_chunk_opt_ratio   • H5Pset_dxpl_mpio_collective_opt  

5/19/2008 34SCICOMP 14 Tutorial

Page 35: Parallel HDF5 Introductory Tutorial

The benefits of Using HDF5

• Self-describing• Allow tools to access the file without knowledge of

applications that produces it

• Flexible design• Move between system architectures and between serial

and parallel applications

• Flexible application control• Application can choose the way on how to store data in

HDF5 to achieve better performance

• Advanced features• Support complex selections• More user control for performance tuning

5/19/2008 35SCICOMP 14 Tutorial

Page 36: Parallel HDF5 Introductory Tutorial

Situations where parallel HDF5 may not work well

• We don’t support BlueGene• Misuse HDF5 can cause bad performance

• Chunking storagehttp://www.hdfgroup.uiuc.edu/papers/papers/ParallelIO/HDF5-CollectiveChunkIO.pdf

• Collective IO

5/19/2008 36SCICOMP 14 Tutorial

Page 37: Parallel HDF5 Introductory Tutorial

Questions?

5/19/2008 37SCICOMP 14 Tutorial

Page 38: Parallel HDF5 Introductory Tutorial

Questions for audiences

• Any suggestions on general improvement or new features for parallel HDF5 support?

• Any suggestions on other tools for parallel HDF5?

5/19/2008 38SCICOMP 14 Tutorial

Page 39: Parallel HDF5 Introductory Tutorial

Useful Parallel HDF Links

• Parallel HDF information site• http://hdfgroup.org/HDF5/PHDF5/

• Parallel HDF5 tutorial available at• http://hdfgroup.org/HDF5/Tutor/

• HDF Help email address• [email protected]

5/19/2008 39SCICOMP 14 Tutorial