research on high performance computing systems...performance of npb bt class b omni openmp:...

1
Center for Computational Sciences, University of Tsukuba http://www.ccs.tsukuba.ac.jp/ OmniRPC: A Grid RPC System for Parallel Computing OmniRPC is a Grid RPC system for parallel programming and is designed as Thread-saf e RPC in order to allow client programs to use in OpenMP . Support of master-workers programming model for parametric search grid applications. Integrating computing resources on various execution platform: XtremWeb, Condor, CyberGRIP, Grid Engine. Clusters with private IP address and NATs. Applications: CONFLEX-G: Grid-enabled Molecular Conformational Space Sarch Program HMCS-G: Grid-enabled Hybrid Computing System for Computational Astrophysics. Master-Worker parallel program for large eigen value program (Prof. Sakurai, U. Tsukuba) PAML-G: Phylogenetic Analysis by Maximum Likelihood on the grid. Used as a backend for YML (Prof. Petiton, LIFL, France) Webpage: http://www.omni.hpcc.jp/OmniRPC/ 0 2 4 6 8 10 0 500 1000 1500 2000 2500 3000 3500 MPI version SCASH-MPI(prefetch) SCASH-MPI Performance [Mop/s] Number of Nodes Performance of NPB BT Class B Omni OpenMP: Cluster-enabled OpenMP by SCASH-MPI FFTE: A High-Performance FFT Library Performance of Parallel 3D FFTs 32 16 8 4 2 1 0 1000 2000 3000 4000 5000 6000 FFTW 2.1.5 (dual) FFTW 2.1.5 (single) FFTE 3.2 (dual) FFTE 3.2 (single) Performance [MFLOPS] Number of CPUs Client PC PC Cluster OmniRPC Agent OmniRPC Agent Volunteer PCs OmniRPC Agent OmniRPC Agent CyberGRIP GRM CyberGRIP SRMD XtremWeb Dispatcher OmniRPC Client Program OmniRPC Stub Program C ondorP ool The Omni OpenMP compiler enables an OpenMP program to run on a PC-cluster with MPI enabled. The compiler can generate parallel codes for a software DSM system SCASH-MPI is a software distributed shared memory system (DSM) using MPI as a communication layer of SCASH DSM. The original SCASH is a DSM in the SCore cluster software. A thread is created to handle remote memory access request by MPI. By using MPI as a communication layer, one can exploit a high-performance network of clusters with high portability. History-based prefetching for optimizing communication. Performance Benchmark : NPB2.3 BT (parallelized with OpenMP) Environment : Dual Opteron 1.8GHz * 8nodes + Gigabit Ethernet FFTE is a Fortran subroutine library for computing the Fast Fourier Transform (FFT) in one or more dimensions. The API of FFTE is similar to sequential SGI SCSL or Intel MKL routines. Features: High speed. (Supports SSE2/SSE3) Complex and mixed-radix transforms. Paralle l transforms : Shared / Distribute d memory parallel computers (OpenMP, MPI and OpenMP+MPI) High portability: Fortran77 + OpenMP + MPI FFTE’s 1-D parallel FFT routine has been incorporated into the HPC Challenge (HPCC) benchmark. Performance Data: N1xN2xN3 = 2^24xP Machines: Dual Xeon 2.8GHz + Myrinet2000 Research on High Performance Computing Systems

Upload: others

Post on 23-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research on High Performance Computing Systems...Performance of NPB BT Class B Omni OpenMP: Cluster-enabled OpenMP by SCASH-MPI FFTE: A High-Performance FFT Library Performance of

Center for Computational Sciences, University of Tsukuba

http://www.ccs.tsukuba.ac.jp/

OmniRPC: A Grid RPC System for Parallel ComputingOmniRPC is a Grid RPC system for parallel programming and is designed as Thread-saf e RPC in order to allow client programs to use in OpenMP.Support of master-workers programming model for parametric search grid applications. Integrating computing resources on various execution platform:

XtremWeb, Condor, CyberGRIP, Grid Engine.Clusters with private IP address and NATs.

Applications: CONFLEX-G: Grid-enabled Molecular Conformational Space Sarch Program HMCS-G: Grid-enabled Hybrid Computing System for Computational Astrophysics. Master-Worker parallel program for large eigen value program (Prof. Sakurai, U. Tsukuba) PAML-G: Phylogenetic Analysis by Maximum Likelihood on the grid. Used as a backend for YML (Prof. Petiton, LIFL, France)Webpage: http://www.omni.hpcc.jp/OmniRPC/

0 2 4 6 8 100

500

1000

1500

2000

2500

3000

3500

MPI version SCASH-MPI(prefetch)SCASH-MPI

Per

form

ance

[Mop

/s]

Number of Nodes

Performance of NPB BT Class B

Omni OpenMP: Cluster-enabled OpenMP by SCASH-MPI

FFTE: A High-Performance FFT Library

Performance of Parallel 3D FFTs

321684210

1000

2000

3000

4000

5000

6000FFTW 2.1.5 (dual)FFTW 2.1.5 (single)FFTE 3.2 (dual)FFTE 3.2 (single)

Per

form

ance

[MF

LOP

S]

Number of CPUs

Client PC

PC Cluster

OmniRPCAgent

OmniRPCAgent

Volunteer PCsOmniRPCAgent

OmniRPCAgent

CyberGRIPGRM

CyberGRIPSRMD

XtremWebDispatcher

OmniRPCClient Program

OmniRPCStub Program

C ondor Pool

The Omni OpenMP compiler enables an OpenMP program to run on a PC-cluster with MPI enabled.

The compiler can generate parallel codes for a software DSM system

SCASH-MPI is a software distributed shared memory system (DSM) using MPI as a communication layer of SCASH DSM.

The original SCASH is a DSM in the SCore cluster software.A thread is created to handle remote memory access request by MPI.By using MPI as a communication layer, one can exploit a high-performance network of clusters with high portability.History-based prefetching for optimizing communication.

PerformanceBenchmark : NPB2.3 BT (parallelized with OpenMP)Environment : Dual Opteron 1.8GHz * 8nodes + Gigabit Ethernet

FFTE is a Fortran subroutine library for computing the Fast Fourier Transform (FFT) in one or more dimensions.The API of FFTE is similar to sequential SGI SCSL or Intel MKL routines.Features:

High speed. (Supports SSE2/SSE3)Complex and mixed-radix transforms.Paralle l transforms : Shared / Distribute d memory parallel computers (OpenMP, MPI and OpenMP+MPI)High portability: Fortran77 + OpenMP + MPIFFTE’s 1-D parallel FFT routine has been incorporated into the HPC Challenge (HPCC) benchmark.

Performance Data: N1xN2xN3 = 2^24xP Machines: Dual Xeon 2.8GHz + Myrinet2000

Research on High Performance Computing Systems