the f-mpj challenge: solving complex problems on ...€¦ · infiniband network 32 gbps (mellanox...

41
Introduction Message Passing in Java with F-MPJ Complex Application in Bioinformatic: ProtTest Performance Evaluation Conclusions The F-MPJ Challenge: Solving Complex Problems on Hierachical Architectures with Java Sabela Ramos Garea Roberto Rey Expósito Group of Computer Architecture University of A Coruña [email protected], [email protected] ComplexHPC Challenge 2011 Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Upload: others

Post on 01-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

The F-MPJ Challenge: Solving ComplexProblems on Hierachical Architectures with

Java

Sabela Ramos GareaRoberto Rey Expósito

Group of Computer ArchitectureUniversity of A Coruña

[email protected], [email protected]

ComplexHPC Challenge 2011Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 2: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Outline

1 Introduction

2 Message Passing in Java with F-MPJ

3 Complex Application in Bioinformatic: ProtTest

4 Performance Evaluation

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 3: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Java for HPC

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 4: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Java for HPC

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 5: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Java for HPC

Java for High Performance Computing (HPC)

Features:

Network communication support.

Multithreading support.

Portable, platform independent.

Object Oriented.

Safe, robust, simple and witheasy maintenance.

Similar performance as nativelanguages (C, Fortran).

Parallel/distributed programming inJava:

Concurrency Framework.

Java Sockets.

Java RMI.

Message-Passing in Java (MPJ).

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 6: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 7: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 8: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

Message Passing in Java

Message-passing is the main HPC programming model.

Implementation approaches

RMI.

Wrapping a native library via JNI.(e.g., MPI libraries: OpenMPI, MPICH).

Sockets.

APIs implemented:

PVM-like.

mpiJava.

MPJ.

Others.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 9: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

Pur

eJa

vaIm

pl. Socket

impl.High-speednetworksupport

API

Java

IO

Java

NIO

Myr

inet

Infin

iBan

d

SC

I

mpi

Java

1.2

JGF

MP

J

Oth

erA

PIs

MPJava X X X

Jcluster X X X

Parallel Java X X X

mpiJava X X X X

P2P-MPI X X X X

MPJ Express X X X X

MPJ/Ibis X X X X

JMPI X X X

F-MPJ X X X X X X X

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 10: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 11: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

F-MPJ Communication Devices

JVM

native comms

device layer smpdev

Java Threads

Shared Memory

MPJ Applications

ibvdev

TCP/IP

JNI

IBV

omxdev

Open−MX

InfiniBand EthernetMyrinet/Ethernet

Java Sockets

niodev/iodev

F−MPJ Library

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 12: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

F-MPJ Communication Devices for Heterogeneity

Different sorts of devices:

Distributed memory.

Native communication layers: ibvdev, omxdev.Java sockets: iodev, niodev.

Shared memory.

Java threads: smpdev.

Hybrid shared-distributed memory.

In development. It combines a distributed memory devicewith smpdev.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 13: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

Optimizing performance:

No buffering layer for primitive types.

Multi-core aware collective operations library.

Configurable algorithms depending on the message sizeand the number of processors.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 14: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Message Passing in JavaF-MPJ: Fast MPJ

Multi-core aware algorithms for collective operations:

Operation Algorithms

Barrier BT, Gather+Bcast, BTe, Gather+Bcast Optimized

Bcast MST, NBFT, BFT

Scatter/v MST, NBFT

Gather/v MST, NBFT, NB1FT, BFT

Allgather/v NBFT, NBBDE, BBKT, NBBKT, BTe, Gather + Bcast

Alltoall/v NBFT, NB1FT, NB2FT, BFT

Reduce MST, NBFT, BFT

Allreduce NBFT, BBDE, NBBDE, BTe, Reduce + Bcast

Reduce-scatter BBDE, NBBDE, BBKT, NBBKT, Reduce + Scatter

Scan NBFT, OneToOne

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 15: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 16: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 17: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

ProtTest

One of the most popular tools for selectingmodels of protein evolution.

Almost 4,000 registered users.Over 700 citations.

Written in Java.

Intensive in computational needs.

ProtTest 3 designed to take advantage ofparallel processing.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 18: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 19: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

Shared Memory Implementation

Java concurrence API

Implementation of a thread pool.

Dynamic task distribution over the pool.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 20: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

Distributed Memory Implementation

Message Passing in Java

Allow both distributions (static and dynamic).

Includes a distributor process with a negligible workload.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 21: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

ProtTestParallel Strategies

Hybrid Shared/Distributed Memory Implementation

MPJ + OpenMP

Scalability is limited by the task-based high level parallelization.

Solution:

Two-level parallelism.Combination of message passing with multithread computation oflikelihood.Implementation of a parallel version of PhyML using OpenMP.

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 22: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 23: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 24: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Experimental Configuration:

Pluton: Departmental cluster (16 nodes)

2xIntel Xeon 5520 Quad-core CPU (8 cores with hyper-threading per node)

8 GB RAM

InfiniBand Network 16 Gbps (QLogic QLE7240 DDR)

Linux, Sun JDK 1.6, F-MPJ, MPJ Express, OpenMPI, MVAPICH

DAS-4 VU cluster (74 nodes)

2xIntel Xeon 5620 Quad-core CPU (8 cores with hyper-threading per node)

24 GB RAM

InfiniBand Network 32 Gbps (Mellanox MT26428 QDR)

Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI

Special shared memory node (node075):4xAMD Opteron 6172 12-core (48 cores) and 128 GB RAM

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 25: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 26: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Point-to-Point Performance

Message size (bytes)

Point-to-point Performance on InfiniBand (Pluton)

0

10

20

30

40

50

60

70

80

4 16 64 256 1K

Late

ncy

(µs

)

1K 4K 16K 64K 256K 1M 2M 4M 0

1

2

3

4

5

6

7

8

9

10

11

Ban

dwid

th (

Gbp

s)

F-MPJ (ibvdev) - IBV MPJE (niodev) - IPoIB MVAPICH v1.2.0- IBV OpenMPI v1.3.3 - IBV

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 27: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Point-to-Point Performance

Message size (bytes)

Point-to-point Performance on InfiniBand (DAS-4)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

4 16 64 256 1K

Late

ncy

(µs

)

1K 4K 16K 64K 256K 1M 2M 4M 0

2

4

6

8

10

12

14

16

18

20

22

24

26

28

Ban

dwid

th (

Gbp

s)

F-MPJ (ibvdev) - IBV MPJE (niodev) - IPoIB IntelMPI 4 - IBV

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 28: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Point-to-Point Performance

Message size (bytes)

Point-to-point Performance on Shared Memory (Pluton)

0

2

4

6

8

10

12

14

16

18

20

22

24

26

4 16 64 256 1K

Late

ncy

(µs

)

1K 4K 16K 64K 256K 1M 2M 4M 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80

Ban

dwid

th (

Gbp

s)

F-MPJ (ibvdev) F-MPJ (smpdev) MPJE (smpdev) MVAPICH v1.2.0 OpenMPI v1.3.3

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 29: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Point-to-Point Performance

Message size (bytes)

Point-to-point Performance on Shared Memory (DAS-4)

0

5

10

15

20

25

30

35

40

45

4 16 64 256 1K

Late

ncy

(µs

)

1K 4K 16K 64K 256K 1M 2M 4M 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

Ban

dwid

th (

Gbp

s)

F-MPJ (ibvdev) F-MPJ (smpdev) MPJE (smpdev) IntelMPI 4

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 30: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Collective Operations Performance

0

20

40

60

80

100

120

140

160

180

200

220

240

260

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M

Agg

rega

ted

Ban

dwid

th (

Gbp

s)

Message size (bytes)

Broadcast Performance (128 Processes)

F−MPJ (ibvdev) − IBVMPJE (niodev) − IPoIBMVAPICH − IBVOpenMPI − IBV

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 31: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Collective Operations Performance

0

200

400

600

800

1000

1200

1400

1600

1800

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M

Agg

rega

ted

Ban

dwid

th (

Gbp

s)

Message size (bytes)

Broadcast Performance on DAS−4 (512 Processes)

F−MPJ (ibvdev) − IBV Intel MPI 4 − IBV

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 32: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Collective Operations Performance

0

20

40

60

80

100

120

140

160

180

200

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M

Agg

rega

ted

Ban

dwid

th (

Gbp

s)

Message size (bytes)

Broadcast Performance (8 Threads)

F−MPJ (smpdev) MPJE (smpdev) MVAPICHOpenMPI

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 33: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

Collective Operations Performance

0

10

20

30

40

50

60

70

80

1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M 2M 4M

Agg

rega

ted

Ban

dwid

th (

Gbp

s)

Message size (bytes)

Broadcast Performance on DAS−4 (48 Threads)

F−MPJ (smpdev) MPJE (smpdev) IntelMPI 4

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 34: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 35: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

ProtTest 3 Performance

0

5

10

15

20

25

30

35

40

45

50

1 2 4 8 16 32 64 128

Spe

edup

Number of Processes

ProtTest 3 Distributed Memory Scalability on Pluton

F−MPJ (ibvdev)F−MPJ (ibvdev) + OpenMP

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 36: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

ProtTest 3 Performance

0

4

8

12

16

20

24

28

32

36

40

1 2 4 8 16 32 64 128

Spe

edup

Number of Processes

ProtTest 3 Distributed Memory Scalability on DAS−4

F−MPJ (ibvdev) − IBV F−MPJ (ibvdev) + OpenMP − IBV

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 37: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

ProtTest 3 Performance

0

1

2

3

4

5

6

7

8

9

10

11

1 2 4 8 16

Spe

edup

Number of Threads

ProtTest 3 Shared Memory Scalability on Pluton

F−MPJ (smpdev)ProtTest 3 (threads)

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 38: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

Experimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

ProtTest 3 Performance

0

2

4

6

8

10

12

14

16

18

20

1 2 4 8 16 32 48

Spe

edup

Number of Threads

ProtTest 3 Shared Memory Scalability on DAS−4

F−MPJ (smpdev)ProtTest 3 (threads)F−MPJ (smpdev) + OpenMP

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 39: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

SummaryQuestions

1 IntroductionJava for HPC

2 Message Passing in Java with F-MPJMessage Passing in JavaF-MPJ: Fast MPJ

3 Complex Application in Bioinformatic: ProtTestProtTestParallel Strategies

4 Performance EvaluationExperimental ConfigurationF-MPJ PerformanceProtTest 3 Performance

5 Conclusions

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 40: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

SummaryQuestions

Summary

This work presents our current research efforts for HPC inJava with F-MPJ.

F-MPJ has been applied to a real complex problem withlarge scale needs for computational resources.

Taking advantage of hierarchical architectures: sharedmemory, distributed memory, hybrid shared/distributedmemory.

Other applications that benefit from the use of F-MPJ: ESAGaia project, jGadget, financial applications, petro-seismicJavaSeis, ...

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge

Page 41: The F-MPJ Challenge: Solving Complex Problems on ...€¦ · InfiniBand Network 32 Gbps (Mellanox MT26428 QDR) Linux, OpenJDK 1.6, F-MPJ, MPJ Express, IntelMPI Special shared memory

IntroductionMessage Passing in Java with F-MPJ

Complex Application in Bioinformatic: ProtTestPerformance Evaluation

Conclusions

SummaryQuestions

THE F-MPJ CHALLENGE: SOLVING COMPLEX PROBLEMS ON HIERACHICAL

ARCHITECTURES WITH JAVA

Sabela Ramos GareaRoberto Rey Expósito

University of A Coruña

Sabela Ramos Garea, Roberto Rey Expósito The F-MPJ Challenge