zorka user guide - odu - old dominion university · yes yes 40 x 4 = 160 40 x 8 = 320 smp 4 four...

Old Dominion University | ACCOUNTS 1

USER GUIDE FOR ZORKA

Table of Contents ACCOUNTS .................................................................................................................................................... 2

Zorka System Information............................................................................................................................. 2

High Performance Linpack Benchmark: ................................................................................................ 3

LOGIN AND INTERACTIVE ENVIRONMENT .................................................................................................... 5

Access Zorka through DNS load balanced login: ................................................................................... 5

Accessing from Windows Environment: ............................................................................................... 5

Zorka Naming Convention: ................................................................................................................... 5

Login and Password: ODU MIDAS ID..................................................................................................... 5

Interactive Jobs on Zorka ...................................................................................................................... 6

APPLICATION ENVIRONMENT ....................................................................................................................... 6

Applications available on Zorka ............................................................................................................ 6

Setting the environment for jobs .......................................................................................................... 6

Using Modules to define the job environment ..................................................................................... 7

FILE SYSTEM .................................................................................................................................................. 8

File systems available on Zorka: ............................................................................................................ 8

Directory structure on Zorka:................................................................................................................ 8

Attributes of available scratch spaces: ................................................................................................. 9

Access to Mass Storage for archiving ................................................................................................... 9

How to move files from/to Zorka? ........................................................................................................ 9

COMPILING AND RUNNING JOBS................................................................................................................ 10

Compilers available on Zorka: ............................................................................................................. 10

Notes on Intel Compiler Options: ....................................................................................................... 10

Message Passing Interface (MPI) ........................................................................................................ 11

Submitting Jobs using Sun Grid Engine (SGE) ............................................................................................. 12

DEBUGGING AND PROFILING ...................................................................................................................... 16

MATH LIBRARIES ......................................................................................................................................... 17

Appendix A: Research Computing Use Statement ...................................................................................... 18

Appendix B: Modules User Guide .............................................................................................................. 19

Appendix C: Summary of Intel MPI Benchmarks (IMB) .............................................................................. 22

Old Dominion University | ACCOUNTS 2

Appendix D: Quick List of SGE Commands .................................................................................................. 23

ACCOUNTS Who can use Zorka?

o Any valid Old Dominion University student, faculty or staff member conducting research

o Research collaborators sponsored by ODU faculty/staff.

How to apply?

o Prerequisites:

A valid MIDAS username and password ( http://midas.odu.edu)

Activate LIONS account

(http://www.lions2.odu.edu:8080/lions/documentation/hpcdocs/documentation/account_request)

o Use FootPrints to request an HPC cluster account account (https://fp.odu.edu)

What are the conditions of use?

o Users are expected to review and adhere to the research computing acceptable use

statement (Appendix A).

HELP!

o Your request for assistance from the Research Computing group are documented and

tracked using FootPrints: https://fp.odu.edu.

o Email: {msachon, rigloria, mhalappa, ahkumar} @odu.edu

o Phone: {msachon, rigloria, mhalappa, ahkumar} 757-683-{4856, 4842, 3073, 3092}

o Location: Engineering & Computational Sciences Bldg, Suite 4300, 4700 Elkhorn Avenue,

Norfolk VA – 23529

Zorka System Information

Type # Processor #Core Memory

Infiniband

Gigabit Ethernet

Total Cores Total Memory(GB)

Login 3 Two quad core 2.493 GHz Intel Xeon (x86-64)

8 8 GB RAM

NO YES 3 X 8 = 24 3 X 8 = 24

Compute 40 Two dual core 2.992 GHz Intel Xeon (x86-64)

4 8 GB RAM

YES YES 40 X 4 = 160 40 X 8 = 320

SMP

4 Four quad core 2.393 GHz Intel Xeon (x86-64)

16 32 GB RAM

YES YES 4 X 16 = 64 4 X 32 = 128

7 8 Quad core 2.693 GHz AMD Opteron(x86-64)

32 64 GB RAM

YES YES 7 X 32 = 224 7 X 64 = 448

Master Node

1 Two quad core 2.493 GHz Intel Xeon (x86-64)

8 8 GB RAM

NO YES 1 X 8 = 8 1 X 8 = 8

I/O Nodes

4 Two dual core 2.992 GHz Intel Xeon (x86-64)

4 8 GB RAM

YES YES 4 X 4 = 16 4 X 8 = 32

TOTAL 59 496 960 GB

http://midas.odu.edu/

http://www.lions2.odu.edu:8080/lions/documentation/hpcdocs/documentation/account_request

https://fp.odu.edu/

Old Dominion University | Zorka System Information 3

Theoretical Peak Performance:

Rpeak = (# of nodes) X (# of CPU cores per node) X (Clock speed in GHz) X (Flops/cycle) GFLOPS/S

Single compute node: Rpeak = 1 X 4 X 2.992 X 2 = 23.936 GFLOPS/S

Single “SMP – Intel Xeon Processor” node: Rpeak = 1 X 16 X 2.393 X 4 = 153.152 GFLOPS/S

Single “SMP – AMD Opteron Processor” node: Rpeak = 1 X 32 X 2.693 X 2 = 172.352 GFLOPS/S

Node Type # of nodes Rpeak (GFLOPS/S) Estimated (Infiniband)* Estimated (GigE)*

Compute 40 957.44 778 (81.29% of peak) 682 (71.23% of peak) SMP Intel 4 612.608 441 (72.03% of peak) 387 (63.11% of peak) SMP AMD 7 1206.464 803 (66.83% of peak) N/A Total 44 1,570.048 GFLOPS/S 1,219 GFLOPS/S 1,069 GFLOPS/S

*Estimated Performance: Processor RAM (2 GB per

core) Infiniband (4X) GigE

79% of peak (Compute) 98% of peak 105% of peak 92% of peak 70% of peak (SMP - Intel) 98% of peak 105% of peak 92% of peak

Compute Node: Estimated Rmax = 19 (Actual Performance) = 81.29% of peak SMP Intel Node: Estimated Rmax = 110 (Actual performance) = 72.03% of peak

High Performance Linpack Benchmark:

The HPL benchmark suite compiled with GCC compilers and using ATLAS CBLAS libraries yielded a maximum of 869 GFLOPS/S on the 40 compute nodes. A complete set of results is provided in another document referenced in the Table of Contents.

Old Dominion University | Zorka System Information 4

Figure 1: Zorka HPC Cluster Architecture

Old Dominion University | LOGIN AND INTERACTIVE ENVIRONMENT 5

LOGIN AND INTERACTIVE ENVIRONMENT

Access Zorka through DNS load balanced login:

ssh –X go-zorka.hpc.odu.edu

Go-zorka will assign users to one of the three login nodes using a round robin methodology. The log in

node where you are assigned is cached on your local SSH client and you may occasionally end up on the

same node when going to go-zorka before the cache is flushed. You can also directly log into a specific

login node if necessary as follows:

ssh –X zorka0.hpc.odu.edu ssh –X zorka1.hpc.odu.edu ssh –X zorka2.hpc.odu.edu

Note: While all other directories like /home, /scratch (nfs and pvfs) will be accessible from any of these machines, each login node has a directory, /tmp, that is local to each. If your work on a login node results in output to /tmp which you want to have available from other login nodes, copy this information first to an accessible directory.

Accessing from Windows Environment:

You will need an SSH client like PuTTy or Secure Shell Client. Using your MIDAS username and password, you can download SSH client X-Win32 from (http://occs.odu.edu/hardwaresoftware/downloads/index.shtml)

Zorka Naming Convention:

Node Type Start End

Compute zorka-0-0.local zorka-0-39.local

SMP zorka-44-0.local zorka-44-3.local

zorka-84-0.local zorka-84-6.local

Login Nodes zorka-24-0.local (zorka0.hpc.odu.edu)

zorka-24-2.local (zorka2.hpc.odu.edu)

Login and Password: ODU MIDAS ID

Please use your ODU MIDAS login and password for accessing the login nodes. Once you have logged

into any of the login nodes, you should be able to perform a password-less login into all the other nodes.

In order to do that you will have to select a null SSH PASSPHRASE when you are prompted the very first

time you login. If you have added a non-null password phrase, you can delete the ~/.ssh directory,

logout and log back in. If you do not prefer to use a null passphrase then you will have to start an SSH

agent each time you login: ssh-agent /bin/tcsh; ssh-add. Some applications like MPI-based jobs need

password-less login capability to successfully run jobs across nodes.

http://occs.odu.edu/hardwaresoftware/downloads/index.shtml

Old Dominion University | APPLICATION ENVIRONMENT 6

Interactive Jobs on Zorka

Note: The “-X” switch during initial login (and every subsequent login) is very important to forward all

the X-related streams. An alternative will be to issue the following commands (for TCSH shell): setenv

DISPLAY <my-ip-address>:0 (or export DISPLAY <my-ip-address>:0 for bash).

APPLICATION ENVIRONMENT

Applications available on Zorka

Default location: /opt or /share/opt

Naming Convention: Base-Name/Version/Compiler/App-specific/

Name Type Default Location & Remarks

abaqus Parallel /opt/abaqus

atlas Serial /opt/atlas

Bio Serial/Parallel /opt/Bio ‘Also includes gromacs’

charmm Serial/Parallel /share/opt

fftw Serial/Parallel /opt/fftw ‘2.1.5 is parallel and 3.1.2 is serial’

fluent Serial/parallel /share/opt/fluent

g03 Serial/Parallel /opt/g03

smxgauss Serial /share/opt/smxgauss

google-perftools Serial /opt/google-perftools

gotoblas Serial /opt/gotoblas

gromacs Serial/Parallel /opt/gromacs

IMB_2.3 Parallel /opt/IMB_2.3

lapack Serial /opt/lapack

matlab Serial/Parallel /share/opt/matlab

metis Serial /opt/metis

mumps Parallel (MPI) /opt/mumps

parmetis Parallel (MPI) /opt/parmetis

petsc Serial/Parallel (MPI) /opt/petsc

superlu-dist Parallel (MPI) /opt/superlu-dist

valgrind Serial /opt/valgrind

zoltan Parallel (MPI) /opt/zoltan

Setting the environment for jobs

The two ways in which the user can set up his environment for an application: A) Using Modules B) Using .tcshrc (.bashrc) file

Old Dominion University | APPLICATION ENVIRONMENT 7

Using Modules to define the job environment

Modules is a package that enables easy dynamic modification of user’s environment variables via

modulesfiles.

Each modulefile contains the information needed to configure the shell for an application. Once the

Modules package is initialized, the environment can be modified on a per-module basis using the

module command which interprets modulefiles. Typically modulefiles instruct the module command to

alter or set shell environment variables such as PATH, MANPATH, etc. Modulefiles may be shared by

many users on a system. Users may have their own Modules loaded and unloaded dynamically and

atomically, in an clean fashion.

All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, as well as some scripting

languages such as perl.collection to supplement or replace the shared modulefiles.

>> Example 1: To load intel Compilers into your environment:

# module load intel/v10.1/compiler

Now you can readily access Intel compiler binaries and libraries from any directory without worrying

about providing path information.

>>Example 2: In order to compile your Intel compiler based MPI programs, you have to setup your

environment variables to point to the right MPI libaray based on the choice of compiler:

This can be achieved using modules with the following steps:

Step 1) Check for available MPI modulefiles: Command to use: “module which”

Step 2) Load the respective module: Command to use: “module load module-name”

#] module which 3.2.6 : Changes the MODULE_VERSION environment variable dot : adds `.' to your PATH environment variable gcc/v4.1.2/compiler : Loads the GCC v4.1.2 Compiler specific environment variables. intel/v10.1/compiler : Loads the Intel v10.1 Compiler specific environment variables. modules : loads the modules environment mvapich2-v1.0.3/gcc/gen2: Adds GCC Complied MVAPICH2 v1.0.3 specific environment variables to the current environtment. mvapich2-v1.0.3/intel/gen2: Adds Intel Compiled MVAPICH2 v1.0.3 specific environment variables to the current environtment.

# module load mvapich2-v1.0.3/intel/gen2 At this point after the above command is typed, you can compile your MPI programs with the MPI compiler of your choice without even knowing its install location. Compare this method with manually editing .tcshrc file in the following section.

For more information on using Modules, please refer Appendix B.

a) Using .tcshrc(.bashrc) files

In order to make any application executables readily available from command prompt, users have to add

a complete path, to the respective application binaries, into .tcshrc file. Also required is to update the

respective LD_LIBRARY_PATH for the correct functioning of the application during run-time. All of these

require a user to manipulate his/her .tcshrc file often.

Old Dominion University | FILE SYSTEM 8

For example: To load Intel based MPI compiler into a user’s environments settings a user has to add the

following line into his/her .tcshrc file. With the following lines added, now the user has to source his

setenv MPI_HOME /opt/mvapich2/1.0.3/intel-10.1/gen2 set path = ( $MPI_HOME/bin $path ) set path= ($MPI_HOME/sbin $path ) setenv LD_LIBRARY_PATH $MPI_HOME/lib:$LD_LIBRARY_PATH

.tcshrc file.

# source ~/.tcshrc

At this point after the above command is type, you can compile your MPI programs with the MPI

compiler of your choice. Compare this method with using modules as described int the previous

section.

FILE SYSTEM

File systems available on Zorka:

Local file system: Specific to each node, for example: /opt, /tmp, etc.

Network file system: Can be accessed from any node in the cluster, for example: /home,

/share/apps, /share/opt, /scratch/nfs, /ms etc.

Parallel file system (PVFS2): High performance file system accessible from all compute and SMP

nodes, for example: /scratch-pvfs2

Directory structure on Zorka:

Directory Type Limit Remark:

/home NFS 2 GB Not high performance; Do not run file intensive jobs from here.

/ms NFS mounted Archival storage

Unlimited This accesses the Research Mass Storage archival disk/tape system. MS is only accessible on login nodes; compute nodes cannot access mass storage.

/scratch/pvfs2 PVFS 9.1 TB High performance parallel scratch accessed over Infiniband.

/scratch/nfs NFS 3.3 TB Non parallel scratch space over Gigabit Ethernet.

Old Dominion University | FILE SYSTEM 9

/scratch/smp0 Local ext3 FS 512 GB Remote mounted on login nodes for staging and moving results. Use this ONLY when running jobs on the respective SMP node.

/scratch/smp1 Local ext3 FS 512 GB



Attributes of available scratch spaces:

a) /scratch/pvfs2: This is a high performance parallel scratch. It is well suited for large parallel I/O

jobs. Therefore, if your application is I/O intensive then by using this file system will be critical

for better performance. If your application uses MPI + ROMIO then using PVFS2 will add to the

performance improvement significantly. For further information on PVFS2 please refer

http://www.pvfs.org/documentation/

b) /scratch/nfs: This is a regular non-parallel NFS scratch. Use of this scratch is encouraged for non-

I/O intensive applications, parallel or serial.

c) /scratch/smpX: Where X is a place hodler for SMP’s local scratch 0, 1, 2 or 3. This scratch space

is only accessible for jobs that are contained within an individual SMP node. They are made

accessible outside of the SMP nodes only for convenience. To contain your job within a

particular SMP node, a specific queue request has to be made to SGE. This is discussed in detail

under the section titled “Submmitting Jobs using SGE”.

Access to Mass Storage for archiving

Mass Storage is a hierarchical archival storage. Use this storage only to backup your research

data. You are not permitted to run jobs directly using this storage space.

Two ways to access your Mass Storage using Windows:

a) WINSCP: Follow the link for more on how to access Mass Storage using WINSCP

http://www.lions.odu.edu:8080/hpcdocs/masstor/winscp

b) WebDrive: Webdrive allows you to access external file systems on your windows computer as a

mounted drive (e.g. L: drive). Contact your college’s desktop TSP to have Webdrive installed on

your windows machine and the research computing group to configure Webdrive.

How to move files from/to Zorka?

o Unix/Linux Unix/Linux: using command scp

o Windows Linux : Tools like WinSCP (http://winscp.net)

## Using scp for moving files: #Single File: Zorka:] scp myFile.dat [email protected]:/home/my-id/ #Entire Directory: Zorka:] scp -r myDirectory [email protected]:/home/my-id/

http://www.lions.odu.edu:8080/hpcdocs/masstor/winscp

http://winscp.net/

Old Dominion University | COMPILING AND RUNNING JOBS 10

COMPILING AND RUNNING JOBS

Compilers available on Zorka:

GNU Compilers: Version: 4.1.2

Language File Suffix Serial MPI

Fortran 77 .f Gfortran mpif77 Fortran90 .f90, .F Gfortran mpif90 C .c Gcc mpicc C++ .cc, .cpp g++ mpicxx

Use “-m64” switch for building 64-bit (“-m32” for 32-bit) libraries and executables.

Intel Compilers: Version: 10.1

Language File Suffix Serial MPI OpenMP

Fortran 77 .f Ifort mpif77 -openmp Fortran90 .f90, .F Ifort mpif90 -openmp C .c Icc mpicc -openmp C++ .cc, .cpp Icpc mpicxx -openmp

Setting up environment for compilers of your choice: You can either use Modules to link to the

compilers of your choice or set up the environment using .tcshrc file.

GNU compilers (4.1.2) are available by default. In order to access Intel compilers, please add these lines

to your .tcshrc files:

#Intel Compilers: source /opt/intel/cce/10.1.015/bin/iccvars.csh source /opt/intel/fce/10.1.015/bin/ifortvars.csh #Linking with MKL Libraries: setenv LD_LIBRARY_PATH /usr/X11R6/lib:$LD_LIBRARY_PATH setenv LD_LIBRARY_PATH /opt/intel/mkl/10.0.1.014/lib/em64t:$LD_LIBRARY_PATH

Notes on Intel Compiler Options:

Optimization Options “Specific” for Intel Platforms:

Dual-Core Intel Xeon 5300/5100/3000 series : -axT Quad-Core Intel Xeon : -axT Quad-Core AMD Opteron : -xW

Other Switches: Automatic Parallelization : -parallel OpenMP : -openmp

Old Dominion University | COMPILING AND RUNNING JOBS 11

Optimization flags:

Linux Comment

-O0 No optimization

-O1 Optimize for size

-O2 Optimize for speed and enable some optimization

-O3 Enable all optimizations as O2, and intensive loop optimizations

-fast Shorthand for "-O3 -ipo -static -xT -no-prec-div". Note that the processor specific optimizations (-xT) will change with new compiler versions. Useful if you want to compile your program for release.

-prof_gen Compile the program and instrument it for a profile generating run.

-prof_use May only be used after running a program that was previously compiled using prof_gen. Uses profile information during each step of the compilation process.

Message Passing Interface (MPI)

There are three main parameters in chosing a particular flavor of MPI:

Interconnect: Gigabit Ethernet or Infiniband (always prefer Infiniband over GigE)

o Infiniband: MVAPICH2; Ethernet: OpenMPI

Base Compilers: GNU V/s Intel (prefer Intel over GCC)

MPI Standard: 1 or 2 (by default only MPI-2 is available on Zorka)

GCC Intel

Infiniband /opt/mvapich2/1.0.3/gcc-4.1.2/gen2 /opt/mvapich2/1.0.3/intel-10.1/gen21 Gigabit Enthernet /opt/openmpi/1.2.7/gcc-4.1.2/64 /opt/openmpi/1.2.7/intel-10.1/64

Again there are two ways of setting up your environment for a particular flavor of MPI:

Modules, or

.tcshrc

Please refer to section on Modules for setting up MPI using Modules. For more information on Modules,

please refer Appendix B.

To setup your environment using .tcshrc

Add the following lines to your .tcshrc file to link to your favorite MPI flavor:

Old Dominion University | Submitting Jobs using Sun Grid Engine (SGE) 12

#MPI Settings: setenv MPI_HOME /opt/mvapich2/1.0.3/intel-10.1/gen2 set path = ( $MPI_HOME/bin $path ) set path= ($MPI_HOME/sbin $path ) setenv LD_LIBRARY_PATH $MPI_HOME/lib:$LD_LIBRARY_PATH

Submitting Jobs using Sun Grid Engine (SGE) Sun Grid Engine is a workload management system used in our clusters. SGE allows users to share

important system resources.

Because all users are required to use SGE (see policies), managing jobs under SGE is an important skill.

The minimum required tasks for managing a job under SGE are staging the job, submitting the job to

SGE, and managing the job’s results and output (i.e, clean-up). Additional tasks that are optional are

covered at the end of this section.

The staging of a job and the management of its results and output require only basic knowledge of

operating system (OS) commands. In particular, expect to use the OS commands “ls”, “cp”, and “mv” at

the command line, or their various equivalents in different environments (e.g., a GUI). Efficient practice

of these tasks benefits from a good understanding of the system architecture as discussed elsewhere.

The last step of managing your job’s results and output (or clean-up) is needed so that you can comply

with usage policies, for example, with the use of scratch spaces.

Therefore, job submission is the key task that you need to understand. The SGE job submission

commands are “qsub” and “qrsh”. The “qsub” command is used for batch submissions, while the

“qrsh” command is for interactive use. When you are logged in, do use the manpages for these

commands frequently as a reference.

The minimum required argument for the SGE “qsub” command is the job script. The syntax for this

basic usage is shown below (the “$ > ” sign is your OS command prompt).

$ > qsub job.script Your job 170 ("job.script") has been submitted

In every case, you will receive a response from SGE (similar to the one shown) after you press the

<Enter> key. The response shows the job ID (“170”, in this case), and it echoes the filename of your job

script (assuming you did not provide other options that change the job name).

Sun Grid Engine allows us to define parallel programming and runtime environments, allowing for the

execution of shared memory or distributed memory parallelized applications, under the name Parallel

Environments(PE). We here at OCCS have defined a set of PE’s for Zorka cluster to fit the needs of the

commonly used applications and its run-time environments. An individual can identify the parallel

Old Dominion University | Following are the tables showing the association of PE’s for both Batch and Interactive Jobs.

13

environment based on the application he/she wants to run and request the Parallel environment (PE)

through his/her job script using the switch “-pe pe_name number_of_slots” . A detailed example of the

job script showing how to request a PE is discussed in the section “Job Script”.

If an individual chooses not to mention a PE in his/her job script OR requires to run a sequential job then

one need not request any PE and exclude the switch “-pe pe_name number_of_slots” from the

command line parameter or the job script.

Following are the tables showing the association of PE’s for both Batch

and Interactive Jobs.

Resource Description

Parallel Environments(PE) Queue’s Cores

2 Dual Cores: 2.99 GHz Intel Xeon Processor

charmm, impi, linda, mvapich2, mpi-charmm, mpi-fluent

q2x2 160

4 Quad Cores 2.393 GHz Intel Xeon

Processor

impi-smp, linda, mvapich2-smp, mpi-charmm-smp, mpi-fluent-smp

q4x4 64

8 Quad cores 2.693 GHz AMD Optron Processors

mvapich2-smp8x4, linda q8x4 224

Table: Batch job PE’s

Resource Description

Parallel Environments(PE) Queue’s Cores

2 Dual Cores: 2.99 GHz Intel Xeon Processor

gui q2x2 160

4 Quad Cores 2.393 GHz Intel Xeon

Processor

gui-smp q4x4 64

8 Quad cores 2.693 GHz AMD Optron Processors

gui-smp8x4 q8x4 224

Table: Interactive job PE’s

If you are overwhelmed with the above range of PE’s and would like to identify the appropriate PE for

your job, send us a brief email on what kind of experiment you are running and we will guide you

through.


14

Job Script: A job script is an ACSII file – this implies that you can read its contents using the OS “cat” command. The

file “job.script” must not be a binary executable file. Usually, a “#” at the beginning of a line indicates

a comment. The special “#!” on the first line indicates the shell (please refer to shell documentation

for details). SGE uses the “#$” to indicate “qsub” arguments stored within the job script file - saves

you time and the need to remember the correct “tried-and-true” options for your job.

The following is a sample job script for a sequential job:

#!/bin/tcsh # # Replace the “a.out” with your application’s binary executable filename. # If your program accepts or requires arguments, they are listed after the name # of the program – please refer to your application’s documentation for details. # # For example: # a.out output.file # a.out

To use the job script template shown, change the “a.out” file as necessary.

You can add many typical OS shell commands (“tcsh” in this case) to this script to automate many of the

tasks you need to do, including some of the staging and clean-up tasks. For example, for staging, you can

compile some programs you want. As another example, after “a.out”, you can delete files that you know

won’t be needed later.


15

The following is a sample job script for a MVAPICH2 parallel job:

#!/bin/tcsh # # Check out the SGE options listed here # #$ -cwd #$ -o MyOutPutFile.txt -j y #$ -S /bin/tcsh #$ -q q2x2 # # Modify the 6 at the end of this option to request a different number of # parallel slots # #$ -pe mvapich2 6 # # Tell SGE of the version of MVAPICH2 to use # #$ -v mvapich2="mvapich2-v1.0.3/intel/gen2" # # Load the repective module file # module load mvapich2-v1.0.3/intel/gen2 # # Run the binary executable file hello.exe # Refer to MVAPICH2 application development on how to get hello.exe # mpiexec -machinefile $TMPDIR/machines -n $NSLOTS ./hello.exe

Note the use of “#$” to specify various SGE “qsub” options within the job script.

For other types of job scripts, we ask you consult with us and specific application documentation.

There are other SGE commands (refer Appendix D)that are useful, though not absolutely necessary for

using SGE. Some of these commands are listed in the two tables below, along with their functions. Some

options are shown, try them out, or check the manpage for the function.

Old Dominion University | DEBUGGING AND PROFILING 16

DEBUGGING AND PROFILING Tools for debugging:

Valgrind: /opt/valgrind/3.3.1

GDB

Tools for Profiling:

Google Perftools: /opt/google-perftools/0.98/gcc-4.1.2

Mpdtrace: -mpe=mpdtrace

Jumpshot: -mpe=mpilog

Google Perftools details:

#Google Performance Tools setenv GooglePerf_HOME /opt/google-perftools/0.98/gcc-4.1.2 set path = ( $GooglePerf_HOME/bin $path ) setenv LD_LIBRARY_PATH $GooglePerf_HOME/lib:$LD_LIBRARY_PATH setenv PROFILEFREQUENCY 1000000 #CPUPROFILE=/tmp/profile /usr/local/netscape

JumpShot Details:

#Compilation: mpicc –mpe=mpilog myMpiCode.c #Execution: mpiexec –np x myMpiCode.exe #This will create myMpuCode.exe.clog2 file #Conversion: clog2ToSlog myFile.clog2 #This will create a Slog2 file, to visualize: jumpshot myFile.slog2

Sample JumpShot Picture: continued on next page…

Old Dominion University | MATH LIBRARIES 17

Figure 2: A smaple Jumpshot picture

MATH LIBRARIES Math libraries currently available on Zorka:

Intel MKL: /opt/intel/mkl/10.0.1.014

Atlas: /opt/atlas/3.8.1/gcc-4.1.2

Goto BLAS: /opt/gotoblas/1.26

LAPACK: /opt/lapack/3.1.1

FFTW: /opt/fftw (versions 2.1.5 and 3.1.2)

Solvers that are currently available are:

SuperLU-Dist: /opt/superlu-dist/2.2/intel-10.1/mpich2-1.0.7-default

MUMPS: /opt/mumps/4.7.3/intel-10.1/mpich2-1.0.7-default

PETSc: /opt/petsc/2.3.3/intel-10.1/mpich2-1.0.7-default

MATLAB: /opt/matlab

Old Dominion University | Appendix A: Research Computing Use Statement 18

Appendix A: Research Computing Use Statement The purpose of the following guidelines is to promote awareness of computer security issues and to

ensure that the ODU's computing systems are used in an efficient, ethical, and lawful manner. Users

must adhere to the computing Acceptable Usage Policy (available at http://midas.odu.edu on the

security setting link). In addition, the following guidelines are established for use of the research

facilities.

1. While no hard limits are set for a maximum time limit or the number of nodes, jobs requiring extraordinary resources should be discussed with the research support group prior to execution.

2. ODU cluster accounts are to be used only for the university research activity. Use is not allowed for non- research or commercial activities. Unauthorized use may constitute grounds for account termination and/or legal action.

3. The research computing systems are non-classified systems. Classified information may not be processed, entered, or stored.

4. Users are responsible for protecting and archiving any programs/data/results. 5. Users are required to report any incident in computer security to the research computing

group. Users shall not download, install, or run security programs or utilities to identify weaknesses in the security of a system. For example, ODU users shall not run password cracking programs.

6. Users shall not attempt to access any data or programs contained on systems for which they do not have authorization or explicit consent of the owner of the data.

7. Users must have proper authorization and licenses to install or use copyrighted programs or data. Unauthorized use can lead to account suspension, account termination and may be subject to legal action.

8. You must use your odu login name when using research facilities and may not access anothers account. Users shall not provide information to others for unauthorized access.

9. Users shall not intentionally engage in activities to: harass other users; degrade the performance of systems; deprive an authorized user access to a resource; obtain extra resources beyond those allocated; circumvent computer security measures or gain access to a system for which proper authorization has not been given; misuse batch queues or other resources in ways not authorized or intended.

http://midas.odu.edu/

Old Dominion University | Appendix B: Modules User Guide 19

Appendix B: Modules User Guide a) To list the help menue for modules type: # module -H Modules Release 3.2.6 2007-02-14 (Copyright GNU GPL v2 1991):

Usage: module [ switches ] [ subcommand ] [subcommand-args ]

Switches:

-H|--help this usage info

-V|--version modules version & configuration options

-f|--force force active dependency resolution

-t|--terse terse format avail and list format

-l|--long long format avail and list format

-h|--human readable format avail and list format

-v|--verbose enable verbose messages

-s|--silent disable verbose messages

-c|--create create caches for avail and apropos

-i|--icase case insensitive

-u|--userlvl <lvl> set user level to

(nov[ice],exp[ert],adv[anced])

Available SubCommands and Args:

+ add|load modulefile [modulefile ...]

+ rm|unload modulefile [modulefile ...]

+ switch|swap [modulefile1] modulefile2

+ display|show modulefile [modulefile ...]

+ avail [modulefile [modulefile ...]]

+ use [-a|--append] dir [dir ...]

+ unuse dir [dir ...]

+ update

+ refresh

+ purge

+ list

+ clear

+ help [modulefile [modulefile ...]]

+ whatis [modulefile [modulefile ...]]

+ apropos|keyword string

+ initadd modulefile [modulefile ...]

+ initprepend modulefile [modulefile ...]

+ initrm modulefile [modulefile ...]

+ initswitch modulefile1 modulefile2

+ initlist

+ initclear

b) To list Currently Loaded modules:

# module list

c) Currently available modules to load: # module which

3.2.6 : Changes the MODULE_VERSION environment variable

:

dot : adds `.' to your PATH environment variable


mvapich2-v1.0.3/gcc/gen2:

Adds GCC Complied MVAPICH2 v1.0.3 specific environment

variables to the current environtment.

mvapich2-v1.0.3/intel/gen2:

Adds Intel Compiled MVAPICH2 v1.0.3 specific environment

variables to the current environtment.

null : does absolutely nothing

d) How to load a module: >> First check what your current PATH contains:

# echo $PATH

/opt/gridengine/bin/lx26-

amd64:/usr/java/jdk1.5.0_10/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/us

r/sbin:/usr/bin:/root/bin

>> Now Load the Module

# module load mvapich2-v1.0.3/gcc/gen2

>> Now that the module is loaded, check the path:

# echo $PATH

/opt/mvapich2/1.0.3/gcc-4.1.2/gen2/bin:/opt/gridengine/bin/lx26-

amd64:/usr/java/jdk1.5.0_10/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/us

r/sbin:/usr/bin:/root/bin

Note the environment settings for MVAPICH2 is added ahead of all path

variables and hence it will be the first default in the path search.

>>!!!! Important !!!!! To load another version of MVAPICH2 you first

have to unload the previously loaded version of MVAPICH2

The following example shows trying to load a module that conflicts with

another module

# module load mvapich2-v1.0.3/intel/gen2

mvapich2-v1.0.3/intel/gen2(24):ERROR:150: Module 'mvapich2-

v1.0.3/intel/gen2' conflicts with the currently loaded module(s)

'mvapich2-v1.0.3/gcc/gen2'

mvapich2-v1.0.3/intel/gen2(24):ERROR:102: Tcl command execution failed:

conflict mvapich2-v1.0.3

>> To avoid the above error unload the previously loaded conflicting

module and then load the new module:

# module unload mvapich2-v1.0.3/gcc/gen2

# module load mvapich2-v1.0.3/intel/gen2

e) Help on a particular modulefile # module help mvapich2-v1.0.3/gcc/gen2


----------- Module Specific Help for 'mvapich2-v1.0.3/gcc/gen2' -------------

--------------

mvapich2-v1.0.3/gcc/gen2 - Load this module ...

1) To add MPICH2 v1.0.3 sepcific PATH & LD_LIBRAY_PATH

variables to the current environment

2) To compile your programs written using MPI.

3) In your job submit script to run MPI jobs in parallel.

f) To show the environment changes made by a specific module: # module show mvapich2-v1.0.3/intel/gen2

-------------------------------------------------------------------

/usr/local/Modules/3.2.6/modulefiles/mvapich2-v1.0.3/intel/gen2:

module-whatis

Adds Intel Compiled MVAPICH2 v1.0.3 specific

environment variables to the current environtment.

conflict mvapich2-v1.0.3

setenv MPI_HOME /opt/mvapich2/1.0.3/intel-10.1/gen2

prepend-path PATH /opt/mvapich2/1.0.3/intel-10.1/gen2/bin

prepend-path MANPATH /opt/mvapich2/1.0.3/intel-10.1/gen2/man

prepend-path LD_LIBRARY_PATH /opt/mvapich2/1.0.3/intel-10.1/gen2/lib

Old Dominion University | Appendix C: Summary of Intel MPI Benchmarks (IMB) 22

Appendix C: Summary of Intel MPI Benchmarks (IMB)

MPI Test Input Parameters Units of measurement

Infiniband (MVAPICH)

Ethernet (OpenMPI)

PingPong #bytes 4194304 Mbytes/sec 1235.01 1179.18

PingPing #bytes 4194304 Mbytes/sec 1080.96 996.64

Sendrecv #bytes 4194304 Mbytes/sec 2148.80 721.18

Exchange #processes = 64

#bytes = 65536 Mbytes/sec 2158.23 685.52

Allreduce #processes = 64

#bytes = 65536 t_avg in

usec

595.47 1465.82

Reduce #processes = 64


usec

412.41 3376.40

Reduce_scatter #processes = 64


usec

329.40 1135.95

Allgather #processes = 64


usec

10278.69 10934.29

Allgatherv #processes = 64


usec

10268.31 217895.26

Alltoall #processes = 64


usec

14364.32 15284.80

Bcast #processes = 64


usec

432.08 454.29

Barrier #processes = 64 t_avg in

usec

170.65 178.21

Old Dominion University | Appendix D: Quick List of SGE Commands 23

Appendix D: Quick List of SGE Commands

SGE Commands Some Options (not all) Function or purpose

qstat

-help

-g c

-f

-q <queue name>

Find out the general status of the SGE environment, including what jobs are running.

qjobs

(none) This command is not a real SGE command. Experienced users will find the output similar to “qstat” from our older clusters. It is a script that uses the new “qstat” which, of course, works a little differently now.

qdel

-help

<job-id>

Delete a submitted job whether or not it is running. You can only delete your own jobs.

qalter

-help

<job-id>

Change a job’s options.

qhost -q Find some information about the compute nodes.

qacct -help Find out usage of one, more than one, or all jobs submitted through SGE.

qmon Use SGE in a GUI. Requires X.

zorka user guide - odu - old dominion university · yes yes 40 x 4 = 160 40 x 8 = 320 smp 4 four...

Documents