mpi support in glite
DESCRIPTION
MPI support in gLite. Enol Fernández CSIC. MPI on the Grid. CREAM/WMS. Submission/Allocation Definition of job characteristics Search and select adequate resources Allocate (or coallocate ) resources for the job Execution File distribution Batch system interaction - PowerPoint PPT PresentationTRANSCRIPT
MPI support in gLite
Enol FernándezCSIC
EMI I
NFS
O-R
I-261
611
CREAM/W
MS
MPI-Start
MPI on the Grid
• Submission/Allocation– Definition of job characteristics– Search and select adequate resources– Allocate (or coallocate) resources for the job
• Execution– File distribution– Batch system interaction– MPI implementation details
EMI I
NFS
O-R
I-261
611
Allocation / Submission
Type = "Job";CPUNumber = 23;Executable = "my_app";Arguments = "-n 356 -p 4";StdOutput = "std.out";StdError = "std.err";InputSandBox = {"my_app"};OutputSandBox = {"std.out", "std.err"};Requirements = Member("OPENMPI”, other.GlueHostApplicationSoftwareRunTimeEnvironment);
• Process count specified with the CPUNumber attribute
EMI I
NFS
O-R
I-261
611
MPI-Start
• Specify a unique interface to the upper layer to run a MPI job
• Allow the support of new MPI implementations without modifications in the Grid middleware
• Support of “simple” file distribution• Provide some support for the user to help manage his
dataGrid Middleware
MPI-START
ResourcesMPI
EMI I
NFS
O-R
I-261
611
MPI-Start Design Goals
• Portable– The program must be able to run under any
supported operating system• Modular and extensible architecture
– Plugin/Component architecture• Relocatable
– Must be independent of absolute path, to adapt to different site configurations
– Remote “injection” of mpi-start along with the job• “Remote” debugging features
EMI I
NFS
O-R
I-261
611
MPI-Start Architecture
CORE
Execution
Ope
n M
PI
MPI
CH2
LAM
PACX
Scheduler
PBS/
Torq
ue
SGE
LSF
Hooks
Loca
l
Use
r
Com
pile
r
File
Dist
.
EMI I
NFS
O-R
I-261
611
Using MPI-Start (I)
$ cat starter.sh#!/bin/sh# This is a script to call mpi-start
# Set environment variables neededexport I2G_MPI_APPLICATION=/bin/hostnameexport I2G_MPI_APPLICATION_ARGS=export I2G_MPI_TYPE=openmpiexport I2G_MPI_PRECOMMAND=time
# Execute mpi-start$I2G_MPI_START
stdout:Scientific Linux CERN SLC release 4.5 (Beryllium)Scientific Linux CERN SLC release 4.5 (Beryllium)lflip30.lip.ptlflip31.lip.pt
stderr:real 0m0.731suser 0m0.021ssys 0m0.013s
JobType = "Normal";CpuNumber = 4;Executable = "starter.sh";InputSandbox = {"starter.sh”}StdOutput = "std.out";StdError = "std.err";OutputSandbox = {"std.out","std.err"};Requirements =Member("MPI-START”, other.GlueHostApplicationSoftwareRunTimeEnvironment)&& Member("OPENMPI”, other.GlueHostApplicationSoftwareRunTimeEnvironment);
EMI I
NFS
O-R
I-261
611
Using MPI-Start (II)…CpuNumber = 4;Executable = ”mpi-start-wrapper.sh";Arguments = “userapp OPENMPI some app args…”InputSandbox = {”mpi-start-wrapper.sh”};Environment = {“I2G_MPI_START_VERBOSE=1”, …}...
#!/bin/bash
MY_EXECUTABLE=$1shiftMPI_FLAVOR=$1shiftexport I2G_MPI_APPLICATION_ARGS=$*
# Convert flavor to lowercase for passing to mpi-start.MPI_FLAVOR_LOWER=`echo $MPI_FLAVOR | tr '[:upper:]' '[:lower:]'`
# Pull out the correct paths for the requested flavor.eval MPI_PATH=`printenv MPI_${MPI_FLAVOR}_PATH`
# Ensure the prefix is correctly set. Don't rely on the defaults.eval I2G_${MPI_FLAVOR}_PREFIX=$MPI_PATHexport I2G_${MPI_FLAVOR}_PREFIX
# Setup for mpi-start.export I2G_MPI_APPLICATION=$MY_EXECUTABLEexport I2G_MPI_TYPE=$MPI_FLAVOR_LOWER
# Invoke mpi-start.$I2G_MPI_START
EMI I
NFS
O-R
I-261
611
MPI-Start Hooks (I)
• File Distribution Methods– Copy files needed for execution using the most
appropriate method (shared filesystem, scp, mpiexec, …)
• Compiler flag checking– checks correctness of compiler flags for 32/64
bits, changes them accordingly• User hooks:
– build applications– data staging
EMI I
NFS
O-R
I-261
611
MPI-Start Hooks (II)
#!/bin/sh
pre_run_hook () {
# Compile the program. echo "Compiling ${I2G_MPI_APPLICATION}"
# Actually compile the program. cmd="mpicc ${MPI_MPICC_OPTS} -o ${I2G_MPI_APPLICATION} ${I2G_MPI_APPLICATION}.c" $cmd if [ ! $? -eq 0 ]; then echo "Error compiling program. Exiting..." exit 1 fi
# Everything's OK. echo "Successfully compiled ${I2G_MPI_APPLICATION}"
return 0}
…InputSandbox = {…, “myhooks.sh”…};Environment = {…, “I2G_MPI_PRE_HOOK=myhooks.sh”};…
EMI I
NFS
O-R
I-261
611
MPI-Start: more features
• Remote injection– Mpi-start can be sent along with the job
• Just unpack, set environment and go!
• Interactivity– A pre-command can be used to “control” the mpirun call– $I2G_MPI_PRECOMMAND mpirun ….– This command can:
• Redirect I/O• Redirect network traffic• Perform accounting
• Debugging– 3 different debugging levels:
• VERBOSE: basic information• DEBUG: internal flow information• TRACE: set –x at the beginning. Full trace of the execution
EMI I
NFS
O-R
I-261
611
Future work (I)
• New JDL description for parallel jobs (proposed by the EGEE MPI TF):– WholeNodes (True/False):
• whether or not full nodes should be reserved
– NodeNumber (default = 1):• number of nodes requested
– SMPGranularity (default = 1):• minimum number of cores per node
– CPUNumber (default = 1):• number of job slots (processes/cores) to use
• CREAM team working on how to support them
EMI I
NFS
O-R
I-261
611
Future work (II)
• Management of non MPI jobs– new execution environments (OpenMP)– generic parallel job support
• Support for new schedulers– Condor and SLURM support
• Explore support for new architectures:– FPGAs, GPUs,…
EMI I
NFS
O-R
I-261
611
More Info…
• gLite MPI PT:– https://twiki.cern.ch/twiki/bin/view/EMI/GLi
teMPI• MPI-Start trac
– http://devel.ifca.es/mpi-start– contains user, admin and developer docs
• MPI Wiki @ TCD– http://www.grid.ie/mpi/wiki