parallel computing & bioinformatics lab: account setup and mpi introduction 1 account setup and...

54
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 1 Account Setup and MPI Introduction Parallel Computing & Bioinformatics Lab Sylvain Pitre ([email protected] ) Web: http://cgmlab.carleton.ca

Post on 18-Dec-2015

232 views

Category:

Documents


2 download

TRANSCRIPT

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 1

Account Setup and MPI Introduction

Parallel Computing & Bioinformatics Lab

Sylvain Pitre ([email protected])

Web: http://cgmlab.carleton.ca

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 2

Overview

• CGM Cluster specs

• Account Creation

• Logging in Remotely (Putty, X-Win32)

• Account Setup for MPI

• Checking Cluster Load

• Listing Your Jobs

• MPI Introduction and Basics

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 3

CGM Lab Cluster

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 4

CGM Lab Cluster (2)

• 8 dual-core workstations (total of 16 processors)– Named cgm01, cgm02…cgm08.– Intel Core 2 Duo 1.6GHz, 4GB DDR2 RAM, 320GB disks– Server (cgm01) has an extra terabyte (1TB) disk space.

• Connected through a dedicated gigabit switch.• Running Fedora 8 (64-bit).• OpenMPI (http://www.open-mpi.org/)• cgmXX.carleton.ca (SSH, where XX=01 to 08)• Putty (terminal): http://www.putty.nl/download.html• WinSCP (file transfer): http://winscp.net/eng/index.php• XWin-32 (http://www.starnet.com/)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 5

CGM Lab Cluster (3)

• Accounts are handled by LDAP (Lightweight Directory Access Protocol) on the server.

• User files are stored on the server and accessed by every workstation using NFS (Network File System).

• Same login and password will work on any workstation.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 6

CGM Lab Cluster (4)cgm01

cgm02

cgm03

cgm04

cgm05

cgm06

cgm07

cgm08

NFS Server

LDAP Server

Carleton

Network

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 7

Account Creation

• To get an account send an email to Sylvain Pitre ([email protected])

• Include in your email– your full name– your email address (if different from the one

used to send the email).– your supervisor name (or course professor).– your preferred login name (8 characters max)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 8

Logging In Remotely

• You can login remotely to the cluster by SSH (Secure Shell).

• Users familiar to unix/linux should already know how to do this.

• Windows users can use Putty, a lightweight SSH client (see link on slide 4)

• Windows users can also log in by X-Win32• DNS names: cgmXX.carleton.ca (XX=01 to 08)• Log in any node except cgm01 (server)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 9

Logging in with Putty

• Under Host Name, enter the cgm machine you want to log into (cgm03 in this case) then click Open.

• A terminal will open and ask you for your username then password.

• That’s it! You are logged into one of the cgm nodes.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 10

Login with X-Win32

• You can also log in to the nodes using X-Win32 • Open the X-Win32 Configuration program (X-Config)• Under the Sessions Tab, click on Wizard.• Enter a name for the session (ex: cgm03) and under

Type click on ssh then click Next.• As host enter the name of the node you wish to connect

to (ex: cgm03.carleton.ca) then click Next.• Enter your login name and password and Click Next.• For Command, click on Linux then click Finish.• The new session is now added to your Sessions

Window.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 11

Login with X-Win32 (2)

• Click on the newly created session then click on Launch.• After launching the session, you might get asked to

accept a key (click on “yes”).• You should now be at a terminal. • You can work in this terminal if you wish (like in Putty)

but if you wish to have the visual interface type:– gnome-session &

• After a few seconds the visual interface will start up. • Now you have access to all the menus and windows of

the Fedora 8 interface (using Gnome).

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 12

Login with X-Win32 (3)

Demonstration

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 13

Account Setup

• First time login:– Once you have your account (send me an

email to get one) and login, change your password with the passwd command.

• If you are unfamiliar with unix/linux:– I strongly recommend reading some tutorials

and playing around with commands (but be careful!).

– I assume you have some basic unix/linux knowledge in the rest of the slides.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 14

“Password-less” SSH

• In order to run MPI on different nodes transparently, we need to setup SSH to it doesn’t constantly ask us for a password. Type:

ssh-keygen -t rsa    cd .ssh cp id_rsa.pub authorized_keys2 chmod go-rwx authorized_keys2ssh-agent $SHELLssh-addcd ..

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 15

“Password-less” SSH (2)

• Now after your initial login you should be able to SSH into any other cgmXX machine without a password. SSH to every workstation in order to add that node to your known_hosts. Type:

ssh cgm01 date (answer “yes” when asked)

ssh cgm02 date

ssh cgm08 date

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 16

Ready for MPI!

• After completing the steps above your account is now ready to run MPI jobs.

• Running big jobs on multiple processors– Since there is no job scheduler jobs are

launched manually so please be considerate. Use nodes that are not in use or that have less load (I’ll show you how to check).

– If you need all the nodes for a longer period of time we’ll try to reserve them for you.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 17

Network Vs. Local Files

• If you need to do a lot of disk I/O, it is preferable to use the local disk’s /tmp directory.– Since your account is mounted by NFS, all

files written to your home directory are sent to the server (network bottleneck).

– To reduce network transfers, place your large input/output files in /tmp on your local node.

– Make the filename “unique”.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 18

Checking Cluster Load

• To check the load on each workstation type the command: load

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 19

Listing Your Jobs

• To check all of your jobs (processes) across the cluster type: listjobs

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 20

MPI Introduction

• Message Passing Interface (MPI)– Portable message-passing standard that facilitates

the development of parallel applications and libraries.– For parallel computers, clusters…– Not a language in it’s own. It is used as a package

with another language, like C or Fortran.– Different implementations: OpenMPI, LAM/MPI,

MPICH… – Portable = not limited to a specific architecture.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 21

MPI Basics

• Every node (process) executes the same code.– Nodes can follow different paths (Master/slave model)

but don’t abuse!

• Communication is done by message passing.• Every node has a unique rank (ID) from 0 to p.• The total number of nodes is known to every node.• Synchronous or asynchronous messages.• Thread safe.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 22

Compiling/Running MPI Programs

• Compiler: mpicc

• Command line:mpirun –n <p> --hostfile <hostfile> <prog> <params>

Where <p> is the number of processes you want to use. Can be greater than the number of processors available (used for overloading or simulation).

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 23

Hostfile

• For running a job on more than one node, a hostfile must be used.

• What’s in a hostfile:– Node name or IP.– How many processors on each node (1 by default).

• Example:

cgm01 slots=2

cgm02 slots=2

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 24

MPI Startup/Finalize

#include "mpi.h"int main(int argc, char *argv[]) {int rank, wsize;MPI_Init (&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &wsize);

/* CODE */

MPI_Finalize(); return 0;}

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 25

MPI TypesMPI C Type C Type

MPI_CHAR char

MPI_SHORT signed short int

MPI_INT signed int

MPI_LONG signed long int

MPI_UNSIGNED_CHAR unsigned char

MPI_UNSIGNED_SHORT unsigned short int

MPI_UNSIGNED unsigned int

MPI_UNSIGNED_LONG unsigned long int

MPI_FLOAT float

MPI_DOUBLE double

MPI_LONG_DOUBLE long double

MPI_BYTE -

MPI_PACKED -

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 26

MPI Functions

• Send/receive

• Broadcast

• All to all

• Gather/Scatter

• Reduce

• Barrier

• Other

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 27

MPI Send/Receive (synch)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 28

MPI Send/Receive (synch)

• Communication between nodes (processors).• Blocking

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)

*buf send buffer addresscount number of entries in bufferdatatype data type of entriesdest destination process ranktag message tagcomm communicator*status status after operation (returned)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 29

MPI Send/Receive (asynch)

• A buffer can be used with asynchronous messages.• Problems occur when the buffer becomes empty or full.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 30

MPI Send/Receive (asynch)

• Non-blocking (not guaranteed to be received)

int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm)

Parameters are the same as MPI_Send() and MPI_Recv()

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 31

MPI Broadcast

• One to all (including itself).

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 32

MPI Broadcast (syntax)

int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm)

*buf send buffer address

count number of entries in buffer

datatype data type of entries

root rank of root

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 33

MPI All to All

• Flood a message from every process to every process.

MPI_AlltoAll(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype datatype, MPI_Comm comm)

*sendbuf send buffer addresssendcount number of send buffer elementssendtype data type of send elements*recvbuf receive buffer address (loaded)recvcount number of elements each receiverecvtype data type of receiving processcomm communicator

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 34

MPI All to All

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 35

MPI All to All (alternative)

• MPI_AlltoAllv()– Sends data to all processes, with

displacement.

MPI_Alltoallv ( void *sendbuf, int *sendcounts, int *sdispls, MPI_Datatype sendtype, void *recvbuf, int *recvcnts, int *rdispls, MPI_Datatype recvtype, MPI_Comm comm )

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 36

MPI Gather (Description)

• MPI_Gather()– Each process in comm sends the contents of send

buf to the process with rank root. The process root concatenates the received data in process rank order in recvbuf That is the data from process is followed by the data from process which is followed by the data from process, etc. The recv arguments are signicant only on the process with rank root. The argument recv count indicates the number of items received from each process not the total number received

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 37

MPI Scatter (Description)

• MPI_Scatter()– The process with rank root distributes the contents of

sendbuf among the processes. The contents of sendbuf are split into p segments each consisting of sendcount items The first segment goes to process 0, the second to process 1, etc. The send arguments are significant only on process root.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 38

MPI Gather/Scatter

Gather

Scatter

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 39

MPI Gather/Scatter (syntax)int MPI_Gather(void *sendbuf, int sendcount,

MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)

int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)

*sendbuf send buffer addresssendcount number of send buffer elementssendtype data type of send elements*recvbuf receive buffer address (loaded)recvcount number of elements each receiverecvtype data type of receiving processroot rank of sending (scatter) or receiving (gather) processcomm communicator

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 40

MPI Gatherv/Scatterv

• Similar functions than gather/scatter, but allows for varying amounts of data to be sent instead of a fixed amount.

• For example, varying parts of an array can be scattered/gathered in one step.

• See Parallel Image Processing example to see how they can be used.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 41

MPI Gatherv/Scatterv (Syntax)

int MPI_Scatterv(void* sendbuf,int *sendcounts,int *displs,MPI_Datatype sendtype,void* recvbuf,int recvcount,MPI_Datatype recvtype,int root,MPI_Comm comm);

int MPI_Gatherv(void* sendbuf,int sendcount,MPI_Datatype sendtype,void* recvbuf,int *recvcounts,int *displs,MPI_Datatype recvtype,int root,MPI_Comm comm);

sendcounts* number of send buffer elements for each processesrecvcounts* number of elements each receive from each processes*displs displacement for each processorOther parameters are the same as gather/scatter.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 42

MPI Reduce

• Gather results and reduce them to one value using an operation (Max, Min, Sum, Product).

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 43

MPI Reduce (syntax)

int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)

*sendbuf send buffer address*recvbuf receive buffer addresscount number of send buffer elementsdatatype data type of send elementsop reduce operation:

- MPI_MAX Maximum - MPI_MIN Minimum - MPI_SUM Sum - MPI_PROD Product

root root process rank for resultcomm communicator

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 44

MPI Barrier

• Blocks until all processes have called it.

int MPI_Barrier(MPI_Comm comm)

comm communicator

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 45

Other MPI Routines

• MPI_Allgather(): Gather values and distribute to all.• MPI_Allgatherv(): Gather values into specified

locations and distribute to all. • MPI_Reduce_scatter(): Combine values and scatter

results.• MPI_Wait(): Waits for a MPI send/receive to complete

then returns.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 46

Parallel Problem Examples

• Embarrassingly Parallel– Simple Image Processing (Brightness, Negative…)

• Pipelined computations– Sorting

• Synchronous computations

– Heat Distribution Problem– Cellular Automata

• Divide and Conquer– N-Body Problem

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 47

MPI Hello World!

#include "mpi.h"

int main(int argc, char *argv[]) { int rank, wsize; MPI_Status status; MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &wsize);

printf("Hello World!, I am processor %d.\n",rank); MPI_Finalize(); return 0;}

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 48

Parallel Image processing

• Input: Image of size MxN.

• Output: Negative of the image.

• Each processor should have an equal share of the work, roughly (MxN)/P.

• Master/slave model– The master will read in the image and

distribute the pixels to the slave nodes. Once done the slaves will return the results to the master who will output the negative image.

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 49

Parallel Image processing (2)

• Workload– If we have 32 pixels to process, and 4 CPUs,

each CPU will process 8 pixels.– For P0, the work will start at pixel 0

(displacement) and process 8 pixels (count).

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 50

Parallel Image processing (3)

- Find the displacement/count for each processor.- Master processor scatters the image:- Execute the negative operation- Gather the results on the master processor.- Displacement (displs) tells you where to start,

count (counts) tells you how many to do.

MPI_Scatterv (image, counts, displs, MPI_CHAR, image, counts [myId], MPI_CHAR, 0, MPI_COMM_WORLD);

MPI_Gatherv (image, counts [myId], MPI_CHAR, image, counts, displs, MPI_CHAR, 0, MPI_COMM_WORLD);

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 51

MPI Timing

• Calculate the wall clock time of some code. Can be executed by master to find out total runtime.

double start, total;

start = MPI_Wtime();//Do some work!total = MPI_Wtime() - start;printf(“ Total Runtime: %f \n", total);

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 52

Compiling & Running Your First MPI Program

• Download the MPI_hello.tar.gz example from the cgmlab.carleton.ca website. In the terminal type:wget http://cgmlab.carleton.ca/files/MPI_hello.tar.gz

• Uncompress the files by typing:

tar zxvf MPI_hello.tar.gz

• Compile the program by typing:

make

• Run the program on all 16 cores by typing:mpirun –np 16 --hostfile hostfile ./hello

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 53

What To Do Next?

• There is also a prefix sums example on the cgmlab.carleton.ca website.

• Try other examples you find on the web.

• Find MPI tutorials online or in books.

• Write your own MPI programs.

• Have fun ;)

Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 54

References

• Parallel Programing: Techniques and Applications Using Networked Workstations and Parallel Computers, Barry Wilkinson and Michael Allen, Prentice Hall, 1999.

• MPI Information/Tutorials:– http://www-unix.mcs.anl.gov/mpi/learning.html

• A draft of a Tutorial/User's Guide for MPI by Peter Pacheco.– ftp://math.usfca.edu/pub/MPI/mpi.guide.ps

• OpenMPI (http://www.open-mpi.org/)