parallel matlab: doing it right - people | mit csailpeople.csail.mit.edu/cly/rqe.pdf · matlab...

55
Parallel MATLAB: Doing it Right Ron Choy CSAIL MIT

Upload: others

Post on 24-Aug-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Parallel MATLAB: Doing it Right

Ron ChoyCSAIL

MIT

Page 2: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

• The case for parallel MATLAB• Parallel MATLAB Survey• Star-P

Page 3: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

The Rise of Clusters• Cheap and everyone seems to have one

Page 4: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Some MIT Clusters

• Cheap & everyone seems to have one

Page 5: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Impact of Clusters

• More and more ‘non-experts’ are using clusters in their work/research

• Increased computing power does not necessarily translate to increased productivity

• Someone has to program these parallel machines, and this can be hard

Page 6: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Classes of interesting problems

• Large collection of small problems – easy, usually involves running a large number of serial programs – embarrassingly parallel

• Large problems – might not even fit into the memory of a machine. Much harder since it usually involves communication between processes

Page 7: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Parallel programming (1990 – now)

• Dominant model (for distributed memory): MPI (Message Passing Interface)

• Low level control of communications between processes

• Send, Recv, Scatter, Gather …

Page 8: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MPI

• It is not an easy way to do parallel programming.

• Error prone – it is hard to get complex low level communications right

• Hard to debug – MPI programs tend to die with obscure error messages. There is also no good debugger for MPI. (Totalview? Multiple GDBs?)

Page 9: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

The result?

• A lot of researchers in the sciences need the power of parallel computing, but do not have the expertise to tackle the programming

• Time saved by using parallel computers is offset by the additional time needed to program them

• Reduced productivity!

Page 10: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

The result? (2)

• Also the parallel programs written might not be efficient! Writing good parallel programs is hard

• A high level tool is needed to hide the complexity away from the users of parallel computers

Page 11: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Let’s go back to the serial world

for a moment

Page 12: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

What do people want?

• Real applications programmers care less about “p-fold speedups”

• Ease of programming, debugging, correct answer matters much more

Page 13: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MATLAB

• MATrix LABoratory• Started out as an interactive frontend to

LINPACK and EISPACK• Has since grown into a powerful computing

environment with ‘Toolboxes’ ranging from neural networks to computational finance to image processing.

• Many current users know nothing/little about linear algebra (Toolbox users)

Page 14: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MATLAB ‘language’

• MATLAB has a C-like scripting language, with FORTRAN90-like array indexing

• Interpreted language• A wide range of linear algebra routines• Very popular in the scientific community as

a prototyping tool – easier to use than C/FORTRAN

Page 15: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MATLAB ‘language’ (2)

• MATLAB makes use of ATLAS (optimized BLAS) for basic linear algebra routines

• Still, performance is not as good as C/FORTRAN, especially for loops

• However MATLAB is catching up with its release 13 – (just-in-time compiling?)

Page 16: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Why people like MATLAB?

• MATLAB made good choices on numerical libraries that are efficient, and combined them into one interactive, easy to use package.

• People were skeptical at first because they feel MATLAB slows things down. At the end convenience, not performance, won.

Page 17: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Wouldn’t it be nice to mixthe convenience of MATLAB

withparallel computing?

Page 18: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Why there should be a parallel MATLAB

• Cleve Moler, ‘Why there isn’t a parallel MATLAB’– Memory Model– Granularity– Business Situation

Page 19: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Parallel MATLAB Survey

• http://theory.lcs.mit.edu/~cly/survey.html• 27 parallel MATLAB projects found, using

4 approaches– Embarrassingly Parallel– Message Passing– Backend Support– Compiler

Page 20: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

Page 21: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

CornellRostock, GermanyLincoln LabsUIUCGranada, SpainCornellWake Forest

WUStLLinkopings, SwedenPurdueChalmers, SwedenNortheasternTUDenmarkLucent

UIUCAlpha Data UKUTAustinJPLMITUTK

Umea, SwedenUIUCAccelchipIRISA, FranceOSUSydney, AustraliaISI

Parallel MATLABs

Page 22: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

Page 23: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

Page 24: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Embarrassingly Parallel

• Multiple MATLABs needed• Scatter/Gather at the beginning and end• No lateral communication

Page 25: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Message Passing

• Multiple MATLABs required• MATLABs talk to each other in a general

MPI fashion• Superset of embarrassingly parallel

Page 26: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Backend support

• Requires 1 MATLAB• MATLAB is used as front end to a parallel

computation engine

Page 27: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Compiler

Page 28: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Compiler (2)

• Requires zero to one MATLAB• Compiles MATLAB scripts into native code

and link with parallel numerical libraries

Page 29: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

What do we learn from the survey?

• The projects themselves are ‘existential proofs’ for what users want from a parallel MATLAB

• There are way too many parallel MATLAB projects

Page 30: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Wouldn’t it be nice to have a single project

that covers it all, and does it better than the rest?

Page 31: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Star-P

• Goal: To provide a unified parallel MATLAB for user-friendly parallel computing

• Uses MATLAB as a front end to a parallel linear algebra server

• Encompasses the 4 approaches in the survey

Page 32: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Star-P (2)

• UI idea based on MATLAB*P• Substantially different from MATLAB*P

– New code base– Unified vs. backend support

Page 33: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Star-P (3)

• Server leverages on popular parallel numerical libraries:– ScaLAPACK– FFTW– PARPACK– …

• As well as custom-made parallel routines– Sorting– Sparse Matrix routines

Page 34: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Star-P Example

A = rand(4000*p, 4000*p); x = randn(4000*p, 1); y = zeros(size(x));while norm(x-y) / norm(x) > 1e-11

y = x;x = A*x;x = x / norm(x);

end;

Page 35: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Structure of the System

Package Manager

MatrixManager

Proxy

Client

MATLABMatrix 1

Matrix 2

Matrix 3

PBlas FFTW Scalapack PBase

Server #0

Client Manager

Server #1 Server #2 Server #3 Server #4

ServerManager

Page 36: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Structure of the system (2)

• Client (C++) – Attaches to MATLAB through MEX interface– Relays commands to server and process returns

• Server (C++/MPI)– Manipulates/processes data (matrices)– Performs the actual parallel computation by

calling the appropriate library routines

Page 37: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Functionalities provided

• PBLAS, ScaLAPACK– solve, eig, svd, chol, matmul, +, -, …

• FFTW– 1D and 2D FFT

• PARPACK– eigs, svds

• Sparse Matrix Routines• ‘MultiMATLAB’ mode

Page 38: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Focus

• Requires minimal learning on user’s part• Reuses of existing scripts• Mimics MATLAB behaviour• Data stay on backend until explicitly

retrieved by user• Extendable backend• Be a system that fits all users

Page 39: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Comparison

Extendable through ‘Packages’ and ‘Toolboxes’

Extendable through ‘Toolboxes’

Focus is on usability not performance

Focus is on usability not performance

Frontend to PBLAS, ScaLAPACK, …

Frontend to LINPACK, EISPACK

Star-PMATLAB

Page 40: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

What is *p?

• Star-P creates a new MATLAB class called dlayout

• By providing overloaded functions for this new class, parallelism is achieved transparently

• *p is dlayout(1)

Page 41: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Example

• A = randn(1024*p,1024*p);• E = eig(A);• e = pp2matlab(E);• plot(e,’*’);

Page 42: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array
Page 43: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Example 2

• H = hilb(n) % H = hilb(8000*p)– J = 1:n;– J = J(ones(n,1),:);– I = J’;– E = ones(n,n);– H = E./(I+J-1);

Page 44: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Built-in MATLAB functions

• Hilb is a MATLAB function that get parallelized automatically

• Another example of a MATLAB function that works is cgs

• We do not know how many of them works

Page 45: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Where is the data?

• One often-asked question in parallel computing is: where are the data residing?

• Star-P supports 3 data distributions: – A = randn(m*p,n) – row distribution– B = randn(m,n*p) – column distribution– C = randn(m*p,n*p) – 2D block cyclic

distribution

Page 46: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MultiMATLAB mode

• Similar to MultiMATLAB from Cornell• Starts up a MATLAB process on each node,

and runs MATLAB scripts on them in parallel

• Interesting feature is that the data can come from the ‘regular’ mode of Star-P

Page 47: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Example 3

% FFT2 in 4 lines• A = randn(10000,10000*p);• B = mm('fft',A);• C = B.';• D = mm('fft',C);• F = D.';

Page 48: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

MPI in MATLAB

• User can use MPI-like commands inside MultiMATLAB mode

Page 49: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

function result = simpletest(X)import mpi.*;MPI.init;if MPI.COMM_WORLD.rank == 0;

data = [4 2];MPI.COMM_WORLD.send(data, 2, 1, 13);result = 9;

elseif MPI.COMM_WORLD.rank == 1got = MPI.COMM_WORLD.recv(2, 0, 13);result = got(1) + got(2);

endMPI.fnlz;

Page 50: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Compilation

• MATLAB R14 (prerelease) provides the ability to compile MATLAB m-files that uses Star-P functionalities into binaries.

• The binaries combined with a running Star-P server enable compiled m-files to be run in parallel.

Page 51: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Types of parallel problems

• Collection of small problems– MultiMATLAB mode– MPI within MultiMATLAB mode

• Large problems– ‘Regular mode’ – ScaLAPACK, FFTW, …

• We even allow a mixture of the two modes to give additional flexibility – FFT2

Page 52: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Visualization package

• A term project done by a group of students in a parallel computing class at MIT

• Provides the equivalent of mesh, surf and spy to distributed matrices

• Visualization of very large matrix is now possible in MATLAB

Page 53: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array
Page 54: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

Why is Star-P interesting?

• First parallel MATLAB system that encompasses the 4 approaches in the survey

• First system that allows seamless (to the user) integration between a large number of parallel numerical libraries

Page 55: Parallel MATLAB: Doing it Right - People | MIT CSAILpeople.csail.mit.edu/cly/rqe.pdf · MATLAB ‘language’ • MATLAB has a C-like scripting language, with FORTRAN90-like array

More

• Processor maps – ScaLAPACK magic?• Dynamic addition of processors - Singapore• Star-P on a grid – UCSB, Parry• Pipelining - ??• Fault Tolerance - UCSB