networks of workstations prabhaker mateti wright state university

65
Networks of Workstations Prabhaker Mateti Wright State University

Upload: junior-hawkins

Post on 19-Dec-2015

224 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Networks of Workstations Prabhaker Mateti Wright State University

Networks of Workstations

Prabhaker MatetiWright State University

Page 2: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 2

Overview

• Parallel computers• Concurrent computation• Parallel Methods• Message Passing• Distributed Shared Memory• Programming Tools• Cluster configurations

Page 3: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 3

Granularity of Parallelism

• Fine-Grained Parallelism • Medium-Grained Parallelism• Coarse-Grained Parallelism • NOWs (Networks of Workstations)

Page 4: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 4

Fine-Grained Machines

• Tens of thousands of Processors • Processors

– Slow (bit serial) – Small (K bits of RAM)– Distributed Memory

• Interconnection Networks – Message Passing

• Single Instruction Multiple Data (SIMD)

Page 5: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 5

Sample Meshes

• Massively Parallel Processor (MPP) • TMC CM-2 (Connection Machine) • MasPar MP-1/2

Page 6: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 6

Medium-Grained Machines

• Typical Configurations – Thousands of processors – Processors have power between

coarse- and fine-grained • Either shared or distributed

memory• Traditionally: Research Machines • Single Code Multiple Data (SCMD)

Page 7: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 7

Medium-Grained Machines

• Ex: Cray T3E • Processors

– DEC Alpha EV5 (600 MFLOPS peak) – Max of 2048

• Peak Performance: 1.2 TFLOPS • 3-D Torus • Memory: 64 MB - 2 GB per CPU

Page 8: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 8

Coarse-Grained Machines

• Typical Configurations – Hundreds of Processors

• Processors – Powerful (fast CPUs) – Large (cache, vectors, multiple fast buses)

• Memory: Shared or Distributed-Shared • Multiple Instruction Multiple Data

(MIMD)

Page 9: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 9

Coarse-Grained Machines

• SGI Origin 2000: – PEs (MIPS R10000): Max of 128 – Peak Performance: 49 Gflops – Memory: 256 GBytes – Crossbar switches for interconnect

• HP/Convex Exemplar: – PEs (HP PA-RISC 8000): Max of 64 – Peak Performance: 46 Gflops – Memory: Max of 64 GBytes – Distributed crossbar switches for interconnect

Page 10: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 10

Networks of Workstations

• Exploit inexpensive Workstations/PCs • Commodity network • The NOW becomes a “distributed memory

multiprocessor”• Workstations send+receive messages • C and Fortran programs with PVM, MPI, etc.

libraries• Programs developed on NOWs are portable to

supercomputers for production runs

Page 11: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 11

“Parallel” Computing

• Concurrent Computing• Distributed Computing• Networked Computing• Parallel Computing

Page 12: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 12

Definition of “Parallel”

• S1 begins at time b1, ends at e1• S2 begins at time b2, ends at e2• S1 || S2

– Begins at min(b1, b2)– Ends at max(e1, e2)– Equiv to S2 || S1

Page 13: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 13

Data Dependency

• x := a + b; y := c + d;• x := a + b || y := c + d;• y := c + d; x := a + b;• X depends on a and b, y depends

on c and d• Assumed a, b, c, d were

independent

Page 14: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 14

Types of Parallelism

• Result• Specialist• Agenda

Page 15: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 15

Perfect Parallelism

• Also called – Embarrassingly Parallel– Result parallel

• Computations that can be subdivided into sets of independent tasks that require little or no communication – Monte Carlo simulations– F(x, y, z)

Page 16: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 16

MW Model

• Manager – Initiates computation – Tracks progress – Handles worker’s requests – Interfaces with user

• Workers – Spawned and terminated by manager – Make requests to manager – Send results to manager

Page 17: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 17

Reduction

• Combine several sub-results into one

• Reduce r1 r2 … rn with op• Becomes r1 op r2 op … op rn

Page 18: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 18

Data Parallelism

• Also called – Domain Decomposition– Specialist

• Same operations performed on many data elements simultaneously – Matrix operations– Compiling several files

Page 19: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 19

Control Parallelism

• Different operations performed simultaneously on different processors

• E.g., Simulating a chemical plant; one processor simulates the preprocessing of chemicals, one simulates reactions in first batch, another simulates refining the products, etc.

Page 20: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 20

Process communication

• Shared Memory• Message Passing

Page 21: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 21

Shared Memory

• Process A writes to a memory location

• Process B reads from that memory location

• Synchronization is crucial• Excellent speed

Page 22: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 22

Shared Memory

• Needs hardware support: – multi-ported memory

• Atomic operations: – Test-and-Set– Semaphores

Page 23: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 23

Shared Memory Semantics: Assumptions

• Global time is available. Discrete increments.• Shared variable s, = vi at ti, i=0,…• Process A: s := v1 at time t1• Assume no other assignment occurred after t1.• Process B reads s at time t and gets value v.

Page 24: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 24

Shared Memory: Semantics

• Value of Shared Variable– v = v1, if t > t1– v = v0, if t < t1– v = ??, if t = t1

• t = t1 +- discrete quantum• Next Update of Shared Variable

– Occurs at t2– t2 = t1 + ?

Page 25: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 25

Condition Variables and Semaphores

• Semaphores– V(s) ::= < s := s + 1 >– P(s) ::= <when s > 0 do s := s – 1>

• Condition variables– C.wait()– C.signal()

Page 26: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 26

Distributed Shared Memory

• A common address space that all the computers in the cluster share.

• Difficult to describe semantics.

Page 27: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 27

Distributed Shared Memory: Issues

• Distributed– Spatially– LAN– WAN

• No global time available

Page 28: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 28

Messages

• Messages are sequences of bytes moving between processes

• The sender and receiver must agree on the type structure of values in the message

• “Marshalling” of data

Page 29: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 29

Message Passing

• Process A sends a data buffer as a message to process B.

• Process B waits for a message from A, and when it arrives copies it into its own local memory.

• No memory shared between A and B.

Page 30: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 30

Message Passing

• Obviously, – Messages cannot be received before they

are sent.– A receiver waits until there is a message.

• Asynchronous– Sender never blocks, even if infinitely many

messages are waiting to be received– Semi-asynchronous is a practical version of

above with large but finite amount of buffering

Page 31: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 31

Message Passing: Point to Point

• Q: send(m, P) – Send message M to process P

• P: recv(x, Q)– Receive message from process P, and

place it in variable x• The message data

– Type of x must match that of m– As if x := m

Page 32: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 32

Broadcast

• One sender, multiple receivers• Not all receivers may receive at

the same time

Page 33: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 33

Types of Sends

• Synchronous• Asynchronous

Page 34: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 34

Synchronous Message Passing

• Sender blocks until receiver is ready to receive.

• Cannot send messages to self.• No buffering.

Page 35: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 35

Message Passing: Speed

• Speed not so good– Sender copies message into system

buffers.– Message travels the network.– Receiver copies message from

system buffers into local memory.– Special virtual memory techniques

help.

Page 36: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 36

Message Passing: Programming

• Less error-prone cf. shared memory

Page 37: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 37

Message Passing: Synchronization

• Synchronous MP: – Sender waits until receiver is ready.– No intermediary buffering

Page 38: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 38

Barrier Synchronization

• Processes wait until “all” arrive

Page 39: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 39

Parallel Software Development

• Algorithmic conversion by compilers

Page 40: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 40

Development of Distributed+Parallel Programs

• New code + algorithms• Old programs rewritten

– in new languages that have distributed and parallel primitives

– With new libraries

• Parallelize legacy code

Page 41: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 41

Conversion of Legacy Software

• Mechanical conversion by software tools

• Reverse engineer its design, and re-code

Page 42: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 42

Automatically parallelizing compilers

Compilers analyze programs and parallelize (usually loops).

Easy to use, but with limited success

Page 43: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 43

OpenMP on Networks of Workstations

• The OpenMP is an API for shared memory architectures.

• User-gives hints as directives to the compiler

• http://www.openmp.org

Page 44: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 44

Message Passing Libraries

• Programmer is responsible for data distribution, synchronizations, and sending and receiving information

• Parallel Virtual Machine (PVM)• Message Passing Interface (MPI)• BSP

Page 45: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 45

BSP: Bulk Synchronous Parallel model

• Divides computation into supersteps• In each superstep a processor can work

on local data and send messages. • At the end of the superstep, a barrier

synchronization takes place and all processors receive the messages which were sent in the previous superstep

• http://www.bsp-worldwide.org/

Page 46: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 46

BSP Library

• Small number of subroutines to implement – process creation, – remote data access, and – bulk synchronization.

• Linked to C, Fortran, … programs

Page 47: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 47

Parallel Languages

• Shared-memory languages• Parallel object-oriented languages • Parallel functional languages• Concurrent logic languages

Page 48: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 48

Tuple Space: Linda

• <v1, v2, …, vk>• Atomic Primitives

– In (t)– Read (t)– Out (t)– Eval (t)

• Host language: e.g., JavaSpaces

Page 49: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 49

Data Parallel Languages

• Data is distributed over the processors as a arrays

• Entire arrays are manipulated:– A(1:100) = B(1:100) + C(1:100)

• Compiler generates parallel code– Fortran 90– High Performance Fortran (HPF)

Page 50: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 50

Parallel Functional Languages

• Erlang http://www.erlang.org/

• SISAL http://www.llnl.gov/sisal/• PCN Argonne

Page 51: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 51

Clusters

Page 52: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 52

Buildings-Full of Workstations

1. Distributed OS have not taken a foot hold.

2. Powerful personal computers are ubiquitous.

3. Mostly idle: more than 90% of the up-time?

4. 100 Mb/s LANs are common. 5. Windows and Linux are the top two OS

in terms of installed base.

Page 53: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 53

Cluster Configurations

• NOW -- Networks of Workstations• COW -- Clusters of Dedicated

Nodes• Clusters of Come-and-Go Nodes• Beowulf clusters

Page 54: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 54

Beowulf

• Collection of compute nodes• Full trust in each other

– Login from one node into another without authentication

– Shared file system subtree

• Dedicated

Page 55: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 55

Close Cluster Configuration

computenode

computenode

computenode

computenode

-Front end

High Speed Network

Service Network

gatewaynode

External Network

computenode

computenode

computenode

computenode

-Front end

High Speed Network

gatewaynode

External Network

FileServernode

Page 56: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 56

Open Cluster Configuration

computenode

computenode

computenode

computenode

computenode

computenode

computenode

computenode

-Front end

External Network

FileServernode

High Speed Network

Page 57: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 57

Interconnection Network

• Most popular: Fast Ethernet• Network topologies

– Mesh– Torus

• Switch v Hub

Page 58: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 58

Software Components

• Operating System– Linux, FreeBSD, …

• Parallel programming– PVM, MPI

• Utilities, …• Open source

Page 59: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 59

Software Structure of PC Cluster

- HIGH SPEED NETWORK

HARDWARELAYER

OS LAYER

HARDWARELAYER

OS LAYER

HARDWARELAYER

OS LAYER

PARALLEL VIRTUAL MACHINE LAYER

Parallel Program Parallel Program Parallel Program

Page 60: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 60

Single System View

• Single system view– Common filesystem structure view fro

m any node– Common accounts on all nodes– Single software installation point

• Benefits– Easy to install and maintain system– Easy to use for users

Page 61: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 61

Installation Steps• Install Operating system• Setup a Single System View

– Shared filesystem– Common accounts – Single software installation point

• Install parallel programming packages s uch as MPI, PVM, BSP

• Install utilities, libraries, and applications

Page 62: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 62

Linux Installation

• Linux has many distributions : Redh at, Caldera, SuSe, Debian, …

• Caldera is easy to install• All above u pgrade with RPM packag

e management• Mandrake and SuSe come with a ve

ry complete set of software

Page 63: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 63

Clusters with Part Time Nodes

• Cycle Stealing: Running of jobs on a workstation that don't belong to the owner.

• Definition of Idleness: E.g., No keyboard and no mouse activity

• Tools/Libraries– Condor– PVM– MPI

Page 64: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 64

Migration of Jobs

• Policies – Immediate-Eviction– Pause-and-Migrate

• Technical Issues– Checkpointing: Preserving the state of

the process so it can be resumed.– Migrating from one architecture to

another

Page 65: Networks of Workstations Prabhaker Mateti Wright State University

Prabhaker Mateti, Networks of Workstations 65

Summary

• Parallel – computers– computation

• Parallel Methods• Communication primitives

– Message Passing– Distributed Shared Memory

• Programming Tools• Cluster configurations