networks of workstations prabhaker mateti wright state university

Post on 19-Dec-2015

224 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Networks of Workstations

Prabhaker MatetiWright State University

Prabhaker Mateti, Networks of Workstations 2

Overview

• Parallel computers• Concurrent computation• Parallel Methods• Message Passing• Distributed Shared Memory• Programming Tools• Cluster configurations

Prabhaker Mateti, Networks of Workstations 3

Granularity of Parallelism

• Fine-Grained Parallelism • Medium-Grained Parallelism• Coarse-Grained Parallelism • NOWs (Networks of Workstations)

Prabhaker Mateti, Networks of Workstations 4

Fine-Grained Machines

• Tens of thousands of Processors • Processors

– Slow (bit serial) – Small (K bits of RAM)– Distributed Memory

• Interconnection Networks – Message Passing

• Single Instruction Multiple Data (SIMD)

Prabhaker Mateti, Networks of Workstations 5

Sample Meshes

• Massively Parallel Processor (MPP) • TMC CM-2 (Connection Machine) • MasPar MP-1/2

Prabhaker Mateti, Networks of Workstations 6

Medium-Grained Machines

• Typical Configurations – Thousands of processors – Processors have power between

coarse- and fine-grained • Either shared or distributed

memory• Traditionally: Research Machines • Single Code Multiple Data (SCMD)

Prabhaker Mateti, Networks of Workstations 7

Medium-Grained Machines

• Ex: Cray T3E • Processors

– DEC Alpha EV5 (600 MFLOPS peak) – Max of 2048

• Peak Performance: 1.2 TFLOPS • 3-D Torus • Memory: 64 MB - 2 GB per CPU

Prabhaker Mateti, Networks of Workstations 8

Coarse-Grained Machines

• Typical Configurations – Hundreds of Processors

• Processors – Powerful (fast CPUs) – Large (cache, vectors, multiple fast buses)

• Memory: Shared or Distributed-Shared • Multiple Instruction Multiple Data

(MIMD)

Prabhaker Mateti, Networks of Workstations 9

Coarse-Grained Machines

• SGI Origin 2000: – PEs (MIPS R10000): Max of 128 – Peak Performance: 49 Gflops – Memory: 256 GBytes – Crossbar switches for interconnect

• HP/Convex Exemplar: – PEs (HP PA-RISC 8000): Max of 64 – Peak Performance: 46 Gflops – Memory: Max of 64 GBytes – Distributed crossbar switches for interconnect

Prabhaker Mateti, Networks of Workstations 10

Networks of Workstations

• Exploit inexpensive Workstations/PCs • Commodity network • The NOW becomes a “distributed memory

multiprocessor”• Workstations send+receive messages • C and Fortran programs with PVM, MPI, etc.

libraries• Programs developed on NOWs are portable to

supercomputers for production runs

Prabhaker Mateti, Networks of Workstations 11

“Parallel” Computing

• Concurrent Computing• Distributed Computing• Networked Computing• Parallel Computing

Prabhaker Mateti, Networks of Workstations 12

Definition of “Parallel”

• S1 begins at time b1, ends at e1• S2 begins at time b2, ends at e2• S1 || S2

– Begins at min(b1, b2)– Ends at max(e1, e2)– Equiv to S2 || S1

Prabhaker Mateti, Networks of Workstations 13

Data Dependency

• x := a + b; y := c + d;• x := a + b || y := c + d;• y := c + d; x := a + b;• X depends on a and b, y depends

on c and d• Assumed a, b, c, d were

independent

Prabhaker Mateti, Networks of Workstations 14

Types of Parallelism

• Result• Specialist• Agenda

Prabhaker Mateti, Networks of Workstations 15

Perfect Parallelism

• Also called – Embarrassingly Parallel– Result parallel

• Computations that can be subdivided into sets of independent tasks that require little or no communication – Monte Carlo simulations– F(x, y, z)

Prabhaker Mateti, Networks of Workstations 16

MW Model

• Manager – Initiates computation – Tracks progress – Handles worker’s requests – Interfaces with user

• Workers – Spawned and terminated by manager – Make requests to manager – Send results to manager

Prabhaker Mateti, Networks of Workstations 17

Reduction

• Combine several sub-results into one

• Reduce r1 r2 … rn with op• Becomes r1 op r2 op … op rn

Prabhaker Mateti, Networks of Workstations 18

Data Parallelism

• Also called – Domain Decomposition– Specialist

• Same operations performed on many data elements simultaneously – Matrix operations– Compiling several files

Prabhaker Mateti, Networks of Workstations 19

Control Parallelism

• Different operations performed simultaneously on different processors

• E.g., Simulating a chemical plant; one processor simulates the preprocessing of chemicals, one simulates reactions in first batch, another simulates refining the products, etc.

Prabhaker Mateti, Networks of Workstations 20

Process communication

• Shared Memory• Message Passing

Prabhaker Mateti, Networks of Workstations 21

Shared Memory

• Process A writes to a memory location

• Process B reads from that memory location

• Synchronization is crucial• Excellent speed

Prabhaker Mateti, Networks of Workstations 22

Shared Memory

• Needs hardware support: – multi-ported memory

• Atomic operations: – Test-and-Set– Semaphores

Prabhaker Mateti, Networks of Workstations 23

Shared Memory Semantics: Assumptions

• Global time is available. Discrete increments.• Shared variable s, = vi at ti, i=0,…• Process A: s := v1 at time t1• Assume no other assignment occurred after t1.• Process B reads s at time t and gets value v.

Prabhaker Mateti, Networks of Workstations 24

Shared Memory: Semantics

• Value of Shared Variable– v = v1, if t > t1– v = v0, if t < t1– v = ??, if t = t1

• t = t1 +- discrete quantum• Next Update of Shared Variable

– Occurs at t2– t2 = t1 + ?

Prabhaker Mateti, Networks of Workstations 25

Condition Variables and Semaphores

• Semaphores– V(s) ::= < s := s + 1 >– P(s) ::= <when s > 0 do s := s – 1>

• Condition variables– C.wait()– C.signal()

Prabhaker Mateti, Networks of Workstations 26

Distributed Shared Memory

• A common address space that all the computers in the cluster share.

• Difficult to describe semantics.

Prabhaker Mateti, Networks of Workstations 27

Distributed Shared Memory: Issues

• Distributed– Spatially– LAN– WAN

• No global time available

Prabhaker Mateti, Networks of Workstations 28

Messages

• Messages are sequences of bytes moving between processes

• The sender and receiver must agree on the type structure of values in the message

• “Marshalling” of data

Prabhaker Mateti, Networks of Workstations 29

Message Passing

• Process A sends a data buffer as a message to process B.

• Process B waits for a message from A, and when it arrives copies it into its own local memory.

• No memory shared between A and B.

Prabhaker Mateti, Networks of Workstations 30

Message Passing

• Obviously, – Messages cannot be received before they

are sent.– A receiver waits until there is a message.

• Asynchronous– Sender never blocks, even if infinitely many

messages are waiting to be received– Semi-asynchronous is a practical version of

above with large but finite amount of buffering

Prabhaker Mateti, Networks of Workstations 31

Message Passing: Point to Point

• Q: send(m, P) – Send message M to process P

• P: recv(x, Q)– Receive message from process P, and

place it in variable x• The message data

– Type of x must match that of m– As if x := m

Prabhaker Mateti, Networks of Workstations 32

Broadcast

• One sender, multiple receivers• Not all receivers may receive at

the same time

Prabhaker Mateti, Networks of Workstations 33

Types of Sends

• Synchronous• Asynchronous

Prabhaker Mateti, Networks of Workstations 34

Synchronous Message Passing

• Sender blocks until receiver is ready to receive.

• Cannot send messages to self.• No buffering.

Prabhaker Mateti, Networks of Workstations 35

Message Passing: Speed

• Speed not so good– Sender copies message into system

buffers.– Message travels the network.– Receiver copies message from

system buffers into local memory.– Special virtual memory techniques

help.

Prabhaker Mateti, Networks of Workstations 36

Message Passing: Programming

• Less error-prone cf. shared memory

Prabhaker Mateti, Networks of Workstations 37

Message Passing: Synchronization

• Synchronous MP: – Sender waits until receiver is ready.– No intermediary buffering

Prabhaker Mateti, Networks of Workstations 38

Barrier Synchronization

• Processes wait until “all” arrive

Prabhaker Mateti, Networks of Workstations 39

Parallel Software Development

• Algorithmic conversion by compilers

Prabhaker Mateti, Networks of Workstations 40

Development of Distributed+Parallel Programs

• New code + algorithms• Old programs rewritten

– in new languages that have distributed and parallel primitives

– With new libraries

• Parallelize legacy code

Prabhaker Mateti, Networks of Workstations 41

Conversion of Legacy Software

• Mechanical conversion by software tools

• Reverse engineer its design, and re-code

Prabhaker Mateti, Networks of Workstations 42

Automatically parallelizing compilers

Compilers analyze programs and parallelize (usually loops).

Easy to use, but with limited success

Prabhaker Mateti, Networks of Workstations 43

OpenMP on Networks of Workstations

• The OpenMP is an API for shared memory architectures.

• User-gives hints as directives to the compiler

• http://www.openmp.org

Prabhaker Mateti, Networks of Workstations 44

Message Passing Libraries

• Programmer is responsible for data distribution, synchronizations, and sending and receiving information

• Parallel Virtual Machine (PVM)• Message Passing Interface (MPI)• BSP

Prabhaker Mateti, Networks of Workstations 45

BSP: Bulk Synchronous Parallel model

• Divides computation into supersteps• In each superstep a processor can work

on local data and send messages. • At the end of the superstep, a barrier

synchronization takes place and all processors receive the messages which were sent in the previous superstep

• http://www.bsp-worldwide.org/

Prabhaker Mateti, Networks of Workstations 46

BSP Library

• Small number of subroutines to implement – process creation, – remote data access, and – bulk synchronization.

• Linked to C, Fortran, … programs

Prabhaker Mateti, Networks of Workstations 47

Parallel Languages

• Shared-memory languages• Parallel object-oriented languages • Parallel functional languages• Concurrent logic languages

Prabhaker Mateti, Networks of Workstations 48

Tuple Space: Linda

• <v1, v2, …, vk>• Atomic Primitives

– In (t)– Read (t)– Out (t)– Eval (t)

• Host language: e.g., JavaSpaces

Prabhaker Mateti, Networks of Workstations 49

Data Parallel Languages

• Data is distributed over the processors as a arrays

• Entire arrays are manipulated:– A(1:100) = B(1:100) + C(1:100)

• Compiler generates parallel code– Fortran 90– High Performance Fortran (HPF)

Prabhaker Mateti, Networks of Workstations 50

Parallel Functional Languages

• Erlang http://www.erlang.org/

• SISAL http://www.llnl.gov/sisal/• PCN Argonne

Prabhaker Mateti, Networks of Workstations 51

Clusters

Prabhaker Mateti, Networks of Workstations 52

Buildings-Full of Workstations

1. Distributed OS have not taken a foot hold.

2. Powerful personal computers are ubiquitous.

3. Mostly idle: more than 90% of the up-time?

4. 100 Mb/s LANs are common. 5. Windows and Linux are the top two OS

in terms of installed base.

Prabhaker Mateti, Networks of Workstations 53

Cluster Configurations

• NOW -- Networks of Workstations• COW -- Clusters of Dedicated

Nodes• Clusters of Come-and-Go Nodes• Beowulf clusters

Prabhaker Mateti, Networks of Workstations 54

Beowulf

• Collection of compute nodes• Full trust in each other

– Login from one node into another without authentication

– Shared file system subtree

• Dedicated

Prabhaker Mateti, Networks of Workstations 55

Close Cluster Configuration

computenode

computenode

computenode

computenode

-Front end

High Speed Network

Service Network

gatewaynode

External Network

computenode

computenode

computenode

computenode

-Front end

High Speed Network

gatewaynode

External Network

FileServernode

Prabhaker Mateti, Networks of Workstations 56

Open Cluster Configuration

computenode

computenode

computenode

computenode

computenode

computenode

computenode

computenode

-Front end

External Network

FileServernode

High Speed Network

Prabhaker Mateti, Networks of Workstations 57

Interconnection Network

• Most popular: Fast Ethernet• Network topologies

– Mesh– Torus

• Switch v Hub

Prabhaker Mateti, Networks of Workstations 58

Software Components

• Operating System– Linux, FreeBSD, …

• Parallel programming– PVM, MPI

• Utilities, …• Open source

Prabhaker Mateti, Networks of Workstations 59

Software Structure of PC Cluster

- HIGH SPEED NETWORK

HARDWARELAYER

OS LAYER

HARDWARELAYER

OS LAYER

HARDWARELAYER

OS LAYER

PARALLEL VIRTUAL MACHINE LAYER

Parallel Program Parallel Program Parallel Program

Prabhaker Mateti, Networks of Workstations 60

Single System View

• Single system view– Common filesystem structure view fro

m any node– Common accounts on all nodes– Single software installation point

• Benefits– Easy to install and maintain system– Easy to use for users

Prabhaker Mateti, Networks of Workstations 61

Installation Steps• Install Operating system• Setup a Single System View

– Shared filesystem– Common accounts – Single software installation point

• Install parallel programming packages s uch as MPI, PVM, BSP

• Install utilities, libraries, and applications

Prabhaker Mateti, Networks of Workstations 62

Linux Installation

• Linux has many distributions : Redh at, Caldera, SuSe, Debian, …

• Caldera is easy to install• All above u pgrade with RPM packag

e management• Mandrake and SuSe come with a ve

ry complete set of software

Prabhaker Mateti, Networks of Workstations 63

Clusters with Part Time Nodes

• Cycle Stealing: Running of jobs on a workstation that don't belong to the owner.

• Definition of Idleness: E.g., No keyboard and no mouse activity

• Tools/Libraries– Condor– PVM– MPI

Prabhaker Mateti, Networks of Workstations 64

Migration of Jobs

• Policies – Immediate-Eviction– Pause-and-Migrate

• Technical Issues– Checkpointing: Preserving the state of

the process so it can be resumed.– Migrating from one architecture to

another

Prabhaker Mateti, Networks of Workstations 65

Summary

• Parallel – computers– computation

• Parallel Methods• Communication primitives

– Message Passing– Distributed Shared Memory

• Programming Tools• Cluster configurations

top related