Transcript
Page 1: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Maria Athanasaki, Evangelos Koukis, Nectarios Koziris

National Technical University of AthensSchool of Electrical and Computer Engineering

Computing Systems Laboratory

Page 2: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Previous work M. Athanasaki, A. Sotiropoulos, G. Tsoukalas, N. Koziris,

"Pipelined Scheduling of Tiled Nested Loops onto Clusters of SMPs using Memory Mapped Network Interfaces", SuperComputing Conference on High Performance Networking and Computing (SC2002), Baltimore, Maryland, November 16-22, 2002.

G. Goumas, A.Sotiropoulos and N. Koziris, "Minimizing Completion Time for Loop Tiling with Computation and Communication Overlapping," Proceedings of the 2001 International Parallel and Distributed Processing Symposium (IPDPS2001), IEEE Press, San Francisco, California, April  2001 .

Page 3: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overview

Tiling for parallelization Non-overlapping vs. Overlapping

execution scheme Grouping Application on a cluster of SMPs

with a fixed number of nodes Experimental-Simulation Results

Page 4: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Nested For-Loops

for (i1=l1; i1<=u1; i1++)

for (i2=l2; i2<=u2; i2++)

… … … … …

for (in=ln; in<=un; in++)

{

Loop Body

}

Page 5: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Dependence Vectors

i2

i1

for (i1=0; i1<=7; i1++)

for (i2=0; i2<=7; i2++)

A[i,j]=A[i-1,j]+A[i,j-1]

Page 6: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Tiling

i2

i1

Page 7: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Tiling

i2

i1

Processor 0

Processor 1

Page 8: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overview

Tiling for parallelization Non-overlapping vs.

Overlapping execution scheme Grouping Application on a cluster of SMPs

with a fixed number of nodes Experimental-Simulation Results

Page 9: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Non-Overlapping Scheme

i2

i1

Processor 0

Processor 1

Processor 2

Page 10: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Non-Overlapping vs. Overlapping Scheme

P0

P1

P2

P3

P0

P1

P2

P3

Page 11: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overlapping Scheme

i2

i1

Processor 0

Processor 1

Processor 2

Page 12: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overview

Tiling for parallelization Non-overlapping vs. Overlapping

execution scheme Grouping Application on a cluster of SMPs

with a fixed number of nodes Experimental-Simulation Results

Page 13: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Generalization to SMPs – “Grouping”

SMP0

SMP1

SMP2

SMP3

CPU0

CPU1

CPU0

CPU1

CPU0

CPU1

CPU0

CPU1

Page 14: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Example: Grouping + Non overlapping Communication Scheme

Tile Space

Group Space

SMP node0

SMP node1

Scheduling vector Π=(1,0)

Page 15: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Example: Grouping + Overlapping Communication Scheme

Tile Space

Group Space

SMP node0

SMP node1

Scheduling vector Π=(1,1)

Page 16: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overview

Tiling for parallelization Non-overlapping vs. Overlapping

execution scheme Grouping Application on a cluster of SMPs

with a fixed number of nodes Experimental-Simulation Results

Page 17: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Scheduling onto a Fixed Number of SMPs

Dynamic Scheduling by the Operating SystemRun time overhead for generating a

lot of processesContext switching slows down the

execution Static Scheduling at Compile

Time

Page 18: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Scheduling onto a Fixed Number of SMPs

Cyclic Assignment Schedule

Mirror Assignment Schedule

Cluster Assignment Schedule

Retiling

Page 19: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment

SMP0

SMP1

CPU0CPU1

CPU0CPU1

CPU0CPU1

CPU0CPU1

Cyclic assignment on 2 SMP nodes with 2 CPUs

each

SMP0

SMP1

Page 20: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment

CPU0CPU1

CPU0CPU1

CPU0CPU1

CPU0CPU1

Cyclic assignment on 2 SMP nodes with 2 CPUs

each

SMP0

SMP1

SMP0

SMP1

chunk

chunk

Page 21: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment – Non Overlapping Communication

CPU0

CPU1

CPU0

CPU1

Cyclic assignment on 2 SMP nodes with 2 CPUs

each

SMP0

SMP1

t

Page 22: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment - Overlapping Communication

Cyclic assignment on 2 SMP nodes with 2 CPUs

each

t

CPU0

CPU1

CPU0

CPU1

SMP0

SMP1

Page 23: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment - Communication

CPU0CPU1

CPU0CPU1

CPU0CPU1

CPU0CPU1

Cyclic assignment on 2 SMP nodes with 2 CPUs

each

SMP0

SMP1

SMP0

SMP1

chunk

chunk

Page 24: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Scheduling onto a Fixed Number of SMPs

Cyclic Assignment Schedule

Mirror Assignment Schedule

Cluster Assignment Schedule

Retiling

Page 25: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Mirror Assignment

SMP0

SMP1

CPU0CPU1

CPU0CPU1

CPU1CPU0

CPU1CPU0

Mirror assignment on 2 SMP nodes with 2 CPUs

each

SMP1

SMP0

chunk

chunk

Page 26: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Mirror Assignment – Non Overlapping Communication

Mirror assignment on 2 SMP nodes with 2 CPUs

each

CPU0CPU1

CPU0CPU1

SMP0

SMP1

t

Page 27: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Mirror Assignment - Overlapping Communication

Mirror assignment on 2 SMP nodes with 2 CPUs

each

tCPU0CPU1

CPU0CPU1

SMP0

SMP1

Page 28: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Mirror Assignment - Communication

SMP0

SMP1

CPU0CPU1

CPU0CPU1

CPU1CPU0

CPU1CPU0

Mirror assignment on 2 SMP nodes with 2 CPUs

each

SMP1

SMP0

Page 29: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Scheduling onto a Fixed Number of SMPs

Cyclic Assignment Schedule

Mirror Assignment Schedule

Cluster Assignment Schedule

Retiling

Page 30: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cluster Assignment

SMP0

SMP1

CPU0

Cluster assignment on 2 SMP nodes with 2 CPUs

each

CPU1

CPU0

CPU1

tiles “TILE”

Page 31: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cluster Assignment

SMP0

SMP1

CPU0

Cluster assignment on 2 SMP nodes with 2 CPUs

each

CPU1

CPU0

CPU1

TILESGROUPS

Page 32: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cluster Assignment – Non Overlapping Communication

SMP0

SMP1

CPU0

Cluster assignment on 2 SMP nodes with 2 CPUs

each

CPU1

CPU0

CPU1

t

Page 33: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cluster Assignment –Overlapping Communication

SMP0

SMP1

CPU0

Cluster assignment on 2 SMP nodes with 2 CPUs

each

CPU1

CPU0

CPU1

t

Page 34: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cluster Assignment - Communication

SMP0

SMP1

CPU0

Cluster assignment on 2 SMP nodes with 2 CPUs

each

CPU1

CPU0

CPU1

TILESGROUPS

Page 35: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Scheduling onto a Fixed Number of SMPs

Cyclic Assignment Schedule

Mirror Assignment Schedule

Cluster Assignment Schedule

Retiling

Page 36: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Retiling

SMP0

SMP1

CPU0

Retiling on 2 SMP nodes with 2 CPUs each

CPU1

CPU0

CPU1 old tiles

new tiles

Page 37: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Retiling

SMP0

SMP1

CPU0

Retiling on 2 SMP nodes with 2 CPUs each

CPU1

CPU0

CPU1 old tiles

new tiles

retaining computation

volume of a tile

Page 38: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Retiling – Non Overlapping Communication

SMP0

SMP1

CPU0

Retiling on 2 SMP nodes with 2 CPUs each

CPU1

CPU0

CPU1

t

Page 39: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Retiling –Overlapping Communication

SMP0

SMP1

CPU0

Retiling on 2 SMP nodes with 2 CPUs each

CPU1

CPU0

CPU1

t

Page 40: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Retiling - Communication

SMP0

SMP1

CPU0

Retiling on 2 SMP nodes with 2 CPUs each

CPU1

CPU0

CPU1

Page 41: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Overview

Tiling for parallelization Non-overlapping vs. Overlapping

execution scheme Grouping Application on a cluster of SMPs

with a fixed number of nodes Experimental-Simulation

Results

Page 42: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Experimental Platform Linux SMP (Symmetric Multi-

Processors) Cluster 2 nodes

1GB RAM2 Pentium III 1266MHz

Myrinet high performance interconnect

GM low level message passing system

Page 43: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

The Myrinet interconnect User-level Networking

Based on the GM message passing interface All message exchange using DMA

Directly to/from pinned userspace buffers Communication is offloaded to the NIC

Programmable NIC LANai RISC processor @ 133-333MHz 2-8MB SRAM

2+2Gbps full duplex fiber links

Page 44: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

GM Architecture

Comprised of three main parts User library Kernel driver Firmware on NIC

OS bypass design Regions of NIC

memory mapped to the VM of a process

GM Library

Application

GM kernel module

GM firmware

User

Kernel

NIC

Page 45: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Sending and Receiving messages over Myrinet/GM

Sending application

Host

NICSend q

Send DMA Recv DMA

Host DMA

LANai

Receiving application

Host

NICRecv q

Send DMA Recv DMA

Host DMA

LANai

Buffer Event q Buffer Event q

Page 46: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Initial Code

for (i=1; i<=X; i++)for (j=1; j<=Y; j++)

for (k=1; k<=Z; k++){

A[i][j][k] = func(A[i-1][j][k],

A[i][j-1][k], A[i][j][k-1])

}

Page 47: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

cyclic

mirror

cluster

retile

cyclic

mirror

cluster

retile

Experimental results

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

500 1000 1500 2000 2500 3000 3500

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Non Overlapping Execution Scheme

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

500 1000 1500 2000 2500 3000 3500

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Overlapping Execution Scheme

Page 48: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Simulation results

mirrorcyclic

retile

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 4000 8000 12000 16000 20000

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Overlapping Execution Scheme

cluster

mirror

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 4000 8000 12000 16000 20000

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Non Overlapping Execution Scheme

retile

clustercyclic

Page 49: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Simulation results

retile

cluster

cyclic

mirror 0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 4000 8000 12000 16000 20000

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Non Overlapping Execution Scheme

mirror cluster

retile

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 4000 8000 12000 16000 20000

Sp

eed

up

/ #

pro

cessors

Height of Iteration Space

Overlapping Execution Scheme

cyclic

Page 50: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Advantages - Disadvantages

Advantages Disadvantages

cyclic + fast pipeline filling - communication

mirror + better communication than cyclic- idle time steps- worse communication than cluster, retile

cluster+ communication: 1) little volume of data to be transferred 2) data combined in fewer messages

- slow pipeline filling

retile+ fast pipeline filling+ communication: little volume of data to be transfered

- reorganizes tiles annuls optimal tile shape for cache hits

Page 51: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

The End

Page 52: Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

Scheduling of Tiled Nested Loops onto a Cluster with a Fixed Number of SMP Nodes

National Technical University of AthensComputing Systems Laboratory

PDP 2004

Cyclic Assignment - Overlapping Communication

SMP0

SMP1

SMP0

SMP1

CPU0

CPU1

CPU0

CPU1

CPU0

CPU1

CPU0

CPU1

equivalentschedulings

P

tscheduling on a fixed number of processors

empty pipeline waiting for thenecessary data to become available

t

P

scheduling on an unlimited number of processors


Top Related