astrophysical na single processor and memory. main memory processor instructions (to processor) data...
TRANSCRIPT
![Page 1: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/1.jpg)
1Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.1 Astrophysical N-bodysimulation by Scott Linssen (undergraduateUniversity of North Carolina at Charlotte[UNCC] student).
![Page 2: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/2.jpg)
2Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.2 Conventional computer havinga single processor and memory.
Main memory
Processor
Instructions (to processor)Data (to or from processor)
![Page 3: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/3.jpg)
3Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.3 Traditional shared memorymultiprocessor model.Processors
Interconnectionnetwork
Memory modulesOneaddressspace
![Page 4: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/4.jpg)
4Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Processor
Interconnectionnetwork
Local
Computers
Messages
Figure 1.4 Message-passingmultiprocessor model (multicomputer).
memory
![Page 5: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/5.jpg)
5Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Processor
Interconnectionnetwork
Shared
Computers
Messages
Figure 1.5 Shared memory multiprocessorimplementation.
memory
![Page 6: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/6.jpg)
6Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.6 MPMD structure.
Program
Processor
Data
Program
Processor
Data
InstructionsInstructions
![Page 7: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/7.jpg)
7Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P M
C
P M
C
P M
C
Figure 1.7 Static link multicomputer.
Computers
Network with direct linksbetween computers
![Page 8: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/8.jpg)
8Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Linksto other
nodes
Switch
Processor Memory
Computer (node)
Linksto othernodes
Figure 1.8 Node with a switch for internode message transfers.
![Page 9: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/9.jpg)
9Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Link
Figure 1.9 A link between two nodes withseparate wires in each direction.
NodeNode
![Page 10: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/10.jpg)
10Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.10 Ring.
![Page 11: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/11.jpg)
11Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.11 Two-dimensional array(mesh).
LinksComputer/processor
![Page 12: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/12.jpg)
12Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.12 Tree structure.
Processingelement
Root
Links
![Page 13: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/13.jpg)
13Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.13 Three-dimensional hypercube.000 001
010 011
100
110
101
111
![Page 14: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/14.jpg)
14Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
0000 0001
0010 0011
0100
0110
0101
0111
1000 1001
1010 1011
1100
1110
1101
1111
Figure 1.14 Four-dimensional hypercube.
![Page 15: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/15.jpg)
15Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.15 Embedding a ring onto a torus.
Ring
![Page 16: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/16.jpg)
16Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.16 Embedding a mesh into ahypercube.
00
01
11
10
00 01 11 10yx
Nodal address1011
![Page 17: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/17.jpg)
17Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.17 Embedding a tree into a mesh.
Root
A
A
A
A
A
A
![Page 18: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/18.jpg)
18Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
HeadPacket
Request/Acknowledge
signal(s)
Figure 1.18 Distribution of flits.
Flit buffer
Movement
![Page 19: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/19.jpg)
19Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Data
R/A
Source Destinationprocessor processor
Figure 1.19 A signaling method betweenprocessors for wormhole routing (Ni andMcKinley, 1993).
![Page 20: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/20.jpg)
20Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Packet switching
Circuit switchingWormhole routing
Distance
Network
(number of nodes between source and destination)
latency
Figure 1.20 Network delay characteristics.
![Page 21: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/21.jpg)
21Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Messages
Node 1 Node 2
Node 3Node 4
Figure 1.21 Deadlock in store-and-forwardnetworks.
![Page 22: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/22.jpg)
22Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Physical link
Virtual channel
Route
buffer Node Node
Figure 1.22 Multiple virtual channels mapped onto a single physical channel.
![Page 23: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/23.jpg)
23Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Workstations Figure 1.23 Ethernet-type single wirenetwork.
Workstation/
Ethernet
file server
![Page 24: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/24.jpg)
24Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.24 Ethernet frame format.
Preamble
(64 bits)
Destinationaddress(48 bits)
Sourceaddress(48 bits)
Type
(16 bits)
Data
(variable)
Frame checksequence(32 bits)
Direction
![Page 25: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/25.jpg)
25Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Network
Workstation/
Workstations
Figure 1.25 Network of workstations connected via a ring.
file server
![Page 26: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/26.jpg)
26Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Workstation/file server
Workstations
Figure 1.26 Star connected network.
![Page 27: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/27.jpg)
27Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.27 Overlapping connectivity Ethernets.
(a) Using specially designed adaptors
(b) Using separate Ethernet interfaces
Parallel programming cluster
![Page 28: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/28.jpg)
28Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Time
Process 1
Process 2
Process 3
Process 4
Waiting to send a message
Figure 1.28 Space-time diagram of a message-passing program.
Message
Computing
Slope indicating timeto send message
![Page 29: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/29.jpg)
29Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Serial section Parallelizable sections
(a) One processor
(b) Multipleprocessors
fts (1 − f)ts
ts
(1 − f)ts/n
Figure 1.29 Parallelizing sequential problem — Amdahl’s law.
tp
n processors
![Page 30: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/30.jpg)
30Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 1.30 (a) Speedup against number of processors. (b) Speedup against serial fraction, f.
4
8
12
16
20
0.2 0.4 0.6 0.8 1.0
Spee
dup
fact
or,S
(n)
Serial fraction, f
(b)
n = 256
n = 164
8
12
16
20
4 8 12 16 20
f = 20%
f = 10%
f = 5%
f = 0%
Spee
dup
fact
or,S
(n)
Number of processors, n
(a)
![Page 31: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/31.jpg)
31Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Sourcefile
Executables
Processor 0 Processor n − 1Figure 2.1 Single program, multiple dataoperation.
Compile to suitprocessor
![Page 32: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/32.jpg)
32Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Process 1
Process 2spawn();
Figure 2.2 Spawning a process.
Time
Start executionof process 2
![Page 33: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/33.jpg)
33Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.3 Passing a message betweenprocesses using send() and recv()library calls.
Process 1 Process 2
send(&x, 2);
recv(&y, 1);
x y
Movementof data
![Page 34: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/34.jpg)
34Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.4 Synchronous send() and recv() library calls using a three-way protocol.
Process 1 Process 2
send();
recv();Suspend
Time
processAcknowledgment
MessageBoth processescontinue
(a) When send() occurs before recv()
Process 1 Process 2
recv();
send();Suspend
Time
process
Acknowledgment
MessageBoth processescontinue
(b) When recv() occurs before send()
Request to send
Request to send
![Page 35: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/35.jpg)
35Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.5 Using a message buffer.
Process 1 Process 2
send();
recv();
Message buffer
Readmessage buffer
Continueprocess
Time
![Page 36: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/36.jpg)
36Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
bcast();
buf
bcast();
data
bcast();
datadata
Figure 2.6 Broadcast operation.
Process 0 Process n − 1Process 1
Action
Code
![Page 37: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/37.jpg)
37Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
scatter();
buf
scatter();
data
scatter();
datadata
Figure 2.7 Scatter operation.
Process 0 Process n − 1Process 1
Action
Code
![Page 38: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/38.jpg)
38Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.8 Gather operation.
gather();
buf
gather();
data
gather();
datadata
Process 0 Process n − 1Process 1
Action
Code
![Page 39: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/39.jpg)
39Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.9 Reduce operation (addition).
reduce();
buf
reduce();
data
reduce();
datadata
Process 0 Process n − 1Process 1
+
Action
Code
![Page 40: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/40.jpg)
40Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.10 Message passing between workstations using PVM.
PVM
Application
daemon
program
Workstation
PVMdaemon
Applicationprogram
Applicationprogram
PVMdaemon
Workstation
Workstation
Messagessent throughnetwork
(executable)
(executable)
(executable)
![Page 41: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/41.jpg)
41Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.11 Multiple processes allocated to each processor (workstation).
Workstation
Applicationprogram
PVMdaemon Workstation
Workstation
Messagessent throughnetwork
(executable)
PVMdaemon
PVMdaemon
![Page 42: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/42.jpg)
42Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.12 pvm_psend() and pvm_precv() system calls.
Process 1 Process 2
pvm_psend();
pvm_precv();Continueprocess
Wait for message
Pack
Send bufferArray Array toholdingdata
receivedata
![Page 43: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/43.jpg)
43Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
pvm_pkint( … &x …);pvm_pkstr( … &s …);pvm_pkfloat( … &y …);pvm_send(process_2 … ); pvm_recv(process_1 …);
pvm_upkint( … &x …);pvm_upkstr( … &s …);pvm_upkfloat(… &y … );
Send
Receivebuffer
buffer
xsy
Process_1 Process_2
Figure 2.13 PVM packing messages, sending, and unpacking.
Message
pvm_initsend();
![Page 44: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/44.jpg)
44Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.14 Sample PVM program.
#include <stdio.h>#include <stdlib.h>#include <pvm3.h>#define SLAVE “spsum”#define PROC 10#define NELEM 1000main() {
int mytid,tids[PROC];int n = NELEM, nproc = PROC;int no, i, who, msgtype;int data[NELEM],result[PROC],tot=0;char fn[255];FILE *fp;mytid=pvm_mytid();/*Enroll in PVM */
/* Start Slave Tasks */no= pvm_spawn(SLAVE,(char**)0,0,““,nproc,tids);if (no < nproc) {
printf(“Trouble spawning slaves \n”);for (i=0; i<no; i++) pvm_kill(tids[i]);pvm_exit(); exit(1);
}
/* Open Input File and Initialize Data */strcpy(fn,getenv(“HOME”));strcat(fn,”/pvm3/src/rand_data.txt”);if ((fp = fopen(fn,”r”)) == NULL) {
printf(“Can’t open input file %s\n”,fn);exit(1);
}for(i=0;i<n;i++)fscanf(fp,”%d”,&data[i]);
/* Broadcast data To slaves*/pvm_initsend(PvmDataDefault);msgtype = 0;pvm_pkint(&nproc, 1, 1);pvm_pkint(tids, nproc, 1);pvm_pkint(&n, 1, 1);pvm_pkint(data, n, 1);pvm_mcast(tids, nproc, msgtag);
/* Get results from Slaves*/msgtype = 5;for (i=0; i<nproc; i++){
pvm_recv(-1, msgtype);pvm_upkint(&who, 1, 1);pvm_upkint(&result[who], 1, 1);printf(“%d from %d\n”,result[who],who);
}
/* Compute global sum */for (i=0; i<nproc; i++) tot += result[i];printf (“The total is %d.\n\n”, tot);
pvm_exit(); /* Program finished. Exit PVM */ return(0);
#include <stdio.h>#include “pvm3.h”#define PROC 10#define NELEM 1000
main() {int mytid;int tids[PROC];int n, me, i, msgtype;int x, nproc, master;int data[NELEM], sum;
mytid = pvm_mytid();
/* Receive data from master */msgtype = 0;pvm_recv(-1, msgtype);pvm_upkint(&nproc, 1, 1);pvm_upkint(tids, nproc, 1);pvm_upkint(&n, 1, 1);pvm_upkint(data, n, 1);
/* Determine my tid */for (i=0; i<nproc; i++)
if(mytid==tids[i]){me = i;break;}
/* Add my portion Of data */x = n/nproc;low = me * x;high = low + x;for(i = low; i < high; i++)
sum += data[i];
/* Send result to master */pvm_initsend(PvmDataDefault);pvm_pkint(&me, 1, 1);pvm_pkint(&sum, 1, 1);msgtype = 5;master = pvm_parent();pvm_send(master, msgtype);
/* Exit PVM */pvm_exit();return(0);
}
Master
Slave
Broadcast data
Receive results
![Page 45: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/45.jpg)
45Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.15 Unsafe message passing with libraries.
lib()
lib()
send(…,1,…);
recv(…,0,…);
Process 0 Process 1
send(…,1,…);
recv(…,0,…);
(a) Intended behavior
(b) Possible behavior
lib()
lib()
send(…,1,…);
recv(…,0,…);
Process 0 Process 1
send(…,1,…);
recv(…,0,…);
Destination
Source
![Page 46: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/46.jpg)
46Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.16 Sample MPI program.
#include “mpi.h”#include <stdio.h>#include <math.h>#define MAXSIZE 1000
void main(int argc, char *argv){
int myid, numprocs;int data[MAXSIZE], i, x, low, high, myresult, result;char fn[255];char *fp;
MPI_Init(&argc,&argv);MPI_Comm_size(MPI_COMM_WORLD,&numprocs);MPI_Comm_rank(MPI_COMM_WORLD,&myid);
if (myid == 0) { /* Open input file and initialize data */strcpy(fn,getenv(“HOME”));strcat(fn,”/MPI/rand_data.txt”);if ((fp = fopen(fn,”r”)) == NULL) {
printf(“Can’t open the input file: %s\n\n”, fn);exit(1);
}for(i = 0; i < MAXSIZE; i++) fscanf(fp,”%d”, &data[i]);
}
/* broadcast data */MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD);
/* Add my portion Of data */x = n/nproc;low = myid * x;high = low + x;for(i = low; i < high; i++)
myresult += data[i];printf(“I got %d from %d\n”, myresult, myid);
/* Compute global sum */MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);if (myid == 0) printf(“The sum is %d.\n”, result);
MPI_Finalize();}
![Page 47: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/47.jpg)
47Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Tim
e
Number of data items (n)
Startup time
Figure 2.17 Theoretical communicationtime.
![Page 48: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/48.jpg)
48Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
160
140
120
100
80
60
40
20
01 2 3 4 50
x0
c2g(x) = 6x2
c1g(x) = 2x2
f(x) = 4x2 + 2x + 12
Figure 2.18 Growth of function f(x) = 4x2 + 2x + 12.
![Page 49: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/49.jpg)
49Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.19 Broadcast in a three-dimensional hypercube.
000 001
010 011
100
110
101
111
1st step
2nd step
3rd step
![Page 50: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/50.jpg)
50Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.20 Broadcast as a tree construction.
P000
P000
P010P000
P000
P010P100 P110 P001 P101 P011
P001
P111
P001 P011
Step 1
Step 2
Step 3
Message
![Page 51: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/51.jpg)
51Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.21 Broadcast in a mesh.
1 2 3
4
5 6
2
3
4
5
6
3 4
5
4
Steps
![Page 52: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/52.jpg)
52Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Source Destinations
Message
Figure 2.22 Broadcast on an Ethernetnetwork.
![Page 53: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/53.jpg)
53Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 2.23 1-to-N fan-out broadcast.
Source
N destinations
Sequential
![Page 54: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/54.jpg)
54Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Source
Sequential message issue
DestinationsFigure 2.24 1-to-N fan-out broadcast on atree structure.
![Page 55: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/55.jpg)
55Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Process 1
Process 2
Process 3
Time
Computing
Waiting
Message-passing system routine
Message
Figure 2.25 Space-time diagram of a parallel program.
![Page 56: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/56.jpg)
56Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Statement number or regions of program1 2 3 4 5 6 7 8 9 10
Num
ber
of r
epet
ition
s or
tim
e
Figure 2.26 Program profile.
![Page 57: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/57.jpg)
57Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Processes
Results
Input data
Figure 3.1 Disconnected computationalgraph (embarrassingly parallel problem).
![Page 58: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/58.jpg)
58Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 3.2 Practical embarrassingly parallel computational graph with dynamic processcreation and the master-slave approach.
Send initial data
Collect results
MasterSlaves
spawn()
recv()
send()
recv()send()
![Page 59: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/59.jpg)
59Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
640
480
80
80
640
480
10
(a) Square region for each process
(b) Row region for each process
Figure 3.3 Partitioning into regions for individual processes.
Process
Map
Process
Map
x
y
![Page 60: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/60.jpg)
60Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Real
Figure 3.4 Mandelbrot set.
+2−2 0
+2
−2
0
Imaginary
![Page 61: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/61.jpg)
61Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Work pool
(xc, yc)(xa, ya)
(xd, yd)(xb, yb)
(xe, ye)
Figure 3.5 Work pool approach.
Task
Return results/request new task
![Page 62: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/62.jpg)
62Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
0 disp_height
Row returned
Row sent
Increment
Decrement
Rows outstanding in slaves (count)
Figure 3.6 Counter termination.Terminate
![Page 63: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/63.jpg)
63Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 3.7 Computing π by a Monte Carlomethod.
Area = π
Total area = 4
2
2
![Page 64: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/64.jpg)
64Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x
y 1 x2–=1
f(x)
Figure 3.8 Function being integrated incomputing π by a Monte Carlo method.1
1
![Page 65: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/65.jpg)
65Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Master
Slaves
Random numberprocess
Randomnumber
Partial sum
Request
Figure 3.9 Parallel Monte Carlointegration.
![Page 66: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/66.jpg)
66Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x1 x2 xk-1 xk xk+1 xk+2 x2k-1 x2k
Figure 3.10 Parallel computation of a sequence.
![Page 67: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/67.jpg)
67Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.1 Partitioning a sequence of numbers into parts and adding the parts.
Sum
x0 … x(n/m)−1 xn/m … x(2n/m)−1 x(m−1)n/m … xn−1…
Partial sums
+ +
+
+
![Page 68: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/68.jpg)
68Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.2 Tree construction.
Initial problem
Divide
Final tasks
problem
![Page 69: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/69.jpg)
69Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.3 Dividing a list into parts.
P0 P1 P2 P3 P4 P5 P6 P7
P0
P0
P0 P2 P4 P6
P4
Original list
x0 xn−1
![Page 70: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/70.jpg)
70Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.4 Partial summation.
P0 P1 P2 P3 P4 P5 P6 P7
P0
P0
P0 P2 P4 P6
P4
Final sum
x0 xn−1
![Page 71: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/71.jpg)
71Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
OR
OROR
Found/Not found
Figure 4.5 Part of a search tree.
![Page 72: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/72.jpg)
72Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.6 Quadtree.
![Page 73: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/73.jpg)
73Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Image area
First division
Second division
into four parts
Figure 4.7 Dividing an image.
![Page 74: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/74.jpg)
74Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Unsorted numbers
Sorted numbers
Buckets
Figure 4.8 Bucket sort.
Sortcontentsof buckets
Merge lists
![Page 75: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/75.jpg)
75Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Unsorted numbers
Sort
Figure 4.9 One parallel version of bucket sort.
Buckets
contentsof buckets
Merge lists
p processors
Sorted numbers
![Page 76: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/76.jpg)
76Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Unsorted numbers
Sort
Large
Figure 4.10 Parallel version of bucket sort.
Smallbuckets
Emptysmallbuckets
buckets
contentsof buckets
Merge lists
p processors
n/m numbers
Sorted numbers
![Page 77: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/77.jpg)
77Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Send Receive
Send
Process 1 Process n − 1
Process 0 Process n − 1
Process 0 Process n − 2
0 n − 1 0 n − 1 0 n − 1 0 n − 1
Figure 4.11 “All-to-all” broadcast.
buffer buffer
buffer
![Page 78: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/78.jpg)
78Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
A0,0 A0,1 A0,2 A0,3
A1,0 A1,1 A1,2 A1,3
A3,0 A3,1 A3,2 A3,3
A2,0 A2,1 A2,2 A2,3
A0,0 A1,0 A2,0 A3,0
A0,1 A1,1 A2,1 A3,1
A0,3 A1,3 A2,3 A3,3
A0,2 A1,2 A2,2 A3,2
P0
P1
P2
P3
“All-to-all”
Figure 4.12 Effect of “all-to-all” on anarray.
![Page 79: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/79.jpg)
79Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.13 Numerical integration usingrectangles.
f(q)f(p)
δ
f(x)
xp qa b
![Page 80: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/80.jpg)
80Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
f(q)f(p)
δFigure 4.14 More accurate numericalintegration using rectangles.
f(x)
xp qa b
![Page 81: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/81.jpg)
81Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.15 Numerical integration usingthe trapezoidal method.
f(q)f(p)
δ
f(x)
xp qa b
![Page 82: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/82.jpg)
82Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.16 Adaptive quadratureconstruction.
A B
Cf(x)
x
![Page 83: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/83.jpg)
83Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.17 Adaptive quadrature with falsetermination.
f(x)
x
A B
C = 0
![Page 84: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/84.jpg)
84Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Distant cluster of bodiesr
Center of mass
Figure 4.18 Clustering distant bodies.
![Page 85: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/85.jpg)
85Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Subdivisiondirection
Figure 4.19 Recursive division of two-dimensional space.
Partial quadtreeParticles
![Page 86: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/86.jpg)
86Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.20 Orthogonal recursive bisectionmethod.
![Page 87: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/87.jpg)
87Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Binary Tree
Result
Figure 4.21 Process diagram for Problem 4-12(b).
log n numbers
![Page 88: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/88.jpg)
88Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
f(a)
f(b)
ab
y
x
f(x)
Figure 4.22 Bisection method for findingthe zero crossing location of a function.
![Page 89: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/89.jpg)
89Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 4.23 Convex hull (Problem 4-22).
![Page 90: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/90.jpg)
90Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3 P4 P5
Figure 5.1 Pipelined processes.
![Page 91: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/91.jpg)
91Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
sum
a[0] a[1] a[2] a[3] a[4]
soutsin
Figure 5.2 Pipeline for an unfolded loop.
soutsin soutsin soutsin soutsin
a aaaa
![Page 92: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/92.jpg)
92Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
f(t) foutfin
Figure 5.3 Pipeline for a frequency filter.
foutfin foutfin foutfin foutfin
f0 f4f3f2f1 Filtered signal
Signal withoutfrequency f0
Signal withoutfrequency f1
Signal withoutfrequency f2
Signal withoutfrequency f3
![Page 93: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/93.jpg)
93Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P4
P3
P5
P2
P1Instance
1
Instance1
Instance1
Instance1
Instance1
Instance1
Instance2
Instance2
Instance2
Instance2
Instance2
Instance2
Instance4
Instance3
Instance3
Instance3
Instance3
Instance3
Instance3
Instance4
Instance4
Instance4
Instance4
Instance4
Instance5
Instance5
Instance5
Instance5
Instance5
Instance6
Instance5
Instance6
Instance6
Instance6
Instance6
Instance7
Instance7
Instance7
Instance7
Time
Figure 5.4 Space-time diagram of a pipeline.
p − 1 m
![Page 94: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/94.jpg)
94Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3 P4 P5
P0 P1 P2 P3 P4 P5
P0 P1 P2 P3 P4 P5
P0 P1 P2 P3 P4 P5
P0 P1 P2 P3 P4 P5
Time
Instance 1
Instance 2
Instance 3
Instance 0
Instance 4
Figure 5.5 Alternative space-time diagram.
![Page 95: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/95.jpg)
95Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P4
P3
P5
P2
P1
Time
Figure 5.6 Pipeline processing 10 data elements.
d9d8d7d6d5d4d3d2d1d0 P0 P1 P2 P3 P4 P5
(a) Pipeline structure
(b) Timing diagram
P8
P7
P9
P6
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
P7P6 P8 P9
Input sequence
p − 1 n
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8 d9
d0 d1 d2 d3 d4 d5 d6 d7 d8
d0 d1 d2 d3 d4 d5 d6 d7
d0 d1 d2 d3 d4 d5 d6
![Page 96: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/96.jpg)
96Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Time
P0
P1
P2
P3
P4
P5
(a) Processes with the same (b) Processes not with the
P0
P1
P2
P3
P4
P5
Time
Figure 5.7 Pipeline processing where information passes to next stage before end of process.
Informationtransfersufficient tostart nextprocess
same execution timeexecution time
Information passedto next stage
![Page 97: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/97.jpg)
97Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3 P4 P5 P7P6 P8 P9 P11P10
Processor 1Processor 0 Processor 2
Figure 5.8 Partitioning processes onto processors.
![Page 98: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/98.jpg)
98Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Host
Multiprocessor
computer
Figure 5.9 Multiprocessor system with a line configuration.
![Page 99: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/99.jpg)
99Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P3P2P1 P4
Figure 5.10 Pipelined addition.
Σ1
5iΣ
1
i Σ1
2i Σ
1
3i Σ
1
4i
![Page 100: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/100.jpg)
100Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P2P1 Pn−1
Figure 5.11 Pipelined addition numbers with a master process and ring configuration.
dn−1… d2d1d0
Master process
Sum
Slaves
![Page 101: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/101.jpg)
101Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P2P1 Pn−1
Figure 5.12 Pipelined addition of numbers with direct access to slave processes.
Master process
Sum
Slaves dn−1d0 d1
Numbers
![Page 102: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/102.jpg)
102Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
5
4, 3, 1, 2, 5
4, 3, 1, 2
4, 3, 12
5
4, 3 5 21
4 53
21
54
32
5 43
5 4
1
21
32
1
5 4 3 21
5 4 3 2 1
Figure 5.13 Steps in insertion sort with five numbers.
P0 P2 P3 P4P1
Time
1
2
3
4
5
6
8
7
(cycles)
9
10
![Page 103: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/103.jpg)
103Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2
Largest number Next largestnumber
Series of numbersxn−1 … x1x0
Figure 5.14 Pipeline for sorting using insertion sort.
xmax
Compare
Smallernumbers
![Page 104: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/104.jpg)
104Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P2P1 Pn−1
Figure 5.15 Insertion sort with results returned to the master process using a bidirectional line configuration.
dn−1… d2d1d0Sorted sequence
Master process
![Page 105: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/105.jpg)
105Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P4
P3
P2
P1
Time
Figure 5.16 Insertion sort with results returned.
Sorting phase Returning sorted numbers
2n − 1 n
Shown for n = 5
![Page 106: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/106.jpg)
106Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2
1st prime 2nd prime
Series of numbersxn−1 … x1x0
Figure 5.17 Pipeline for sieve of Eratosthenes.
Compare
Not multiples of
3rd primemultiples number number number
1st prime number
![Page 107: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/107.jpg)
107Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x0x0
x0 x0
x1x1 x1x2 x2
x3
Figure 5.18 Solving an upper triangular set of linear equation using a pipeline.
Compute x0 Compute x1 Compute x2 Compute x3
P0 P1 P2 P3
![Page 108: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/108.jpg)
108Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Time
P0
P1
P2
P3
P4
P5
Figure 5.19 Pipeline processing using backsubstitution.
ProcessesFinal computed value
First value passed onward
![Page 109: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/109.jpg)
109Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3 P4
dividesend(x0) ⇒ recv(x0)end send(x0) ⇒ recv(x0)
multiply/add send(x0) ⇒ recv(x0)divide/subtract multiply/add send(x0) ⇒ recv(x0)send(x1) ⇒ recv(x1) multiply/add send(x1) ⇒end send(x1) ⇒ recv(x1) multiply/add
multiply/add send(x1) ⇒ recv(x1)divide/subtract multiply/add send(x1) ⇒send(x2) ⇒ recv(x2) multiply/addend send(x2) ⇒ recv(x2)
multiply/add send(x2) ⇒divide/subtract multiply/addsend(x3) ⇒ recv(x3)end send(x3) ⇒
multiply/adddivide/subtractsend(x4) ⇒end
Figure 5.20 Operations in back substitution pipeline.
Time
![Page 110: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/110.jpg)
110Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x1 x2
x
x3 x4
yin yout
a
x
yin yout
a
x
yin yout
a
x
yin yout
a
a1 a2 a3 a4
y4y3y2y1 Output
Figure 5.21 Pipeline for Problem 5-9.
![Page 111: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/111.jpg)
111Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Display
Pipeline
Audio input(digitized)
Figure 5.22 Audio histogram display.
Display
Audio input(digitized)
(a) Pipeline solution (b) Direct decomposition
![Page 112: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/112.jpg)
112Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 Pn−1
Processes
Barrier
Figure 6.1 Processes reaching the barrier atdifferent times.
Time
Active
Waiting
![Page 113: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/113.jpg)
113Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
Processes
Figure 6.2 Library call barriers.
Barrier();
P1
Barrier();
Pn−1
Barrier();
Processes wait untilall reach theirbarrier call
![Page 114: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/114.jpg)
114Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
Processes
Figure 6.3 Barrier using a centralized counter.
Barrier();
P1
Barrier();
Pn−1
Barrier();
Counter, C
Incrementand check for n
![Page 115: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/115.jpg)
115Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
for(i=0;i<n;i++)recv(Pany);
for(i=0;i<n;i++)send(Pi);
Master
Figure 6.4 Barrier implementation in a message-passing system.
ArrivalphaseDeparturephase
send(Pmaster);recv(Pmaster);
Barrier:
send(Pmaster);recv(Pmaster);
Barrier:
Slave processes
![Page 116: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/116.jpg)
116Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3 P4 P5 P6 P7
Arrivalat barrier
Departurefrom barrier
Figure 6.5 Tree barrier.
Sychronizingmessage
![Page 117: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/117.jpg)
117Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
1st stage
2nd stage
3rd stage
P0 P1 P2 P3 P4 P5 P6 P7
Time
Figure 6.6 Butterfly construction.
![Page 118: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/118.jpg)
118Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a[0]=a[0]+k; a[n-1]=a[n-1]+k;a[1]=a[1]+k;
Instructiona[] = a[] + k;
a[0] a[n-1]a[1]
Figure 6.7 Data parallel computation.
Processors
![Page 119: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/119.jpg)
119Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Σi=0
0
Σi=0
1
Σi=0
2
Σi=0
3
Σi=0
4
Σi=0
5
Σi=0
6
Σi=0
7
Σi=0
8
Σi=0
9
Σi=0
10
Σi=0
11
Σi=0
12
Σi=0
15
Σi=0
14
Σi=0
13
Σi=0
0
Σi=0
1
Σi=1
2
Σi=2
3
Σi=3
4
Σi=4
5
Σi=5
6
Σi=6
7
Σi=7
8
Σi=8
9
Σi=9
10
Σi=10
11
Σi=11
12
Σi=14
15
Σi=13
14
Σi=12
13
Σi=0
0
Σi=0
1
Σi=0
2
Σi=0
3
Σi=1
4
Σi=2
5
Σi=3
6
Σi=4
7
Σi=5
8
Σi=6
9
Σi=7
10
Σi=8
11
Σi=9
12
Σi=12
15
Σi=11
14
Σi=10
13
Σi=0
0
Σi=0
1
Σi=0
2
Σi=0
3
Σi=0
4
Σi=0
5
Σi=0
6
Σi=0
7
Σi=1
8
Σi=2
9
Σi=3
10
Σi=4
11
Σi=5
12
Σi=8
15
Σi=7
14
Σi=6
13
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15
Figure 6.8 Data parallel prefix sum operation.
Numbers
Step 1
Step 2
Step 3
Final step
Add
Add
Add
Add
(j = 0)
(j = 1)
(j = 2)
(j = 3)
![Page 120: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/120.jpg)
120Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Computedvalue
Error
Iteration
Exact value
Figure 6.9 Convergence rate.t+1t
![Page 121: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/121.jpg)
121Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 6.10 Allgather operation.
Allgather(); Allgather();
data
Allgather();
datadata
Process 0 Process n − 1Process 1
Send
Receive
buffer
buffer
xn−1x0 x1
![Page 122: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/122.jpg)
122Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
3228242016128400
1 × 106
2 × 106
Figure 6.11 Effects of computation and communication in Jacobi iteration.
OverallCommunication
Computation
Execution
Number of processors, p
time(τ = 1)
![Page 123: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/123.jpg)
123Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
hi,j
hi−1,j
hi,j−1 hi,j+1
hi+1,j
j
iMetal plate
Figure 6.12 Heat distribution problem.
Enlarged
![Page 124: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/124.jpg)
124Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
xi−1
xi
xi+1
xi+k
xi−k
x1 x2 xk−1
xk+1 xk+2
xk
x2k−1
xk2
x2k
Figure 6.13 Natural ordering of heatdistribution problem.
![Page 125: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/125.jpg)
125Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
send(g, Pi-1,j);send(g, Pi+1,j);send(g, Pi,j-1);send(g, Pi,j+1);recv(w, Pi-1,j)recv(x, Pi+1,j);recv(y, Pi,j-1);recv(z, Pi,j+1);
send(g, Pi-1,j);send(g, Pi+1,j);send(g, Pi,j-1);send(g, Pi,j+1);recv(w, Pi-1,j)recv(x, Pi+1,j);recv(y, Pi,j-1);recv(z, Pi,j+1);
send(g, Pi-1,j);send(g, Pi+1,j);send(g, Pi,j-1);send(g, Pi,j+1);recv(w, Pi-1,j)recv(x, Pi+1,j);recv(y, Pi,j-1);recv(z, Pi,j+1);
send(g, Pi-1,j);send(g, Pi+1,j);send(g, Pi,j-1);send(g, Pi,j+1);recv(w, Pi-1,j)recv(x, Pi+1,j);recv(y, Pi,j-1);recv(z, Pi,j+1);
send(g, Pi-1,j);send(g, Pi+1,j);send(g, Pi,j-1);send(g, Pi,j+1);recv(w, Pi-1,j)recv(x, Pi+1,j);recv(y, Pi,j-1);recv(z, Pi,j+1);
Figure 6.14 Message passing for heat distribution problem.
i
j
column
row
![Page 126: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/126.jpg)
126Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P1
P1
P0
Pp−1
Pp−1
Figure 6.15 Partitioning heat distribution problem.
Blocks Strips (columns)
![Page 127: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/127.jpg)
127Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Square blocks
Strips
n
np---
Figure 6.16 Communication consequences of partitioning.
![Page 128: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/128.jpg)
128Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
10001001010
1000
2000
Strip partition best
Block partition best
tstartup
Processors, pFigure 6.17 Startup times for block andstrip partitions.
![Page 129: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/129.jpg)
129Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Ghost points
Process i
Process i+1
One rowof points
Array heldby process i
Array heldby process i+1
Figure 6.18 Configurating array into contiguous rows for each process, with ghost points.
Copy
![Page 130: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/130.jpg)
130Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
20°C100°C
10ft
10ft
4ft
Figure 6.19 Room for Problem 6-14.
![Page 131: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/131.jpg)
131Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
vehicle
Figure 6.20 Road junction forProblem 6-16.
![Page 132: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/132.jpg)
132Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Airflow
Figure 6.21 Figure for Problem 6-23.
Actual dimensionsselected at will
![Page 133: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/133.jpg)
133Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P4
P5
P0
P1
P2
P3
P4
P5
P2P1P0
P3
Time
(b) Perfect load balancing
(a) Imperfect load balancing leading
t
Figure 7.1 Load balancing.
to increased execution time
Processors
Processors
![Page 134: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/134.jpg)
134Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
QueueWork pool
Slave “worker” processes
Masterprocess
Figure 7.2 Centralized work pool.
Tasks
Request task
Send task
(and possiblysubmit new tasks)
![Page 135: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/135.jpg)
135Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Process M0 Process Mn−1
Master, Pmaster
Slaves
Initial tasks
Figure 7.3 A distributed work pool.
![Page 136: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/136.jpg)
136Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Process
Requests/tasks
ProcessProcess
Process
Figure 7.4 Decentralized work pool.
![Page 137: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/137.jpg)
137Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 7.5 Decentralized selection algorithm requesting tasks between slaves.
RequestsSlave Pi
Localselectionalgorithm
RequestsSlave Pj
Localselectionalgorithm
![Page 138: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/138.jpg)
138Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Masterprocess
P1 P2 P3 Pn−1
P0
Figure 7.6 Load balancing using a pipeline structure.
![Page 139: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/139.jpg)
139Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
If buffer empty,make request
Receive taskfrom request
If free,requesttask
Receivetask fromrequest
If buffer full,send task
Request for task
Figure 7.7 Using a communication process in line load balancing.
Ptask
Pcomm
![Page 140: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/140.jpg)
140Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P1
P3
P2
P6P4P5
Figure 7.8 Load balancing using a tree.
Taskwhenrequested
![Page 141: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/141.jpg)
141Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Inactive
Active
Parent
First task
Other processes
Finalacknowledgment
Process
TaskAcknowledgment
Figure 7.9 Termination using messageacknowledgments.
![Page 142: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/142.jpg)
142Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P2P1 Pn−1
Token passed to next processor
Figure 7.10 Ring termination detection algorithm.
when reached local termination condition
![Page 143: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/143.jpg)
143Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Terminated
Token
AND
Figure 7.11 Process algorithm for localtermination.
![Page 144: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/144.jpg)
144Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 PiPj Pn−1
Figure 7.12 Passing task to previous processes.
Task
![Page 145: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/145.jpg)
145Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Terminated
AND
Terminated
AND Terminated
AND
Figure 7.13 Tree termination.
![Page 146: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/146.jpg)
146Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Base camp
Summit
Possible intermediate camps
B
C
A
Figure 7.14 Climbing a mountain.
F
E
D
![Page 147: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/147.jpg)
147Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 7.15 Graph of mountain climb.
A B C
D
E
F
10
13
17
51
8
24
9
14
![Page 148: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/148.jpg)
148Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
A
B
C
D
E
F
A B C D E F
∞
∞
∞
∞
∞
∞
10
13
17
518 24
9
∞
∞
∞ ∞ ∞ ∞ ∞
∞
∞ ∞
∞∞
∞
∞
∞
∞
∞ ∞ ∞ ∞
∞
∞14Source
Destination
A
B
C
D
E
F
Source
Weight NULL
10
8 13 24 51C D E F
14D
9E
17F
(a) Adjacency matrix
(b) Adjacency list
Figure 7.16 Representing a graph.
B
![Page 149: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/149.jpg)
149Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Vertex i
Vertex j
wi , j
dj
di
Figure 7.17 Moore’s shortest-path algo-rithm.
![Page 150: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/150.jpg)
150Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Start at
w[]
dist Process C
Process A
Master process
Figure 7.18 Distributed graph search.
Vertex
sourcevertex
w[]
dist
Vertex
dist
Process B
Newdistance
Newdistance
w[]Vertex
Other processes
![Page 151: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/151.jpg)
151Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Entrance
Exit
Search path
Figure 7.19 Sample maze for Problem 7-9.
![Page 152: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/152.jpg)
152Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Gold
Entrance
Figure 7.20 Plan of rooms for Problem 7-10.
![Page 153: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/153.jpg)
153Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Door
Room A
Room B
Figure 7.21 Graph representation forProblem 7-10.
![Page 154: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/154.jpg)
154Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Processors Memory modulesFigure 8.1 Shared memory multiprocessorusing a single bus.
Bus
Cache
![Page 155: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/155.jpg)
155Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a. Brinch Hansen, P. (1975), “The Programming Language Concurrent Pascal,” IEEE Trans. Software Eng.,Vol. 1, No. 2 (June), pp. 199–207.
b. U.S. Department of Defense (1981), “The Programming Language Ada Reference Manual,” LectureNotes in Computer Science, No. 106, Springer-Verlag, Berlin.
c. Bräunl, T., R. Norz (1992), Modula-P User Manual, Computer Science Report, No. 5/92 (August), Univ.Stuttgart, Germany.
d. Thinking Machines Corp. (1990), C* Programming Guide, Version 6, Thinking Machines System Docu-mentation.
e. Gehani, N., and W. D. Roome (1989), The Concurrent C Programming Language, Silicon Press, NewJersey.
f. Fox, G., S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu (1990), Fortran DLanguage Specification, Technical Report TR90-141, Dept. of Computer Science, Rice University.
TABLE 8.1 SOME EARLY PARALLEL PROGRAMMING LANGUAGES
Language Originator/date Comments
Concurrent Pascal Brinch Hansen, 1975a Extension to Pascal
Ada U.S. Dept. of Defense, 1979b Completely new language
Modula-P Bräunl, 1986c Extension to Modula 2
C* Thinking Machines, 1987d Extension to C for SIMD systems
Concurrent C Gehani and Roome, 1989e Extension to C
Fortran D Fox et al., 1990f Extension to Fortran for data parallel programming
![Page 156: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/156.jpg)
156Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 8.2 FORK-JOIN construct.
Main program
FORK
FORK
FORK
JOIN
JOIN JOIN
JOIN
Spawned processes
![Page 157: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/157.jpg)
157Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
IP
Stack
Code Heap
Files
Interrupt routines
Code Heap
Files
Interrupt routines
IP
Stack
IP
Stack
Thread
Thread
(a) Process
(b) ThreadsFigure 8.3 Differences between a processand threads.
![Page 158: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/158.jpg)
158Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Main program
pthread_create(&thread1, NULL, proc1, &arg);
pthread_join(thread1, *status);
proc1(&arg)
return(*status);
{
}
Figure 8.4 pthread_create() and pthread_join().
thread1
![Page 159: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/159.jpg)
159Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Main program
Figure 8.5 Detached threads.
Thread
pthread_create();
pthread_create();
pthread_create(); Termination
Thread
Thread
Termination
Termination
![Page 160: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/160.jpg)
160Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
+1 +1
Shared variable, x
Read
Write Write
Read
Process 1 Process 2Figure 8.6 Conflict in accessing sharedvariable.
![Page 161: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/161.jpg)
161Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Process 1 Process 2
while (lock == 1) do_nothing;lock = 1;
Critical section
lock = 0;
while (lock == 1)do_nothing;
lock = 1;
Critical section
lock = 0;
Figure 8.7 Control of critical sections through busy waiting.
![Page 162: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/162.jpg)
162Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 8.8 Deadlock (deadly embrace).
R1 R2
R1 R2 Rn −1 Rn
P1 P2
P1 P2 Pn −1 Pn
(a) Two-process deadlock
(b) n-process deadlock
Resource
Process
![Page 163: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/163.jpg)
163Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Block
Cache
Processor 1
Cache
Processor 2
Main memory
Block in cache
76543210
Addresstag
Figure 8.9 False sharing in caches.
![Page 164: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/164.jpg)
164Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Array a[]sum
addr
Figure 8.10 Shared memory locations for Section 8.4.1 program example.
![Page 165: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/165.jpg)
165Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Array a[]global_index
addr
sum
Figure 8.11 Shared memory locations for Section 8.4.2 program example.
![Page 166: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/166.jpg)
166Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Test1Test2
Test3
Output1
Output2
1 2
3Figure 8.12 Sample logic circuit.
TABLE 8.2 LOGIC CIRCUIT DESCRIPTION FOR FIGURE 8.12
Gate Function Input 1 Input 2 Output
1 AND Test1 Test2 Gate1
2 NOT Gate1 Output1
3 OR Test3 Gate1 Output2
![Page 167: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/167.jpg)
167Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Movement
River
Log
of logs
Figure 8.13 River and frog for Problem 8-23.
Frog
![Page 168: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/168.jpg)
168Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Master
Slaves
Pool of threads
Request
Signal
Requestserviced
Figure 8.14 Thread pool for Problem 8-24.
![Page 169: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/169.jpg)
169Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a[i] a[0] a[i] a[n-1]
Incrementcounter, x
b[x] = a[i] Figure 9.1 Finding the rank in parallel.
Compare
![Page 170: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/170.jpg)
170Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a[i] a[0] a[i] a[1] a[i] a[2] a[i] a[3]
Tree
Add
0/1 0/10/1 0/1
Add
0/1/2 0/1/2
Add
Figure 9.2 Parallelizing the rank computation.
0/1/2/3/4
Compare
![Page 171: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/171.jpg)
171Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 9.3 Rank sort using a master andslaves.
a[] b[]
Slaves
Master
Readnumbers
Place selectednumber
![Page 172: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/172.jpg)
172Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
A
P1
Compare
B
P2
Send(A)
If A > B send(B)
Figure 9.4 Compare and exchange on a message-passing system — Version 1.
If A > B load Aelse load B
else send(A)
1
3
2
Sequence of steps
![Page 173: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/173.jpg)
173Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Compare
A
P1
Compare
B
P2
Send(A)
Send(B)
Figure 9.5 Compare and exchange on a message-passing system — Version 2.
If A > B load AIf A > B load B
1
3
2
3
![Page 174: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/174.jpg)
174Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
43422825
88502825
Returnlowernumbers
98804342
88502825
43422825
98888050
Merge
Keephighernumbers
Figure 9.6 Merging two sublists — Version 1.
Originalnumbers
Finalnumbers
P1 P2
![Page 175: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/175.jpg)
175Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
88502825
98804342
43422825
98888050
Merge
Keeplowernumbers
88502825
98804342
43422825
98888050
Merge
Keephighernumbers
Figure 9.7 Merging two sublists — Version 2.
P1 P2
Originalnumbers
Originalnumbers
(final
(finalnumbers)
numbers)
![Page 176: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/176.jpg)
176Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Time
4 2 7 8 5 1 3 6
2 4 7 8 5 1 3 6
2 4 7 8 5 1 3 6
2 4 7 8 5 1 3 6
2 4 7 5 8 1 3 6
2 4 7 5 1 8 3 6
2 4 7 5 1 3 8 6
2 4 7 5 1 3 6 8
2 4 7 5 1 3 6 8
2 4 7 5 1 3 6 8
2 4 5 7 1 3 6 8
2 4 5 1 7 3 6 8
2 4 5 1 3 7 6 8
2 4 5 1 3 6 7 8
2 4 5 1 3 6 7 8
Figure 9.8 Steps in bubble sort.
Original
Phase 1
Phase 2
Phase 3
sequence: 4 2 7 8 5 1 3 6
Placelargestnumber
Placenextlargestnumber
![Page 177: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/177.jpg)
177Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
1
1
1
12
2
3 2 1
Time
Figure 9.9 Overlapping bubble sort actions in a pipeline.
Phase 3
Phase 2
Phase 1
3 2 1
Phase 4
4 3 2 1
![Page 178: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/178.jpg)
178Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
4 2 7 5 1 68 3
2 4 7 1 5 68 3
2 4 7 8 3 61 5
2 4 1 3 8 67 5
2 1 4 7 5 63 8
1 2 3 5 7 84 6
1 2 3 5 6 84 7
1 2 3 5 6 84 7
Step
1
2
3
4
5
6
7
0
Figure 9.10 Odd-even transposition sort sorting eight numbers.
P0 P1 P2 P3 P4 P5 P6 P7
Time
![Page 179: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/179.jpg)
179Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Smallest
Largest
number
number Figure 9.11 Snakelike sorted list.
![Page 180: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/180.jpg)
180Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
4 14 8 2
10 3 13 16
7 15 1 5
12 6 11 9
2 4 8 14
16 13 10 3
1 5 7 15
12 11 9 6
1 4 7 3
2 5 8 6
12 11 9 14
16 13 10 15
1 3 4 7
8 6 5 2
9 11 12 14
16 15 13 10
1 3 4 2
8 6 5 7
9 11 12 10
16 15 13 14
1 2 3 4
8 7 6 5
9 10 11 12
16 15 14 13
(a) Original placement
Figure 9.12 Shearsort.
(b) Phase 1 — Row sort (c) Phase 2 — Column sort
(d) Phase 3 — Row sort (e) Phase 4 — Column sort (f) Final phase — Row sort
of numbers
![Page 181: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/181.jpg)
181Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
(b) Transpose operation(a) Operations between elementsin rows
(c) Operations between elementsin rows (originally columns)
Figure 9.13 Using the transpose operation to maintain operations in rows.
![Page 182: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/182.jpg)
182Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
4 2 6
4 2 7 8 5 1 3 6
4 2 7 8 5 1 3 6
7 8 5 1 3
4 2 67 8 5 1 3
2 4 6
1 2 3 4 5 6 7 8
2 4 7 8 1 3 5 6
7 8 1 5 3
Sorted list
Unsorted list
Figure 9.14 Mergesort using tree allocation of processes.
Merge
Dividelist
P0
P2P0
P4 P5 P6 P7P1 P2 P3P0
P0
P6P4
P4
P0
P2P0
P0
P6P4
P4
Process allocation
![Page 183: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/183.jpg)
183Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P4
P6P1P0
2 1 6
4 2 7 8 5 1 3 6
3 2 1 4 5 7 8 6
3 4 5 7 8
1 2 7 86
Sorted list
Unsorted list
Figure 9.15 Quicksort using tree allocation of processes.
P0
P0
P7
P0
P6
P4
Process allocation
Pivot
3
P2
![Page 184: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/184.jpg)
184Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
862 6
1 2 6
4 2 7 8 5 1 3 6
3 2 1 5 7 8 6
7 8
Sorted list
Unsorted list
Figure 9.16 Quicksort showing pivot withheld in processes.
4
1
82
3
7
5
Pivots
Pivot
![Page 185: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/185.jpg)
185Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Work pool
Sublists
Slave processes
Requestsublist Return
sublistFigure 9.17 Work pool implementation ofquicksort.
![Page 186: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/186.jpg)
186Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
(a) Phase 1 001 010 011 100 101 110 111000
001 010 011 100 101 110 111000(b) Phase 2
≤ p1 > p1
001 010 011 100 101 110 111000(c) Phase 3
> p2 > p3≤ p3≤ p2
> p6 > p7≤ p7≤ p6> p4 > p5≤ p5≤ p4
Figure 9.18 Hypercube quicksort algorithm when the numbers are originally in node 000.
![Page 187: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/187.jpg)
187Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
(a) Phase 1
Broadcast pivot, p1
001 010 011 100 101 110 111000
001 010 011 100 101 110 111000(b) Phase 2
≤ p1 > p1
Broadcast pivot, p3Broadcast pivot, p2
001 010 011 100 101 110 111000(c) Phase 3
Broadcastpivot, p4
Broadcastpivot, p5
Broadcastpivot, p6
Broadcastpivot, p7
> p2 > p3≤ p3≤ p2
> p6 > p7≤ p7≤ p6> p4 > p5≤ p5≤ p4
Figure 9.19 Hypercube quicksort algorithm when numbers are distributed among nodes.
![Page 188: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/188.jpg)
188Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
(a) Phase 1 communication
(b) Phase 2 communication
(c) Phase 3 communication
Figure 9.20 Hypercube quicksortcommunication.
000 001
101
010 011
110 111
100
000 001
101
010 011
110 111
100
000 001
101
010 011
110 111
100
![Page 189: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/189.jpg)
189Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
(a) Phase 1
Broadcast pivot, p1
001 011 010 110 111 101 100000
001 011 010 110 111 101 100000(b) Phase 2
≤ p1 > p1
Broadcast pivot, p3Broadcast pivot, p2
001 011 010 110 111 101 100000(c) Phase 3
Broadcastpivot, p4
Broadcastpivot, p5
Broadcastpivot, p6
Broadcastpivot, p7
> p2 > p3≤ p3≤ p2
> p6 > p7≤ p7≤ p6> p4 > p5≤ p5≤ p4
Figure 9.21 Quicksort hypercube algorithm with Gray code ordering.
![Page 190: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/190.jpg)
190Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
82 4 5 1 6 73
83 4 761 2 5
Odd indicesEven indices
Sorted lists
a[] b[]
c[] d[]
e[]Final sorted list
Compare and exchange
1 2 3 4 5 6 7 8Figure 9.22 Odd-even merging of twosorted lists.
Merge
Merge
![Page 191: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/191.jpg)
191Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a2
b2
a4
b4
a3
b3
a1
b1
bn
anan−1
bn−1
Evenmergesort
Oddmergesort
c1c2c3c4
c2nc2n−1
Compare andexchange
Figure 9.23 Odd-even mergesort.
c5
c7c6
c2n−2
![Page 192: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/192.jpg)
192Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a0, a1, a2, a3, … an−2, an−1
Figure 9.24 Bitonic sequences.
Value
a0, a1, a2, a3, … an−2, an−1
(a) Single maximum (b) Single maximum and single minimum
![Page 193: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/193.jpg)
193Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
3 5 8 9 7 4 2 1
3 4 2 1 7 5 8 9
Bitonic sequence
Bitonic sequence Bitonic sequence
Compare andexchange
Figure 9.25 Creating two bitonicsequences from one bitonic sequence.
![Page 194: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/194.jpg)
194Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
3 5 8 9 7 4 2 1
3 4 2 1 7 5 8 9
Compare andexchange
2 1 3 4 7 5 8 9
1 2 3 4 5 7 8 9Sorted list Figure 9.26 Sorting a bitonic sequence.
Unsorted numbers
![Page 195: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/195.jpg)
195Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Sorted list
Figure 9.27 Bitonic mergesort.
Unsorted numbers
Bitonicsortingoperation
Directionof increasingnumbers
![Page 196: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/196.jpg)
196Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
8 3 4 7 9 2 1 5
3 8 7 4 2 9 5 1
3 4 7 8 5 9 2 1
3 4 7 8 9 5 2 1
3 4 2 1 9 5 7 8
2 1 3 4 7 5 9 8
1 2 3 4 5 7 8 9
1
2
3
4
5
6
Compare and exchangeai with ai+n/2 (n numbers)
n = 2 ai with ai+1
n = 4 ai with ai+2
Formbitonic listsof four
Formbitonic listof eight
numbers
numbers
Split
Sort
n = 2 ai with ai+1
Sort bitonic list
n = 8 ai with ai+4
n = 4 ai with ai+2
n = 2 ai with ai+1
Split
Split
Sort
Step
Figure 9.28 Bitonic mergesort on eight numbers.
Compare andexchange
HigherLower
= bitonic list[Fig. 9.24 (a) or (b)]
![Page 197: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/197.jpg)
197Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
88502825
98804342
43422825
98888050
50422825
98888043
Figure 9.29 Compare-and-exchangealgorithm for Problem 9-5.
Step 1
Step 2
Step 3
Terminates when insertions at top/bottom of lists
![Page 198: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/198.jpg)
198Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a0,0 a0,1
a1,0
a0,m−2
an−1,0
a0,m−1
an−2,0
an−1,m−1an−1,m−2
an−2,m−1
a1,1 a1,m−2 a1,m−1
an−2,1 an−2,m-2
an−1,1
Row
Column
Figure 10.1 An n × m matrix.
![Page 199: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/199.jpg)
199Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
× =A B C
Figure 10.2 Matrix multiplication, C = A × B.
i
j
ci,j
Row
ColumnMultiply Sum
results
![Page 200: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/200.jpg)
200Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
× =A b c
Figure 10.3 Matrix-vector multiplicationc = A × b.
i ci
Rowsum
![Page 201: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/201.jpg)
201Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
× =
Sum
A B C
Figure 10.4 Block matrix multiplication.
p
qMultiply results
![Page 202: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/202.jpg)
202Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
a0,0 a0,1 a0,2 a0,3
a1,0
a2,0
a3,0
a1,2a1,1
a2,1
a3,1
a2,2
a3,2 a3,3
a1,3
a2,3
b0,0 b0,1 b0,2 b0,3
b1,0
b2,0
b3,0
b1,2b1,1
b2,1
b3,1
b2,2
b3,2 b3,3
b1,3
b2,3
a0,0 a0,1
a1,0 a1,1
b0,0 b0,1
b1,0 b1,1
a0,2 a0,3
a1,2 a1,3
b2,0 b2,1
b3,0 b3,1
(a) Matrices
(b) Multiplying A0,0 × B0,0 to obtain C0,0
a0,0b0,0+a0,1b1,0 a0,0b0,1+a0,1b1,1
a1,0b0,0+a1,1b1,0 a1,0b0,1+a1,1b1,1
A0,0 B0,0 A0,1 B1,0
a0,2b2,0+a0,3b3,0 a0,2b2,1+a0,3b3,1
a1,2b2,0+a1,3b3,0 a1,2b2,1+a1,3b3,1
+
× + ×
=
=
a0,0b0,0+a0,1b1,0+a0,2b2,0+a0,3b3,0 a0,0b0,1+a0,1b1,1+a0,2b2,1+a0,3b3,1
a1,0b0,0+a1,1b1,0+a1,2b2,0+a1,3b3,0 a1,0b0,1+a1,1b1,1+a1,2b2,1+a1,3b3,1
= C0,0
Figure 10.5 Submatrix multiplication.
×
![Page 203: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/203.jpg)
203Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
b[][j]
a[i][]Row i
Column j
c[i][j]
Processor Pi,j
Figure 10.6 Direct implementation ofmatrix multiplication.
![Page 204: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/204.jpg)
204Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 10.7 Accumulation using a treeconstruction.
P0 P1 P2 P3
P0
P0 P2
c0,0
a0,0 b0,0 a0,1 b1,0 a0,2 b2,0 a0,3 b3,0
×
+
×××
+
+
![Page 205: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/205.jpg)
205Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
App
Aqp Aqq
Apq
i j
i
j
Bpp
Bqp Bqq
Bpq Cpp
Cqp Cqq
Cpq
P1 P3P2P0
P0 + P1
P4 + P5 P6 + P7
P2 + P3
P5 P7P6P4
Figure 10.8 Submatrix multiplication and summation.
![Page 206: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/206.jpg)
206Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
B
A
Figure 10.9 Movement of A and Belements.
j
i
Pi,j
![Page 207: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/207.jpg)
207Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
B
A
Figure 10.10 Step 2 — Alignment ofelements of A and B.
j
i
bi+j,j
ai,j+i
i places
j places
![Page 208: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/208.jpg)
208Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
B
A
Figure 10.11 Step 4 — One-place shift ofelements of A and B.
j
i
Pi,j
![Page 209: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/209.jpg)
209Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
c0,0 c0,1 c0,2 c0,3
c1,0 c1,1 c1,2 c1,3
c2,0 c2,1 c2,2 c2,3
c3,0 c3,1 c3,2 c3,3
b3,0b2,0b1,0b0,0
b3,3b2,3b1,3b0,3
b3,2b2,2b1,2b0,2
b3,1b2,1b1,1b0,1
a0,3 a0,2 a0,1 a0,0
a3,3 a3,2 a3,1 a3,0
a2,3 a2,2 a2,1 a2,0
a1,3 a1,2 a1,1 a1,0
Figure 10.12 Matrix multiplication using a systolic array.
Pumpingaction
One cycle delay
![Page 210: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/210.jpg)
210Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
c0
c1
c2
c3
b3b2b1b0
a0,3 a0,2 a0,1 a0,0
a3,3 a3,2 a3,1 a3,0
a2,3 a2,2 a2,1 a2,0
a1,3 a1,2 a1,1 a1,0
Figure 10.13 Matrix-vector multiplicationusing a systolic array.
Pumpingaction
![Page 211: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/211.jpg)
211Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Clearedto zero
Alreadyclearedto zero
Row i
Column i
Column
Row
Figure 10.14 Gaussian elimination.
Row j
Step throughaji
![Page 212: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/212.jpg)
212Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Already
Row i
Column
Row
Figure 10.15 Broadcast in parallel implementation of Gaussian elimination.
Broadcastith row
n − i +1 elements(including b[i])
clearedto zero
![Page 213: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/213.jpg)
213Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Broadcast Figure 10.16 Pipeline implementation ofGaussian elimination.
P0 P1 P2 Pn−1
rows
Row
![Page 214: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/214.jpg)
214Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P1
P3
P2
0
n/p
2n/p
3n/p
Figure 10.17 Strip partitioning.
Row
![Page 215: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/215.jpg)
215Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0
P1
Figure 10.18 Cyclic partitioning toequalize workload.
0
n/p
2n/p
3n/p
Row
![Page 216: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/216.jpg)
216Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
∆ ∆
f(x, y)
Solution space
y
x Figure 10.19 Finite difference method.
![Page 217: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/217.jpg)
217Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 10.20 Mesh of points numbered in natural order.
x1 x4x3x2 x8x7x6x5 x9
x31 x34x33x32 x38x37x36x35 x39
x41 x44x43x42 x48x47x46x45 x49
x51 x54x53x52 x58x57x56x55 x59
x61 x64x63x62 x68x67x66x65 x69
x71 x74x73x72 x78x77x76x75 x79
x11 x14x13x12 x18x17x16x15 x19
x21 x24x23x22 x28x27x26x25 x29
x81 x84x83x82 x88x87x86x85 x89
x60
x70
x80
x90
x40
x50
x30
x20
x10
x91 x94x93x92 x98x97x96x95 x99 x100
Boundary points (see text)
![Page 218: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/218.jpg)
218Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
−4−4
−4
−4
−4
1 1 11 1
1 1
1 11
ai,i ai,i+nai,i−1 ai,i+1ai,i−n1 1ith equation
11
11
1
1
Figure 10.21 Sparse matrix for Laplace’s equation.
×
x1
=
To includeboundary values
11
A x
00
00
and some zeroentries (see text)
x2
xN
xN-1
Those equations with a boundarypoint on diagonal unnecessary
for solution
![Page 219: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/219.jpg)
219Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Point
Point to becomputed
computed
Sequential order of computation
Figure 10.22 Gauss-Seidel relaxation with natural order, computed sequentially.
![Page 220: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/220.jpg)
220Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Red
Black
Figure 10.23 Red-black ordering.
![Page 221: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/221.jpg)
221Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 10.24 Nine-point stencil.
![Page 222: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/222.jpg)
222Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Coarsest grid points Finer grid pointsProcessor
Figure 10.25 Multigrid processorallocation.
![Page 223: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/223.jpg)
223Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
50°C
40°C 60°C
Ambient temperature at edges of board = 20°C
Figure 10.26 Printed circuit board for Problem 10-18.
![Page 224: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/224.jpg)
224Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 11.1 Pixmap.
j
i
Origin (0, 0)
p(i, j)Picture element(pixel)
![Page 225: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/225.jpg)
225Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
0 255Gray level
Numberof pixels
Figure 11.2 Image histogram.
![Page 226: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/226.jpg)
226Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x0
x3 x4 x5
x1 x2
x6 x7 x8 Figure 11.3 Pixel values for a 3 × 3 group.
![Page 227: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/227.jpg)
227Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Step 1Each pixel addspixel from left
Step 2Each pixel addspixel from right
Step 3Each pixel adds pixel
from above
Step 4Each pixel adds pixel
from below
Figure 11.4 Four-step data transfer for the computation of mean.
![Page 228: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/228.jpg)
228Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x3 + x4
x0 + x1
x6 + x7
(a) Step 1 (b) Step 2
(c) Step 3 (d) Step 4
x0
x3
x2x1
x7x6
x4 x5
x8
x3 + x4 + x5
x0 + x1 + x2
x6 + x7 + x8
x0
x3
x2x1
x7x6
x4 x5
x8
x0 + x1 + x2
x0 + x1 + x2
x6 + x7 + x8
x0
x3
x2x1
x7x6
x4 x5
x8
x3 + x4 + x5
x0 + x1 + x2
x0 + x1 + x2
x6 + x7 + x8
x0
x3
x2x1
x7x6
x4 x5
x8
x3 + x4 + x5x6 + x7 + x8
Figure 11.5 Parallel mean data accumulation.
![Page 229: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/229.jpg)
229Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Largest Next largest
Next largest
in row in row
in column
Figure 11.6 Approximate median algorithm requiring six steps.
![Page 230: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/230.jpg)
230Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
w0 w2
w3 w4
w1
w7w6
w5
w8
x0 x2
x3 x4
x1
x7x6
x5
x8
⊗ =
Figure 11.7 Using a 3 × 3 weighted mask.
x4'
Mask Pixels Result
![Page 231: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/231.jpg)
231Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
1
1 1 1
1 1
1 1 1Figure 11.8 Mask to compute mean.
19
k =
![Page 232: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/232.jpg)
232Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
1
1 8 1
1 1
1 1 1Figure 11.9 A noise reduction mask.
116
k =
![Page 233: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/233.jpg)
233Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
−1
−1 8 −1
−1 −1
−1 −1 −1 Figure 11.10 High-pass sharpening filtermask.
19
k =
![Page 234: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/234.jpg)
234Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Intensity transition
First derivative
Figure 11.11 Edge detection usingdifferentiation.
Second derivative
![Page 235: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/235.jpg)
235Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
f(x, y)
x
y
Image
Figure 11.12 Gray level gradient anddirection.
Gradient
φ
Constantintensity
![Page 236: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/236.jpg)
236Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
0
1
0
1
−1
0
1
−1−1
−1
1
0
0
1
1
−1
0−1
Figure 11.13 Prewitt operator.
![Page 237: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/237.jpg)
237Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
0
1
0
2
−1
0
1
−2−1
−2
1
0
0
1
2
−1
0−1
Figure 11.14 Sobel operator.
![Page 238: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/238.jpg)
238Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 11.15 Edge detection with Sobel operator.
(a) Original image (Annabel) (b) Effect of Sobel operator
![Page 239: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/239.jpg)
239Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
−1
0
4
−1
0
−1
0
−10
Figure 11.16 Laplace operator.
![Page 240: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/240.jpg)
240Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x1
x5
x7
x4x3
Left pixel Right pixel
Upper pixel
Lower pixelFigure 11.17 Pixels used in Laplaceoperator.
![Page 241: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/241.jpg)
241Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 11.18 Effect of Laplace operator.
![Page 242: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/242.jpg)
242Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
y = ax + b b = −x1a + y1y
x
b
a
Figure 11.19 Mapping a line into (a, b) space.
(b) Parameter space(a) (x, y) plane
(a, b)
Pixel in image
b = −xa + y(x1, y1)
![Page 243: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/243.jpg)
243Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
r = x cos θ + y sin θ
r
θ
Figure 11.20 Mapping a line into (r, θ) space.
rθ
(b) (r, θ) plane(a) (x, y) plane
y = ax + by
x
(r, θ)
![Page 244: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/244.jpg)
244Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
θr
y
x
Figure 11.21 Normal representation usingimage coordinate system.
![Page 245: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/245.jpg)
245Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
r
θ0°
5
10°0
1015
20°30°
Accumulator
Figure 11.22 Accumulators, acc[r][θ], forthe Hough transform.
![Page 246: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/246.jpg)
246Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
xjk
j
k Transformrows
Xjm
Transformcolumns
Xlm
Figure 11.23 Two-dimensional DFT.
![Page 247: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/247.jpg)
247Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
fj,k
gj,k
hj,k
Image
Filter/image
F(j, k)
G(j, k)
H(j, k)
f(j, k)
g(j, k)
h(j, k)
MultiplyConvolution
Transform
Inversetransform
×
(a) Direct convolution (b) Using Fourier transform
Figure 11.24 Convolution using Fourier transforms.
∗
![Page 248: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/248.jpg)
248Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Slave processes
Master process
w0 w1 wn−1
X[0] X[1] X[n−1]Figure 11.25 Master-slave approach forimplementing the DFT directly.
![Page 249: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/249.jpg)
249Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
+××a
wk
x[j]
Figure 11.26 One stage of a pipelineimplementation of DFT algorithm.
X[k]
Process j
X[k]
a
Values fornext iteration
a × x[j]
wk
![Page 250: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/250.jpg)
250Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Time
Figure 11.27 Discrete Fourier transform with a pipeline.
P0 P1 P2 P3 PN−1
(a) Pipeline structure
(b) Timing diagram
Output sequence
X[0] X[1] X[2] X[3] X[4] X[6]X[5]
Pipelinestages
X[0],X[1],X[2],X[3]…X[k]
a
wk
0
1
wk
x[0] x[1] x[2] x[3] x[N−1]
P0
P1
P2
PN−1
PN−2
![Page 251: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/251.jpg)
251Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x0x1x2
xN−1
x3
xN−2
N/2 ptDFT
N/2 ptDFT
Xk
Xk+N/2
k = 0, 1, … N/2
+
−Xodd × wk
Xeven
Figure 11.28 Decomposition of N-point DFT into two N/2-point DFTs.
Input sequence Transform
![Page 252: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/252.jpg)
252Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
X0
X1
X2
X3
x0
x1
x2
x3
+−
−
+++
+
+Figure 11.29 Four-point discrete Fouriertransform.
![Page 253: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/253.jpg)
253Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Xk = Σ(0,2,4,6,8,10,12,14)+wkΣ(1,3,5,7,9,11,13,15)
{[Σ(0,8)+wkΣ(4,12)]+wk[Σ(2,10)+wkΣ(6,14)]}+{[Σ(1,9)+wkΣ(5,13)]+wk[Σ(3,11)+wkΣ(7,15)]}
{Σ(0,4,8,12)+wkΣ(2,6,10,14)}+wk{Σ(1,5,9,13)+wkΣ(3,7,11,15)}
x0 x8 x4 x12 x2 x10 x6 x14 x1 x9 x5 x13 x3 x11 x7 x15
0000 1000 0100 1100 0010 1010 0110 1011 0001 1001 0101 1101 0011 1011 0111 1111
Figure 11.30 Sixteen-point DFT decomposition.
![Page 254: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/254.jpg)
254Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x0
x8
x4
x12
x2
x10
x6
x14
x1
x9
x5
x13
x3
x11
x7
x15
X0
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11
X12
X13
X14
X15
Figure 11.31 Sixteen-point FFT computational flow.
![Page 255: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/255.jpg)
255Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
x0
x8
x4
x12
x2
x10
x6
x14
x1
x9
x5
x13
x3
x11
x7
x15
X0
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11
X12
X13
X14
X15
Figure 11.32 Mapping processors onto 16-point FFT computation.
P0
P2
P1
P3
0000
0001
0010
0011
0100
0101
0110
0111
1001
1010
1011
1100
1101
1110
1111
1000
P/r
ProcessRow
Inputs Outputs
![Page 256: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/256.jpg)
256Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3
Figure 11.33 FFT using transposealgorithm — first two steps.
x0
x4
x8
x12
x1
x5
x9
x13
x2
x6
x10
x14
x3
x7
x11
x15
![Page 257: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/257.jpg)
257Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
P0 P1 P2 P3
Figure 11.34 Transposing array fortranspose algorithm.
x0
x4
x8
x12
x1
x5
x9
x13
x2
x6
x10
x14
x3
x7
x11
x15
![Page 258: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/258.jpg)
258Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Figure 11.35 FFT using transposealgorithm — last two steps.
P0 P1 P2 P3
x0
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x11
x12
x13
x14
x15
![Page 259: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/259.jpg)
259Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
1 2 3 4 5 6 7
1
2
3
4
5
6
7
Mask
Figure 11.36 Image for Problem 11-3.
![Page 260: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/260.jpg)
260Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
First choice
Second choice
Third choice
Figure 12.1 State space tree.
C0 C1 Cn−1
Notincluding
C0
Notincluding
C1
Notincluding
Cn−1
![Page 261: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/261.jpg)
261Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
A1 A2
B1 B2
A1 B2
B1 A2
Parent A
Parent B
Child 1
Child 2Figure 12.2 Single-point crossover.
1
1
1
1
p
p
p
p
p+1
p+1
p+1
p+1
m
m
m
m
![Page 262: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/262.jpg)
262Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Subpopulation
Migration path;
Figure 12.3 Island model.
every island sendsto every other island
![Page 263: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/263.jpg)
263Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Island subpopulations
Limited migration path Figure 12.4 Stepping stone model
![Page 264: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/264.jpg)
264Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Instructions
Figure D.1 PRAM model.
Shared memory
Program
Data
ProcessorsClock
with localmemory
![Page 265: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/265.jpg)
265Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
d[1] s[1] d[2] s[2] d[3] s[3] d[4] s[4] d[5] s[5] d[6] s[6]d[0] s[0] d[7] s[7]
1 111 0111
Figure D.2 List ranking by pointer jumping.
Null
2 222 0122
4 444 0123
7 456 0123
![Page 266: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/266.jpg)
266Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
Local computation
Communication
Barrier synchronization
Threads or processes
Maximum of hsends or receives
(maximum time w)
Figure D.3 A view of the bulk synchronous parallel model.
![Page 267: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/267.jpg)
267Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998
o
L
g
Pi
Pk
PiTime
MessageProcessors
Figure D.4 LogP parameters.o
Next message
![Page 268: Astrophysical Na single processor and memory. Main memory Processor Instructions (to processor) Data (to or from processor) 3 ... Figure 1.4 Message-passing multiprocessor model (multicomputer)](https://reader034.vdocuments.mx/reader034/viewer/2022042612/5f6300dd3d7f95769e7689f7/html5/thumbnails/268.jpg)
268Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers
Barry Wilkinson and Michael Allen Prentice Hall, 1998