2016/1/5part i1 models of parallel processing. 2016/1/5part i2 parallel processors come in many...

26
111/06/23 Part I 1 Models of Parallel Processing

Upload: leona-morton

Post on 21-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 1

Models of Parallel Processing

Page 2: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 2

• Parallel processors come in many different varieties.

• Thus, we often deal with abstract models of real machines.

Page 3: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 3

Development of Early Models (1)

• Associative processing (AP) was perhaps the earliest form of parallel processing. – Associative or content-addressable memories (AMs, CAMs),

which allow memory cells to be accessed based on contents rather than their physical locations within the memory array.

– AMI AP architectures are essentially based on incorporating simple processing logic into the memory array so as to remove the need for transferring large volumes of data through the limited-bandwidth interface between the memory and the processor (the von Neumann bottleneck)

Page 4: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 4

Development of Early Models (2)

• the AM/AP model has evolved through the incorporation of additional capabilities, so that it is in essence converging with SIMD-type array processors.

Page 5: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 5

Development of Early Models (3)

• neural networks

• Cellular automata

Page 6: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 6

Page 7: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 7

Page 8: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 8

SIMD Vs. MIMD (1)

• Most early parallel machines had SIMD designs.

• Within the SIMD category, two fundamental design choices exist: – Synchronous versus asynchronous SIMD

• A possible cure is to use the asynchronous version of SIMD, known as SPMD

– Custom- versus commodity-chip SIMD

Page 9: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 9

SIMD Vs. MIMD (2)

• In the 1990s, the MIMD paradigm has become more popular recently.

• MIMD machines are most effective for medium- to coarse-grain parallel applications, where the computation is divided into relatively large subcomputations or tasks whose executions are assigned to the various processors.

Page 10: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 10

SIMD Vs. MIMD (3)

• Within the MIMD class, three fundamental issues or design choices are subjects of ongoing debates in the research community. – MPP-massively or moderately parallel processor

• Is it more cost-effective to build a parallel processor out of a relatively small number of powerful processors or a massive number of very simple processors

– Tightly versus loosely coupled MIMD• network of workstations (NOW), cluster computing, Grid

Computing

– Explicit message passing versus virtual shared memory

Page 11: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 11

Global Vs. Distributed Memory (1)

• Within the MIMD class of paranel processors, memory can be global or distributed.

• Global memory may be visualized as being in a central location where all processors can access it with equal ease.

• memory latency-hiding techniques must be employed. An example of such methods is the use of multithreading.

Page 12: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 12

Page 13: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 13

Global Vs. Distributed Memory (2)

• Examples for both the processor-to-memory and processor-to-processor networks include:

• an abstract model of global-memory computers, known as PRAM.

• One approach to reducing the amount of data that must pass through the processor-to memory interconnection network is to use a private cache memory. (locality of data access, cache coherence problem)

Page 14: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 14

Page 15: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 15

Global Vs. Distributed Memory (3)

• Distributed-memory architectures can be conceptually viewed as in Fig. 4.5.

• In addition to the types of interconnection networks enumerated for shared-memory parallel processors, distributed-memory MIMD architectures can also be interconnected by a variety of direct networks. (as nonuniform memory access (NUMA) architectures)

Page 16: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 16

Page 17: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 17

PRAM Shared-Memory Model (1)

• The theoretical model used for conventional or sequential computers (SISD class) is known as the random-access machine (RAM)

• The parallel version of RAM (PRAM), constitutes an abstract model of the class of global-memory parallel processors. The abstraction consists of ignoring the details of the processor-to-memory interconnection network and taking the view that each processor can access any memory location in each machine cycle, independent of what other processors are doing.

Page 18: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 18

Page 19: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 19

PRAM Shared-Memory Model (2)

• In the formal PRAM model, a single processor is assumed to be active initially. In each computation step, each active processor can read from and write into the shared memory and can also activate another processor.

• Even though the global-memory architecture was introduced as a subclass of the MIMD class, the abstract PRAM model depicted in Fig. 4.6 can be SIMD or MIMD.

Page 20: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 20

Page 21: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 21

PRAM Shared-Memory Model (3)

• This implies that each instruction cycle would have to consume Ω(log p) real time.

• The above point is important when we try to compare PRAM algorithms with those for distributed-memory models. An O(log p)-step PRAM algorithm may not be faster than an O(1og2 p)-step algorithm for a hypercube architecture.

Page 22: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 22

Distributed-Memory or Graph Models (1)

• Given the internal processor and memory structures in each node, a distributed-memory architecture is characterized primarily by the network used to interconnect the nodes.

• This network is usually represented as a graph.

• Important parameters of an interconnec tion network include– Network diameter: the longest of the shortest paths between various pairs

of nodes – Bisection (band)width: the smallest number (total capacity) of links that

need to be cut in order to divide the network into two subnetworks of half the size.

– Vertex or node degree: the number of communication ports required of each node

Page 23: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 23

Page 24: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 24

Page 25: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 25

Distributed-Memory or Graph Models (2)

• Even though the distributed-memory architecture was introduced as a subclass of the MIMD class, machines based on networks of the type shown in Fig. 4.8 can be SIMD- or MIMD-type.

• Fig. 4.9 are available for reducing bus traffic by taking advantage of the locality of communication within small clusters of processors.

Page 26: 2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract

112/04/21 Part I 26