parallel computing(unit5)
TRANSCRIPT
-
8/6/2019 Parallel Computing(Unit5)
1/25
-
8/6/2019 Parallel Computing(Unit5)
2/25
TOPICS TO BE COVERED
Introduction of parallel computing
Need for parallel computing
Parallel Architectural classification schemes
(1) Flynns, classification
(2) Fengs, classification
-
8/6/2019 Parallel Computing(Unit5)
3/25
INTRODUCTION TO PARALLEL
COMPUTING
Traditionally, software has been written for serialcomputation:
To be run on a single computer having a single Central Processing Unit (CPU);
A problem is broken into a discrete series of instructions.
Instructions are executed one after another.
Only one instruction may execute at any moment in time.
-
8/6/2019 Parallel Computing(Unit5)
4/25
INTRODUCTION TO PARALLEL
COMPUTING
In the simplest sense, parallel computing is the simultaneous use
of multiple compute resources to solve a computational problem:
To be run using multiple CPUs
A problem is broken into discrete parts that can be solved
concurrently
Each part is further broken down to a series of instructions
Instructions from each part execute simultaneously on differentCPUs
-
8/6/2019 Parallel Computing(Unit5)
5/25
INTRODUCTION TO PARALLEL
COMPUTING
-
8/6/2019 Parallel Computing(Unit5)
6/25
INTRODUCTION TO PARALLEL
COMPUTING
For parallel processing there may be:
A single computer with multiple processors;
An arbitrary number of computers connected by a network;
A combination of both.
The computational problem usually demonstrates characteristics such asthe ability to be:
Broken apart into discrete pieces of work that can be solvedsimultaneously;
Execute multiple program instructions at any moment in time;
Solved in less time with multiple compute resources than with a singlecompute resource.
-
8/6/2019 Parallel Computing(Unit5)
7/25
DEFINITION
Parallel processingisanefficientformofinformation processing
which emphasizestheexploitationofconcurrenteventsinthe
computing process.
Concurrencyimplies parallelism,simultaneity,and pipelining.
Parallelism: eventsoccurringatsameintervaloftime.
Simultaneouseventsmayoccuratthesametimeinstant
Pipelinedeventsmayoccurinoverlappedtimespans.
-
8/6/2019 Parallel Computing(Unit5)
8/25
NEED OF PARALLEL COMPUTING
Save time and/or money: In theory, throwing more resources at a task will shorten its time
to completion, with potential cost savings. Parallel clusters can be built from cheap,
commodity components
Solve larger problems: Many problems are so large and/or complex that it is impractical or
impossible to solve them on a single computer, especially given limited computer memory.For example
Web search engines/databases processing millions of transactions per second
Provide concurrency: A single compute resource can only do one thing at a time. Multiple
computing resources can be doing many things simultaneously. For example, the Access Grid
provides a global collaboration network where people from around the world can meet andconduct work "virtually.
Use of non-local resources: Using compute resources on a wide area network, or even the
Internet when local compute resources are scarce.
-
8/6/2019 Parallel Computing(Unit5)
9/25
APPLICATIONS OF PARALLEL
COMPUTING
1) Design of VLSI circuits
2) CAD/ CAM applications in all spheres of engineering activity.
3) Solving field problems. These are modeled using partial differential
equations and require operations on large sized matrices. Example
1. Structural dynamics in aerospace and civil engineering2. Material and nuclear problems in physics
3. Particle system problems in physics
4. Weather forecasting.
5. Intelligent systems
6. Modeling and simulation in economics, planning and many otherareas.
7. Remote sensing and processing of large data
8. problems in nuclear energy
-
8/6/2019 Parallel Computing(Unit5)
10/25
Parallelism can be
achieved by:
(1)Parallelism in
uniprocessor system(2)Parallel computers
-
8/6/2019 Parallel Computing(Unit5)
11/25
PARALLELISM IN
UNIPROCESSOR SYSTEM
A number of parallel processing mechanism have been
developed in uniprocessor computers:
(1) Multiplicity of functional units:--
Many of the functions of ALU can be distributed to multiple and specialized
functional units which can operate in parallel.
Ex: CDC-6600 it has 10 functional units in CPU and they are independent of each
other.IBM 360/91 it has two parallel execution units , one for fixed point and
another for floating point.
-
8/6/2019 Parallel Computing(Unit5)
12/25
PARALLELISM IN
UNIPROCESSOR SYSTEM
(2) parallelismandPipelining withinthe CPU:--
Incontrastto bitserialadder,carrylookaheadandcarrysaveadders
areused.
High speedmultiplicationanddivisiontechniquesareusedforexploring parallelism.
various phasesofinstructionexecutionare pipelined,including
instructionfetch,decode,operandfetch,execution,andstore.
-
8/6/2019 Parallel Computing(Unit5)
13/25
PARALLELISM IN
UNIPROCESSOR SYSTEM
(3) Overlapped CPUandI/O operations:
I/O operationscan be performedsimultaneously byusingseparateI/Ocontrollers,channelsand processors.
DMA can beusedto providedirectcommunication b/w memoryandI/O.
(4)Useofhierarchicalmemorysystem:
A hierarchicalmemorysystemcan beusedtocloseup thespeedgap b/w the
CPUandmemory.
-
8/6/2019 Parallel Computing(Unit5)
14/25
PARALLELISM IN
UNIPROCESSOR SYSTEM
(5) Multiprogramming:
With inthesametimeintervaltheremay bemultiple processesactivein
acomputer,competingformemory,I/O and CPUresources.
Whena processP1istiedup with I/O operations,thesystemscheduler
canswitch the CPUto processP2. thisallowsthesimultaneous
executionofseveral programsinthesystems.
WhenP2isdone, CPUcan beswitchedtoP3ortotheP1.
ThisinterleavingofCPUandI/O operationsamongseveral programsis
calledmultiprogramming.
-
8/6/2019 Parallel Computing(Unit5)
15/25
PARALLELISM IN
UNIPROCESSOR SYSTEM
(6) Timesharing:
Sometimesa high priority programmayoccupythe CPUfortoolongto
allow otherstoshare. This problemcan beovercome byusingthetimesharingoperatingsystem.
Theconceptextendsfrommultiprogramming byassigningfixedor
variabletimeslicestomultiple programs.
Each userthinksthat heorsheisthesoleuserofthesystem, becausetheresponseissofast.
-
8/6/2019 Parallel Computing(Unit5)
16/25
ARCHITECTURAL CLASSIFICATION
SCHEMES
(1) Flynns Classification:
ingeneraldigitalcomputersmay beclassifiedintofourcategories,accordingtothemultiplicityofinstructionanddatastreams.
Thisschemeisintroduced by Michel J. Flynn.
Theessentialcomputing processisexecutionofasequenceofinstructionsonasetofdata. Thetermstreamisused heretodenoteasequenceofitems(instruction,data).
Instructionstream asequenceofinstructions
Datastream sequenceofdata.
-
8/6/2019 Parallel Computing(Unit5)
17/25
FLYNNS CLASSIFICATION
Singleinstructionstream-singledatastream(SISD)
Singleinstructionstream-Multiple datastream(SIMD)
Multipleinstructionstream-singledatastream(MISD)
Multipleinstructionstream- Multipledatastream(MIMD)
Both instruction and data are fetched from memory. Instructions aredecoded by control unit, which sends the decoded instruction scheme tothe processor unit for execution.
Data streams flow between the Processors and memory module.
Each instruction stream is generated by an independent control unit.
-
8/6/2019 Parallel Computing(Unit5)
18/25
SISD
Instructionareexecutedsequentially, butmay beoverlappedin
execution. Most SISD uniprocessors are pipelined.
Allfunctionalunitsareunderthesupervisionofasinglecontrol
unit.
-
8/6/2019 Parallel Computing(Unit5)
19/25
SIMD
T isclassc rresp st arraypr cess rs. T ereare multiplepr cessi geleme ts
supervise ythesamec tr l unit.
ll PEsreceivethesameinstructi n ut operate on different dataset.Theshared memorysu system maycontain multiple modules.
-
8/6/2019 Parallel Computing(Unit5)
20/25
MISD
Therearen processorunits,each receivingdistinctinstructionsoperatingoverthesame
datastreamanditsderivatives.
Theresultofone processor becometheinputofthenext processor.
Ex. Systolic Arrays
-
8/6/2019 Parallel Computing(Unit5)
21/25
MIMD
Mostmultiprocessorsystemsandmultiplecomputersystemscan beclassifiedinthis
category.
Ifndatastreams werederivedfromdisjointedsubspacethen we would have MSISD
which isnothing butasetofnindependent SISD uniprocessor systems.
-
8/6/2019 Parallel Computing(Unit5)
22/25
FENGS CLASSIFICATION
Tse yun Feng hassuggestedtheuseofdegreeofparallelismtoclassifyvariouscomputerarchitectures.
Themax. no. ofbinarydigitsthatcan be processed withinaunittimebyacomputersystemiscalledmaximum parallelismdegree(P).
LetPi betheno. ofbitsthatcan be processed withintheithprocessorcycle. Iftherearetotal T cyclesthenaverage parallelismdegreePaisdefinedas
Pa=Pi/ T
Utilizationratemuofacomputer within T cycles
Mu=Pa/P
-
8/6/2019 Parallel Computing(Unit5)
23/25
FENGS CLASSIFICATION
Herethe horizontalaxisshow the wordlength n. theverticalaxis
showsthe bitslicelength m. length isno. ofbits.
Themaximum parallelismdegreeP(C)ofagivencomputersystemisrepresented bythe productofwordlength nand bitslicelength m
thatis:
P(C )=n.m
The pair(n,m)correspondstoa pointinthecomputersystemshown bycoordinatesystem. TheP(c) isequaltotheareaof
rectangledefined bytheintegersnandm.
-
8/6/2019 Parallel Computing(Unit5)
24/25
FENGS CLASSIFICATION
Thereare4typesofprocessingmethod
(1) wordserialand bitserial(WSBS):-- (n=m=1)one bitis processed
atatime. Firstgenerationcomputers
(2)Word parallel- bitserial:-- (n=1,m>1). Alsoknownas bitslice
processing becausem bitsliceis processedatatime.
1 0 1 0
0 0 0 0
1 1 1 1
0 1 1 0
1 1 1 1
-
8/6/2019 Parallel Computing(Unit5)
25/25
FENGS CLASSIFICATION
(3)Wordserialbit parallel:-- (n>1,m=1). Asfoundinmostexisting
computers. Alsoknownas wordslice processing. Becauseone word
ofn bitsis processedatatime.
(4)Word parallel- bit parallel:-- (n>1,m>1). Knownasfully parallel
processing,in which anarrayofn.m bitsis processedatonetime.