parallel computing(unit5)

Upload: shriniwas-yadav

Post on 07-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Parallel Computing(Unit5)

    1/25

  • 8/6/2019 Parallel Computing(Unit5)

    2/25

    TOPICS TO BE COVERED

    Introduction of parallel computing

    Need for parallel computing

    Parallel Architectural classification schemes

    (1) Flynns, classification

    (2) Fengs, classification

  • 8/6/2019 Parallel Computing(Unit5)

    3/25

    INTRODUCTION TO PARALLEL

    COMPUTING

    Traditionally, software has been written for serialcomputation:

    To be run on a single computer having a single Central Processing Unit (CPU);

    A problem is broken into a discrete series of instructions.

    Instructions are executed one after another.

    Only one instruction may execute at any moment in time.

  • 8/6/2019 Parallel Computing(Unit5)

    4/25

    INTRODUCTION TO PARALLEL

    COMPUTING

    In the simplest sense, parallel computing is the simultaneous use

    of multiple compute resources to solve a computational problem:

    To be run using multiple CPUs

    A problem is broken into discrete parts that can be solved

    concurrently

    Each part is further broken down to a series of instructions

    Instructions from each part execute simultaneously on differentCPUs

  • 8/6/2019 Parallel Computing(Unit5)

    5/25

    INTRODUCTION TO PARALLEL

    COMPUTING

  • 8/6/2019 Parallel Computing(Unit5)

    6/25

    INTRODUCTION TO PARALLEL

    COMPUTING

    For parallel processing there may be:

    A single computer with multiple processors;

    An arbitrary number of computers connected by a network;

    A combination of both.

    The computational problem usually demonstrates characteristics such asthe ability to be:

    Broken apart into discrete pieces of work that can be solvedsimultaneously;

    Execute multiple program instructions at any moment in time;

    Solved in less time with multiple compute resources than with a singlecompute resource.

  • 8/6/2019 Parallel Computing(Unit5)

    7/25

    DEFINITION

    Parallel processingisanefficientformofinformation processing

    which emphasizestheexploitationofconcurrenteventsinthe

    computing process.

    Concurrencyimplies parallelism,simultaneity,and pipelining.

    Parallelism: eventsoccurringatsameintervaloftime.

    Simultaneouseventsmayoccuratthesametimeinstant

    Pipelinedeventsmayoccurinoverlappedtimespans.

  • 8/6/2019 Parallel Computing(Unit5)

    8/25

    NEED OF PARALLEL COMPUTING

    Save time and/or money: In theory, throwing more resources at a task will shorten its time

    to completion, with potential cost savings. Parallel clusters can be built from cheap,

    commodity components

    Solve larger problems: Many problems are so large and/or complex that it is impractical or

    impossible to solve them on a single computer, especially given limited computer memory.For example

    Web search engines/databases processing millions of transactions per second

    Provide concurrency: A single compute resource can only do one thing at a time. Multiple

    computing resources can be doing many things simultaneously. For example, the Access Grid

    provides a global collaboration network where people from around the world can meet andconduct work "virtually.

    Use of non-local resources: Using compute resources on a wide area network, or even the

    Internet when local compute resources are scarce.

  • 8/6/2019 Parallel Computing(Unit5)

    9/25

    APPLICATIONS OF PARALLEL

    COMPUTING

    1) Design of VLSI circuits

    2) CAD/ CAM applications in all spheres of engineering activity.

    3) Solving field problems. These are modeled using partial differential

    equations and require operations on large sized matrices. Example

    1. Structural dynamics in aerospace and civil engineering2. Material and nuclear problems in physics

    3. Particle system problems in physics

    4. Weather forecasting.

    5. Intelligent systems

    6. Modeling and simulation in economics, planning and many otherareas.

    7. Remote sensing and processing of large data

    8. problems in nuclear energy

  • 8/6/2019 Parallel Computing(Unit5)

    10/25

    Parallelism can be

    achieved by:

    (1)Parallelism in

    uniprocessor system(2)Parallel computers

  • 8/6/2019 Parallel Computing(Unit5)

    11/25

    PARALLELISM IN

    UNIPROCESSOR SYSTEM

    A number of parallel processing mechanism have been

    developed in uniprocessor computers:

    (1) Multiplicity of functional units:--

    Many of the functions of ALU can be distributed to multiple and specialized

    functional units which can operate in parallel.

    Ex: CDC-6600 it has 10 functional units in CPU and they are independent of each

    other.IBM 360/91 it has two parallel execution units , one for fixed point and

    another for floating point.

  • 8/6/2019 Parallel Computing(Unit5)

    12/25

    PARALLELISM IN

    UNIPROCESSOR SYSTEM

    (2) parallelismandPipelining withinthe CPU:--

    Incontrastto bitserialadder,carrylookaheadandcarrysaveadders

    areused.

    High speedmultiplicationanddivisiontechniquesareusedforexploring parallelism.

    various phasesofinstructionexecutionare pipelined,including

    instructionfetch,decode,operandfetch,execution,andstore.

  • 8/6/2019 Parallel Computing(Unit5)

    13/25

    PARALLELISM IN

    UNIPROCESSOR SYSTEM

    (3) Overlapped CPUandI/O operations:

    I/O operationscan be performedsimultaneously byusingseparateI/Ocontrollers,channelsand processors.

    DMA can beusedto providedirectcommunication b/w memoryandI/O.

    (4)Useofhierarchicalmemorysystem:

    A hierarchicalmemorysystemcan beusedtocloseup thespeedgap b/w the

    CPUandmemory.

  • 8/6/2019 Parallel Computing(Unit5)

    14/25

    PARALLELISM IN

    UNIPROCESSOR SYSTEM

    (5) Multiprogramming:

    With inthesametimeintervaltheremay bemultiple processesactivein

    acomputer,competingformemory,I/O and CPUresources.

    Whena processP1istiedup with I/O operations,thesystemscheduler

    canswitch the CPUto processP2. thisallowsthesimultaneous

    executionofseveral programsinthesystems.

    WhenP2isdone, CPUcan beswitchedtoP3ortotheP1.

    ThisinterleavingofCPUandI/O operationsamongseveral programsis

    calledmultiprogramming.

  • 8/6/2019 Parallel Computing(Unit5)

    15/25

    PARALLELISM IN

    UNIPROCESSOR SYSTEM

    (6) Timesharing:

    Sometimesa high priority programmayoccupythe CPUfortoolongto

    allow otherstoshare. This problemcan beovercome byusingthetimesharingoperatingsystem.

    Theconceptextendsfrommultiprogramming byassigningfixedor

    variabletimeslicestomultiple programs.

    Each userthinksthat heorsheisthesoleuserofthesystem, becausetheresponseissofast.

  • 8/6/2019 Parallel Computing(Unit5)

    16/25

    ARCHITECTURAL CLASSIFICATION

    SCHEMES

    (1) Flynns Classification:

    ingeneraldigitalcomputersmay beclassifiedintofourcategories,accordingtothemultiplicityofinstructionanddatastreams.

    Thisschemeisintroduced by Michel J. Flynn.

    Theessentialcomputing processisexecutionofasequenceofinstructionsonasetofdata. Thetermstreamisused heretodenoteasequenceofitems(instruction,data).

    Instructionstream asequenceofinstructions

    Datastream sequenceofdata.

  • 8/6/2019 Parallel Computing(Unit5)

    17/25

    FLYNNS CLASSIFICATION

    Singleinstructionstream-singledatastream(SISD)

    Singleinstructionstream-Multiple datastream(SIMD)

    Multipleinstructionstream-singledatastream(MISD)

    Multipleinstructionstream- Multipledatastream(MIMD)

    Both instruction and data are fetched from memory. Instructions aredecoded by control unit, which sends the decoded instruction scheme tothe processor unit for execution.

    Data streams flow between the Processors and memory module.

    Each instruction stream is generated by an independent control unit.

  • 8/6/2019 Parallel Computing(Unit5)

    18/25

    SISD

    Instructionareexecutedsequentially, butmay beoverlappedin

    execution. Most SISD uniprocessors are pipelined.

    Allfunctionalunitsareunderthesupervisionofasinglecontrol

    unit.

  • 8/6/2019 Parallel Computing(Unit5)

    19/25

    SIMD

    T isclassc rresp st arraypr cess rs. T ereare multiplepr cessi geleme ts

    supervise ythesamec tr l unit.

    ll PEsreceivethesameinstructi n ut operate on different dataset.Theshared memorysu system maycontain multiple modules.

  • 8/6/2019 Parallel Computing(Unit5)

    20/25

    MISD

    Therearen processorunits,each receivingdistinctinstructionsoperatingoverthesame

    datastreamanditsderivatives.

    Theresultofone processor becometheinputofthenext processor.

    Ex. Systolic Arrays

  • 8/6/2019 Parallel Computing(Unit5)

    21/25

    MIMD

    Mostmultiprocessorsystemsandmultiplecomputersystemscan beclassifiedinthis

    category.

    Ifndatastreams werederivedfromdisjointedsubspacethen we would have MSISD

    which isnothing butasetofnindependent SISD uniprocessor systems.

  • 8/6/2019 Parallel Computing(Unit5)

    22/25

    FENGS CLASSIFICATION

    Tse yun Feng hassuggestedtheuseofdegreeofparallelismtoclassifyvariouscomputerarchitectures.

    Themax. no. ofbinarydigitsthatcan be processed withinaunittimebyacomputersystemiscalledmaximum parallelismdegree(P).

    LetPi betheno. ofbitsthatcan be processed withintheithprocessorcycle. Iftherearetotal T cyclesthenaverage parallelismdegreePaisdefinedas

    Pa=Pi/ T

    Utilizationratemuofacomputer within T cycles

    Mu=Pa/P

  • 8/6/2019 Parallel Computing(Unit5)

    23/25

    FENGS CLASSIFICATION

    Herethe horizontalaxisshow the wordlength n. theverticalaxis

    showsthe bitslicelength m. length isno. ofbits.

    Themaximum parallelismdegreeP(C)ofagivencomputersystemisrepresented bythe productofwordlength nand bitslicelength m

    thatis:

    P(C )=n.m

    The pair(n,m)correspondstoa pointinthecomputersystemshown bycoordinatesystem. TheP(c) isequaltotheareaof

    rectangledefined bytheintegersnandm.

  • 8/6/2019 Parallel Computing(Unit5)

    24/25

    FENGS CLASSIFICATION

    Thereare4typesofprocessingmethod

    (1) wordserialand bitserial(WSBS):-- (n=m=1)one bitis processed

    atatime. Firstgenerationcomputers

    (2)Word parallel- bitserial:-- (n=1,m>1). Alsoknownas bitslice

    processing becausem bitsliceis processedatatime.

    1 0 1 0

    0 0 0 0

    1 1 1 1

    0 1 1 0

    1 1 1 1

  • 8/6/2019 Parallel Computing(Unit5)

    25/25

    FENGS CLASSIFICATION

    (3)Wordserialbit parallel:-- (n>1,m=1). Asfoundinmostexisting

    computers. Alsoknownas wordslice processing. Becauseone word

    ofn bitsis processedatatime.

    (4)Word parallel- bit parallel:-- (n>1,m>1). Knownasfully parallel

    processing,in which anarrayofn.m bitsis processedatonetime.