15 parallel processing

Upload: rajeevrajkumar

Post on 06-Jul-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 15 Parallel Processing

    1/36

     

    Chapter 17

    Parallel Processing

  • 8/17/2019 15 Parallel Processing

    2/36

    Computer Organizations

  • 8/17/2019 15 Parallel Processing

    3/36

    Multiple Processor Organization

    • Single instruction, single data stream – SISD

    • Single instruction, multiple data stream – SIMD

    • Multiple instruction, single data stream – MISD

    • Multiple instruction, multiple data stream- MIMD

  • 8/17/2019 15 Parallel Processing

    4/36

    Single Instruction, Single Data Stream - SISD

    • Single processor

    • Single instruction stream

    • Data stored in single memor

  • 8/17/2019 15 Parallel Processing

    5/36

    Single Instruction, Multiple Data Stream - SIMD

    • Single machine instruction

    ! "ach instruction e#ecuted on di$$erent set o$ data %di$$erent processors

    • &um%er o$ processing elements

    !  Machine controls simultaneous e#ecution–  'oc(step %asis

    !  "ach processing element has associated data memor

    • )pplication* +ector and arra processing

  • 8/17/2019 15 Parallel Processing

    6/36

    Multiple Instruction, Single Data Stream - MISD

    • Seuence o$ data

    • ransmitted to set o$ processors

    • "ach processor e#ecutes di$$erent instructionseuence

    • &ot clear i$ it has e.er %een implemented

  • 8/17/2019 15 Parallel Processing

    7/36

    Multiple Instruction, Multiple Data Stream- MIMD

    • Set o$ processors

    • Simultaneousl e#ecutes di$$erent instruction

    seuences

    • Di$$erent sets o$ data

    • "#amples* SMPs, &/M) sstems, and Clusters

  • 8/17/2019 15 Parallel Processing

    8/36

    a#onom o$ Parallel Processor )rchitectures

  • 8/17/2019 15 Parallel Processing

    9/36

    0loc( Diagram o$ ightl Coupled Multiprocessor

    • Processors share memor

    • Communicate .ia that shared memor

  • 8/17/2019 15 Parallel Processing

    10/36

    Smmetric Multiprocessor Organization

  • 8/17/2019 15 Parallel Processing

    11/36

    Smmetric Multiprocessors

    • ) stand alone computer ith the $olloing

    characteristics! o or more similar processors o$ compara%le capacit

    ! Processors share same memor and I2O

    ! Processors are connected % a %us or other internal

    connection! Memor access time is appro#imatel the same $or each

    processor

    ! )ll processors share access to I2O– "ither through same channels or di$$erent channels gi.ing

    paths to same de.ices! )ll processors can per$orm the same $unctions 3hence

    smmetric4

    ! Sstem controlled % integrated operating sstem– pro.iding interaction %eteen processors

    – Interaction at 5o%, tas(, $ile and data element le.els

  • 8/17/2019 15 Parallel Processing

    12/36

    I0M z66MultiprocessorStructure

  • 8/17/2019 15 Parallel Processing

    13/36

    SMP )d.antages

    • Per$ormance!I$ some or( can %e done in parallel

    • ).aila%ilit!Since all processors can per$orm the same

    $unctions, $ailure o$ a single processor does nothalt the sstem

    • Incremental groth!/ser can enhance per$ormance % adding

    additional processors• Scaling

    !+endors can o$$er range o$ products %ased onnum%er o$ processors

  • 8/17/2019 15 Parallel Processing

    14/36

    Cache Coherence Pro%lems

    Popular solution - Snoop Protocol

    • Distri%ute cache coherence responsi%ilit among

    cache controllers

    • Cache recognizes that a line is shared

    • /pdates announced to other caches

  • 8/17/2019 15 Parallel Processing

    15/36

    'oosel Coupled - Clusters

    • Collection o$ independent hole uniprocessors or SMPs! /suall called nodes

    • Interconnected to $orm a cluster

    • 8or(ing together as uni$ied resource! Illusion o$ %eing one machine

    • Communication .ia $i#ed path or netor( connections

  • 8/17/2019 15 Parallel Processing

    16/36

    Cluster Con$igurations

  • 8/17/2019 15 Parallel Processing

    17/36

    Cluster 0ene$its

    • )%solute scala%ilit

    • Incremental scala%ilit

    • 9igh a.aila%ilit

    • Superior price2per$ormance

  • 8/17/2019 15 Parallel Processing

    18/36

    Cluster Computer )rchitecture

  • 8/17/2019 15 Parallel Processing

    19/36

    Cluster .: SMP

    • 0oth pro.ide multiprocessor support to highdemand applications:

    • 0oth a.aila%le commerciall• SMP*

    !"asier to manage and control!Closer to single processor sstems– Scheduling is main di$$erence– 'ess phsical space– 'oer poer consumption

    • Clustering*!Superior incremental ; a%solute scala%ilit!'ess cost!Superior a.aila%ilit

  • 8/17/2019 15 Parallel Processing

    20/36

    &onuni$orm Memor )ccess 3&/M)43ightl coupled4

    • )lternati.e to SMP ; Clusters

    • &onuni$orm memor access! )ll processors ha.e access to all parts o$ memor

    – /sing load ; store! )ccess time o$ processor di$$ers depending on region o$ memor– Di$$erent processors access di$$erent regions o$ memor at di$$erent

    speeds

    • Cache coherent &/M) =! Cache coherence is maintained among the caches o$ the .arious

    processors! Signi$icantl di$$erent $rom SMP and Clusters

  • 8/17/2019 15 Parallel Processing

    21/36

    Moti.ation

    • SMP has practical limit to num%er o$processors!0us tra$$ic limits to %eteen 1> and >? processors

    • In clusters each node has on memor!)pps do not see large glo%al memor!Coherence maintained % so$tare not hardare

    • &/M) retains SMP $la.our hile gi.ing largescale multiprocessing

    • O%5ecti.e is to maintain transparent sstemide memor hile permitting multiprocessornodes, each ith on %us or internal

    interconnection sstem

  • 8/17/2019 15 Parallel Processing

    22/36

    CC-&/M) Organization

  • 8/17/2019 15 Parallel Processing

    23/36

    &/M) Pros ; Cons

    • Possi%l e$$ecti.e per$ormance at higherle.els o$ parallelism than one SMP

    • &ot .er supporti.e o$ so$tare changes

    • Per$ormance can %rea(don i$ too much

    access to remote memor!Can %e a.oided %*

    – '1 ; '@ cache design reducing all memor accessA &eed good temporal localit o$ so$tare

    • &ot transparent!Page allocation, process allocation and load

    %alancing changes can %e di$$icult

    • ).aila%ilit=

  • 8/17/2019 15 Parallel Processing

    24/36

    Multithreading

    • Instruction stream di.ided into smaller streams3threads4

    • "#ecuted in parallel

    • here are a ide .ariet o$ multithreading designs

  • 8/17/2019 15 Parallel Processing

    25/36

    De$initions o$ hreads and Processes

    • hreads in multithreaded processors ma or ma not

    %e same as so$tare threads

    • Process*! )n instance o$ program running on computer

    • hread* dispatcha%le unit o$ or( ithin process! Includes processor conte#t 3hich includes the program

    counter and stac( pointer4 and data area $or stac(! hread e#ecutes seuentiall

    ! Interrupti%le* processor can turn to another thread

    • hread sitch! Sitching processor %eteen threads ithin same process! picall less costl than process sitch

  • 8/17/2019 15 Parallel Processing

    26/36

    Implicit and "#plicit Multithreading

    • )ll commercial processors and moste#perimental ones use e#plicit multithreading!Concurrentl e#ecute instructions $rom di$$erent

    e#plicit threads

    !Interlea.e instructions $rom di$$erent threads onshared pipelines or parallel e#ecution on parallelpipelines

    • Implicit multithreading is concurrent

    e#ecution o$ multiple threads e#tracted $romsingle seuential program!Implicit threads de$ined staticall % compiler or

    dnamicall % hardare

  • 8/17/2019 15 Parallel Processing

    27/36

    )pproaches to "#plicit Multithreading

    • Interlea.ed

    ! Bine-grained! Processor deals ith to or more thread conte#ts at a time! Sitching thread at each cloc( ccle! I$ thread is %loc(ed it is s(ipped

    • 0loc(ed

    ! Coarse-grained! hread e#ecuted until e.ent causes dela! ":g: cache miss! "$$ecti.e on in-order processor! ).oids pipeline stall

    • Simultaneous 3SM4! Instructions simultaneousl issued $rom multiple threads toe#ecution units o$ superscalar processor

    • Chip multiprocessing! Processor is replicated on a single chip! "ach processor handles separate threads

  • 8/17/2019 15 Parallel Processing

    28/36

    Scalar Processor )pproaches

    • Single-threaded scalar!Simple pipeline!&o multithreading

    • Interlea.ed multithreaded scalar

    !"asiest multithreading to implement

    !Sitch threads at each cloc( ccle!Pipeline stages (ept close to $ull occupied!9ardare needs to sitch thread conte#t %eteen

    ccles

    • 0loc(ed multithreaded scalar!hread e#ecuted until latenc e.ent occurs!8ould stop pipeline

    !Processor sitches to another thread

  • 8/17/2019 15 Parallel Processing

    29/36

    Scalar Diagrams

  • 8/17/2019 15 Parallel Processing

    30/36

    Multiple Instruction Issue Processors

    • Superscalar

    ! &o multithreading

    • Interlea.ed multithreading superscalar*! "ach ccle, as man instructions as possi%le issued $rom

    single thread

    ! Delas due to thread sitches eliminated! &um%er o$ instructions issued in ccle limited %

    dependencies

    • 0loc(ed multithreaded superscalar! Instructions $rom one thread

    ! 0loc(ed multithreading used

  • 8/17/2019 15 Parallel Processing

    31/36

    Multiple Instruction Issue Diagram

  • 8/17/2019 15 Parallel Processing

    32/36

    Multiple Instruction Issue Processors

    • +er long instruction ord 3+'I84!":g: I)->?

    !Multiple instructions in single ord

    !picall constructed % compiler

    !Operations ma %e e#ecuted in parallel in same ord!Ma pad ith no-ops

    • Interlea.ed multithreading +'I8!Similar e$$iciencies to interlea.ed multithreading on

    superscalar architecture• 0loc(ed multithreaded +'I8

    !Similar e$$iciencies to %loc(ed multithreading onsuperscalar architecture

  • 8/17/2019 15 Parallel Processing

    33/36

    Multiple Instruction Issue Diagram

    Parallel Simultaneous

  • 8/17/2019 15 Parallel Processing

    34/36

    Parallel, Simultaneous"#ecution o$ Multiple hreads

    • Simultaneous multithreading!Issue multiple instructions at a time

    !One thread ma $ill all horizontal slots

    !Instructions $rom to or more threads ma %eissued

    !8ith enough threads, can issue ma#imum num%ero$ instructions on each ccle

    • Chip multiprocessor

    !Multiple processors!"ach has to-issue superscalar processor

    !"ach processor is assigned thread– Can issue up to to instructions per ccle per thread

  • 8/17/2019 15 Parallel Processing

    35/36

    Parallel Diagram

  • 8/17/2019 15 Parallel Processing

    36/36

    "#amples

    • Some Pentium ? 3single processor4!Intel calls it hperthreading

    !SM ith support $or to threads

    !Single multithreaded processor, logicall toprocessors

    • I0M Poer!9igh-end PoerPC

    !Com%ines chip multiprocessing ith SM 

    !Chip has to separate processors!"ach supporting to threads concurrentl using

    SM