parallel job scheduling algorithms and interfaces

Download Parallel Job Scheduling Algorithms and Interfaces

Post on 13-Jan-2016

22 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

Parallel Job Scheduling Algorithms and Interfaces. Research Exam for Cynthia Bailey Lee Department of Computer Science and Engineering University of California, San Diego May 27, 2004. Outline. Introduction Problem Overview Why does this matter? Problem Specification History - PowerPoint PPT Presentation

TRANSCRIPT

  • Parallel Job SchedulingAlgorithms and InterfacesResearch Exam forCynthia Bailey Lee

    Department of Computer Science and EngineeringUniversity of California, San DiegoMay 27, 2004

  • OutlineIntroductionProblem OverviewWhy does this matter? Problem SpecificationHistoryEarly ApproachesBackfillingPrioritiesEvaluationMetricsMetric PitfallsUser PerspectivesFuture Directions

  • What Are We Trying to Do?Introduction: Problem Overview Why Does This Matter? Problem SpecificationJob:BlueHorizonCFD visualization: www.science-computing.de/products/powerviz.html System:Job Model:System Model:Message-PassingParallel Scientific CodeIdle space

  • Why Does This Matter?Introduction: Problem Overview Why Does This Matter? Problem SpecificationSystems in the Top500 typically range in price from $1 million to $50 million+Top500 data: www.top500.org

    Sheet1

    YearSIMDSingle-ProcessorClusterConstellationsSMPMPPSum

    1993409500245120500

    1993

    1994

    1994

    1995

    1995

    1996

    1996

    1997

    1997

    1998

    1998

    1999

    1999

    2000

    2000

    2001

    2001

    2002

    2002

    2003

    2003

    Sheet2

    Nov-03Cluster20841.6263527520076120094

    Constellations12725.46419510593629280

    MPP16533201891297365118415

    Jun-03Cluster14929.812788125748888807

    Constellations13827.6559829199928076

    MPP21342.6189887283658129898

    Nov-02Cluster9418.87832513621166870

    Constellations20340.6469646709028380

    MPP20340.6167847253914124595

    Jun-02Cluster8116.2377307002750389

    Constellations18436.8377035503525500

    MPP23246.4146243217497124776

    SMP30.65691037576

    Nov-01Cluster438.6188322954422526

    Constellations14328.6219323167119580

    MPP25751.488395128804115984

    SMP5711.4581881683584

    Jun-01Cluster326.481521377614351

    Constellations11823.6138062024520590

    MPP31963.883413121462116220

    SMP316.2290641741702

    Nov-00Cluster285.6541087269176

    Constellations11523114431541618512

    MPP34669.270256106763112447

    SMP112.29721024168

    Jun-00Cluster112.2226037193668

    Constellations9318.669411118710432

    MPP25751.4480267237792081

    SMP13927.8700383929477

    Nov-99Cluster71.496220052428

    Constellations6613.2408270726950

    MPP25851.6383775913391687

    SMP16933.87518903111848

    Jun-99Cluster61.23737891026

    Constellations255185325103886

    MPP24749.4293324481875742

    SMP22244.47505956613505

    Nov-98Cluster20.4103299290

    Constellations173.4130717881902

    MPP22645.2211923386169855

    SMP255516765831114577

    Jun-98Cluster10.2197268

    Constellations142.8104014211202

    MPP21943.8154832303456181

    SMP26653.26083744613698

    Nov-97Cluster10.21033100

    Constellations102391608580

    MPP22645.2119941815254169

    SMP26352.6450254869067

    Jun-97Cluster10.21033100

    Constellations122.4151297648

    MPP2705494331414050995

    SIMD20.420404096

    SMP21543323139685996

    Nov-96Constellations193.8161260766

    MPP28857.65754843543425

    SIMD71.4459811264

    Single Processor30.618203

    SMP18336.6200324173050

    Jun-96Constellations244.8209326954

    MPP23446.83680560437835

    SIMD71.4459811264

    Single Processor193.88310021

    SMP21643.2187821432750

    Nov-95Constellations163.2161250750

    MPP26152.23200495235107

    SIMD71.4459811264

    Single Processor224.410011924

    SMP19438.8127914912177

    Jun-95MPP21943.82574413233067

    SIMD112.25511813312

    Single Processor295.811813931

    SMP24148.2118114042533

    Nov-94MPP21242.41511280027342

    SIMD275.47917235072

    Single Processor45914417747

    SMP21643.288810471902

    Jun-94MPP20140.21233223528360

    SIMD234.67216225216

    Single Processor801616821482

    SMP19639.27618931754

    Nov-93MPP14328.6665132618766

    SIMD326.47816846336

    Single Processor9218.416721894

    SMP23346.65566921994

    Jun-93MPP11923.840082614766

    SIMD3576413554272

    Single Processor9719.414718699

    SMP24949.85116401983

    Chart2

    381119

    357143

    299201

    288212

    281219

    22326116

    24223424

    19328819

    21727112

    26322710

    26622014

    25522817

    22225325

    16926566

    13926893

    11374115

    31351118

    57300143

    3313184

    2002297203

    2003362138

    2003373127

    Time-sharing

    Space-sharing

    Other

    Number of Systems in Top500

    Sheet3

    ClusterConstellationMPPSIMDSingle-ProcessorSMPTime-sharingSpace-sharingMixed

    19931193597249381119

    19931433292233357143

    19942012380196299201

    19942122745216288212

    19952191129241281219

    19951626172219422326116

    19962423471921624223424

    1996192887318319328819

    1997112270221521727112

    199711022626326322710

    199811421926626622014

    199821722625525522817

    199962524722222225325

    199976625816916926566

    2000119325713913926893

    2000281153461111374115

    2001321183193131351118

    2001431432575757300143

    20028118423233313184

    200294203203297203

    2003149138213362138

    2003208127165373127

    ClusterConstellationMPPSIMDSingle-ProcessorSMP

    Jun-930023.8719.449.8

    Nov-930028.66.418.446.6

    Jun-940040.24.61639.2

    Nov-940042.45.4943.2

    Jun-950043.82.25.848.2

    Nov-9503.252.21.44.438.8

    Jun-9604.846.81.43.843.2

    Nov-9603.857.61.40.636.6

    Jun-970.22.4540.4043

    Nov-970.2245.20052.6

    Jun-980.22.843.80053.2

    Nov-980.43.445.20051

    Jun-991.2549.40044.4

    Nov-991.413.251.60033.8

    Jun-002.218.651.40027.8

    Nov-005.62369.2002.2

    Jun-016.423.663.8006.2

    Nov-018.628.651.40011.4

    Jun-0216.236.846.4000.6

    Nov-0218.840.640.6000

    Jun-0329.827.642.6000

    Nov-0341.625.433000

    Sheet3

    Cluster

    Constellation

    MPP

    SIMD

    Single-Processor

    SMP

    Number of Systems in Top500

    Top500 Supercomputers: Architecture Trends

    Chart1

    23.80019.4749.8

    28.60018.46.446.6

    40.200164.639.2

    42.40095.443.2

    43.8005.82.248.2

    52.203.24.41.438.8

    46.804.83.81.443.2

    57.603.80.61.436.6

    540.22.400.443

    45.20.220052.6

    43.80.22.80053.2

    45.20.43.40051

    49.41.250044.4

    51.61.413.20033.8

    51.42.218.60027.8

    69.25.623002.2

    63.86.423.6006.2

    51.48.628.60011.4

    46.416.236.8000.6

    40.618.840.6000

    42.629.827.6000

    3341.625.4000

    MPP

    Cluster

    Constellation

    Single-Processor

    SIMD

    SMP

    Sheet4

    ClusterConstellationMPPSIMDSingle-ProcessorSMP

    Jun-930023.8719.449.8

    Nov-930028.66.418.446.6

    Jun-940040.24.61639.2

    Nov-940042.45.4943.2

    Jun-950043.82.25.848.2

    Nov-9503.252.21.44.438.8

    Jun-9604.846.81.43.843.2

    Nov-9603.857.61.40.636.6

    Jun-970.22.4540.4043

    Nov-970.2245.20052.6

    Jun-980.22.843.80053.2

    Nov-980.43.445.20051

    Jun-991.2549.40044.4

    Nov-991.413.251.60033.8

    Jun-002.218.651.40027.8

    Nov-005.62369.2002.2

    Jun-016.423.663.8006.2

    Nov-018.628.651.40011.4

    Jun-0216.236.846.4000.6

    Nov-0218.840.640.6000

    Jun-0329.827.642.6000

    Nov-0341.625.433000

    Jun-93Nov-93Jun-94Nov-94Jun-95Nov-95Jun-96Nov-96Jun-97Nov-97Jun-98Nov-98Jun-99Nov-99Jun-00Nov-00Jun-01Nov-01Jun-02Nov-02Jun-03Nov-03

    Cluster000000000.20.20.20.41.21.42.25.66.48.616.218.829.841.6

    Constellation000003.24.83.82.422.83.4513.218.62323.628.636.840.627.625.4

    MPP23.828.640.242.443.852.246.857.65445.243.845.249.451.651.469.263.851.446.440.642.633

    SIMD76.44.65.42.21.41.41.40.40000000000000

    Single-Processor19.418.41695.84.43.80.600000000000000

    SMP49.846.639.243.248.238.843.236.64352.653.25144.433.827.82.26.211.40.6000

  • Problem SpecificationPurpose process a workload parallel batch jobsProcessor Homogeneity machine consists of N identical processorsJob Specification processors by requested runtimeExclusivity jobs do not share processorsNon-Preemption once begun, jobs run to completionOnline jobs arrive stochastically, no knowledge of futureAccounting there is a scheme to track users' resource consumptionUser Independence users are in competition for system resourcesIntroduction: Problem Overview Why Does This Matter? Problem Specification

  • OutlineIntroductionProblem OverviewWhy does this matter?Problem SpecificationHistoryEarly ApproachesBackfillingPrioritiesEvaluationMetricsMetric PitfallsUser PerspectivesFuture DirectionsHistory

  • First Come First Serve(FCFS)Job 1Job 4Job 3TimeProcessorsJob 2History: Early Approaches Backfilling PrioritiesQueue:

  • Tennis Court Scheduling[M93,P04]Job 2Job 3Job 4Job 7Job 6TimeProcessorsJob 1Job 5History: Early Approaches Backfilling Priorities

  • EASY Backfilling[SCZL96]Allow backfills when the projected start of first job in the queue is not delayedNo starvationall jobs will eventually runClaim: Jobs in the queue are never delayed from running by jobs submitted to the queue after them.Disproved [MF01]

    History: Early Approaches Backfilling Priorities

  • Conservative BackfillingAllow backfills when the projected starts of all preceding jobs in the queue are not delayedWorst-case start time guaranteed at submittalClaim: guarantees that future arrivals do not delay previously queued jobs. [MF01]Disproveddepending on semantics of delay [JSC01]History: Early Approaches Backfilling Priorities

  • Maui Scheduler [JS01]Prioritiesa funct

Recommended

View more >