october 18, 2005 charm++ workshop 2005 1 faucets a framework for developing cluster and grid...

26
October 18, 2005 October 18, 2005 Charm++ Workshop 2005 Charm++ Workshop 2005 1 Faucets Faucets A Framework for Developing Cluster A Framework for Developing Cluster and Grid Scheduling Solutions and Grid Scheduling Solutions Presented by Esteban Pauli Presented by Esteban Pauli Parallel Programming Lab, UIUC Parallel Programming Lab, UIUC

Upload: malcolm-sanders

Post on 03-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 11

FaucetsFaucets

A Framework for Developing Cluster and A Framework for Developing Cluster and Grid Scheduling SolutionsGrid Scheduling Solutions

Presented by Esteban PauliPresented by Esteban PauliParallel Programming Lab, UIUCParallel Programming Lab, UIUC

Page 2: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 22

OutlineOutline

Motivation and GoalsMotivation and Goals System OverviewSystem Overview Meta SchedulerMeta Scheduler Cluster SchedulerCluster Scheduler ConclusionsConclusions Future WorkFuture Work

Page 3: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 33

MotivationMotivation

Clusters are becoming ubiquitousClusters are becoming ubiquitous Workloads come in bursts, resulting Workloads come in bursts, resulting

in alternation between low and high in alternation between low and high utilizationutilization

Need framework for sharing Need framework for sharing computing powercomputing power

Traditional schedulers care about Traditional schedulers care about throughput, not deadlines, priorities, throughput, not deadlines, priorities, etc.etc.

Page 4: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 44

GoalsGoals

Provide technical and economic Provide technical and economic framework for allowing organizations framework for allowing organizations to share their resources (clusters)to share their resources (clusters)

Provide new cluster scheduler which Provide new cluster scheduler which facilitates the abovefacilitates the above

Provide platform for implementing Provide platform for implementing new scheduling strategiesnew scheduling strategies

Page 5: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 55

System OverviewSystem Overview

Faucets consists of two main Faucets consists of two main components: meta scheduler and components: meta scheduler and cluster schedulercluster scheduler• Meta scheduler provides mechanism for Meta scheduler provides mechanism for

discovering and sharing resourcesdiscovering and sharing resources• Cluster scheduler makes scheduling Cluster scheduler makes scheduling

decisions based on local and global decisions based on local and global workloadworkload

Components interact to meet users’ Components interact to meet users’ needsneeds

Page 6: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 66

Central Server

Database

Cluster

Cluster Daemon

Scheduler

System ArchitectureSystem Architecture

User User

Cluster

Cluster Daemon

Scheduler

UserUser

Page 7: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 77

OutlineOutline

Motivation and GoalsMotivation and Goals System OverviewSystem Overview Meta SchedulerMeta Scheduler Cluster SchedulerCluster Scheduler ConclusionsConclusions Future WorkFuture Work

Page 8: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 88

The Faucets Meta SchedulerThe Faucets Meta Scheduler

Job Monitor

Job Submission

Job SpecsBids

Job Specs

Job IdJob Id

Cluster

Cluster

Cluster

Page 9: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 99

The Faucets Meta SchedulerThe Faucets Meta Scheduler

Users provide job requirementsUsers provide job requirements• System requirements: architecture, number of System requirements: architecture, number of

processors, minimum memory, etc.processors, minimum memory, etc.• Software requirements: utilities, dynamic Software requirements: utilities, dynamic

libraries, packages, etc.libraries, packages, etc.• Contract requirements: deadline, reliability, Contract requirements: deadline, reliability,

maximum price, etc.maximum price, etc.• Use XML, easily expandableUse XML, easily expandable

Clusters bid on jobClusters bid on job Winning bidder executes jobWinning bidder executes job

Page 10: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1010

The Faucets Meta SchedulerThe Faucets Meta Scheduler

Bidding requires no user interventionBidding requires no user intervention Clusters bid on jobs based on current Clusters bid on jobs based on current

conditionsconditions• Local utilizationLocal utilization• Account balancesAccount balances

Depending on scheduling strategy, Depending on scheduling strategy, might not be able to accept all jobsmight not be able to accept all jobs

Page 11: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1111

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (10000) Bob (100) Shared (-200) John (0) Joan (0)

Page 12: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1212

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (10000) Bob (100) Shared (-200) John (0) Joan (0)

Bob runs job worth 1000 units on Cluster 2

Page 13: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1313

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (9100) Bob (0) Shared (-200) John (0) Joan (0)

Bob’s account drained, remaining 900 units come from shared pool

Page 14: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1414

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (9100) Bob (0) Shared (-200) John (0) Joan (0)

Cluster 2’s policy: 50% to shared, rest divided equally

Page 15: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1515

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (9100) Bob (0) Shared (300) John (250) Joan (250)

Cluster 2’s shared balance up 500, John & Joan get 250 each

Page 16: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1616

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (9100) Bob (0) Shared (300) John (250) Joan (250)

Global balance remains unchanged

Page 17: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1717

The Faucets Meta SchedulerThe Faucets Meta Scheduler Both users and clusters have account balancesBoth users and clusters have account balances Cluster administrators decide how to share Cluster administrators decide how to share

balance among usersbalance among users

Central Server

Cluster 1 Cluster 2

Shared (9100) Bob (0) Shared (300) John (250) Joan (250)

Have limits to negative balances to prevent freeloading

Page 18: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1818

OutlineOutline

Motivation and GoalsMotivation and Goals System OverviewSystem Overview Meta SchedulerMeta Scheduler Cluster SchedulerCluster Scheduler ConclusionsConclusions Future WorkFuture Work

Page 19: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 1919

Cluster SchedulerCluster Scheduler

Traditional schedulers concerned Traditional schedulers concerned only with throughput – try to have only with throughput – try to have highest possible utilizationhighest possible utilization

Faucets cluster scheduler provides Faucets cluster scheduler provides different strategies to allow more different strategies to allow more efficient biddingefficient bidding

Leverage run-time systemsLeverage run-time systems Flexible design allows for easy Flexible design allows for easy

implementation of new strategies implementation of new strategies

Page 20: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005

Deadline-Driven SchedulingDeadline-Driven Scheduling(Gantt Chart)(Gantt Chart)

Schedule based on #processors, deadline, wall-time As new jobs arrive, reschedule meeting all demands Allows bidding based on deadline – can charge different Allows bidding based on deadline – can charge different

amounts based on user’s flexibilityamounts based on user’s flexibility Can leverage Charm++ runtime system to shrink and Can leverage Charm++ runtime system to shrink and

expand jobsexpand jobs

Job 1, 4 PE’s, 4 time slicesJob 2, 2 PE’s, 6 time slicesJob 3, 3 PE’s, 3 time slicesNew Job, 2 PE’s, 7 time slices

P1P2P3P4

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Originalschedule

P1P2P3P4

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Newschedule

Page 21: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005

Priority-Driven SchedulingPriority-Driven Scheduling Leverage Charm++ and other checkpoint/restart Leverage Charm++ and other checkpoint/restart

mechanismsmechanisms Priority can be based on rank (military, Priority can be based on rank (military,

institutional, etc), price paid, or other factorsinstitutional, etc), price paid, or other factors

P1P2P3P4

OriginalSchedule

Job 1, normal priority

Job 2, normal priority

Job 3, high priority

Job 3 arrives

Job 3 terminates

P1P2P3P4

NewSchedule

Page 22: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 2222

OutlineOutline

Motivation and GoalsMotivation and Goals System OverviewSystem Overview Meta SchedulerMeta Scheduler Cluster SchedulerCluster Scheduler ConclusionsConclusions Future WorkFuture Work

Page 23: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 2323

ConclusionsConclusions

Clusters becoming more common, Clusters becoming more common, Faucets provides economic and Faucets provides economic and technical framework for sharingtechnical framework for sharing

Flexible cluster scheduler allows Flexible cluster scheduler allows scheduling based on deadlines, scheduling based on deadlines, priorities, etc.priorities, etc.

Cluster scheduler leverages run-time Cluster scheduler leverages run-time systems to increase functionalitysystems to increase functionality

Page 24: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 2424

Future WorkFuture Work

NCSA Faculty Fellowship – on- demand NCSA Faculty Fellowship – on- demand accessaccess

How do we control anonymous access?How do we control anonymous access?• Only allow pre-selected applicationsOnly allow pre-selected applications• Virtual machines (Xen, VMWare, etc.)Virtual machines (Xen, VMWare, etc.)

Leverage virtualization to allow processor Leverage virtualization to allow processor sharingsharing

Re-architect Faucets to make more robust, Re-architect Faucets to make more robust, easier to write strategieseasier to write strategies

Page 25: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 2525

Questions?Questions?

Page 26: October 18, 2005 Charm++ Workshop 2005 1 Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming

October 18, 2005October 18, 2005 Charm++ Workshop 2005Charm++ Workshop 2005 2626

Thanks!Thanks!