Download - Dynamic Fractional Resource Scheduling -- 2010, ARCS

Introduction Off-Line Problem On-Line Problem Summary Appendix

Dynamic Fractional Resource Schedulingfor Cluster Platforms

Mark Stillwell

Department of Information and Computer SciencesUniversity of Hawai’i at Manoa

Achievement Rewards for College Scientists2010 Scholarship Awards Program

Mark Stillwell UH Manoa ICS

DFRS for Clusters


Clusters

DefinitionA cluster is a group of independent computers, or nodes,working together closely, usually connected by a high-speednetwork.


DFRS for Clusters


Jobs

I The system can accept user requests to run jobs or theadministrator can instantiate jobs directly

I Running jobs are made up of nearly identical tasksI The number of tasks is specified by user/administratorI Tasks can block while communicating with each other

I The assignment of resources to tasks is called scheduling


DFRS for Clusters


Current Approaches

I Service Hosting (Off-Line scheduling)I traditionally, dedicated machinesI current interest in server consolidationI few good theoretical modelsI heavy “engineering” bias

I High-Performance Computing (On-Line Scheduling)I Usually First-Come-First-Served (FCFS) with backfillingI Backfilling needs (unreliable) compute time estimatesI Unbounded wait timesI Inefficient use of nodes/resources


DFRS for Clusters


Our Proposal

I Use virtual machine technology.I Multiple tasks on one nodeI Performance isolationI Sharing of fractional resources

I Define a run-time computable metric that captures notionsof performance and fairness.

I Design heuristics that allocate resources to jobs whileexplicitly trying to achieve high ratings by our metric.


DFRS for Clusters


Requirements, Needs, and Yield

I Tasks have memory requirements and CPU needsI All tasks of a job have the same requirements and needsI For a task to be placed on a node there must be memory

available at least equal to its requirementsI A task can be allocated less CPU than its need, and the

ratio of the allocation to the need is the yieldI All tasks of a job must have the same yield, so we can also

speak of the yield of a jobI The yield of a job gives its performance relative to if it were

run on a dedicated system


DFRS for Clusters


Off-Line Problem Assumptions

I Steady-state execution with infinite jobsI Models an ideal service hosting environmentI Makes the problem more tractable [Marchal et al., 2006]I Good when job duration longer than schedule time

I Jobs are serial (single task)


DFRS for Clusters


Objective Function

I The performance of a job is correlated to its yieldI Maximizing the average or sum of the yields may be unfair

to some jobsI Instead, we seek to maximize the minimum yield


DFRS for Clusters


Task Placement Heuristics

I Greedy Task Placement – Incremental, puts each task onthe node with the lowest computational load on which itcan fit without violating memory constraints

I Randomized Rounding Task Placement – Based onrelaxing an MILP to an LP and rounding probabilistically

I MCB Task Placement – Global, iteratively appliesmulti-capacity (vector) bin-packing heuristics during abinary search for the maximized minimum yield


DFRS for Clusters


Large Problem Set: Minimum Yield vs. Free Memory

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90.3

0.35

0.4

0.45

0.5

0.55

Slack

Min

imum

Yie

ld

boundMCB8GRGBSGSGBRRNDRRNZ


DFRS for Clusters


Conclusions

I The problem is tractableI Multi-capacity bin packing algorithms perform wellI Greedy algorithms perform okay, and are very fastI Randomized rounding approaches are not very promising


DFRS for Clusters


On-Line Problem Assumptions

I Job arrival/completion times are not known in advanceI Jobs are temporary

I The user wants a final resultI Quick turnaround relative to runtime is desired

I Jobs are not interactiveI So can wait until resources are available to start

I We avoid the use of runtime estimates


DFRS for Clusters


Stretch

I Our goal: minimize maximum stretch (aka slowdown)I Stretch: the time a job spends in the system divided by the

time that would be spent in a dedicatedsystem [Bender et al., 1998]

I Popular to quantify schedule quality post-mortemI Not generally used to make scheduling decisionsI Runtime computation requires (unreliable) user estimates.I Minimizing average stretch prone to starvationI Minimizing maximum stretch captures notions of both

performance and fairness.I Our approach: try to maximize minimum yield

I Similar, but not the same, as minimizing maximum stretch


DFRS for Clusters


Max Stretch Degradation vs. Load, no migration cost

1

10

100

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Str

etc

h D

eg

rad

atio

n F

acto

r

Load

EASYFCFS

GREEDY

GREEDYPGREEDYPM

DynMCB8

MCB8P 600MCB8PG 600


DFRS for Clusters


Conclusions

I DFRS approaches can significantly outperform traditionalapproaches

I Aggressive repacking can lead to much better resourceallocations

I But also to heavy migration costsI Greedy migration is not that useful


DFRS for Clusters


Summary

I We have proposed a novel approach to job scheduling onclusters, Dynamic Fractional Resource Scheduling, thatmakes use of modern virtual machine technology andseeks to optimize a runtime-computable, user-centricmeasure of performance called the minimum yield

I Multi-capacity bin packing heuristics can be used to findgood solutions

I Our approach avoids the use of unreliable runtimeestimates

I This approach has the potential to lead toorder-of-magnitude improvements in performance overcurrent technology


DFRS for Clusters


References I

Bender, M. A., Chakrabarti, S., and Muthukrishnan, S.(1998).Flow and Stretch Metrics for Scheduling Continuous JobStreams.In Proc. of the 9th ACM-SIAM Symp. On DiscreteAlgorithms, pages 270–279.

Marchal, L., Yang, Y., Casanova, H., and Robert, Y. (2006).

Steady-state scheduling of multiple divisible loadapplications on wide-area distributed computing platforms.Intl. J. of High Performance Computing Applications,20(3):365–381.


DFRS for Clusters


References II

Stillwell, M., Shanzenbach, D., Vivien, F., and Casanova,H. (2009).Resource Allocation using Virtual Clusters.In CCGrid, pages 260–267. IEEE.

Stillwell, M., Vivien, F., and Casanova, H. (2010).Dynamic fractional resource scheduling for HPC workloads.

In IPDPS.to appear.


DFRS for Clusters

Download - Dynamic Fractional Resource Scheduling -- 2010, ARCS

Top Related