size-based scheduling policies with inaccurate scheduling information

31
1 Size-Based Scheduling Policies with Inaccurate Scheduling Information Dong Lu * , Huanyuan Sheng + , Peter A. Dinda * * Prescience Lab, Dept. of Computer Science + Dept. of Industrial Engineering & Management Science Northwestern University Evanston, IL 60201 USA

Upload: mervin

Post on 02-Feb-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Size-Based Scheduling Policies with Inaccurate Scheduling Information. Dong Lu * , Huanyuan Sheng + , Peter A. Dinda * * Prescience Lab, Dept. of Computer Science + Dept. of Industrial Engineering & Management Science Northwestern University Evanston, IL 60201 USA. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Size-Based Scheduling Policies with Inaccurate Scheduling Information

1

Size-Based Scheduling Policies with Inaccurate Scheduling Information

Dong Lu*, Huanyuan Sheng+, Peter A. Dinda*

*Prescience Lab, Dept. of Computer Science+Dept. of Industrial Engineering & Management Science

Northwestern University

Evanston, IL 60201 USA

Page 2: Size-Based Scheduling Policies with Inaccurate Scheduling Information

2

Outline

• Review of size-based scheduling

• Motivation

• Simulation Setup

• Simulation Results

• New applications

Page 3: Size-Based Scheduling Policies with Inaccurate Scheduling Information

3

Non-size-based scheduling

• FCFS, PS, etc.

• FCFS: First Come First Serve– Intuitive– Easiest to implement

• PS: Processor Sharing– Fair: all jobs accept equal resources – Also easy to implement

Problem: Unaware of job size information, which results in big mean response time

Page 4: Size-Based Scheduling Policies with Inaccurate Scheduling Information

4

Review of size-based scheduling

• SRPT, FSP, etc.

• Utilize the job size (processing time, service time) information for scheduling– Optimal in mean response time– Fair?– Easy to implement?

We use Job Size to refer to the Processing Time (Service Time) of the job

Page 5: Size-Based Scheduling Policies with Inaccurate Scheduling Information

5

Shortest Remaining Processing Time (SRPT)

• Always serve the job with minimum remaining processing time first, Preemptive scheduling

• Yields minimum mean response time [Schrage, Operations Research, 1968]

• Performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is Fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]

• Easy to implement?– With accurate a priori job size information, YES

– Otherwise, NO

Page 6: Size-Based Scheduling Policies with Inaccurate Scheduling Information

6

Fair Sojourn Protocol (FSP)

• Combined SRPT with PS, preemptive scheduling

• Mean response time is close to that of SRPT; and more fair than PS [Friedman, et al, Sigmetrics ‘03]

• Easy to implement? – With accurate a priori job size information, YES– Otherwise, NO

Page 7: Size-Based Scheduling Policies with Inaccurate Scheduling Information

7

Motivation

• Size-based scheduling requires accurate knowledge of job sizes

• In practice, a priori job size information is not always available

• All the previous work assumes perfect knowledge of job sizes a priori

• How does performance depend on quality of job size information?

Page 8: Size-Based Scheduling Policies with Inaccurate Scheduling Information

8

Correlation

We study the performance of Size-based schedulers as a function of the correlation coefficient (Pearson’s R) between actual job sizes and estimated job sizes.

Page 9: Size-Based Scheduling Policies with Inaccurate Scheduling Information

9

Outline

• Review of size-based scheduling• Motivation• Simulation Setup• Simulation Results• New applications

Page 10: Size-Based Scheduling Policies with Inaccurate Scheduling Information

10

Simulation Setup: Trace generator

Trace Generator

Correlation (Pearson’s R)

Distribution A Distribution B

X Y1 1005 300. .. .. .

Correlated random pairs of X and Y• X has distribution A• Y has distribution B• X and Y are correlated to R

Page 11: Size-Based Scheduling Policies with Inaccurate Scheduling Information

11

Simulation Setup: Trace generator

• Algorithm: “Normal-To-Anything”– First developed by Cario and Nelson, on

INFORMS Journal on Computing 10, 1 (1998). – We simplified the algorithm and first introduced

it into the simulation studies of computer systems

Page 12: Size-Based Scheduling Policies with Inaccurate Scheduling Information

12

Scatter plot of example traces

R=0.13 R=0.78

Y

X

Y

X

Page 13: Size-Based Scheduling Policies with Inaccurate Scheduling Information

13

Simulation Setup: Performance metrics

• Performance metrics– Mean response time: Sojourn time, Turn-around time– Slowdown: the ratio of response time to its size.

Fairness metric

Page 14: Size-Based Scheduling Policies with Inaccurate Scheduling Information

14

Simulation Setup: Simulator

• Simulator– Written in C++– Supports M/G/1 and G/G/n/m queuing model

• Simulator validation– Little’s law– Repeat the simulations in the FSP paper [Friedman, et

al, Sigmetrics ‘03]

– Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]

Page 15: Size-Based Scheduling Policies with Inaccurate Scheduling Information

15

Simulation Setup: Scheduling Policies

• PS: Processor sharing

• Size-based scheduling policies– SRPT: Ideal SRPT scheduler– SRPT-E: SRPT scheduler using estimated job size

– FSP: Ideal Fair Sojourn Protocol– FSP-E: FSP scheduler using estimated job size

Each simulation is repeated 20 times and we present the average

Page 16: Size-Based Scheduling Policies with Inaccurate Scheduling Information

16

Outline

• Review of size-based scheduling

• Motivation

• Simulation Setup

• Simulation Results

• New applications

Page 17: Size-Based Scheduling Policies with Inaccurate Scheduling Information

17

Simulation Results: Mean response time

0.1

1

10

100

1000

0 0.2 0.4 0.6 0.8 1

Correlation Coefficient R

Mea

n R

espo

nse

Tim

e

PSSRPTSRPT-EFSPFSP-E

Page 18: Size-Based Scheduling Policies with Inaccurate Scheduling Information

18

Simulation Results: Slowdown (R=0.0224)

1

10

100

1000

10000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.0224)

Slo

wd

ow

n

PSSRPTSRPT-EFSPFSP-E

Page 19: Size-Based Scheduling Policies with Inaccurate Scheduling Information

19

Simulation Results: Slowdown (R=0.239)

1

10

100

1000

10000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.239)

Slo

wdo

wn

PSSRPTSRPT-EFSPFSP-E

Page 20: Size-Based Scheduling Policies with Inaccurate Scheduling Information

20

Simulation Results: Slowdown (R=0.4022)

1

10

100

1000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.4022)

Slo

wd

ow

n

PSSRPTSRPT-EFSPFSP-E

Page 21: Size-Based Scheduling Policies with Inaccurate Scheduling Information

21

Simulation Results: Slowdown (R=0.5366)

1

10

100

1000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.5366)

Slo

wdo

wn

PSSRPTSRPT-EFSPFSP-E

Page 22: Size-Based Scheduling Policies with Inaccurate Scheduling Information

22

Simulation Results: Slowdown (R=0.7322)

1

10

100

1000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.7322)

Slo

wdo

wn

PSSRPTSRPT-EFSPFSP-E

Page 23: Size-Based Scheduling Policies with Inaccurate Scheduling Information

23

Simulation Results: Slowdown (R=0.9779)

1

10

100

1000

0 10 20 30 40 50 60 70 80 90 100

Job Size Percentile (R=0.9779)

Slo

wdo

wn

PSSRPTSRPT-EFSPFSP-E

Page 24: Size-Based Scheduling Policies with Inaccurate Scheduling Information

24

Simulation Results: Conclusions

• Performance heavily depends on correlation– SRPT-E and FSP-E can outperform PS given an

effective job size estimator

• Crossover point of performance metrics is a function of correlation– Also of job size distributions (See TR NWU-CS-04-33)

Page 25: Size-Based Scheduling Policies with Inaccurate Scheduling Information

25

Outline

• Review of size-based scheduling

• Motivation

• Simulation Setup

• Simulation Results

• New applications

Page 26: Size-Based Scheduling Policies with Inaccurate Scheduling Information

26

New Applications: Web server scheduling (TR NWU-CS-04-33)

• Is file size a good estimator of a job’s service time (processing time)? Not Really (R 0.14)

Service time (wall clock time)

File

Size

Page 27: Size-Based Scheduling Policies with Inaccurate Scheduling Information

27

New Applications: Web server scheduling

• Domain-based estimator: much more accurate prediction of the service time at low overhead

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 2 4 6 8 10 12 14 1618 20 2224 26 2830 32Bits used to define a domain

R (

co

rrela

tio

n c

off

icie

nt

betw

een

actu

al serv

ice

tim

e a

nd

esti

mate

d s

erv

ice t

ime)

Page 28: Size-Based Scheduling Policies with Inaccurate Scheduling Information

28

New Applications: P2P server side scheduling (LCR ’04)

• “Server side” of current file sharing P2P applications superficially similar to web server – Both send back files upon requests.

• However, P2P application can’t even know the file size accurately a priori– Partial downloads

• Our ongoing work shows that SRPT-E performs well using our time-series based job size estimators.

Page 29: Size-Based Scheduling Policies with Inaccurate Scheduling Information

29

New Applications: Network backup system scheduling

• Incremental backup copies only the files that have been created or modified since a previous backup

• With Incremental backup, the actual job sizes is difficult to know until the backup finishes

• We believe that SRPT-E or FSP-E can be applied with time series based job size predictors

Page 30: Size-Based Scheduling Policies with Inaccurate Scheduling Information

30

Summary

• Performance of size-based scheduling policies depends on correlation between size estimates and actual sizes– Fairness, mean response time, etc.

• Estimator must preserve ordering of job sizes for high performance– Performance degrades as correlation degrades

• Effective new estimators for Web and P2P

Page 31: Size-Based Scheduling Policies with Inaccurate Scheduling Information

31

For MoreInformation

• Prescience Laboratory – http://plab.cs.northwestern.edu

For more details on the applications, please also see our short paper “Applications of SRPT Scheduling with Inaccurate Scheduling Information” in digital proceedings of MASCOTS ‘04 and a poster this evening.