lecture slides from last week are posted on course web page

34
1

Upload: others

Post on 03-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture slides from last week are posted on course web page

1

Page 2: Lecture slides from last week are posted on course web page

2

•  Lecture slides from last week are posted on course web page.

•  Project suggestions & deadlines are posted on web page

•  Reading list is posted. • Volunteer now! :-)‏ • Need one presenter for next week …

Page 3: Lecture slides from last week are posted on course web page

3

•  Sept. 30: Project proposal •  Oct 7: Related work •  Oct 28: Status report I •  Nov 20: Status report II •  Dec 20: Final report

Page 4: Lecture slides from last week are posted on course web page

4

•  Two case studies from my own research •  Some project suggestions •  A few words about paper presentations

•  Probably next week: •  Queueing Terminology •  First operational laws •  Little’s law

Page 5: Lecture slides from last week are posted on course web page

5

Page 6: Lecture slides from last week are posted on course web page

6

PS (timesharing)‏

Standard web server

SRPT (shortest-remaining-time)

Socket 1

Socket 3

Socket 2

SRPT web server (kernel-level Implementation) ‏

Socket 1

Socket 3

Socket 2 S

M

L

Size-based scheduling for better

response times.

Page 7: Lecture slides from last week are posted on course web page

7

Workload generator 1

resp

onse

tim

e (m

s) ‏

PS SRPT

PS

SRPT

300

200

100

load 0 .25 .5 .75 1 0 .25 .5 .75 1

load

Workload generator 2

WHY?   Mean file size   File size distribution   Access pattern   Request rate   CPU utilization   Bandwidth   Network effects ALL THE SAME!

Tuning knob - Request rate

Tuning knob - Number of users - Think time

Page 8: Lecture slides from last week are posted on course web page

8

load 0 .25 .5 .75 1

load 0 .25 .5 .75 1

Web server App server Database (PostgreSQL) ‏

resp

onse

tim

e (s

ec ‏(

10

5

0

TPC-W generator X TPC-W generator Y

Tuning knob - Request rate

Tuning knob - Number of users - Think time

Page 9: Lecture slides from last week are posted on course web page

9

•  Based on trace from top-10 online auctioning site.

load 0 0.25 0.5 0.75 1

load 0 0.25 0.5 0.75 1

resp

onse

tim

e (s

ec ‏(

20

15

10

5

0

Simulator 1 Simulator 2 20

15

10

5

0

PLJF PS SRPT

PLJF PS SRPT

Tuning knob - Request rate

Tuning knob - Number of users - Think time

Page 10: Lecture slides from last week are posted on course web page

10

load

10 clnt.

100 clnt.

1000 clnt.

load 0.1 0.25 0.5 0.75 1

Simulator A Simulator B

FCFS

Cray J90/C90

•  Simulation based on trace from Pittsburgh Supercomputing Center.

10

10

10

10 resp

onse

tim

e (m

in ‏(

5

4

3

2

0.1 0.25 0.5 0.75 1

Tuning knob - Request rate

Tuning knob - Number of users - Think time

Page 11: Lecture slides from last week are posted on course web page

11

PLJF PS PSJF

10 clnt. 100 clnt. 1000 clnt.

PLJF PS PSJF

PS SRPT

PS SRPT

Tuning knob - Request rate

Tuning knob - Number of users - Think time

Page 12: Lecture slides from last week are posted on course web page

12

Model of user behavior

User requests web page, receives page, reads page, clicks on new link

•  Arrivals triggered by completions.

•  Fixed number of users, called the Multi-Programming-Level (MPL) ‏

think send receive

Page 13: Lecture slides from last week are posted on course web page

13

x x x server

new arrivals

arrival times

next arrival time from trace

•  Arrivals are independent of completions

•  There is no max number of simultaneous users

Trace / probability distribution

Page 14: Lecture slides from last week are posted on course web page

14

Surge SPECWeb

TPC-W Sclient RUBiS

WebBench Webjamma

•  Generators for same purpose use different models!

•  Often not clear which model generators use!

Page 15: Lecture slides from last week are posted on course web page

15

•  Very little … •  Limited to FCFS single server queue.

–  Response times under open system higher than under closed [Bondi and Whitt 1986].

–  For MPL -> , closed system converges to open system [Schatte83, Schatte84]. ∞

Page 16: Lecture slides from last week are posted on course web page

16

–  What is the magnitude in difference of response times? –  What is the speed of convergence? –  How does variability (heavy tails) affect results? –  How are different scheduling disciplines affected? –  …. in practice?

Page 17: Lecture slides from last week are posted on course web page

17

•  What is the magnitude in difference of response times? –  Orders of magnitude!

load 0 0.25 0.5 0.75 1

mea

n re

spon

se ti

me

1000

100

10

Open

Closed (MPL=50)‏

ANALYSIS

•  Why? –  Bounded number of jobs in closed system.

Page 18: Lecture slides from last week are posted on course web page

18

•  How does variability affect open/closed response times? –  Huge effect on open, limited effect on closed system.

Closed (MPL=50)‏ Closed (MPL=100)‏

Closed (MPL=1000)‏

low variability high variability mea

n re

spon

se ti

me

1500

1000

500

Open Web Workloads

•  Why? –  Dependency between completions and arrivals in closed system

reduces burstiness.

ANALYSIS

Page 19: Lecture slides from last week are posted on course web page

19

•  Can we make closed look like open, by increasing MPL?

Closed (MPL=50)‏ Closed (MPL=100)‏

Closed (MPL=1000)‏

low variability high variability mea

n re

spon

se ti

me

1500

1000

500

Open Web Workloads

Page 20: Lecture slides from last week are posted on course web page

20

•  What is the impact of scheduling? –  Huge in open system, almost none in closed system.

PLJF FCFS PS PSJF

PLJF FCFS PS PSJF

ANALYSIS

•  Why? –  Scheduling takes advantage of variability in the system. –  Closed systems reduce the effect of variability.

ANALYSIS

Page 21: Lecture slides from last week are posted on course web page

21

1.  Is there a more realistic model?

2.  What’s most representative of real systems?

Page 22: Lecture slides from last week are posted on course web page

22

x x x new arrivals

server

think send receive

leave system

with probability q return to the system

Page 23: Lecture slides from last week are posted on course web page

23

1 10 100 1000 mean think time

300

200

100

0

mea

n re

spon

se ti

me

SRPT

PS

Page 24: Lecture slides from last week are posted on course web page

24

q1 q0

number of requests per visit ↑ number of requests per visit ↓ ? ?

x x x new arrivals

server

think send receive

leave system

with probability q return to the system

Page 25: Lecture slides from last week are posted on course web page

25

300

200

100

0 0 5 10 15 20

PS open

PS closed

PS

SRPT mea

n re

spon

se ti

me

mean number of requests per visit

OPEN CLOSED

Page 26: Lecture slides from last week are posted on course web page

26

Open or Closed? Use partly-open system

to decide

Real web workloads

•  A site being “Slashdotted” •  Financial service provider •  CMU web server •  Kasparov vs Deep Blue •  Large corporate web site •  Science Institute USGS •  Online dept. store •  Supercomp. site •  World cup site •  Online gaming site

)1.2(‏ )1.4(‏ )1.8(‏ )2.4(‏ )2.4(‏ )3.6(‏ )5.4(‏ )6.0(‏ )11.6(‏ )12.9(‏

#req. / visit

Page 27: Lecture slides from last week are posted on course web page

27

Page 28: Lecture slides from last week are posted on course web page

28

Storage system (RAID)‏

•  Depends on probability that after one drive fails, a second drive fails while reconstructing data.

Page 29: Lecture slides from last week are posted on course web page

29

4

2

1

0

3

5

6 x 10 -3

Prob

abili

ty (%

‏(

1 hour reconstruction time

•  Need probability of second failure during reconstruction

Standard approach: Use datasheet MTTF and exponential distr.

Page 30: Lecture slides from last week are posted on course web page

30

4

2

1

0

3

5

6 x 10 -3

Prob

abili

ty (%

‏(

1 hour reconstruction time

•  Need probability of second failure during reconstruction

Standard approach: Use datasheet MTTF and exponential distr.

Estimate based on data

Page 31: Lecture slides from last week are posted on course web page

31

x 10 -3

4

2

1

0

3

5

6

Prob

abili

ty (%

‏(

1 hour reconstruction time

•  Need probability of second failure during reconstruction

Standard approach: Use datasheet MTTF and exponential distr. Use measured MTTF and exponential distribution

Estimate based on data

Page 32: Lecture slides from last week are posted on course web page

32

x 10 -3

4

2

1

0

3

5

6

Prob

abili

ty (%

‏(

1 hour reconstruction time

•  Need probability of second failure during reconstruction

Standard approach: Use datasheet MTTF and exponential distr. Use measured MTTF and exponential distribution Use measured MTTF and Weibull distribution Estimate based on data

Page 33: Lecture slides from last week are posted on course web page

33

1.2 1.0

0.6 0.4 0.2

0

0.8

1.4 1.6

x 10 -2

Prob

abili

ty (%

‏(

Reconstruction time

•  Need probability of second failure during reconstruction

Standard approach: Use datasheet MTTF and exponential distr. Use measured MTTF and exponential distribution Use measured MTTF and Weibull distribution Estimate based on data

Page 34: Lecture slides from last week are posted on course web page

34

•  Intuition is not always good enough •  Need back-of-the envelope calculations

and analytical tools to answer questions. •  Workload / fault load matters hugely

•  Important to understand what the real world looks like!