se-280 dr. mark l. hornick 1 prediction intervals

23
SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

Upload: christian-melton

Post on 26-Dec-2015

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

1

Prediction Intervals

Page 2: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

2

Review: PROBE’s regression calculations give us estimates (projections) for size (A+M LOC)…

0

100

200

300

400

500

600

700

800

0 100 200 300 400

Act

ual

Siz

e (

LO

C)

Estimated Proxy Size (LOC)

Page 3: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

3

…and time

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350 400

Tim

e (m

in)

Estimated Proxy Size (LOC)

Page 4: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

4

But just how good are these estimates???

Off by 5%, 10%, 50%, 100%, 500%?

Does it matter?

Do you want to bet: Your weekends? Your reputation? Your JOB?

Page 5: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

5

Which of the following regression projections would you trust more?

Page 6: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

6

Example A10 data pointsCorrelation = 0.75

0

100

200

300

400

500

600

700

800

0 100 200 300 400

Estimated Object LOC

Act

ual

To

tal

LO

C

Page 7: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

7

Example B25 data pointsCorrelation = 0.75

0

100

200

300

400

500

600

700

800

0 100 200 300 400

Estimated Object LOC

Act

ual

To

tal

LO

C

Page 8: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

A Prediction Interval calculation computes the bounds on the likely error of an estimate

Range

UPI = estimated A+M LOC + RangeLPI = estimated A+M LOC - Range

UPI

LPI

Projection(Estimate)

Strictly speaking, the UPI/LPI "lines" are parabolas, and Range varies.

Range

Page 9: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

9

If you had this kind of information about your estimates, how would you use it?

Suppose your time projection said that a project would take 8 weeks.

But, your prediction interval has a range of 3 weeks.

151413121110

9876543210

How should you make your plan?

What should you tell management?

Page 10: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

10

If you had this kind of information about your estimates, how would you use it?

Suppose your time projection said that a project would take 8 weeks.

But, your prediction interval has a range of 3 weeks.

151413121110

9876543210

How should you make your plan?

What should you tell management?

3

3

Page 11: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

11

If you had this kind of information about your estimates, how would you use it?

Suppose your time projection said that a project would take 8 weeks.

151413121110

9876543210

How should you make your plan?

What should you tell management?

What if the range was 6 weeks?

6

6

Page 12: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

12

The prediction interval is based on the t distribution.

70%limits(area)

Regression-projected value

Range

Lower prediction interval limit (LPI)

Upper prediction interval limit (UPI)

Page 13: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

13

Prediction Interval Usage

Range within which data is likely to fall Assuming variation is this estimate is similar to

that in prior estimates PSP uses 70% and 90% limits

Computes range in which actual value will likely fall 70% of the time 90% of the time

Helps to assess planning quality

Page 14: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

To get the prediction interval, we must calculate the range:

Text, page 128; may have error in formula (n instead of d), depending on textbook revision.

n

ii

est

xx

xx

ndptRange

1

2

211,

Note: this is for one independent variable.

1 number of independent variables

number of historical data points

1 degrees of freedom

m

n

d n m

ttdpt

p

nix

x

i

est

to from valuep"" specified givesthat unknown t ,

70%)for 0.7 (e.g., valueintegral desired

)1( estimates previous

input"" regression new

Page 15: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

For multiple regression, the range calculation is just extended a little.

n

immi

mest

n

ii

est

xx

xx

xx

xx

ndptRange m

1

2,

2

1

211,

211

11,

freedom of degrees

points data historical of number

variables tindependen of number

1mnd

n

m

ttdpt

p

mjnix

mjx

ji

est j

to from value integral p"" gives that (t) limit

70%) for 0.7 (e.g., value integral desired

estimates previous

input"" regressionnew

,

)1;1(

)1(

,

Page 16: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

The ("sigma") value is computed in the following way.

xi,j = previous independent variable valuesyi = previous dependent variable (estimate) valuesn = number of previous estimatesm = number of independent variablesd = n-(m+1) [degrees of freedom]j = regression coefficients calculated from previous data

n

imimii xxy

mn 1

2,1,10 )(

1

1

Same for one independent

variable.

n

ipredi iyy

d 1

21Alternate form:

Page 17: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

The range formula requires us to find the integration limit that yields the correct integral value.

For a 70% interval,we want p = 0.70

Question: what integration limit “t” gives this value?

-t t

Two-sided integral value = p

We have to search (try t values) in

order to find out.

For a 90% interval,we want p = 0.90

Page 18: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

When thinking about searching for the desired integral value, it may be helpful to plot the integral of the t-distribution function.

T probability density function

-6 0 6

1

5

25

Integral of t distribution from -t to t (p)

0.00

0.20

0.40

0.60

0.80

1.00

0 1 2 3 4 5 6

t

p

1

5

25

Hint: create a "function object" that calculates the two-sided p integral, given a specified t value.

Page 19: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

SE-280Dr. Mark L. Hornick

19

The calculation needed is the reverse of that used in the significance calculation, since we are seeking "t" instead of "p".

t

p For a specified "p" (integral) value, we want to find the corresponding

"t" (integration limit).

How should we do the search?

Page 20: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

In the significance calculation, we calculated "p" for a given "t"; now we are seeking "t" that will give us a desired "p".

t

p For a specified "p" (2-sided integral) value, we want to find the corresponding

"t" (integration limit).

How should we do the search?

Page 21: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

The textbook's suggests a state-machine approach that requires the function to be monotonic (pg. 246).

The sign of the error (desired versus actual "p") tells you whether to increase or decrease the trial "t" value to get closer to the desired answer.

p

t

The increase/decrease step size is halved when changing search direction.

Page 22: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

An alternative search method brackets the answer and bisects the interval.

sm Search

FindInterv al

BisectSearch

TimeoutBisect /Failure

InRightSegment/SelectRight, CheckSegment

InLeftSegment/SelectLeft, CheckSegment

ValueOK /Success

TimeoutInterval /FailureNotInInterval/ExpandInterval,

CheckInterval

/Init, CheckInterval

InInterval/CheckSegment

Page 23: SE-280 Dr. Mark L. Hornick 1 Prediction Intervals

How does the interval bisection method work?

t

p

error (+)

error (-)