se-280 dr. mark l. hornick 1 prediction intervals
TRANSCRIPT
SE-280Dr. Mark L. Hornick
1
Prediction Intervals
SE-280Dr. Mark L. Hornick
2
Review: PROBE’s regression calculations give us estimates (projections) for size (A+M LOC)…
0
100
200
300
400
500
600
700
800
0 100 200 300 400
Act
ual
Siz
e (
LO
C)
Estimated Proxy Size (LOC)
SE-280Dr. Mark L. Hornick
3
…and time
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300 350 400
Tim
e (m
in)
Estimated Proxy Size (LOC)
SE-280Dr. Mark L. Hornick
4
But just how good are these estimates???
Off by 5%, 10%, 50%, 100%, 500%?
Does it matter?
Do you want to bet: Your weekends? Your reputation? Your JOB?
SE-280Dr. Mark L. Hornick
5
Which of the following regression projections would you trust more?
SE-280Dr. Mark L. Hornick
6
Example A10 data pointsCorrelation = 0.75
0
100
200
300
400
500
600
700
800
0 100 200 300 400
Estimated Object LOC
Act
ual
To
tal
LO
C
SE-280Dr. Mark L. Hornick
7
Example B25 data pointsCorrelation = 0.75
0
100
200
300
400
500
600
700
800
0 100 200 300 400
Estimated Object LOC
Act
ual
To
tal
LO
C
A Prediction Interval calculation computes the bounds on the likely error of an estimate
Range
UPI = estimated A+M LOC + RangeLPI = estimated A+M LOC - Range
UPI
LPI
Projection(Estimate)
Strictly speaking, the UPI/LPI "lines" are parabolas, and Range varies.
Range
SE-280Dr. Mark L. Hornick
9
If you had this kind of information about your estimates, how would you use it?
Suppose your time projection said that a project would take 8 weeks.
But, your prediction interval has a range of 3 weeks.
151413121110
9876543210
How should you make your plan?
What should you tell management?
SE-280Dr. Mark L. Hornick
10
If you had this kind of information about your estimates, how would you use it?
Suppose your time projection said that a project would take 8 weeks.
But, your prediction interval has a range of 3 weeks.
151413121110
9876543210
How should you make your plan?
What should you tell management?
3
3
SE-280Dr. Mark L. Hornick
11
If you had this kind of information about your estimates, how would you use it?
Suppose your time projection said that a project would take 8 weeks.
151413121110
9876543210
How should you make your plan?
What should you tell management?
What if the range was 6 weeks?
6
6
SE-280Dr. Mark L. Hornick
12
The prediction interval is based on the t distribution.
70%limits(area)
Regression-projected value
Range
Lower prediction interval limit (LPI)
Upper prediction interval limit (UPI)
SE-280Dr. Mark L. Hornick
13
Prediction Interval Usage
Range within which data is likely to fall Assuming variation is this estimate is similar to
that in prior estimates PSP uses 70% and 90% limits
Computes range in which actual value will likely fall 70% of the time 90% of the time
Helps to assess planning quality
To get the prediction interval, we must calculate the range:
Text, page 128; may have error in formula (n instead of d), depending on textbook revision.
n
ii
est
xx
xx
ndptRange
1
2
211,
Note: this is for one independent variable.
1 number of independent variables
number of historical data points
1 degrees of freedom
m
n
d n m
ttdpt
p
nix
x
i
est
to from valuep"" specified givesthat unknown t ,
70%)for 0.7 (e.g., valueintegral desired
)1( estimates previous
input"" regression new
For multiple regression, the range calculation is just extended a little.
n
immi
mest
n
ii
est
xx
xx
xx
xx
ndptRange m
1
2,
2
1
211,
211
11,
freedom of degrees
points data historical of number
variables tindependen of number
1mnd
n
m
ttdpt
p
mjnix
mjx
ji
est j
to from value integral p"" gives that (t) limit
70%) for 0.7 (e.g., value integral desired
estimates previous
input"" regressionnew
,
)1;1(
)1(
,
The ("sigma") value is computed in the following way.
xi,j = previous independent variable valuesyi = previous dependent variable (estimate) valuesn = number of previous estimatesm = number of independent variablesd = n-(m+1) [degrees of freedom]j = regression coefficients calculated from previous data
n
imimii xxy
mn 1
2,1,10 )(
1
1
Same for one independent
variable.
n
ipredi iyy
d 1
21Alternate form:
The range formula requires us to find the integration limit that yields the correct integral value.
For a 70% interval,we want p = 0.70
Question: what integration limit “t” gives this value?
-t t
Two-sided integral value = p
We have to search (try t values) in
order to find out.
For a 90% interval,we want p = 0.90
When thinking about searching for the desired integral value, it may be helpful to plot the integral of the t-distribution function.
T probability density function
-6 0 6
1
5
25
Integral of t distribution from -t to t (p)
0.00
0.20
0.40
0.60
0.80
1.00
0 1 2 3 4 5 6
t
p
1
5
25
Hint: create a "function object" that calculates the two-sided p integral, given a specified t value.
SE-280Dr. Mark L. Hornick
19
The calculation needed is the reverse of that used in the significance calculation, since we are seeking "t" instead of "p".
t
p For a specified "p" (integral) value, we want to find the corresponding
"t" (integration limit).
How should we do the search?
In the significance calculation, we calculated "p" for a given "t"; now we are seeking "t" that will give us a desired "p".
t
p For a specified "p" (2-sided integral) value, we want to find the corresponding
"t" (integration limit).
How should we do the search?
The textbook's suggests a state-machine approach that requires the function to be monotonic (pg. 246).
The sign of the error (desired versus actual "p") tells you whether to increase or decrease the trial "t" value to get closer to the desired answer.
p
t
The increase/decrease step size is halved when changing search direction.
An alternative search method brackets the answer and bisects the interval.
sm Search
FindInterv al
BisectSearch
TimeoutBisect /Failure
InRightSegment/SelectRight, CheckSegment
InLeftSegment/SelectLeft, CheckSegment
ValueOK /Success
TimeoutInterval /FailureNotInInterval/ExpandInterval,
CheckInterval
/Init, CheckInterval
InInterval/CheckSegment
How does the interval bisection method work?
t
p
error (+)
error (-)