simple linear regression: prediction instructor: g...

16
Prediction APS 425 - Advanced Managerial Data Analysis (c) Prof. G. William Schwert, 2001-2015 1 APS 425 Fall 2015 Simple Linear Regression: Prediction Instructor: G. William Schwert 585-275-2470 [email protected] Ciba-Geigy Ritalin Experiment Ritalin is tested to see if it helps with Central Auditory Processing Disorder (CAPD) – Similar symptoms to ADD/ADHD • Experiment: – “Randomly” select 64 children – All receive auditory test – 32 (control group) receive no drug (or placebo?) – 32 (treatment group) receive varying doses of Ritalin – All children are tested a second time

Upload: lynhi

Post on 06-Feb-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 1

APS 425Fall 2015

Simple Linear Regression:

Prediction

Instructor: G. William Schwert585-275-2470

[email protected]

Ciba-Geigy Ritalin Experiment

• Ritalin is tested to see if it helps with Central Auditory Processing Disorder (CAPD)– Similar symptoms to ADD/ADHD

• Experiment:– “Randomly” select 64 children

– All receive auditory test

– 32 (control group) receive no drug (or placebo?)

– 32 (treatment group) receive varying doses of Ritalin

– All children are tested a second time

Page 2: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 2

Ciba-Geigy Ritalin Experiment

• DOSAGEi = amount of Ritalin received by child i– Measured as Mg of Ritalin per Kg of body weight

• IMPROVEi = child’s 2nd test score – 1st test score– Dataset A425_ritalin.wf1 also contains:

• AGE of child in months

• Gender (FEMALE = 1, for girls)

Predictions

Page 3: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 3

• Predictive model:

IMPROVEi = 0.226 + 12.18 DOSAGEi= the estimate of E[IMPROVEi | DOSAGEi, b0, b1]

• What is your estimate of the average IMPROVE score for all children who receive a dosage of 0.35 mg/kg?

IMPROVEi = 0.226 + 12.18 DOSAGEi = 4.488

• This question asks about the average or expected value for all children who get a DOSAGE of 0.35mg/kg.

Predictions

• A given child has been administered a DOSAGEof 0.35mg/kg. What value do you predict for the child’s IMPROVE score?

IMPROVEi = 0.226 + 12.18 × 0.35 = 4.488

• This question asks you to predict the value for an individual child who gets a DOSAGE of 0.35mg/kg

^

Predictions

Page 4: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 4

• The prediction of the value for an individual child = the expected value for the population (of all children with DOSAGE = 0.35mg/kg) (see previous two slides)

• However, standard errors are different!

• Let’s derive them next

Predictions

Std Error of Predictions

• Standard error for predicting an individual value– The linear model is:

Yi = + Xi + ei

– Our prediction is:Yi = b0 + b1 Xi

– Three sources of error:• Error in estimating • Error in estimating • Error in estimating ei

^

Page 5: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 5

• Standard error of Yi:SDP = [ s2 + s2 / n + s2 (Xi – X)2 / xi

2 ] ½

Error term Intercept Slope

uncertainty uncertainty uncertainty

where s2 = ei 2 / (n-2) is the residual variance

and s2 / xi 2 is the variance of b1

^

_

Std Error of Predictions

Prediction Intervals

• A 100(1–)% confidence interval for Yi is:

[Yi - t/2 SDP, Yi + t/2 SDP]

where Pr{tn-2 > t/2 } = /2

• In the Ritalin case with one child getting a DOSAGEof 0.35mg/kg, we have SDP = 12.056, so a 95% confidence interval for IMPROVEi is

[4.488 – 2.00(12.056), 4.488 + 2.00(12.056)]

= [– 19.61, 28.59]

Page 6: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 6

Using Eviews to Get Prediction Intervals

• Redefine the workfile range so that you can generate an “out-of-sample” prediction

Using Eviews to Get Prediction Intervals

• Double-click dosage to open the spreadsheet

• Then click Edit+/-

• Move to observation 65 (which will say “NA”)

• Type in .35 into the workspace bar to enter this value for the 65th

observation

Page 7: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 7

Using Eviews to Get Prediction Intervals

• Using the regression equation predicting IMPROVE as a function of DOSAGE, click FORECAST

• Then specify the forecast sample as 65 65 • the value of DOSAGE = 0.35 you just

entered

Upper limit

Point estimate

Lower limit

Prediction Intervals for Improvement from DOSAGE

-30

-20

-10

0

10

20

30

40

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

DOSAGE

Upper 95% PI Predicted IMPROVE Lower 95% PI

DOSAGE = 0.35

Page 8: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 8

Std Error of Prediction for the Average

• Standard error for predicting the expected or average value– A group of children have been administered a DOSAGE of

0.35mg/kg. What value do you predict for the average IMPROVE score of these children on the test?

IMPROVEi = 0.226 + 12.18 × 0.35 = 4.488^

Std Error of Prediction for the Average

• Now let’s consider the standard error– The linear model is: Yi = + Xi + ei

– We are predicting: E[Y0] = + X0

– Our prediction is: Yi = b + b Xi

– Two sources of uncertainty:

• Error in estimating • Error in estimating

Page 9: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 9

• Standard error of E(Yi):SEP = [s2 / n + s2 (Xi – X)2 / xi

2 ] ½

Intercept Slope

uncertainty uncertainty

• A 100(1–)% confidence interval for E(Yi) is:

[Yi - t/2 SEP, Yi + t/2 SEP]

where Pr{tn-2 > t/2 } = /2

_

Std Error of Prediction for the Average

• In the Ritalin case with one child getting a DOSAGE of 0.35mg/kg, we have SEP = 1.585, so a 95% confidence interval for

E(IMPROVEi | DOSAGE = 0.35, b0, b1) is

[4.488 – 2.00(1.585), 4.488 + 2.00(1.585)]

= [1.319, 7.657]

Std Error of Prediction for the Average

Page 10: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 10

Prediction Intervals for Expected Improvement from DOSAGE

-10

-5

0

5

10

15

20

0 0.2 0.4 0.6 0.8

DOSAGE

Upper 95% PI Predicted IMPROVE Lower 95% PI

Note: prediction interval for expected improvement are much narrower and curvature is more apparent

Predictions and Eviewsfor DOSAGE = 0.35

(SEP)2 + (SE of Regression)2 = (SDP)2

[1.5852+11.9512]1/2 =12.056

Note that SEP and SDP depend on Xi = 0.35

(SEP)2 = s2 / n + (SE(b1))2 (Xi – X)2

= 11.9512/ 64 + (5.723)2 (0.35 – 0.257)2

= 2.513 => SEP = 1.585

_

Page 11: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 11

Predictions and Eviews

• The range of DOSAGE values in the data is 0 to 0.71• Given this sample, the predictive model is:

IMPROVEi = 0.226 + 12.18 × 0.35 = 4.488• We have no support from the

data whether this relation extendsoutside of the sample range (e.g., to dosages > 0.71 mg/kg)

• To predict outside the samplerange is called extrapolation

• Extrapolation is ill-advised and subject to much criticism

^

Predictions of IMPROVE from Eviews

Generate predicted values, “predict”, and the standard deviation of the prediction, “sdp”

Page 12: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 12

Prediction Interval for IMPROVE from Eviews

Generate upper and lower limits for 95% prediction interval, “predup” and “preddown”

Create a “group” of “dosage”, “predict”, “predup”, and “preddown”

Prediction Interval for IMPROVE from Eviews

Page 13: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 13

Graph XY lineOne X against all Y’s

Make sure that the predictor variable, DOSAGE, is the first one in the set

Prediction Interval for IMPROVE from Eviews

Note that the 95% prediction interval for IMPROVE covers a wide range of outcomes for individual students

No assurance that any one child will improve if given Ritalin

Prediction Interval for IMPROVE from Eviews

Page 14: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 14

Standard Error of the Regression Line for IMPROVE from Eviews

We will calculate the standard error of the regression line (SEP)

First, save the standard error of the regression, SER

Next, derive SEP from SDP and SER

Standard Errors Around the Regression Line for IMPROVE

Generate upper and lower limits for 95% regression line, “regrup” and “regrdown”

Page 15: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 15

Create a “group” of “dosage”, “predict”, “regrup”, and “regrdown”

Standard Errors Around the Regression Line for IMPROVE

Graph simple scatter

Make sure that the predictor variable, DOSAGE, is the first one in the set

Standard Errors Around the Regression Line for IMPROVE

Page 16: Simple Linear Regression: Prediction Instructor: G ...schwert.ssb.rochester.edu/a425/a425_pred.pdf · Simple Linear Regression: Prediction Instructor: G. William ... • Double-click

Prediction APS 425 - Advanced Managerial Data Analysis

(c) Prof. G. William Schwert, 2001-2015 16

Note that the 95% confidence interval for the regression line is much narrower

For the dosages used in this experiment the entire interval covers positive improvement

Standard Errors Around the Regression Line for IMPROVE

Links

Ritalin Datahttp://schwert.ssb.rochester.edu/a425/a425_ritalin.wf1

Return to APS 425 Home Pagehttp://schwert.ssb.rochester.edu/a425/a425main.htm