chapter 2: simple linear regression 2.2-2.3 parameter ... · pdf filechapter 2: simple linear...

75
Chapter 2: Simple Linear Regression 2.1 The Model 2.2-2.3 Parameter Estimation 2.4 Properties of Estimators 2.5 Inference 2.6 Prediction 2.7 Analysis of Variance 2.8 Regression Through the Origin 2.9 Related Models 1

Upload: duongthien

Post on 21-Mar-2018

267 views

Category:

Documents


13 download

TRANSCRIPT

Page 1: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Chapter 2: Simple Linear Regression

• 2.1 The Model• 2.2-2.3 Parameter Estimation• 2.4 Properties of Estimators• 2.5 Inference• 2.6 Prediction• 2.7 Analysis of Variance• 2.8 Regression Through the Origin• 2.9 Related Models

1

Page 2: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.1 The Model

• Measurement of y (response) changes in a linear fashion with asetting of the variable x (predictor):

y =β0 + β1x

linear relation+

ε

noise

◦ The linear relation is deterministic (non-random).

◦ The noise or error is random.

• Noise accounts for the variability of the observations about thestraight line.No noise⇒ relation is deterministic.Increased noise⇒ increased variability.

2

Page 3: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

• Experiment with this simulation program:

simple.sim <-

function(intercept=0,slope=1,x=seq(1,10),sigma=1){

noise <- rnorm(length(x),sd = sigma)

y <- intercept + slope*x + noise

title1 <- paste("sigma = ", sigma)

plot(x,y,pch=16,main=title1)

abline(intercept,slope,col=4,lwd=2)

}

• > source("simple.sim.R")

> simple.sim(sigma=.01)

> simple.sim(sigma=.1)

> simple.sim(sigma=1)

> simple.sim(sigma=10)

3

Page 4: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Simulation Examples

2 4 6 8 10

24

68

10

sigma = 0.01

x

y

2 4 6 8 10

24

68

10

sigma = 0.1

x

y

2 4 6 8 10

24

68

10

sigma = 1

x

y

2 4 6 8 10

010

2030

sigma = 10

x

y

4

Page 5: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

The Setup

• Assumptions:1. E[y|x] = β0 + β1x.2. Var(y|x) = Var(β0 + β1x+ ε|x) = σ2.

• Data: Suppose data y1, y2, . . . , yn are obtained at settingsx1, x2, . . . , xn, respectively. Then the model on the data is

yi = β0 + β1xi + εi

(εi i.i.d. N(0, σ2) and E[yi|xi] = β0 + β1xi.)Either1. the x’s are fixed values and measured without error

(controlled experiment)OR

2. the analysis is conditional on the observed values of x(observational study).

5

Page 6: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.2-2.3 Parameter Estimation, Fitted Values and Residuals

1. Maximum Likelihood Estimation

Distributional assumptions are required

2. Least Squares Estimation

Distributional assumptions are not required

6

Page 7: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.2.1 Maximum Likelihood Estimation

◦ Normal assumption is required:

f(yi|xi) =1√2πσ

e−12σ2(yi−β0−β1xi)

2

Likelihood: L(β0, β1, σ) =∏f(yi|xi)

∝1

σne− 1

2σ2

∑ni=1(yi−β0−β1xi)

2

◦ Maximize with respect to β0, β1, and σ2.

◦ (β0, β1): Equivalent to minimizingn∑i=1

(yi − β0 − β1xi)2

(i.e. Least-squares) σ2: SSE/n, biased.

7

Page 8: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.2.2 Least Squares Estimation

◦ Assumptions:

1. E[εi] = 0

2. Var(εi) = σ2

3. εi’s are independent.

• Note that normality is not required.

8

Page 9: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Method

• Minimize

S(β0, β1) =n∑i=1

(yi − β0 − β1xi)2

with respect to the parameters or regression coefficients β0 andβ1: β0 and β1.

◦ Justification: We want the fitted line to pass as close to all of thepoints as possible.

◦ Aim: small Residuals (observed - fitted response values):

ei = yi − β0 − β1xi

9

Page 10: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Look at the following plots:

> source("roller2.plot")

> roller2.plot(a=14,b=0)

> roller2.plot(a=2,b=2)

> roller2.plot(a=12, b=1)

> roller2.plot(a=-2,b=2.67)

10

Page 11: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

0 2 4 6 8 10 12

020

4060

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual

−ve residual

a=14, b=0

0 2 4 6 8 10 12

020

4060

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual

−ve residual

a=2, b=2

0 2 4 6 8 10 12

020

4060

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual

−ve residual

a=12, b=1

0 2 4 6 8 10 12

020

4060

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual −ve residual

a=−2, b=2.67

◦ The first three lines above do not pass as close to the plottedpoints as the fourth, even though the sum of the residuals isabout the same in all four cases.◦ Negative residuals cancel out positive residuals.

Page 12: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

The Key: minimize squared residuals source("roller3.plot.R")

> roller3.plot(14,0); roller3.plot(2,2); roller3.plot(12,1);> roller3.plot(a=-2,b=2.67) # small SS

0 2 4 6 8 10 12

040

8012

0

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual−ve residual

a=14, b=0

0 2 4 6 8 10 12

040

8012

0Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual−ve residual

a=2, b=2

0 2 4 6 8 10 12

040

8012

0

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual

−ve residual

a=12, b=1

0 2 4 6 8 10 12

040

8012

0

Roller weight (t)

Dep

ress

ion

in la

wn

(mm

)

Fitted valuesData values

+ve residual −ve residual

a=−2, b=2.67

11

Page 13: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Unbiased estimate of σ2

1

n− # parameters estimated

n∑i=1

e2i

=1

n− 2

n∑i=1

e2i

* n observations⇒ n degrees of freedom

* 2 degrees of freedom are required to estimate the parameters

* the residuals retain n− 2 degrees of freedom

12

Page 14: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Alternative Viewpoint

* y = (y1, y2, . . . , yn) is a vector in n-dimensional space.(n degrees of freedom)

* The fitted values yi = β0 + β1xi also form a vector inn-dimensional space:y = (y1, y2, . . . , yn)

(2 degrees of freedom)* Least-squares seeks to minimize the distance between y and y.* The distance between n-dimensional vectors u and v is the

square root ofn∑i=1

(ui − vi)2

* Thus, the squared distance between y and y isn∑i=1

(yi − yi)2 =n∑i=1

e2i

13

Page 15: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Regression Coefficient Estimators

• The minimizers of

S(β0, β1) =n∑i=1

(yi − β0 − β1xi)2

are

β0 = y − β1x

and

β1 =Sxy

Sxx

where

Sxy =n∑i=1

(xi − x)(yi − y)

and

Sxx =n∑i=1

(xi − x)2

14

Page 16: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

HomeMade R Estimators

> ls.est <-

function (data)

{

x <- data[,1]

y <- data[,2]

xbar <- mean(x)

ybar <- mean(y)

Sxy <- S(x,y,xbar,ybar)

Sxx <- S(x,x,xbar,xbar)

b <- Sxy/Sxx

a <- ybar - xbar*b

list(a = a, b = b, data=data)

}

> S <-

function (x,y,xbar,ybar)

{

sum((x-xbar)*(y-ybar))

}

15

Page 17: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Calculator Formulas and Other Properties

Sxy =n∑i=1

xiyi −(∑ni=1 xi)(

∑ni=1 yi)

n

Sxx =n∑i=1

x2i −

(∑ni=1 xi)

2

n

• Linearity Property:

Sxy =n∑i=1

yi(xi − x)

so β1 is linear in the yi’s.• Gauss-Markov Theorem: Among linear unbiased estimators forβ1 and β0, β1 and β0 are best (i.e. they have smallest variance).• Exercise: Find the expected value and variance of y1−y

x1−x. Is itunbiased for β1? Is there a linear unbiased estimator withsmaller variance?

16

Page 18: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Residuals

εi = ei = yi − β0 − β1xi

= yi − yi

= yi − y − β1(xi − x)

• In R:> res

function (ls.object)

{

a <- ls.object$a

b <- ls.object$b

x <- ls.object$data[,1]

y <- ls.object$data[,2]

resids <- y - a - b*x

resids

}

17

Page 19: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.3.1 Consequences of Least-Squares

1. ei = yi − y − β1(xi − x)

(follows from intercept formula)2.∑ni=1 ei = 0 (follows from 1.)

3.∑ni=1 yi =

∑ni=1 yi

(follows from 2.)4. The regression line passes through the centroid (x, y) (follows

from intercept formula)5.∑xiei = 0

(set partial derivative of S(β0, β1) wrt β1 to 0)6.∑yiei = 0

(follows from 2. and 5.)

18

Page 20: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.3.2 Estimation of σ2

• The residual sum of squares or error sum of squares is given by

SSE =n∑i=1

e2i = Syy − β1Sxy

Note: SSE = SSRes and Syy = SST

• An unbiased estimator for the error variance is

σ2 = MSE =SSE

n− 2=

[Syy − β1Sxy

]n− 2

Note: MSE = MSRes

• σ = Residual Standard Error =√

MSE

19

Page 21: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Example

• roller data

weight depression

1 1.9 2

2 3.1 1

3 3.3 5

4 4.8 5

5 5.3 20

6 6.1 20

7 6.4 23

8 7.6 10

9 9.8 30

10 12.4 25

20

Page 22: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Hand Calculation

• (y = depression, x = weight)

◦∑10i=1 xi = 1.9 + 3.1 + · · ·12.4 = 60.7

◦ x = 60.710 = 6.07

◦∑10i=1 yi = 2 + 1 + · · ·25 = 141

◦ y = 14110 = 14.1

◦∑10i=1 x

2i = 1.92 + 3.12 + · · ·+ 12.42 = 461

◦∑10i=1 y

2i = 4 + 1 + 25 + · · ·+ 625 = 3009

◦∑10i=1 xiyi = (1.9)(2) + · · · (12.4)(25) = 1103

◦ Sxx = 461− (60.7)2

10 = 92.6

21

Page 23: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ Sxy = 1103− (60.7)(141)10 = 247

◦ Syy = 3009− (141)2

10 = 1021

◦ β1 =SxySxx

= 24792.6 = 2.67

◦ β0 = y − β1x = 14.1− 2.67(6.07) = −2.11

◦ σ2 = 1n−2(Syy − β1Sxy)

= 18(1021− 2.67(247)) = 45.2 = MSE

• Example Summary: the fitted regression line relating depression(y) to weight (x) is

y = −2.11 + 2.67x

The error variance is estimated as MSE = 45.2.

Page 24: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

R commands (home-made version)

> roller.obj <- ls.est(roller)

> roller.obj[-3] # intercept and slope estimate

$a

[1] -2.09

$b

[1] 2.67

> res(roller.obj) # residuals

[1] -0.98 -5.18 -1.71 -5.71 7.95

[6] 5.82 8.02 -8.18 5.95 -5.98

> sum(res(roller.obj)ˆ2)/8 # error variance (MSE)

[1] 45.4

22

Page 25: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

R commands (built-in version)

> attach(roller)

> roller.lm <- lm(depression ˜ weight)

> summary(roller.lm)

Call:

lm(formula = depression ˜ weight)

Residuals:

Min 1Q Median 3Q Max

-8.180 -5.580 -1.346 5.920 8.020

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -2.0871 4.7543 -0.439 0.67227

weight 2.6667 0.7002 3.808 0.00518

23

Page 26: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Residual standard error: 6.735 on 8 degrees of freedom

Multiple R-squared: 0.6445, Adjusted R-squared: 0.6001

F-statistic: 14.5 on 1 and 8 DF, p-value: 0.005175

> detach(roller)

• or

> roller.lm <- lm(depression ˜ weight, data = roller)

> summary(roller.lm)

Page 27: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Using Extractor Functions

• Partial Output:

> coef(roller.lm)

(Intercept) weight

-2.09 2.67

> summary(roller.lm)$sigma

[1] 6.74

• From the output,

◦ the slope estimate is β1 = 2.67.

◦ the intercept estimate is β0 = −2.09.

◦ the Residual standard error is the square root of the MSE:6.742 = 45.4.

24

Page 28: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Other R commands

◦ fitted values: predict(roller.lm)

◦ residuals: resid(roller.lm)

◦ diagnostic plots: plot(roller.lm)(these include a plot of the residuals against the fitted valuesand a normal probability plot of the residuals)

◦ Also plot(roller); abline(roller.lm)

(this gives a plot of the data with the fitted line overlaid)

25

Page 29: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.4: Properties of Least Squares Estimates

◦ E[β1] = β1

◦ E[β0] = β0

◦ Var(β1) = σ2

Sxx

◦ Var(β0) = σ2(

1n + x2

Sxx

)

26

Page 30: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Standard Error Estimators

◦ Var(β1) = MSESxx

so the standard error (s.e.) of β1 is estimated by√MSESxx

roller e.g.: MSE = 45.2, Sxx = 92.6

⇒ s.e. of β1 is√

45.2/92.6 = .699

◦ Var(β0) = MSE(

1n + x2

Sxx

)so the standard error (s.e.) of β0 is estimated by√√√√MSE

(1

n+

x2

Sxx

)roller e.g.:MSE = 45.2, Sxx = 92.6, x = 6.07, n = 10

⇒ s.e. of β0 is√

45.2(

110 + 6.072

92.62

)= 4.74

27

Page 31: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Distributions of β1 and β0

◦ yi is N(β0 + β1xi, σ2), and

β1 =∑ni=1 aiyi (ai = xi−x

Sxx)

⇒ β1 is N(β1,σ2

Sxx).

Also, SSEσ2 is χ2

n−2 (independent of β1) so

β1 − β1√MSE/Sxx

∼ tn−2

◦ β0 =∑ni=1 ciyi (ci = 1

n −x(xi−x)Sxx

)

⇒ β0 is N(β0, σ2(

1n + x2

Sxx

)) and

β0 − β0√MSE

(1n + x2

Sxx

) ∼ tn−2

28

Page 32: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.5 Inferences about the Regression Parameters

• Tests

• Confidence Intervals

for β1, and β0 + β1x0

29

Page 33: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.5.1 Inference for β1

H0 : β1 = β10 vs. H1 : β1 6= β10

Under H0,

t0 =β1 − β10√MSE/Sxx

has a t-distribution on n− 2 degrees of freedom.

p-value = P (|tn−2| > t0)

e.g. testing significance of regression for roller:

H0 : β1 = 0 vs. H1 : β1 6= 0

t0 =2.67− 0

.699= 3.82

p-value = P (|t8| > 3.82) = 2(1− P (t8 < 3.82)) = .00509

R command: 2*(1-pt(3.82,8))

30

Page 34: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

(1− α) Confidence Intervals

◦ slope:

β1 ± tn−2,α/2s.e.

or

β1 ± tn−2,α/2

√MSE/Sxx

roller e.g. (95% confidence interval)

2.67± t8,.025(.699)

or

2.67± 2.31(.699) = 2.67± 1.61

R command: qt(.975, 8)

◦ intercept:

β0 ± tn−2,α/2s.e.

e.g. − 2.11± 2.31(4.74) = −2.11± 10.9

31

Page 35: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.5.2 Confidence Interval for Mean Response

◦ E[y|x0] = β0 + β1x0

where x0 is a possible value that the predictor could take.

◦ E[y|x0] = β0 + β1x0 is a point estimate of the mean response atthat value.

◦ To find a 1− α confidence interval for E[y|x0], we need thevariance of y0 = E[y|x0]:

Var(y0) = Var(β0 + β1x0)

β0 =n∑i=1

ciyi

β1x0 =n∑i=1

bix0yi

32

Page 36: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Therefore,

y0 =n∑i=1

(ci + bix0)yi

so

Var(y0) =n∑i=1

(ci + bix0)2Var(yi)

= σ2n∑i=1

(ci + bix0)2

= σ2n∑i=1

(1

n+

(x0 − x)(xi − x)

Sxx)2

= σ2(1

n+

(x0 − x)2

Sxx)

◦ Var(y0) = MSE(1n + (x0−x)2

Sxx)

Page 37: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ The confidence interval is then

y0 ± tn−2,α/2

√√√√MSE(1

n+

(x0 − x)2

Sxx)

◦ e.g. Compute a 95% confidence interval for the expecteddepression when the weight is 5 tonnes.

x0 = 5, y0 = −2.11 + 2.67(5) = 11.2

n = 10, x = 6.07, Sxx = 92.6,MSE = 45.2

Interval:

11.2± t8,.025

√45.2(

1

10+

(5− 6.07)2

92.6)

= 11.2± 2.31√

5.08 = 11.2± 5.21

= (5.99,16.41)

Page 38: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ R code:

> predict(roller.lm,

newdata=data.frame(weight<-5),

interval="confidence")

fit lwr upr

[1,] 11.2 6.04 16.5

◦ Ex. Write your own R function to compute this interval.

Page 39: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.6: Predicted Responses

◦ If a new observation were to be taken at x0, we would predict itto be y0 = β0 + β1x0 + ε where ε is independent noise.

◦ y0 = β0 + β1x0 is a point prediction of the response at thatvalue.

◦ To find a 1− α prediction interval for y0, we need the variance ofy0 − y0:

Var(y0 − y0) = Var(β0 + β1x0 + ε− β0 − β1x0)

= Var(−β0 − β1x0) + Var(ε)

= σ2(1

n+

(x0 − x)2

Sxx) + σ2

33

Page 40: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

= σ2(1 +1

n+

(x0 − x)2

Sxx)

◦ Var(y0 − y0) = MSE(1 + 1n + (x0−x)2

Sxx)

◦ The prediction interval is then

y0 ± tn−2,α/2

√√√√MSE(1 +1

n+

(x0 − x)2

Sxx)

◦ e.g. Compute a 95% prediction interval for the depression for asingle new observation where the weight is 5 tonnes.

x0 = 5, y0 = −2.11 + 2.67(5) = 11.2

n = 10, x = 6.07, Sxx = 92.6,MSE = 45.2

Interval:

11.2± t8,.025

√45.2(1 +

1

10+

(5− 6.07)2

92.6)

Page 41: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

= 11.2± 2.31√

50.3 = 11.2± 16.4 = (−5.2,27.6)

Page 42: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

R code for Prediction Intervals

> predict(roller.lm,

newdata=data.frame(weight<-5),

interval="prediction")

fit lwr upr

[1,] 11.2 -5.13 27.6

◦ Write your own R function to produce this interval.

34

Page 43: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Degrees of Freedom

• A random sample of size n coming from a normal populationwith mean µ and variance σ2 has n degrees of freedom:Y1, . . . , Yn.• Each linearly independent restriction reduces the number of

degrees of freedom by 1.•∑ni=1

(Yi−µ)2

σ2 has a χ2(n) distribution.

•∑ni=1

(Yi−Y )2

σ2 has a χ2(n−1) distribution. (Calculating Y imposes

one linear restriction.)

•∑ni=1

(Yi−Yi)2

σ2 has a χ2(n−2) distribution. (Calculating β0 and β1

imposes two linearly independent restrictions.)• Given the xi’s, there are 2 degrees of freedom in the quantityβ0 + β1xi.• Given xi and Y , there is one degree of freedom:yi − Y = β1(xi − x).

35

Page 44: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.7: Analysis of Variance: Breaking down (analyzing) Variation

◦ The variation in the data (responses) is summarized by

Syy =n∑i=1

(yi − y)2 = TSS

(Total sum of squares)

◦ 2 sources of variation in the responses:

1. variation due to the straight line relationship with thepredictor

2. deviation from the line (noise)

◦ yi−ydeviation from data center = yi−yi

residual + yi−ydifference: line and center

yi − y = ei + yi − y36

Page 45: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

so

Syy =∑

(yi − y)2

=∑

(ei + yi − y)2

=∑

e2i +

∑(yi − y)2

since ∑eiyi = 0 and y

∑ei = 0

Syy = SSE +∑

(yi − y)2 = SSE + SSR

The last term is the regression sum of squares.

Page 46: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Relation between SSR and β1

◦ We saw earlier that

SSE = Syy − β1Sxy

◦ Therefore,

SSR = β1Sxy = β12Sxx

Note that, for a given set of x’s SSR depends only on β1.

MSR = SSR/d.f. = SSR/1

(1 degree of freedom for slope parameter)

37

Page 47: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Expected Sums of Squares

E[SSR] = E[Sxxβ12]

= Sxx(Var(β1) + (E[β1])2

)

= Sxx

(σ2

Sxx+ β2

1

)

= σ2 + β21Sxx

= E[MSR]

Therefore, if β1 = 0, then SSR is an unbiased estimator for σ2.◦ Development of E[MSE]:

E[Syy] = E[∑

(yi − y)2]

= E[∑

y2i − ny

2] =∑

E[y2i ]− nE[y2]

38

Page 48: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Development of E[MSE] (cont’d)

Consider the 2 terms on RHS, separately:

1. term:

E[y2i ] = Var(yi) + (E[yi])

2

= σ2 + (β0 + β1xi)2

∑E[y2

i ] = nσ2 + nβ20 + 2nβ0β1x+

∑β2

1x2i

2. term:

E[y2] = Var(y) + (E[y])2

= σ2/n+ (β0 + β1x)2

nE[y2] = σ2 + nβ20 + 2nβ0β1x+ nβ2

1x2

39

Page 49: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Development of E[MSE] (cont’d)

E[Syy] = (n− 1)σ2 + β21

∑(xi − x)2

= E[SST ]

E[SSE] = E[Syy]− E[SSR]

= (n− 1)σ2 + β21

∑(xi − x)2 − (σ2 + β2

1Sxx)

= (n− 2)σ2

E[MSE] = E[SSE/(n− 2)] = σ2

40

Page 50: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Another approach to testing H0 : β1 = 0

◦ Under the null hypothesis, both MSE and MSR estimate σ2.

◦ Under the alternative, only MSE estimates σ2.

E[MSR] = σ2 + β21Sxx > σ2

◦ A reasonable test is

F0 =MSRMSE

∼ F1,n−2

◦ Large F0 ⇒ evidence against H0.

◦ Note t2ν = F1,ν so this is really the same test as

t20 =

β1√MSE/Sxx

2

=β1

2Sxx

MSE=

MSRMSE

41

Page 51: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

The ANOVA table

Source df SS MS FReg. 1 β1

2Sxx β12Sxx

MSRMSE

Error n− 2 Syy − β12Sxx SSE/(n− 2)

Total n-1 Syyroller data example:> anova(roller.lm) # R code

Analysis of Variance Table

Response: depression

Df Sum Sq Mean Sq F value Pr(>F)

weight 1 658 658 14.5 0.0052

Residuals 8 363 45

(recall that the t-statistic for testing β1 = 0 had been 3.81 =√14.5)

◦ Ex. Write an R function to compute these ANOVA quantities.

42

Page 52: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Confidence Interval for σ2

◦ SSEσ2 ∼ χ2

n−2so

P (χ2n−2,1−α/2 ≤

SSE

σ2≤ χ2

n−2,α/2) = 1− α

so

P (SSE

χ2n−2,α/2

≤ σ2 ≤SSE

χ2n−2,1−α/2

) = 1− α

e.g. roller data:

SSE = 363

χ28,.975 = 2.18

(R code: 1-qchisq(8, .025))

χ28,.025 = 17.5

(363/17.5,363/2.18) = (20.7,166.5)

43

Page 53: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.7.1 R2 - Coefficient of Determination

◦ R2 = is the fraction of the response variability explained by theregression:

R2 =SSRSyy

◦ 0 ≤ R2 ≤ 1. Values near 1 imply that most of the variability isexplained by the regression.

◦ roller data:

SSR = 658 and Syy = 1021

so

R2 =658

1021= .644

44

Page 54: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

R output

> summary(roller.lm)

...

Multiple R-Squared: 0.644, ...

◦ Ex. Write an R function which computes R2.

◦ Another interpretation:

E[R2].

=E[SSR]

E[Syy]=

β21Sxx + σ2

(n− 1)σ2 + β21Sxx

=β2

1Sxxn−1 + σ2

n−1

σ2 + β21Sxxn−1

.=

β21

Sxx(n−1)

σ2 + β21

Sxx(n−1)

for large n. (Note: this differs from the textbook.)

45

Page 55: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Properties of R2

• Thus, R2 increases as

1. Sxx increases (x’s become more spread out)

2. σ2 decreases

◦ Cautions

1. R2 does not measure the magnitude of the regression slope.

2. R2 does not measure the appropriateness of the linear model.

3. A large value of R2 does not imply that the regression modelwill be an accurate predictor.

46

Page 56: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Hazards of Regression

• Extrapolation: predicting y values outside the range of observedx values. There is no guarantee that a future response wouldbehave in the same linear manner outside the observed range.e.g. Consider an experiment with a spring. The spring isstretched to several different lengths x (in cm) and the restoringforce F (in Newtons) is measured:

x F

3 5.1

4 6.2

5 7.9

6 9.5

> spring.lm <- lm(F˜x - 1,data=spring)

> summary(spring.lm)

Coefficients:47

Page 57: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Estimate Std. Error t value Pr(>|t|)

x 1.5884 0.0232 68.6 6.8e-06

The fitted model relating F to x is

F = 1.58x

Can we predict the restoring force for the spring, if it has beenextended to a length of 15 cm?

• High leverage observations: x values at the extremes of therange have more influence on the slope of the regression thanobservations near the middle of the range.

• Outliers can distort the regression line. Outliers may beincorrectly recorded OR may be an indication that the linearrelation or constant variance assumption is incorrect.

Page 58: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

• A regression relationship does not mean that there is acause-and-effect relationship. e.g. The following data give thenumber of lawyers and number of homicides in a given year for anumber of towns:

no. lawyers no. homicides

1 0

2 0

7 2

10 5

12 6

14 6

15 7

18 8

Note that the number of homicides increases with the number oflawyers. Does this mean that in order to reduce the number ofhomicides, one should reduce the number of lawyers?

Page 59: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

• Beware of nonsense relationships. e.g. It is possible to showthat the area of some lakes in Manitoba is related to elevation.Do you think there is a real reason for this? Or is the apparentrelation just a result of chance?

Page 60: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

2.9 - Regression through the Origin

◦ intercept = 0

yi = β1xi + ε

◦ Max. Likelihood and L-S⇒

minimizen∑i=1

(yi − β1xi)2

β1 =

∑xiyi∑x2i

ei = yi − β1xi

SSE =∑

e2i

◦ Max. Likelihood⇒

σ2 =SSE

n

48

Page 61: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ Unbiased Estimator:

σ2 = MSE =SSE

n− 1

◦ Properties of β1:

E[β1] = β1

Var(β1) =σ2∑x2i

◦ 1− α C.I. for β1:

β1 ± tn−1,α/2

√√√√MSE∑x2i

◦ 1− α C.I. for E[y|x0]:

y0 ± tn−1,α/2

√√√√MSEx20∑

x2i

since

Var(β1x0) =σ2∑x2i

x20

Page 62: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ 1− α P.I. for y, given x0:

y0 ± tn−1,α/2

√√√√MSE(1 +x2

0∑x2i

)

◦ R code:> roller.lm <- lm(depression˜weight - 1,

data=roller)

> summary(roller.lm)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

weight 2.392 0.299 7.99 2.2e-05

Residual standard error: 6.43

on 9 degrees of freedom

Multiple R-Squared: 0.876,

Adjusted R-squared: 0.863

F-statistic: 63.9 on 1 and 9 DF,

p-value: 2.23e-005

Page 63: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

> predict(roller.lm,

newdata=data.frame(weight<-5),

interval="prediction")

fit lwr upr

[1,] 12.0 -2.97 26.9

Page 64: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Ch. 2.9.2 Correlation

• Bivariate Normal Distribution:

f(x, y) =1

2πσ1σ2

√1− ρ2

e− 1

2(1−ρ2)(Y 2−2ρXY+X2)

where

X =x− µ1

σ1

and

Y =y − µ2

σ2

◦ correlation coefficient:

ρ =σ12

σ1σ2

◦ µ1 and σ21 are the mean and variance of x

◦ µ2 and σ22 are the mean and variance of y

◦ σ12 = E[(x− µ1)(y − µ2)], the covariance

49

Page 65: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Conditional distribution of y given x

f(y|x) =1√

2πσ1.2e−1

2

(y−β0−β1x

σ1.2

)2

where

β0 = µ2 − µ1ρσ2

σ1

β1 =σ2

σ1ρ

σ21.2 = σ2

2(1− ρ2)

50

Page 66: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

An Explanation

:

◦ Suppose

y = β0 + β1x+ ε

where ε and x are independent N(0, σ2) and N(µ1, σ21) random

variables.◦ Define µ2 = E[y] and σ2

2 = Var(y):

µ2 = β0 + β1µ1

σ22 = β2

1σ21 + σ2

◦ Define σ12 = Cov(x− µ1, y − µ2):

σ12 = E[(x− µ1)(y − µ2)]

= E[(x− µ1)(β1x+ ε− β1µ1)]

= β1E[(x− µ1)2] + E[(x− µ1)ε]

= β1σ21

51

Page 67: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ Define ρ = σ12σ1σ2

:

ρ =β1σ

21

σ1σ2= β1

σ1

σ2

Therefore,

β1 =σ2

σ1ρ

and

β0 = µ2 − β1µ1

= µ2 − µ1σ2

σ1ρ

◦ What is the conditional distribution of y, given x = x?

y|x=x = β0 + β1x+ ε

must be normal with mean

E[y|x = x] = β0 + β1x

= µ2 + ρσ2

σ1(x− µ1)

Page 68: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

and variance

Var(y|x = x) = σ21.2 = σ2 = σ2

2 − β21σ

21

= σ22 − ρ

2σ22

σ21σ2

1

= σ22(1− ρ2)

Page 69: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Estimation

◦ Maximum Likelihood Estimation (using bivariate normal) gives

β0 = y − β1x

β1 =Sxy

Sxx

r = ρ =Sxy√SxxSyy

◦ Note that

r = β1

√Sxx

Syy

so

r2 = β12Sxx

Syy= R2

52

Page 70: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

i.e. the coefficient of determination = square of correlationcoefficient

Page 71: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Testing H0 : ρ = 0 vs. H1 : ρ 6= 0

◦ Cond’l approach (equiv. to testing β1 = 0):

t0 =β1√

MSE/Sxx

=β1√

Syy−β12Sxx

(n−2)Sxx

=

√√√√√β12(n− 2)

SyySxx− β1

2

=

√√√√(n− 2)r2

1− r2

where we have used β12 = r2Syy/Sxx.

53

Page 72: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

◦ The above statistic is a t statistic on n− 2 degrees of freedom⇒conclude ρ 6= 0 if pvalue

P (|tn−2| > |t0|) is small.

Page 73: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Testing H0 : ρ = ρ0

Z =1

2log

1 + r

1− rhas an approximate normal distribution with mean

µZ =1

2log

1 + ρ

1− ρand variance

σ2Z =

1

n− 3

for large n.

◦ Thus,

Z0 =log 1+r

1−r − log 1+ρ01−ρ0

2√

1/(n− 3)

has an approximate standard normal distribution when the nullhypothesis is true.

54

Page 74: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

Confidence Interval for ρ

• Confidence interval for 12 log 1+ρ

1−ρ :

Z ± zα/2

√1/(n− 3)

• Find endpoints (l, u) of this confidence interval and solve for ρ:

(e2l − 1

1 + e2l,e2u − 1

1 + e2u)

55

Page 75: Chapter 2: Simple Linear Regression 2.2-2.3 Parameter ... · PDF fileChapter 2: Simple Linear Regression ... Experiment with this simulation program: ... title1

R code for fossum example

• Find 95% confidence interval for the correlation between totallength and head length:

> source("fossum.R")

> attach(fossum)

> n <- length(totlngth) # sample size

> r <- cor(totlngth,hdlngth)# correlation est.

> zci <- .5*log((1+r)/(1-r)) +

qnorm(c(.025,.975))*sqrt(1/(n-3))

> ci <- (exp(2*zci)-1)/(1+exp(2*zci))

> ci # transformed conf. interval

> detach(fossum)

[1] 0.62 0.83 # conf. interval for true

# correlation

56