inference in simple linear regression

Inference in Simple Linear Regression

KNNL – Chapter 2

Least Squares Estimate of 1

1

1 1 1 1 1 1 1

2 2 2 2 22 2 2

1 1 1 1 1

22 2 1

1 1

Note the following results:

0

2 2 2

n

in n n n n n ni

i i i i i ii i i i i i i

n n n n n

i i i i i ii i i i i

n

in ni

i ii i

XX X X X X nX X n X X

n

X X X X X X X nX X X X nX nX

XX nX X n

n

22

12

1 1

1 1

1 11 2 2

1 1 11

2 1

1

1

where: Note:

n

in ni

i i ii i

n n

i in ni i

i i i i n n ni i

i i i i inni i iXX

iiin

ii

i

ii i

iXX

X

X X X Xn

X YX Y X X Y Y

nb X X Y Y X X k Y

SSX XX

Xn

X Xk k

SS

22

2 21 1 1 1 1

1 10 1

n n n n ni i XX

i i i ii i i iXX XX XX XX

X X X SSk X k X X

SS SS SS SS

Sampling Distribution of b1 – Normal Error Model

2 21

1

2 2 2 2

1 1

Note the following results assuming independent Normal Random Variables:

,..., ~ , ~ , where:

Simple linear regression model with normal and

n

n i i i i U Ui

n n

U i i U i ii i

Y Y N U aY N

E U a U a

1

1

20 1

21

1 1 1 1

1 0 1 0 1 0 1 11 1 1

2 2 2 2 2 21

1 1

independent errors:

~ ,

1where: with: 0 1

(0) (1)

i i

n n n ni

i i i i i i ii i i iXX XX

n n n

b i i i i ii i i

n n

b i i ii i

Y N X

X Xb k Y k k k X k

SS SS

E b k X k k X

b k k

1

2 2

1 1

22 2 2

1

~ ,

In practice unknown

XX XX

bXX XX

b NSS SS

s MSEs s b

SS SS

Sampling Distribution of (b1-1)/s{b1}

21 1 1 1

1 1 21

2 22 1 1

22 2 21

1 1 1 12

1 1 1 1

2

2

~ , ~ 0,1

2 2 2~ also: and independent

22

XX XX

n

XX XX

XX

b bb N N

SS bSS

n s n MSE n sb

b

b b

SS SS b bs sSS s SSn s

n

1 1

1

1 1

1

1 1

~ 2

Pr / 2; 2 1 / 2 ; 2 1

Pr 1 / 2 ; 2 1 / 2 ; 2 1

XX

XX

bt n

s b

bt n t n

s b

bt n t n

s SS

(1-)100% Confidence Interval for 1

1 11

1

1 1

1

1 1 1 1

1 1 1 1 1

1 1 1 1

~ 2

Pr 2;1 / 2 2;1 / 2 1

Pr 2;1 / 2 2;1 / 2 1

Pr 2;1 / 2 2;1 / 2 1

Pr 2;1 / 2 2;1 / 2

XX

b st n s b

s b SS

bt n t n

s b

t n s b b t n s b

b t n s b b t n s b

b t n s b b t n s

1

1 1 1

1

1 100% Confidence Interval for 2;1 / 2

b

b t n s b

Test of Hypothesis for

0 1 10 1 10 10

1 10 110

1 1

0

2-sided test: : : (Almost always 0)

Test Statistic: * Note: if 0 *

Decision Rule: * 2;1 / 2 Reject otherwise Fail to Reject

P-value: 2Pr 2

AH H

b bt t

s b s b

t t n H

t n

0 1 10 1 10

0

0 1 10 1 10

0

*

Upper-tail test: : :

Decision Rule: * 2;1 Reject otherwise Fail to Reject

P-value: Pr 2 *

Lower-tail test: : :

Decision Rule: * 2;1 Reject othe

A

A

t

H H

t t n H

t n t

H H

t t n H

rwise Fail to Reject

P-value: Pr 2 *t n t

Inference Concerning 0

0 1 11 1

0 1 0 1 1 0

22 2 2 20 1 1 1

21

1 1 1

20

1where: and

2 ,

1 1Aside: , , 0

n ni

i ii i XX

n n ni i

i ii i iXX XX

X Xb Y b X Y Y b Y

n SS

E b E Y b X X X

b Y b X Y X b X Y b

X X X XY b Y Y

n SS n SS

b

222 2 2

1

2

20 0

2 2

0

0 00 0 0

0

1

1~ ,

1 1Estimated Standard Error:

~ 2 1 100% CI for 2;1 / 2

Test Statistic for testing

XX

XX

XX XX

XY X b

n SS

Xb N

n SS

X Xs b s MSE

n SS n SS

bt n b t n s b

s b

0 000 0 00 0 00 0

0

: : : * Reject if * 2;1 / 2A

bH H t H t t n

s b

Interval Estimation of E{Yh} = 0+1Xh

0 1

^

0 1 1 1 1

^

0 1

^ 22 2 2 2 2 2

1 1 1

2 222

2

Goal:Estimate population mean when :

Parameter:

Estimator:

1

h

h h

h h h h

h h

h h h h

h h

XX XX

X X

E Y X

Y b b X Y b X b X Y b X X

E Y X

Y Y b X X Y b X X Y X X b

X X X X

n SS n SS

1

2 2

^

^ ^

0 1

Note: , 0

1 1

1 100% CI for : 2;1 / 2

h hh

XX XX

h hh h

Y b

X X X Xs Y s MSE

n SS n SS

E Y X Y t n s Y

Prediction Interval forYh(new) when X=Xh

(new) 0 1 (new)

^

0 1

^

(new)

2

^ ^2 2 2 2 2 2

(new) (new)

Goal: Predict a new (future) observation when :

Target: +

Prediction:

Prediction Error:

1pred

h

h h

h h

hh

hh hh h

XX

X X

Y X

Y b b X

Y Y

X XY Y Y Y

n SS

2

^2

(new)

2 2

^

(new)

11 Note: , 0

1 1pred 1 1

1 100% Prediction Interval for : 2;1 / 2 pred

hhh

XX

h h

XX XX

hh

X XY Y

n SS

X X X Xs s MSE

n SS n SS

Y Y t n s

Confidence Band for Regression Line

0 1

^

0 1 1 1 1

^

0 1

2

^2 2

^

Goal:Simultaneously Estimate population mean for all values (not extrapolating) :

Parameter:

Estimator:

1

h h

h h h h

h h

hh

XX

h

X

E Y X

Y b b X Y b X b X Y b X X

E Y X

X XY

n SS

s Y s

2 2

^ ^

0 1

1 1

1 100% CI for :

2 1 ;2, 2

This can be used for any number of specific levels, simultaneously

h h

XX XX

h hh h

X X X XMSE

n SS n SS

E Y X Y Ws Y

W F n

X

Analysis of Variance Approach to Regression

2

1

^

2^

1

Deviation of i observation from the Mean:

Total Sum of Squares:

Deviation of i observation from the Regression Line:

Error Sum of Squares:

Deviati

thi

n

ii

thii

n

iii

Y Y

SSTO Y Y

Y Y

SSE Y Y

^

2^

1

on of i fitted value from the Mean:

Regression Sum of Squares:

thi

n

i

i

Y Y

SSR Y Y

ANOVA Partitioning - I

^ ^

2 2^ ^ ^ ^2

2 2^ ^ ^ ^2

1 1 1 1

2

2

Note(from normal equations, Chapter 1, Slide 5

i ii i

i i i ii i i

n n n n

i i i ii i ii i i i

Y Y Y Y Y Y

Y Y Y Y Y Y Y Y Y Y

Y Y Y Y Y Y Y Y Y Y

^ ^ ^ ^

1 1 1 1

2 2^ ^2

1 1 1

):

0 0 0n n n n

i i i ii i i ii i i i

n n n

i ii ii i i

Y Y Y Y e Y Y e Y Y e

Y Y Y Y Y Y

SSTO SSR SSE

ANOVA Partitioning

2^ 22

0 1 1 11 1 1

2 22 2

1 1 1 11 1

2

Note useful result regarding :

Degrees of Freedom associated with each sum of squares:

Total:

n n n

i i ii i i

n n

i i XXi i

ii

SSR

SSR Y Y b b X Y Y b X b X Y

b X b X b X X b SS

SSTO Y Y

TO1

2^

E1

2^

R1

= 1 (One parameter estimated)

Error: 2 (Two parameters estimated)

Regression: 2 1 1

(Fitted equation has 2 parameters,

n

n

iii

n

i

i

df n

SSE Y Y df n

SSR Y Y df

TO R E

mean removes 1)

=

Note: Mean Squares are Sums of Squares divided by degrees of freedom

df df df

Analysis of Variance Table

2^ 22 2

11 1

2^2

1

2

1

2 22 2

22

11

Source { }

Regression 11

Error 22

Total 1

Note:

~ 2 22

n n

i ii i

n

iii

n

ii

n

ii

SS df MS E MS

SSRY Y MSR X X

SSEY Y n MSE

n

Y Y n

SSE SSE SSEn E n E E MSE

n

MSR SSR b X X

22 21 1 1

2 22 2 2

1 11

XX XX

n

XX iiXX

E MSR SS E b SS b E b

SS X XSS

F-Test for H0:1=0 vs HA:1≠0

0 1 1

* *0

22 2

1* 1

0 2

*0 1

: 0 : 0

Test Statistic: Reject if 1 ;1, 2 (See below for why)

Reject for large since

Sampling Distribution of Under : 0 (C

A

n

ii

H H

MSRF H F F n

MSE

X XE MSR

H FE MSE

F H

2 22 2 2 2

2

*

2

2*

2

ochran's Theorem, p. 70):

1) 1 ( 2) 1

2) ~ 2 ~ 1 , independent

1 13) ~ 1, 2

2 2

1

~ 1, 2

2

R E TOSSR SSE SSTO df df n n df

SSE SSR SSE SSRn

F F nn n

SSR

MSRF F n

MSESSEn

Comments on F-Test

21 12

22 2 2

2*

2

1) ~ 2 regardless of whether or not 0, when 0 :

2) ~ non-central 1 , independent regardless

1

3) ~ non-central 1, 2

2

4) F-test and t-

SSEn

SSR SSE SSR

SSR

MSRF F n

MSESSEn

0 1 1

22 2 2

2* *1 1 1 12

1 1

2* *

test are equivalent for : 0 : 0 :

1

2

Critical Value for 1 ;1, 2 1 2; 2 Critical Value for

A

XX

XX

H H

SSR b SS b b bF t

MSE MSE SS s b s bSSE n

F F n t n t

General Linear Test – Very Flexible Method

0 1

^

1 0 0 1

2^

1

1) Fit the Full/Unrestricted Model (No restrictions on ,

Compute , by least squares and

Error sum of squares for Full Model: 2

2) Fit the Reduced/Restri

i i

n

ii Fi

b b Y F b b X

SSE F Y Y F df n

0 1

0 0 00 1 10

^

cted Model (Restriction(s) on , and/or

: and/or

Estimate any unspecified parameters by least squares with restriction(s) and obtain

Error sum of squares for Reduced Model:

i

H

Y R

SSE R Y

2^

1

*

*0

(# of unrestricted

( ) ( )3) Compute Test Statistic:

( )

4) Reject if 1 ; ,

n

iii

sR

R F

F

R F F

Y R

df n

SSE R SSE F df dfF

SSE F df

H F F df df df

Example 1 – H0: 1 = 0

1 0 1

2^ ^

0 11

1 1 0 1

^

0

2^ 2

1 1

Full (Unrestricted) Model:

2

Reduced (Restricted) Model: 0

(0)

XY

XX

n

i ii i Fi

i i

n n

ii i Ri i

SSb b Y b X

SS

Y F b b X SSE F Y Y F SSE df n

b b Y b X Y

Y R b X Y

SSE R Y Y R Y Y SSTO df

0

*

* *0

1 (only 1 "free" parameter:

( ) ( ) ( 1) ( 2)Test Statistic:

( ) 2

1= Reject H if 1 ;1, 2 (The ANOVA -test

2

R F

F

n

SSE R SSE F df df SSTO SSE n nF

SSE F df SSE n

SSR MSRF F F n F

MSESSE n

)

Descriptive Measures of Linear Association

^

1

^

0 1

Coefficient of Determination (Proportionate Reduction in Error):

Ignoring Predictor (Setting 0) : "Error SS" =

Accounting for Predictor: "Error SS" =

Difference: Portion "accoun

i

i i

Y Y SSTO

Y b b X SSE

2 2

2

1

ted" by :

1 0 1 Note: See plots on slides 12-14

Coefficient of Correlation (Often used when both and are random):

1 1

1) and are of the

XY

XX YY

X SSTO SSE SSR

SSR SSER R

SSTO SSTO

X Y

SSr R r

SS SS

r b

1

same sign

2) (but not ) is not changed by linear transformations of and/or r b Y X

Correlation Models – Y1,Y2 Bivariate Normal

1 2

1 2

2

1 1 1 1 2 2 2 21 2 1222

1 1 2 2121 2 12

, 2 Characteristics (Random) observed on Experimental Unit

Joint Density (at specific pairs of values , ) :

1 1, exp 2

2 12 1

Y Y

y y

y y y yf y y

2

1212 1 2 12 12 1 2 1 1 2 2

1 2

2

2 21 11 1 1 2 2 1 1 1 2 2 2

11

where is the correlation between , ,

Marginal Densities:

1 1, exp ~ , ~ ,

22

Conditi

Y Y Y Y E Y Y

yf y f y y dy Y N Y N

1 2 2 2 1 1

2

1 1|2 12 21 21 2

2 2 1|21|2

2 2 21 11|2 1 2 12 12 12 1|2 1 12

2 2

2 211 2 2 1 12 2 2 1 12 1|2 1

2

onal Densities | & |

, 1 1| exp

22

where: 1

| ~ , 1

Y Y y Y Y y

y yf y yf y y

f y

Y Y y N y N

2

2 2 1|2,y

Inferences on Correlation Coefficients

1 2 1212

1 2 1 2

112 12

2 2

1 1

0 12

,Parameter:

Point (maximum likelihood) Estimator (aka Pearson product-moment correlation coefficient):

1 1

Testing :

n

i ii XY

n nXX YY

i ii i

Y Y

Y Y

X X Y YSS

r rSS SS

X X Y Y

H

12

* 12

212

*0

*12 0

*12 0

0 vs : 0 :

2Test Statistic:

1

Reject if 1 2 ; 2

For 1-sided tests:

: 0 : Reject if 1 ; 2

: 0 : Reject if 1 ; 2

This test is mathematicall

A

A

A

H

r nt

r

H t t n

H H t t n

H H t t n

0 1y equivalent to t-test for : 0H

Confidence Interval for 1212 12

12

12

12

12

Problem: When 0, sampling distribution of is messy

11Fisher's z transformation: ' ln

2 1

11 1For large (typically at least 25): ' ~ , ln

3 2 1

Compute a

approx

r

rz

r

n z Nn

2

12 2

n approximate 1 100% CI for and transform back for :

11 100% CI for : ' 1 2

3

1After computing CI for , use identity

1

z zn

e

e

Spearman Rank Correlation Coefficient

11 1 11 1

12 2

When data are not normal, and no transformations are normal:

Spearman's Rank Correlation Method:

1) Rank ,..., from 1 to n (smallest to largest) and label: ,...,

2) Rank ,..., from 1 to

n n

n

Y Y R R

Y Y

12 2

1 21 21

2 2

1 21 21 1

0 1 2

n (smallest to largest) and label: ,...,

3) Compute Spearman's rank correlation coefficient:

To Test: H : No Association Between , vs H : Ass

n

n

i ii

S n n

i ii i

A

R R

R R R Rr

R R R R

Y Y

* *02

ociation Exists

2Test Statistic: Reject if 1 2 ; 2

1S

S

r nt H t t n

r

inference in simple linear regression

Documents

ftestgeneral linear

y2 bivariate normalinferences

xhconfidence band

flexible methodexample

b1test of hypothesis

b0interval estimation