chapter 2. simple linear regression - welcome to health...

27
Simple Linear Regression 1 Chapter 2. Simple Linear Regression Regression Analysis Study a functional relationship between variables - response variable y , Y (dependent variable) - explanatory variable x , X (independent variable) To explain the variabilityof Y ,

Upload: truongkhanh

Post on 06-Apr-2019

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

1

Chapter 2. Simple Linear Regression

Regression Analysis

Study a functional relationship between variables

- response variable y ,Y (dependent variable) - explanatory variable x , X (independent variable)

To explain the “variability” of Y ,

Page 2: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

2

Simple linear regression model (§2.4)

0 1 ,i i iY x ( 1, , nx x : non-random)

1, , n : independent random errors

2( ) 0, ( )i iE Var ( 1, , )i n

(additional assumption : 2(0, )i N ) inference를 위해 필요

Method of estimation (§2.5)

<Least Squares Method >

- minimize 20 1

1

( )n

i ii

y x

w.r.t. 0 and 1

- normal equation :

11

1

0ˆ ˆ, ( ) (residu

0

)0 al

n

ii

i i i

n

i ii

e

e

x e y x

Page 3: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

3

( ( ) 0i ix x e )

- least square estimates : 1

0 1

ˆ

ˆ ˆxy xxS S

y x

where

2

( )( )

( )xy i i

xx i

S x x y y

S x x

- least squares regression fit : 0 1 1ˆ ˆ ˆˆ ( )y x y x x

Page 4: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

4

<“Unbiased” estimation of 2 > (§2.6)

2 21 ˆ( )2 i iy y

n

, 0 1ˆ ˆˆi iy x

12n

SSE (SSE : residual sum of squares (error sum of squares))

( 2n : degrees of freedom, df)

Example (Computer Repair Data, §2.3)

data (n=14)

scatter plot: “simple linear regression” seems O.K.

model setting : eq. (2.10)

estimated “l.s. line” eq. (2.19) with residuals (Table 2.7)

estimated error variance: eq. (2.23)

Page 5: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

5

2 21 0

ˆ ˆ ˆ ˆ15.509, 4.162, 5.392 4.162 15.509y x

Page 6: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

6

Page 7: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

7

Method of inference (추정량의 성질 및 신뢰구간, 검정)

(1) Properties of estimates

i. 21 1 1

ˆ ˆ( ) , ( ) xxE Var S

ii. 2 1 20 0 0

ˆ ˆ( ) , ( ) ( )xxE Var n x S

iii. 20 1

ˆ ˆ( , ) xxCov x S

iv. 2 2 2 2ˆ( ) , ( ) (1 ) , ( , )i ii i j ijE Var e p Cov e e p

and ( ) 0iE e where ( )( )1 i j

ijxx

x x x xp

n S

Page 8: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

8

(2) Inference under additional normality assumption

i. 1 1 1 21 1

1

ˆ ˆ ˆ ˆ~ ( 2); . .( ) var( )ˆ. .( )xxt n s e S

s e

- 100 (1 )% C.I. : 1 1 1 1 1ˆ ˆ ˆ ˆ[ ( 2; 2) . .( ) ( 2; 2) . .( )]t n s e t n s e

- Reject 00 1 1:H in favor of 0

1 1 1:H 0

1 1

1

ˆ( 2; 2)ˆ. .( )

iff t ns e

- p-value

ii. 0 0 1 1/20 0

0

ˆ ˆ ˆ ˆ~ ( 2); . .( ) var( ) ( )ˆ. .( )xxt n s e n x S

s e

- 100 (1 )% C.I. : 0 0 0 0 0ˆ ˆ ˆ ˆ[ ( 2; 2) . .( ) ( 2; 2) . .( )]t n s e t n s e

- Reject 00 0 0:H in favor of 0

1 0 0:H 0

0 0

0

ˆ( 2; 2)ˆ. .( )

iff t ns e

Page 9: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

9

iii. 0 0 0 1 0( | )E Y x x

0 0 1 0ˆ ˆˆ x

0 0 1 2 1/20 0 1 0 0

0

ˆ ˆ ˆˆ ˆ~ ( 2); . .( ) var( ) ( ( ) )ˆ. .( ) xxnt n s e x x x S

s e

- 100 (1 )% C.I. : 0 0 0 0 0ˆ ˆ ˆ ˆ[ ( 2; 2) . .( ) ( 2; 2) . .( )]t n s e t n s e

- Test (not given)

iv. Prediction for 0 0 100 1 0 ( : , , )ny x indep of

0 0 1 0ˆ ˆy x

0 0 1 2 1/20 0 0

0 0

ˆ ˆ ˆ~ ( 2) ; . .( ) (1 ( ) )ˆ. .( ) xx

y y t n s e y y n x x Ss e y y

100 (1 )% Prediction interval

: 0 0 0 0 0 0 0ˆ ˆ ˆ ˆ[ ( 2; 2) . .( ) ( 2; 2) . .( )]y t n s e y y y y t n s e y y

** Note that 0 is identical to the predicted response 0y at any given 0x .

Page 10: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

10

Example(computer repair data) (c.t.d.)

① “Test of significance” (of explanatory variable)

0 1: 0H v.s. 1 1: 0H

- 1 1ˆ ˆ. .( )t s e 30.71 (Table 2.9)

p-value / meaning : “we’ve seen a data which can hardly be observed under 0H ”

- We may reject 0H

② 95% C.I .for 1

③ 95% C.I for 4 0 1 4

④ 95% P.I for 4 0 1 14 ( , , )ny (wider than ③)

- All these are valid under “the model assumptions” Need to check them! (chapter 4)

- Note that 0 0: 0H v.s. 1 0: 0H can’t be rejected even at 10% (Table 2.9)

Meaning : We may start with a “simpler” model 1i i iy x 2~ (0, )iid

i N

Then, all the above inferences should be changed!

Page 11: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

11

Measuring the quality of fit

i. Decomposition of Sum of Squares :

deviation sum of squares

ˆ ˆ( ) ( )i i i iy y y y y y 2 2 2ˆ ˆ( ) ( ) ( )i i i iy y y y y y

SST SSE SSR

(d.f.) (n-1) (n-2) (1)

1 1 11

ˆ ˆ ˆˆ ˆ ˆ2( )( ) 2 ( ) 2 ( ) 0 ( )n

i i i i i i i i i ii

y y y y e x x x e x e y y x x

(*) SSR2

22 2 2

11 1

( )ˆˆ( ) ( )n n

ixyi i i

xx xx

x xSy y x x yS S

ii. Coefficient of determination( or Multiple Correlation Coeff.)

2 1SSR SSERSST SST

, 20 1R

2R : “proportion of variation of y explained by x ”

Page 12: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

12

Example (Computer Repair Data)

s.s. d.f. 2RReg. 27419.500 1 0.987

Err. 348.848 12

Total 27768.348 13

Page 13: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

13

Supplement I (ch.2)

(1) Geometry of Least Squares Method

- minimize 20 1

1

{ ( )}n

i ii

y x

w.r.t. 0 & 1

- minimize 2

0 1( )y 1 x

w.r.t. 0 & 1

where 1

n

y

y

y 1

1

1 1

n

x

x

x “column vectors”

- Examples

(x,y) = (1,1), (1,2), (2,2) 1=(1,1,1)T, x=(1, 1, 2) T , y=(1, 2, 2) T

Page 14: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

14

<perpendicular projection onto a vector>

( ) y βx x ( ) 0T X y βX 1ˆ ( )T T β X X X y

i.e. 1( )T T X β X X X X y : proj of y onto ( )C X ( X의 column space)

Page 15: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

15

(*1) 1( )T T y 1 1 1 1 y 1

(*2) 11( ) ( ) ( ) ( ) ( )T Tx x x x x x 1 x 1 x 1 x 1 y x 1 ( ( ) ( ) 0)Tx y x 1 1

1 0 1ˆ ˆ ˆˆ ( )y x y 1 x 1 1 x where 0 1

ˆ ˆy x

Page 16: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

16

<meaning of coefficient of determination>

2( )iy y SST , 2ˆ( )i iSSE y y , 2ˆ( )iSSR y y

2 2 2cos : cos 1 ( 0)SSR SST R

y gets closer to the plane ( , )C 1 x which is determined by ,1 x

Page 17: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

17

(2) Properties of Variance & Covariance of random variables

cov( , ) ( )( )Y Z E Y EY Z EZ

① 1 1 1 1

cov( , ) cov( , )m n m n

i i j j i j i ji j i j

a Y b Z a b Y Z

② 21 1

1

var( ) var( ) cov( , )n

n n i i i j i ji j

a Y a Y a Y a a Y Y

③ 2

11 1

, : cov( , ) 0

, , : var( ) var( )n n

n i i i i

Y Z indep Y Z

Y Y indep a Y a Y

(3) Expectation, Variance & Covariance of random vectors

For random vector 1, , nY Y Y

(column vector notation),

1

n

EYEY

EY

, (mean vector),

1 1 2 1

1 2

var( ) cov( , ) cov( , )

var( ) cov( , )

cov( , ) cov( , ) var( )

n

i j

n n n

Y Y Y Y Y

Y Y Y

Y Y Y Y Y

(variance-covariance matrix)

Page 18: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

18

Note that (*1)

var( ) ( )( )Y E Y EY Y EY

( , )( cov( , ) ( )( ), ( ) )i j i i j j i j i jY Y E Y EY Y EY a a a a

① ( ) ( )

var( ) var( ) var( )E AY b AE Y b

AY b AY A Y A

(*2)

1

n

ij j ij

AY b a Y b

for ( ), ( )ij iA a b b

: constants

1 1

( ) ( )n n

ij j i ij j ij j

E AY b E a Y b a EY b AEY b

var( ) { ( )}{ ( )}AY b E AY b E AY b AY b E AY b

{ ( )}{ ( )}E A Y EY A Y EY

( )( )E A Y EY Y EY A

( )( )AE Y EY Y EY A

var( )A Y A

② In simple (or multiple) linear regression model, 2( ) , var( ) nE Y X Y I

Page 19: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

19

(4) Gradient vector

① For (n1) vector 1

11

, , ( , , )n

n i ii

n

xc x c x c c c x

x

, partial derivative of c x

w.r.t. x

:

11

( )

( )

( ) n

n

c xcx

c x cx

c x cx

. Similarly, 11

( )

( )

( ) n

n

x ccx

x c cx

x c cx

② For any matrix 1& ( , , )nA y y y

( )( )

y AyA A y

y

. When A : symmetric, ( )

2y Ay

Ayy

2

1 1 1 1

n n n n

l lk k i ik k l lk ii il k k l

k i l k

y Ay y a y y a y y a a y

Page 20: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

20

( )2ik k l li ii i

k i l ii

y Aya y y a a y

y

1 1

( )n n

ik k l liik l

a y y a A A y

(5) Properties of Least Squares Estimates

iY : Independent, 1, , nx x : constants

20 1 , var( )i i iEY x Y ( 1, , )i n

11 1 1

1ˆ ( )( ) ( ) 0n n n

xy ii i i i

ixx xx xx

S x xx x Y Y Y x x YS S S

0 11

1ˆ ˆ ( )n

ii

xx

x xY x x Yn S

0 1 11

1ˆ ˆ ˆ( ) ( ) ( )n

ji i i i i i i j

j xx

x xe Y x Y Y x x Y x x Y

n S

i. 1 0 11 1

ˆ( ) ( )n n

i ii i

i ixx xx

x x x xE EY xS S

Page 21: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

21

1 ( ( ) ( )( ) )i i i i xxx x x x x x x S

2

21

1

ˆvar( ) var( )n

ii xx

i xx

x x Y SS

1( , , nY Y : indep. 2var( ) )iY

ii. 1 10 0 1

1 1

ˆ( ) ( )n n

i ii i

xx xx

x x x xE n x EY n x xS S

0

2 21 2 1 2 2

0 21 1 1 1

0

2 21 2 2 1 2

21

( )ˆvar( ) var( ) 2

( )

n n n ni i i

ixx xx xx

n

ixx xx

x x x x x xn x Y n xn xS S S

x xn x x nS S

11 0

1 1

ˆ ˆcov( , ) cov ( ) ,n n

jii j

i jxx xx

x xx xn x Y YS S

1

1 1

cov( , )n n

jii j

i j xx xx

x xx xn x Y YS S

.

Page 22: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

22

1 2

1

2

( cov( , ) 0 )n

i ii j

i xx xx

xx

x x x xn x Y Y for i jS S

x S

Page 23: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

23

<SAS>

Computer Repair Data

1. Input Program

Data repair;

Input units minutes @@;

Cards;

1 23 2 29 3 49 4 64 4 74 5 87 6 96 6 97 7 109 8 119 9 149 9 145 10 154 10 166

;

run;

Page 24: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

24

2. Scatter plot and Linear regression line

symbol1 interpol = RL c=black h=1 v=dot;

axis1 minor=none order=(0,40,80,120,160);

axis2 minor=none order=(0,2,4,6,8,10);

proc gplot data=repair;

plot minutes*units / haxis=axis2 vaxis=axis1;

run;

minutes

0

40

80

120

160

units

0 2 4 6 8 10

Page 25: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

25

3. Regression Analysis

proc reg data=repair

model minutes = units;

run;

<절편없는 회귀분석>

proc reg data=repair

model minutes = units /noint;

run;

Page 26: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

26

Page 27: Chapter 2. Simple Linear Regression - Welcome To Health …healthstat.snu.ac.kr/homepage/files/STATISTICAL_METHODS... · 2018-03-21 · Simple Linear Regression 1 Chapter 2. Simple

Simple Linear Regression

27

Anscombe’s Quartet