fechter - mcmaster universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the...

6
fechter 32 Analysis of Vannice Approach to Test SBnframe of regression ANOVA Assure a two sided test Ho p O H B 40 Recall from last time hat SSE SST SSR 9 To EsE s5 Eiichi y 5 Etsitty p say mean Square regression ht fo MSI SSRI SEEMS E SSE kn Rn mean square error Rmk the SSRI is weird but we write it lose this because I mareskhafh sgfaesmmbsymtdeoseegee.me dpeedbon.dmB Assuring Ho p 0 Fo follows a F n a distro This is a new dostrbutron F dosimbutron with 1 deg af freedom in the numerator and n 2 degrees of freedom mthedenoma idea T is large implies that more of the

Upload: others

Post on 27-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

fechter 32

Analysis of Vannice Approach to Test SBnframeof regression

ANOVAAssure a two sided test Ho p O H B 40

Recall from last time hat

SSE SST SSR9 To

EsE s5 Eiichi y5 Etsittyp say

meanSquareregression

htfo MSI SSRISEEMSE SSE kn Rn

meansquareerror

Rmk the SSRI is weird but we write it lose thisbecause

I mareskhafh sgfaesmmbsymtdeoseegee.me dpeedbon.dmBAssuring Ho p 0 Fo follows a F n a distro

This is a new dostrbutron F dosimbutron with1 deg af freedom in the numerator and n 2 degreesof freedommthedenomaidea T is large implies that more of the

Page 2: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

tea to its large implies that more of thevariability on Y is explained by theregression which means more eurderceof a linear relationship

Given an observation to fo based on ondata

ad given a significance level K we should

reject Ho il fo fdj n 2fond n in f tableColum 1 row h 2

Notre that

To ITEM TE ff.IE FoSSEn 2

So two sided t test and f test are

equivalent

for l sided use the t vest

T.si ifm X

f 7 Adequacy ofthe Regression

model

Page 3: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

we have made many assumptionsin order

to apply the regression model

we assume

y p t p Xt E

where a N 0,11

i e we are assuring E rs normallydistributed

the model is actually linearie no

higher order terms MX

How would we examine the normality of E

Here are two techniquesbased on residual analysis

In this context we call the terns

ei yi Jithe error tens 1

residualss we expect the residuals

to be uniformly dosmbutedso Reminder

setee E e E E een

Page 4: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

be a reordering of the residuals in Mueang order

Let Ci be such that

PCz Eciig.su

Then the pants Ceci ci should be

roughly along a straight line use the fat

pencilStandardize the residuals

If we are correct in our assumption

that E B normally doshrbuted

with mean 0 and Vannice 02 then

the sample ee of residuals

can be standardizedto

di i

we should find that 95 of the di's

will be on the internal C 2,2since

PC 2 EZE 2 20 95

Page 5: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

Another way tomeasure the adequacy is

Lou RI the coeffrcoed of determination

we define

R2 R7 l SSISST

E it 755

Since SST Set SSE O E RE EI

Haring largeR is usually end decree of

adequacy of thelinear regression

model

R2 clone to 1 means that the regression

model explains much of the vamaloniktsm

the datanote that one should use RZ cautiously

R2 can be artificially inflatedin various waysere using higher order regression

Page 6: fechter - McMaster Universitycousingd/stats_3y_3j_notes/...tea to its large implies that more of the variability on Y is explained by the regression which means more eurderce of a

techniques g then any set of n postsLx y Cxn Yu there is a polynomial

of degree Cn D that goesthrough

every point and so gives 122 1

how thin rs an example ofoverfitting data and

usually is not useful frofgood for predictions