Transcript
Page 1: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Multiple Linear Regressiondand

the General Linear Modelthe General Linear Model

1

More Statistics tutorial at www.dumblittledoctor.com

Page 2: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

OutlineOutline

1. Introduction to Multiple Linear Regressionp g2. Statistical Inference 3. Topics in Regression Modelingp g g4. Example 5. Variable Selection Methods6. Regression Diagnostic and Strategy for Building a Model

2

More Statistics tutorial at www.dumblittledoctor.com

Page 3: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

1.Introduction to Multiple Linear

RegressionRegression

3

More Statistics tutorial at www.dumblittledoctor.com

Page 4: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Multiple Linear RegressionMultiple Linear Regression• Regression analysis is a statistical methodology to estimate the relationship of a response variable to a set of predictor variables

• Multiple linear regression extends simple linear regression model to the case of two or more predictor variable

Example:Example:

A multiple regression analysis might show us that the demand of a product varies directly with the change in demographic characteristics (age, income) of a market area

Historical Background• Francis Galton started using the term regression in his biology research

area.

• Francis Galton started using the term regression in his biology research• Karl Pearson and Udny Yule extended Galton’s work to the statistical context• Legendre and Gauss developed the method of least squares used in regression analysis• Ronald Fisher developed the maximum likelihood method used in the related statistical inference (test of the significance of regression etc.).

4

More Statistics tutorial at www.dumblittledoctor.com

Page 5: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Probabilistic Model

iy is the observed value of the random variable (r.v.)iY which depends on

1 1 , 1, 2,...,ki i i k i iY x x x i nβ β β β ε0 1 2= + + + ... + + =

1 2, , ,i i ikx x xK according to the following model:fixed predictor values

1 1 , 1, 2,...,ki i i k i iY x x x i nβ β β β ε0 1 2 + + + ... + +

Here 0 1, , , kβ β βK are unknown model parameters, and n is the number of observations.

The random error, , are assumed to be independent r.v.’s with mean 0 and variance iε

Thus are independent r.v.’s with mean and variance , where iY iμ

2σ2σp ,i iμ

1 1( ) ki i i k ii E Y x x xμ β β β β0 1 2= = + + + ... +

σ

6

More Statistics tutorial at www.dumblittledoctor.com

Page 6: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Fitting the model

• The least squares (LS) method is used to find a line that fits the equationThe least squares (LS) method is used to find a line that fits the equation

• Specifically, LS provides estimates of the unknown model parameters,

1 1 , 1, 2,...,ki i i k i iY x x x i nβ β β β ε0 1 2= + + + ... + + =

0 1, , , kβ β βK Qwhich minimizes, , the sum of squared difference of the

observed values, , and the corresponding points on the line with the same x’siy

1 22

0 1 21[ ( ... )]k

n

i i i k ii

Q y x x xβ β β β=

= − + + + + ∑• The LS can be found by taking partial derivatives of Q with respect to the unknown variables and setting them equal to 0. The result is a set of simultaneous linear equations.

0 1, , , kβ β βK

• The resulting solutions, are the least squares (LS) estimators of , respectively.

0 1, , , kβ β βK0 1垐 �, , , kβ β βK

• Please note the LS method is non-parametric. That is, no probability distribution assumptions on Y or ε are needed. 7

More Statistics tutorial at www.dumblittledoctor.com

Page 7: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Goodness of Fit of the Model

• To evaluate the goodness of fit of the LS model, we use the residualsˆ ( 1, 2 , , )i i ie y y i n= − = K

gdefined by• are the fitted values:ˆ iy

垐 垐

• An overall measure of the goodness of fit is the error sum of squares (SSE)

1 1垐 垐ˆ ( 1, 2, ..., )ki i i k iy x x x i nβ β β β0 1 2= + + + ... + =

2

1min

n

ii

Q SSE e=

= = ∑A f h d fi i i i il h i i l li i• A few other definition similar to those in simple linear regression:

total sum of squares (SST):2( )SST ∑ 2( )iSST y y= −∑

regression sum of squares (SSR):

SSR SST SSE= −

8

More Statistics tutorial at www.dumblittledoctor.com

Page 8: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

• coefficient of multiple determination:

2 1SSR SSER2 1SS SSRSST SST

= = −

• 20 1R≤ ≤• values closer to 1 represent better fits• adding predictor variables never decreases and generally increases 2R

• multiple correlation coefficient (positive square root of ):2R

2R R= +

• only positive square root is used

• R is a measure of the strength of the association between the predictors ( ’ ) d th i bl Y(x’s) and the one response variable Y

9

More Statistics tutorial at www.dumblittledoctor.com

Page 9: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Multiple Regression Model in Matrix Notationp g

The multiple regression model can be represented in a compact form using matrix notationLet:

1

2 ,

YY

Y

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥M

1

2 ,

yy

y

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥M

1

2

εε

ε

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥M

nY⎢ ⎥⎢ ⎥⎣ ⎦

M

n

y

y⎢ ⎥⎢ ⎥⎣ ⎦

M

nε⎢ ⎥⎢ ⎥⎣ ⎦

M

be the n x 1 vectors of the r.v.’s , their observed values , and random errors , ti l f ll b ti

'iY s 'iy s 'i sεrespectively for all n observationsLet:

11 1

21 2

11

k

k

x xx x

X

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥

L

L

M M O M

11 n nkx x⎢ ⎥⎢ ⎥⎣ ⎦

M M O M

L

be the n x (k + 1) matrix of the values of the predictor variables for all n observationsbe the n x (k + 1) matrix of the values of the predictor variables for all n observations(the first column corresponds to the constant term )

0β10

More Statistics tutorial at www.dumblittledoctor.com

Page 10: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Let: 0

1

ββ

⎡ ⎤⎢ ⎥⎢ ⎥ and

0

1

ˆ

ˆˆ

β

ββ

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥1

k

ββ

β

⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

M

and

ˆk

β

β

= ⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

M

be the (k + 1) x 1 vectors of unknown model parameters and their LS estimates, respectively

• The model can be rewritten as:

Y X βY X β ε= +• The simultaneous linear equations whose solutions yields the LS estimates:

' 'X X X yβ =X X X yβ =

• If the inverse of the matrix exists, then the solution is given by:'X X

ˆ 1ˆ ( ' ) 'X X X yβ −=

11

More Statistics tutorial at www.dumblittledoctor.com

Page 11: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

22.Statistical InferenceStatistical Inference

12

More Statistics tutorial at www.dumblittledoctor.com

Page 12: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Determining the statistical significance of the di t i blpredictor variables:

• For statistical inferences, we need the assumption that

• We test the hypotheses: H0 j : β j = 0 H1 j : β j ≠ 0vs.( )2~ 0,

iid

i Nε σ *i.i.d. means independent & identically distributed

• If we can’t reject H0 j : Bj = 0 , then the corresponding variable x j is not a significant predictor of y.j is not a significant predictor of y.

• It is easily shown that each jβ̂ is normal with mean β j

and variance σ 2v jj , where is the jth diagonal entry of the matrix V = (X ' X)−1

v jj

13

More Statistics tutorial at www.dumblittledoctor.com

Page 13: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Deriving a pivotal quantity for the inference βon

),(~ˆ 2jjjj vN σββ• Recall

β j

• The unbiased estimator of the unknown error varianceis given by

σ 2

S2 =SSE

n − (k +1)=

∑ei2

n − (k +1)=

MSEd.o. f .

• We also know that2

2( 1)2 2

( ( 1)) ~ n kn k S SSEW χ

σ σ − +

− += = , and that

S2and jβ̂ are statistically independent.

• Withˆ

~ (0,1),j j

jj

Z Nv

β βσ

−= and by the definition of the t-distribution,

ˆj jZT T

β β−= =

we obtain the pivotal quantity for the inference on jβ

( 1)~/ ( 1) n k

jj

T TW n k S v − += =

− +14

Page 14: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Derivation of the Confidence Interval for β j

ββ −ˆjj α

ββαα −=≤≤− +−+− 1)( )1(,2/)1(,2/ kn

jj

jjkn t

vStP

αβββ αα −=+≤≤− +−+− 1)ˆˆ( )1(,2/)1(,2/ jjknjjjjknj vstvstP

Thus the 100(1-α)% confidence interval for is: β j

/ 2, ( 1)垐 ( )j n k jt SEαβ β− +±

where jjj vsSE =)ˆ(β15

Page 15: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Derivation of the Hypothesis Test for β jat the significance level αHypotheses: 0 : 0jH β =

j

yp: 0

j

a jH β ≠

The test statistic is: 0ˆ 0 HjT T

β −The test statistic is: 0 ( 1)~j

n kjj

T TS v − +=

The decision rule of the test is derived based on the Type I

P (Reject H0 | H0 is true) =

t

error rate α. That is0( )P T c α≥ =

/ 2, ( 1)n kc tα − +=

Therefore, we reject H0 at the significance level α if and only if h i th b d l f≥ t Tif , where is the observed value of 0 / 2, ( 1)n kt tα − +≥

16

0t 0T

Page 16: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Another Hypothesis TestAnother Hypothesis Test

f llH β 0 0 kNow consider:

for allfor at least one

H0 : β j = 0Ha : β j ≠ 0

0 ≤ j ≤ k0 ≤ j ≤ k

When H0 is true, the test statistics 0 , ( 1)~ k n kMSRF fMSE − +=

• An alternative and equivalent way to make a decision for a statisticaltest is through the p-value, defined as:

p = P(observe a test statistic value at least as extreme as the one observed

• At the significance level α we reject H0 if and only if p < α

0| )H

At the significance level α, we reject H0 if and only if p < α

17

Page 17: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

The General Hypothesis TestThe General Hypothesis Test

• Consider the full model:

β β β

• Now consider a partial model:

Yi = β0 + β1xi1 + ...+ βk xik + εi (i=1,2,…n)

Yi = β0 + β1xi1 + ...+ βk−m xi,k−m + εi

H : β = = β = 0 vs H : β ≠ 0

(i=1,2,…n)

H th H0 : βk−m +1 = ... = βk = 0 vs. Ha : β j ≠ 0

for at least one k − m +1 ≤ j ≤ k• Hypotheses:

• Test statistic: 0 , ( 1)( ) / ~

/[ ( 1)]k m k

m n kk

SSE SSE mF fSSE n k

−− +

−=

− +

• Reject H0 when 0 , ( 1),m n kF f α− +> 18

Page 18: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Estimating and Predicting Future ObservationsEstimating and Predicting Future Observations

• Let x* = (x0*,x1

*,...,xk*)' and let

* * * * 0 1 1垐 垐ˆ ... k kY x x xβ β β β= + + + =

* * * *

Th i t l tit f i* *ˆ

T Tμ μ−=

* * * * 0 1 1 ... k kx x xμ β β β β= + + + =

*• The pivotal quantity for is ( 1)* *~ n kT T

s x Vx− +=

• Using this pivotal quantity, we can derive a CI for the estimated

μ

mean μ*: * *2/),1(

*ˆ Vxxst kn αμ +−±

• Additionally, we can derive a prediction interval (PI) to predict Y*:Additionally, we can derive a prediction interval (PI) to predict Y :* *

2/),1(* 1ˆ VxxstY kn +± +− α 19

Page 19: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

3. T i i R iTopics in Regression

ModelingModeling

20

More Statistics tutorial at www.dumblittledoctor.com

Page 20: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

3.1 Multicollinearity3.1 Multicollinearity

Definition The predictor variables areDefinition. The predictor variables are linearly dependent.

This can cause serious numerical and t ti ti l diffi lti i fitti thstatistical difficulties in fitting the

regression model unless “extra” predictor i bl d l t dvariables are deleted.

21

Page 21: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

How does the multicollinearity cause diffi lti ?difficulties?

( ) ( ) invertable be must Thus , equation the to solution the is ^

ββ XXyXXX TTT =( ) ( ).computable and unique be to for order in

If th i t lti lli it hIf the approximate multicollinearity happens:

1. is nearly singular, which makes numerically unstable. This reflected in large changes in their magnitudes with small changes in data.

TX X^β

2. The matrix has very large elements. Therefore are large, which makes statistically nonsignificant.

1)( −= XXV TjjvVar 2

^)( σβ =

j

^βg , y gjβ

22

More Statistics tutorial at www.dumblittledoctor.com

Page 22: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Measures of MulticollinearityMeasures of Multicollinearity

1. The correlation matrix R.1. The correlation matrix R.Easy but can’t reflect linear relationships between more than two variables.

2. Determinant of R can be used as measurement of singularity of .TX X

3. Variance Inflation Factors (VIF): the diagonal elements of . VIF>10 is regarded as

bl1R−

unacceptable.

23

Page 23: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

3.2 Polynomial Regression3.2 Polynomial Regression

A special case of a linear model:p

0 1 ... kky x xβ β β ε= + + + +

Problems:1. The powers of x, i.e., tend to be highly

correlated

2, , , kx x xKcorrelated.

2. If k is large, the magnitudes of these powers tend to vary over a rather wide range.g

So, set k<=3 if a good idea, and almost never use k>5.

24

More Statistics tutorial at www.dumblittledoctor.com

Page 24: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

SolutionsSolutions1. Centering the x-variable:

* * *( ) ( ) ky x x x xβ β β ε− −

= + − + + − +

Effect: removing the non-essential multicollinearity in the data.

0 1 ( ) ... ( )ky x x x xβ β β ε= + + + +

2. Further more, we can standardize the data by dividing the standard deviation of x: −⎛ ⎞s

Eff t h l i t ll i t th d bl

x

x xs

⎛ ⎞−⎜ ⎟⎜ ⎟⎝ ⎠

xs

Effect: helping to alleviate the second problem.

3. Using the first few principal components of the original fvariables instead of the original variables.

25

More Statistics tutorial at www.dumblittledoctor.com

Page 25: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

3.3 Dummy Predictor Variables & The G l Li M d lGeneral Linear Model

How to handle the categorical predictorHow to handle the categorical predictor variables?

1. If we have categories of an ordinal i bl h th i fvariable, such as the prognosis of a

patient (poor, average, good), one can i i l t thassign numerical scores to the

categories. (poor=1, average=2, good=3)

26

More Statistics tutorial at www.dumblittledoctor.com

Page 26: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

2. If we have nominal variable with c>=2 categories. Use c-1 indicator variables,

, called Dummy Variables, to 1 1, , cx x −K

code.

for the ith category, 1ix = 1 1i c≤ ≤ −

for the cth category.1 1... 0cx x −= = =

27

Page 27: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Why don’t we just use c indicator variables:?1 2, , ..., cx x x

If th t th ill b liIf we use that, there will be a linear dependency among them:

1 2 ... 1cx x x+ + + =

This will cause multicollinearity.

28

Page 28: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Example of the dummy variablesExample of the dummy variables

For instance, if we have four years of quarterly sale data , y q yof a certain brand of soda. How can we model the time trend by fitting a multiple regression equation?

Solution: We use quarter as a predictor variable x1. To model the seasonal trend, we use indicator variables x2, x3, x4, for Winter, S SSpring and Summer, respectively. For Fall, all three equal zero. That means: Winter-(1,0,0), Spring-(0,1,0), Summer-(0,0,1), Fall-(0,0,0).Then we have the model:

0 1 1 2 2 3 3 4 4 ( 1,...,16)i i i i i iY x x x x iβ β β β β ε= + + + + + =

29

Page 29: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

3. Once the dummy variables are included, the resulting regression model is referred to as aresulting regression model is referred to as a “General Linear Model”.

This term must be differentiated from that of the “Generalized Linear Model” which include the “General Linear Model” as a special case with the identity link function:

The generalized linear model will link the model

1 1( ) ki i i k ii E Y x x xμ β β β β0 1 2= = + + + ... +

The generalized linear model will link the model parameters to the predictors through a link function. For another example, we will check out the logit link in the logistic regression this afternoon.

30

More Statistics tutorial at www.dumblittledoctor.com

Page 30: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

4. Example4. ExampleHere we revisit the classic regression towards Mediocrity in Hereditary Stature by Francis GaltonMediocrity in Hereditary Stature by Francis GaltonHe performed a simple regression to predict offspring height based on the average parent heightSlope of regression line was less than 1 showing that extremely tall parents had less extremely tall childrenAt the time Galton did not have multiple regression asAt the time, Galton did not have multiple regression as a tool so he had to use other methods to account for the difference between male and female heightsWe can now perform multiple regression on parent-offspring height and use multiple variables as predictorsp

31

More Statistics tutorial at www.dumblittledoctor.com

Page 31: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExample

OUR MODEL: iiiioi xxxY εββββ ++++= 332211

Y = height of childh i ht f f thx1 = height of father

x2 = height of motherx = gender of childx3 = gender of child

32

Page 32: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExampleIn matrix notation:In matrix notation:

XY += εβ

yXXX ')'( 1−∧

We find that: β0 = 15.34476β1 = 0.405978β2 = 0.321495β3 = 5.22595

33

More Statistics tutorial at www.dumblittledoctor.com

Page 33: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExampleFather Mother Gender Child

1 78.5 67 1 73.2

1 78.5 67 1 69.2

1 78.5 67 0 69

1 78.5 67 0 69

1 75.5 66.5 1 73.5X =1 75.5 66.5 1 72.5

1 75.5 66.5 0 65.5

1 75.5 66.5 0 65.5

1 75 64 1 71

1 75 64 0 68

… … … … …

34

More Statistics tutorial at www.dumblittledoctor.com

Page 34: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExampleImportant calculations

162.4149)( 2 =−= ∑∧

yySSE ii

060.515,11)( 2 =−= ∑∑

yySST i

ii

Is the predicted height

ˆi iy X β

=

6397.012 =−=SSTSSEr

p gof each child given a set of predictor variables

64.4)1(

=+−

=kn

SSEMSE)1( +kn

35

More Statistics tutorial at www.dumblittledoctor.com

Page 35: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExampleAre these values significantly different than zero?

Ho: βj = 0H : β ≠ 0

g y

Ha: βj ≠ 0

Reject H0j if 2 245

j

t t tβ

= > = =( 1), / 2 894,0.025 2.245( )

1.63 0.0118 0.0125 0.005960 0118 0 000184 0 0000143 0 0000223

j n k

j

t t tSE

α

β− +∧= > = =

− − −⎡ ⎤⎢ ⎥⎢ ⎥1 0.0118 0.000184 0.0000143 0.0000223

( ' )0.0125 0.0000143 0.000211 0.0000327

0.00596 0.0000223 0.0000327 0.00447

V X X −

− −⎢ ⎥= =⎢ ⎥− −⎢ ⎥−⎣ ⎦

( )jSE β∧

= * jjMSE v36

Page 36: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

ExampleExampleβ-estimate SE t

Intercept 15.3 2.75 5.59*

Father 0.406 0.0292 13.9*HeightMother H i ht

0.321 0.0313 10.3*HeightGender 5.23 0.144 36.3*

* p<0.05. We conclude that all β are significantly different than zero. 37

Page 37: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

EXAMPLEEXAMPLETesting the model as a whole:

Ho: β0 = β1 = β2 = β3 = 0Ha: The above is not true.

22

, ( 1), 3,894,.052

2

[ ( 1)] 2.615(1 )

1 0 6397

k n kr n kF f f

k rSSE

α− +

− += > = =

−Reject H0 if

2

2

1 0.6397

( ) 4149.162i i

rSST

SSE y y∧

= − =

= − =∑2( ) 11,515.060iSST y y= − =∑

Since F = 529.032 >2.615, we reject Ho and conclude that our model predicts height better than by chance.

38

More Statistics tutorial at www.dumblittledoctor.com

Page 38: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

EXAMPLEMaking Predictions

EXAMPLE

Let’s say George Clooney (71 inches) and Madonna (64 inches) have a baby boy.( ) y y* 15.34476 0.405978 (71) 0.321495(64) + 5.225951(1) 69.97Y

= + + =

95% Prediction interval:

**)*'1(** 025,.894 YVxxMSEtY ≤≤+−∧

%

*)*'1(** 025,.894 VxxMSEtY ++∧

69.97 ± 4.84 = (65.13, 74.81)39

More Statistics tutorial at www.dumblittledoctor.com

Page 39: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

EXAMPLESAS code

EXAMPLE

ods graphics ondata revise;

set mylib.galton;

SAS code

set mylib.galton;if sex = 'M' then gender = 1.0;else gender = 0.0;

run;

proc reg data=revise;title "Dependence of Child Heights on Parental

Heights";g ;model height = father mother gender / vif;run;quit;

Alternatively one can use proc GLM procedure that canAlternatively, one can use proc GLM procedure that can incorporate the categorical variable (sex) directly via the class statement. 40

More Statistics tutorial at www.dumblittledoctor.com

Page 40: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Dependence of Child Heights on Parental Heights 2

The REG ProcedureModel: MODEL1

Dependent Variable: height height

Number of Observations Read 898Number of Observations Used 898

Analysis of Variance

Sum of MeanSum of MeanSource DF Squares Square F Value Pr > F

Model 3 7365.90034 2455.30011 529.03 <.0001Error 894 4149.16204 4.64112Corrected Total 897 11515

Root MSE 2.15433 R-Square 0.6397Dependent Mean 66.76069 Adj R-Sq 0.6385Coeff Var 3.22694

Parameter Estimates

Parameter StandardVariable Label DF Estimate Error t Value Pr > |t|

Intercept Intercept 1 15.34476 2.74696 5.59 <.0001father father 1 0.40598 0.02921 13.90 <.0001mother mother 1 0.32150 0.03128 10.28 <.0001gender 1 5.22595 0.14401 36.29 <.0001gender 1 5.22595 0.14401 36.29 .0001

Parameter Estimates

VarianceVariable Label DF Inflation

Intercept Intercept 1 0p pfather father 1 1.00607mother mother 1 1.00660gender 1 1.00188

41

Page 41: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

EXAMPLEEXAMPLE

42

Page 42: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

EXAMPLEEXAMPLE

By Gary Bedford &Bedford &Christine Vendikos 43

Page 43: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

5. Variables Selection Method

A. Stepwise RegressionA. Stepwise Regression

44

Page 44: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Variables selection methodVariables selection method

(1) Why do we need to select the variables?(1) Why do we need to select the variables?

(2) How do we select variables?(2) How do we select variables?* stepwise regression* best subset regression best subset regression

45

Page 45: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Stepwise RegressionStepwise Regression

(p-1)-variable model:(p 1) variable model:

ipipii xxY εβββ ++++= −− 1,11,10 ...

P-variable model

ipippipii xxxY εββββ +++++= −− ,1,11,10 ... pppp

46

Page 46: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

test-F Partial

0 testHypothesis

H β

0

0

:1

:0

=

pp

pp

H

H

β

β

:1 pp

)1(11

)]1(/[1/)(

α+−− >

−= pn

ppp f

SSESSESSE

F ),1(,1

:

)]1(/[ α

β

++−

p

pnp

p

tstatistictest

fpnSSE

|:|

)(:

β

>

=p

pp

ttHreject

SEtstatistictest

2/),1(0 |:| α+−> pnpp ttHreject47

Page 47: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Partial correlation coefficientsPartial correlation coefficients

11

1111|

21...1 )...(

)...()...(−

−=

−=

−−pp xxSSE

xxSSExxSSESSE

SSESSEr

p

pp

p

ppxxyx

2

2

2 1...1|

1

)]1([−

+−== pxxpyx

r

pnrtF pp

1...1|1

−−

pxxpyxr

48

Page 48: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

5. Variables selection method

A. Stepwise Regression:A. Stepwise Regression: SAS Example

49

More Statistics tutorial at www.dumblittledoctor.com

Page 49: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Example 11.5 (T&D pg. 416), 11.9 (T&D pg. 431)The following table shows data on the heat evolved in calories during the hardening of

No. X1 X2 X3 X4 Y

cement on a per gram basis (y) along with the percentages of four ingredients: tricalcium aluminate (x1), tricalcium silicate (x2), tetracalcium alumino ferrite (x3), and dicalcium silicate (x4).

1 7 26 6 60 78.52 1 29 15 52 74.33 11 56 8 20 104.34 11 31 8 47 87.65 7 52 6 33 95.96 11 55 9 22 109.26 11 55 9 22 109.27 3 71 17 6 102.78 1 31 22 44 72.59 2 54 18 22 93 19 2 54 18 22 93.110 21 47 4 26 115911 1 40 23 34 83.812 11 66 9 12 113 312 11 66 9 12 113.313 10 68 8 12 109.4

50

Page 50: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

SAS Program (stepwise variable selection is used)

data example115;data example115;input x1 x2 x3 x4 y;datalines;7 26 6 60 78.51 29 15 52 74 31 29 15 52 74.311 56 8 20 104.311 31 8 47 87.67 52 6 33 95.911 55 9 22 109.23 71 17 6 102.71 31 22 44 72.52 54 18 22 93.12 54 18 22 93.121 47 4 26 115.91 40 23 34 83.811 66 9 12 113.310 68 8 12 109 410 68 8 12 109.4;run;proc reg data=example115;

model y = x1 x2 x3 x4 /selection=stepwise;run; 51

Page 51: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Selected SAS outputSelected SAS output

The SAS System 22:10 Monday, November 26, 2006 3

The REG ProcedureModel: MODEL1

Dependent Variable: y

Stepwise Selection: Step 4

Parameter StandardVariable Estimate Error Type II SS FVariable Estimate Error Type II SS F Value Pr > F

Intercept 52.57735 2.28617 3062.60416 528.91 <.0001x1 1.46831 0.12130 848.43186 146.52 <.0001x2 0.66225 0.04585 1207.78227 208.58 <.0001

Bounds on condition number: 1.0551, 4.2205-------------------------------------------------------------------------------------

---------------

52

Page 52: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

SAS Output (cont)SAS Output (cont)All variables left in the model are significant at the 0.1500 level.

No other variable met the 0.1500 significance level for entry into the model.

Summary of Stepwise SelectionSummary of Stepwise Selection

Variable Variable Number Partial ModelStep Entered Removed Vars In R-Square R-Square C(p) F Value Pr > F

1 x4 1 0.6745 0.6745 138.731 22.80 0.00062 x1 2 0.2979 0.9725 5.4959 108.22 <.00013 x2 3 0.0099 0.9823 3.0182 5.03 0.05174 x4 2 0 0037 0 9787 2 6782 1 86 0 20544 x4 2 0.0037 0.9787 2.6782 1.86 0.2054

53

More Statistics tutorial at www.dumblittledoctor.com

Page 53: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

5. Variables selection method

B. Best Subsets RegressionB. Best Subsets Regression

54

More Statistics tutorial at www.dumblittledoctor.com

Page 54: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Best Subsets RegressionBest Subsets Regression

For the stepwise regression algorithmFor the stepwise regression algorithmThe final model is not guaranteed to be optimal in any specified sensein any specified sense.

In the best subsets regressionIn the best subsets regression, subset of variables is chosen from the collection of all subsets of k predictor variables) thatof all subsets of k predictor variables) that optimizes a well-defined objective criterion

55

More Statistics tutorial at www.dumblittledoctor.com

Page 55: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Best Subsets RegressionBest Subsets Regression

In the stepwise regressionIn the stepwise regression,We get only one single final models.

In the best subsets regression, f fThe investor could specify a size for the

predictors for the model.

56

Page 56: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Best Subsets RegressionBest Subsets Regression

SSESSROptimality Criteria

SSTSSE

SSTSSR

r ppp −== 12• rp

2-Criterion:

MSE

C Criterion ( d d f it f t ti d it bilit

• Adjusted rp2-Criterion:

MSTMSE

r ppadj −= 12

,

∑Γn

YEYE 2]][]ˆ[[1

• Cp-Criterion (recommended for its ease of computation and its ability to judge the predictive power of a model)

∑=

−=Γi

iipp YEYE1

2 ]][][[σ

The sample estimator, Mallows’ Cp-statistic, is given by

npSSE

C pp −++= )1(2

ˆ 2σ 57

More Statistics tutorial at www.dumblittledoctor.com

Page 57: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Best Subsets RegressionBest Subsets Regression

AlgorithmAlgorithmNote that our problem is to find the minimum of a given function.

Use the stepwise subsets regression algorithm p g gand replace the partial F criterion with other criterion such as Cp.p

Enumerate all possible cases and find the minimum of the criterion functions.Other possibility?

58

More Statistics tutorial at www.dumblittledoctor.com

Page 58: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Best Subsets Regression & SASBest Subsets Regression & SAS

proc reg data=example115;model y = x1 x2 x3 x4 /selection=ADJRSQ;

run;

For the selection option SAS has implemented 9 methodsFor the selection option, SAS has implemented 9 methods in total. For best subset method, we have the following options:Maximum R2 Improvement (MAXR)Minimum R2 (MINR) ImprovementR2 Selection (RSQUARE)( Q )Adjusted R2 Selection (ADJRSQ)Mallows' Cp Selection (CP)

59

More Statistics tutorial at www.dumblittledoctor.com

Page 59: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

6. Building A Multiple Regression ModelRegression Model

Steps and StrategySteps and Strategy

60

More Statistics tutorial at www.dumblittledoctor.com

Page 60: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Modeling is an iterative process SeveralModeling is an iterative process. Several cycles of the steps maybe needed before arriving at the final modelarriving at the final model.The basic process consists of seven steps

61

More Statistics tutorial at www.dumblittledoctor.com

Page 61: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Get started and Follow the StepsGet started and Follow the Steps

Categorization by Usage Collect the Datag y g

Divide the Data Explore the Data

Fit Candidate Models

Select and Evaluate

Select the Final ModelSelect the Final Model62

More Statistics tutorial at www.dumblittledoctor.com

Page 62: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step IStep I

Decide the type of model neededDecide the type of model needed, according to different usage.Main categories include:Main categories include:PredictiveTheoreticalControlInferentialData Summary

• Sometimes, models are involved in multiple purposes. p p p

63

More Statistics tutorial at www.dumblittledoctor.com

Page 63: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step IIStep II

Collect the DataCollect the DataPredictor (X)R (Y)Response (Y)

Data should be relevant and bias-free

64

More Statistics tutorial at www.dumblittledoctor.com

Page 64: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step IIIStep III

Explore the DataExplore the Data

Linear Regression Model is sensitive to theLinear Regression Model is sensitive to the noise. Thus, we should treat outliers and influential observations cautiously.influential observations cautiously.

65

More Statistics tutorial at www.dumblittledoctor.com

Page 65: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step IVStep IV

Divide the DataDivide the DataTraining Sets: buildingTest Sets: checkingTest Sets: checking

How to divide?Large sample Half-Half

Small sample size of training set >16Small sample size of training set 16

66

More Statistics tutorial at www.dumblittledoctor.com

Page 66: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step VStep V

Fit several Candidate ModelsFit several Candidate ModelsUsing Training Set.

67

More Statistics tutorial at www.dumblittledoctor.com

Page 67: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step VIStep VI

Select and Evaluate a Good ModelSelect and Evaluate a Good ModelTo improve the violations of model assumptions.

68

More Statistics tutorial at www.dumblittledoctor.com

Page 68: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Step VIIStep VII

Select the Final ModelSelect the Final ModelUse test set to compare competing models by cross validating themmodels by cross-validating them.

69

More Statistics tutorial at www.dumblittledoctor.com

Page 69: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Regression Diagnostics (Step VI)Regression Diagnostics (Step VI)

Graphical Analysis of ResidualsGraphical Analysis of ResidualsPlot Estimated Errors vs. Xi Values

Difference Between Actual Yi & Predicted YiDifference Between Actual Yi & Predicted Yi

Estimated Errors Are Called ResidualsPlot Histogram or Stem-&-Leaf of Residualsg

PurposesExamine Functional Form (Linearity )Examine Functional Form (Linearity )Evaluate Violations of Assumptions

70

More Statistics tutorial at www.dumblittledoctor.com

Page 70: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Linear Regression AssumptionsLinear Regression Assumptions

Mean of Probability Distribution of Error IsMean of Probability Distribution of Error Is 0Probability Distribution of Error HasProbability Distribution of Error Has Constant VarianceP b bilit Di t ib ti f E i N lProbability Distribution of Error is NormalErrors Are Independent

71

More Statistics tutorial at www.dumblittledoctor.com

Page 71: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Residual Plot f F ti l F (Li it )for Functional Form (Linearity)

Add X^2 TermAdd X^2 Term Correct SpecificationCorrect Specification

e e

X XX X

72

More Statistics tutorial at www.dumblittledoctor.com

Page 72: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Residual Plot f E l V ifor Equal Variance

Unequal VarianceUnequal Variance Correct SpecificationCorrect Specification

SR SR

X XX XFanFan--shaped.shaped.Standardized residuals used typically (residual Standardized residuals used typically (residual

divided by standard error of prediction)divided by standard error of prediction)73

More Statistics tutorial at www.dumblittledoctor.com

Page 73: Generalized linear regression - Little Dumb doctor to Multiple Linear Regression 3 tics tutorial at Multiple Linear RegressionMultiple Linear Regression • Regression analysis is

Residual Plot f I d dfor Independence

SR SR

Not IndependentNot Independent Correct SpecificationCorrect Specification

SR SR

X X

74

More Statistics tutorial at www.dumblittledoctor.com


Top Related