lecture 11 - matrix approach to linear regression

36
Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 1 Matrix Approach to Linear Regression Dr. Frank Wood

Upload: others

Post on 16-Oct-2021

18 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 1

Matrix Approach to Linear

Regression

Dr. Frank Wood

Page 2: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 2

Random Vectors and Matrices

• Let’s say we have a vector consisting of three random variables

The expectation of a random vector is defined

Page 3: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 3

Expectation of a Random Matrix

• The expectation of a random matrix is defined similarly

Page 4: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 4

Covariance Matrix of a Random Vector

• The collection of variances and covariances of and between the elements of a random vector can be collection into a matrix called the covariance matrix

remember

so the covariance matrix is symmetric

Page 5: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 5

Derivation of Covariance Matrix

• In vector terms the covariance matrix is defined by

because

verify first entry

Page 6: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 6

Regression Example

• Take a regression example with n=3 with constant error terms σ{ǫi} = σ and are

uncorrelated so that σ{ǫi, ǫj} = 0 for all i ≠ j

• The covariance matrix for the random vector ǫǫǫǫ

is

which can be written as

Page 7: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 7

Basic Results

• If A is a constant matrix and Y is a random matrix then

is a random matrix

Page 8: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 8

Multivariate Normal Density• Let Y be a vector of p observations

• Let µ be a vector of p means for each of the p observations

Page 9: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 9

Multivariate Normal Density

• Let § be the covariance matrix of Y

• Then the multivariate normal density is given by

Page 10: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 10

Example 2d Multivariate Normal Distribution

-10

-8

-6

-4

-2

0

2

4

6

8

10

-10

-8

-6

-4

-2

0

2

4

6

8

10

0

0.02

0.04

x

y

mvnpdf(

[0 0

], [

10 2

;2 2

])

Run multivariate_normal_plots.m

Page 11: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 11

Matrix Simple Linear Regression

• Nothing new – only matrix formalism for previous results

• Remember the normal error regression model

• This implies

Page 12: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 12

Regression Matrices

• If we identify the following matrices

• We can write the linear regression equations in a compact form

Page 13: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 13

Regression Matrices

• Of course, in the normal regression model the expected value of each of the ǫi’s is zero, we

can write

• This is because

Page 14: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 14

Error Covariance

• Because the error terms are independent and have constant variance σ

Page 15: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 15

Matrix Normal Regression Model

• In matrix terms the normal regression model can be written as

where

and

Page 16: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 16

Least Squares Estimation

• Starting from the normal equations you have derived

we can see that these equations are equivalent to the following matrix operations

with

demonstrate this on board

Page 17: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 17

Estimation

• We can solve this equation

(if the inverse of X’X exists) by the following

and since

we have

Page 18: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 18

Least Squares Solution

• The matrix normal equations can be derived directly from the minimization of

w.r.t. to ββββ

Do this on board.

Page 19: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 19

Fitted Values and Residuals

• Let the vector of the fitted values be

in matrix notation we then have

Page 20: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 20

Hat Matrix – Puts hat on Y

• We can also directly express the fitted values in terms of only the X and Y matrices

and we can further define H, the “hat matrix”

• The hat matrix plans an important role in diagnostics for regression analysis.

write H on board

Page 21: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 21

Hat Matrix Properties

• The hat matrix is symmetric

• The hat matrix is idempotent, i.e.

demonstrate on board

Page 22: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 22

Residuals

• The residuals, like the fitted values of \hat{Y_i} can be expressed as linear combinations of the response variable observations Yi

Page 23: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 23

Covariance of Residuals

• Starting with

we see that

but

which means that

and since I-H is idempotent (check) we have

we can plug in MSE for σ as an estimate

Page 24: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 24

ANOVA

• We can express the ANOVA results in matrix form as well, starting with

where

leaving J is matrix of all ones, do 3x3 example

Page 25: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 25

SSE• Remember

• We have

• Simplified

derive this on board

and this

b

Page 26: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 26

SSR

• It can be shown that

– for instance, remember SSR = SSTO-SSE

write these on board

Page 27: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 27

Tests and Inference

• The ANOVA tests and inferences we can perform are the same as before

• Only the algebraic method of getting the quantities changes

• Matrix notation is a writing short-cut, not a computational shortcut

Page 28: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 28

Quadratic Forms

• The ANOVA sums of squares can be shown to be quadratic forms. An example of a quadratic form is given by

• Note that this can be expressed in matrix notation as (where A is a symmetric matrix)

do on board

Page 29: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 29

Quadratic Forms

• In general, a quadratic form is defined by

A is the matrix of the quadratic form.

• The ANOVA sums SSTO, SSE, and SSR are all quadratic forms.

Page 30: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 30

ANOVA quadratic forms

• Consider the following rexpression of b’X’

• With this it is easy to see that

Page 31: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 31

Inference

• We can derive the sampling variance of the β

vector estimator by remembering that

where A is a constant matrix

which yields

Page 32: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 32

Variance of b

• Since is symmetric we can write

and thus

Page 33: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 33

Variance of b

• Of course this assumes that we know σ. If

we don’t, we, as usual, replace it with the MSE.

Page 34: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 34

Mean Response

• To estimate the mean response we can create the following matrix

• The fit (or prediction) is then

since

Page 35: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 35

Variance of Mean Response

• Is given by

and is arrived at in the same way as for the variance of \beta

• Similarly the estimated variance in matrix notation is given by

Page 36: Lecture 11 - Matrix Approach to Linear Regression

Frank Wood, [email protected] Linear Regression Models Lecture 11, Slide 36

Wrap-Up

• Expectation and variance of random vector and matrices

• Simple linear regression in matrix form

• Next: multiple regression