sta 6857---model selection & exploratory data analysis (§2 ... · arthur berg sta 6857—model...

27
STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3)

Upload: others

Post on 21-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

STA 6857—Model Selection & Exploratory Data Analysis(§2.2 cont., 2.3)

Page 2: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Outline

1 Model Selection

2 Removing the Trend

3 Homework 2c

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 2/ 27

Page 3: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Outline

1 Model Selection

2 Removing the Trend

3 Homework 2c

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 3/ 27

Page 4: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Model Selection with the F-Statistic

Forward Step — Start with a minimal model. Increase the model by addingin the variable which produces the largest F-statistic. Repeat until the noadditional variable added produces a significant F-statistic.Backward Step — Start with a full model. Decrease the model by removingthe variable which gives the smallest contribution. Repeat until allremaining variables are significant.Forward/Backward Step — Increase the model by adding in the variablewhich produces the largest F-statistic. Decrease the model by removingweakest variable as long as its contribution is not significant. Repeat untilthe no additional variables added produces a significant F-statistic and allvariables in the model are significant.

ExampleMinimal Model: xt = β1 + wt

Complete Model: xt = β1 + β2t + β3t2 + β4 sin t + wt

Use the above procedures to find a model in between the minimal and completemodels like:xt = β1 + β4 sin t + wt.

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27

Page 5: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example Using Backward Step with the F-Statistic in R

You may wish to view my slides on Linear Models with R (click).> x<- seq(0, 10, .1)> y <- 1 + x + cos(x) + rnorm(x)> yy <- 1 + x + cos(x)> plot(x, y, lwd = 3)> lines(x, yy, lwd = 3)

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 5/ 27

Page 6: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example in R (cont.)> f1 <- lm(y ~ x + I(x^2) + I(x^3) + sin(x) + cos(x) + exp(x))> anova(f1)

Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

x 1 692.94 692.94 515.5497 < 2.2e-16 ***I(x^2) 1 0.03 0.03 0.0214 0.8841297I(x^3) 1 19.69 19.69 14.6468 0.0002335 ***sin(x) 1 1.42 1.42 1.0552 0.3069485cos(x) 1 11.01 11.01 8.1934 0.0051832 **exp(x) 1 0.03 0.03 0.0192 0.8900193Residuals 94 126.34 1.34---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

> f2 <- lm(y ~ x + I(x^2) + I(x^3) + sin(x) + cos(x))> anova(f2)

Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

x 1 692.94 692.94 520.9277 < 2.2e-16 ***I(x^2) 1 0.03 0.03 0.0216 0.8835280I(x^3) 1 19.69 19.69 14.7996 0.0002164 ***sin(x) 1 1.42 1.42 1.0662 0.3044251cos(x) 1 11.01 11.01 8.2789 0.0049533 **Residuals 95 126.37 1.33---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 6/ 27

Page 7: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example in R (cont.)

> f3 <- lm(y ~ x + I(x^3) + sin(x) + cos(x))> anova(f3)

Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

x 1 692.94 692.94 526.2840 < 2.2e-16 ***I(x^3) 1 0.34 0.34 0.2549 0.6148sin(x) 1 0.16 0.16 0.1231 0.7265cos(x) 1 31.62 31.62 24.0135 3.88e-06 ***Residuals 96 126.40 1.32---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

> f4 <- lm(y ~ x + I(x^3) + cos(x))> anova(f4)

Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

x 1 692.94 692.94 531.1043 < 2.2e-16 ***I(x^3) 1 0.34 0.34 0.2573 0.6132cos(x) 1 31.62 31.62 24.2370 3.492e-06 ***Residuals 97 126.56 1.30---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 7/ 27

Page 8: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example in R (cont.)

> f5 <- lm(y ~ x + cos(x))> anova(f5)

Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

x 1 692.94 692.94 533.564 < 2.2e-16 ***cos(x) 1 31.24 31.24 24.057 3.717e-06 ***Residuals 98 127.27 1.30---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

> lines(x,predict(f5), lwd=3, col="magenta")

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 8/ 27

Page 9: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Estimation of σ2

The unbiased estimate of the variance of the WN, σ2, based on a model with kcoefficients is given by

s2k =

RSSk

n− kThe maximum likelihood estimate of the variance with k coefficients is given by

σ2k =

RSSk

n

From classical estimation of the variance

Unbiased ML Besta

1n−1

∑ni=1(xi − x)2 1

n

∑ni=1(xi − x)2 1

n+1∑n

i=1(xi − x)2

ain terms of MSE and under normality

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 9/ 27

Page 10: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Akaike’s Information Criterion (AIC)

General model selection idea:

Criterion = Model Error︸ ︷︷ ︸think of RSS—make as small as possible

+ Penalty for Model Complexity︸ ︷︷ ︸penalty increases with the number of parameters

Definition (AIC)The AIC for k parameters is defined as

AIC = ln σ2k +

n + 2kn

Definition (AICc — AIC for small sample sizes (small n))The biased corrected AIC (AICc) for k parameters is defined as

AIC = ln σ2k +

n + kn− k − 2

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 10/ 27

Page 11: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Schwarz’s Information Criterion (SIC) aka BayesianInformation Criterion (BIC)

Definition (SIC/BIC)The SIC for k parameters is defined as

SIC = ln σ2k +

k ln nn

The penalty term in the AIC is2kn

The penalty term in the SIC isk ln n

n

The SIC penalizes larger models more severely than the AIC.The SIC is proven to be a statistically consistent model selection methodwhereas the AIC is not.

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 11/ 27

Page 12: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Mallows Cp

Cp statistic is defined as

Cp =RSSp

σ2q− n + 2p

Pick model with smallest Cp with the property that Cp ≈ p.

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 12/ 27

Page 13: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Adjusted R2

R2 statistic is defined as

R2 = 1−RSSp∑(yi − y)2

nn− p

Pick model with largest R2.

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 13/ 27

Page 14: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example from Text

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 14/ 27

Page 15: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Scatterplot Matrix

> pairs(cbind(mort, temp, part))

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 15/ 27

Page 16: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Four Models

Data from 1970 to 1979 in LA.

Mt represents the cardiac mortality.

Tt represents the temperatures.

Pt represents the particulate pollution.

Four proposed models1 Mt = β0 + β1t + wt

2 Mt = β0 + β1t + β2(Tt − T.) + wt

3 Mt = β0 + β1t + β2(Tt − T.) + β3(Tt − T.)2 + wt

4 Mt = β0 + β1t + β2(Tt − T.) + β3(Tt − T.)2 + β4Pt + wt

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 16/ 27

Page 17: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example from Text

> temp = temp-mean(temp)> temp2 = temp^2> trend = time(mort)> fit = lm(mort~ trend + temp + temp2 + part, na.action=NULL)> summary(fit)

Call:lm(formula = mort ~ trend + temp + temp2 + part, na.action = NULL)

Residuals:Min 1Q Median 3Q Max

-19.0760 -4.2153 -0.4878 3.7435 29.2448

Coefficients:Estimate Std. Error t value Pr(>|t|)

(Intercept) 2.831e+03 1.996e+02 14.19 < 2e-16 ***trend -1.396e+00 1.010e-01 -13.82 < 2e-16 ***temp -4.725e-01 3.162e-02 -14.94 < 2e-16 ***temp2 2.259e-02 2.827e-03 7.99 9.26e-15 ***part 2.554e-01 1.886e-02 13.54 < 2e-16 ***---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 6.385 on 503 degrees of freedomMultiple R-Squared: 0.5954, Adjusted R-squared: 0.5922F-statistic: 185 on 4 and 503 DF, p-value: < 2.2e-16

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 17/ 27

Page 18: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example from Text

> summary(aov(fit))

Df Sum Sq Mean Sq F value Pr(>F)trend 1 10666.9 10666.9 261.621 < 2.2e-16 ***temp 1 8606.6 8606.6 211.090 < 2.2e-16 ***temp2 1 3428.7 3428.7 84.094 < 2.2e-16 ***part 1 7476.1 7476.1 183.362 < 2.2e-16 ***Residuals 503 20508.4 40.8---Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 18/ 27

Page 19: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Example from Text

> num = length(mort)> AIC(fit)/num - log(2*pi) # AIC

[1] 4.721732

> AIC(fit, k=log(num))/num - log(2*pi) # BIC

[1] 4.771699

> log(sum(resid(fit)^2)/num)+(num+5)/(num-5-2) # AICc

[1] 4.722062

Model RSS AICc1 40,020 5.382 31,413 5.143 27,985 5.034 20,509 4.72

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 19/ 27

Page 20: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Outline

1 Model Selection

2 Removing the Trend

3 Homework 2c

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 20/ 27

Page 21: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

General Principal

If the time series is given asxt = µt + yt

where µt denotes a trend and yt is stationary, then we estimate the trend touncover the underlying stationary process by considering

yt = xt − µt

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 21/ 27

Page 22: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Detrending Global Temperature

> globtemp = ts(scan("mydata/globtemp.dat"), start=1856)> gtemp=window(globtemp, start=1900)> fit = lm(gtemp~time(gtemp), na.action=NULL)> plot(gtemp, type="o", ylab="Global Temperature Deviation")> abline(fit)

> plot(fit$res)

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 22/ 27

Page 23: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Global Temperature as a Random Walk

Suppose the mean is stochastic (not fixed as in the regression context). We canmodel the mean as a random walk, i.e.

µt = δ + µt−1 + wt

In this case, differencing xt makes it stationary—

∆xt = xt − xt−1 = (µt + yt)− (µt−1 + yt−1) = δ + wt + yt − yt−1

(You will show this is stationary in your homework.)> plot(diff(gtemp), type="o", main="first difference")

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 23/ 27

Page 24: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Backshift and Difference Operators

Definition (Backshift Operator)The backshift operator, B, is defined by

Bxt = xt−1

and repeated backshift operations are represented by

Bkxt = BB · · ·B︸ ︷︷ ︸k times

xt = xt−k

Definition (Difference Operator)The difference operator, ∆, is defined by

∆xt = (1− B)xt = xt − xt−1

and repeated d times is called differences of order d which represented by

∆d = (1− B)(1− B) · · · (1− B)︸ ︷︷ ︸d times

xt

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 24/ 27

Page 25: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Outline

1 Model Selection

2 Removing the Trend

3 Homework 2c

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 25/ 27

Page 26: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Textbook Reading

Read the following sections from the textbook§2.4 (Smoothing in the Time Series Context)

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 26/ 27

Page 27: STA 6857---Model Selection & Exploratory Data Analysis (§2 ... · Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 4/ 27 Model SelectionRemoving

Model Selection Removing the Trend Homework 2c

Textbook Problems

Do the following exercises from the textbook2.6

2.7

Arthur Berg STA 6857—Model Selection & Exploratory Data Analysis (§2.2 cont., 2.3) 27/ 27