chapter 12 inference for linear regression

31
Chapter 12 Chapter 12 Inference for Linear Inference for Linear Regression Regression 12.1a 12.1a h.w: pg h.w: pg 759: 1 – 11 odd 759: 1 – 11 odd Target Goals: Target Goals: I can make predictions using I can make predictions using regression for normal regression for normal distributions. distributions. I can check conditions for I can check conditions for performing inference about the performing inference about the slope β of the population (true) slope β of the population (true) regression line. regression line.

Upload: tallys

Post on 07-Feb-2016

72 views

Category:

Documents


0 download

DESCRIPTION

12.1a h.w : pg 759: 1 – 11 odd. Chapter 12 Inference for Linear Regression. Target Goals: I can make predictions using regression for normal distributions. I can check conditions for performing inference about the slope β of the population (true) regression line. Inference about the Model. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 12 Inference for Linear Regression

Chapter 12Chapter 12Inference for Linear Inference for Linear

RegressionRegression

12.1a12.1ah.w: pg h.w: pg 759: 1 – 11 odd759: 1 – 11 odd

Target Goals: Target Goals: I can make predictions using regression for normal I can make predictions using regression for normal distributions.distributions.I can check conditions for performing inference I can check conditions for performing inference about the slope β of the population (true) regression about the slope β of the population (true) regression line.line.

Page 2: Chapter 12 Inference for Linear Regression

Inference about the Inference about the ModelModel

We can use We can use LSRLLSRL fitted to data to fitted to data to predict y for a given value of xpredict y for a given value of x for for two quantitative variables. two quantitative variables.

Now we will do Now we will do teststests and construct and construct confidence intervalsconfidence intervals in this setting. in this setting.

Page 3: Chapter 12 Inference for Linear Regression

Pg. 752

Page 4: Chapter 12 Inference for Linear Regression

Ex. Ex. Crying and IQCrying and IQ Infants who Infants who cry easilycry easily may be more may be more

easily stimulated than others and easily stimulated than others and this may be a this may be a sign of higher IQ.sign of higher IQ.

The researchers snapped a rubber The researchers snapped a rubber band on the sole of the foot of band on the sole of the foot of infants and caused the infants to infants and caused the infants to cry.cry.

At age 3 years the measured IQ.At age 3 years the measured IQ.

Page 5: Chapter 12 Inference for Linear Regression

Step 1: Make a scatterplot of Step 1: Make a scatterplot of the data. the data.

Explanatory variable: Explanatory variable: CryingCrying Response variable: Response variable: IQIQ Enter “crying” data into L1 and “IQ” data into Enter “crying” data into L1 and “IQ” data into

L2.L2. Plot and Interpret. Plot and Interpret. STAT:CALC:LinReg(a+bx) L1,L2,Y1STAT:CALC:LinReg(a+bx) L1,L2,Y1

Y1:(VARS:Y-VARS:FUNCT:Y1)Y1:(VARS:Y-VARS:FUNCT:Y1) Scatterplot shows a Scatterplot shows a roughly linear patternroughly linear pattern. . The correlation r describes the The correlation r describes the direction and direction and

strength of the relationship. strength of the relationship.

Page 6: Chapter 12 Inference for Linear Regression

Step 2: Calculate the LSRLStep 2: Calculate the LSRL

Page 7: Chapter 12 Inference for Linear Regression

Step 3: Identify outliers and Step 3: Identify outliers and influential pointsinfluential points

Influential pointsInfluential points OutliersOutliers

• No extreme outliersNo extreme outliers or potentially or potentially influential observations.influential observations.

Page 8: Chapter 12 Inference for Linear Regression

Step 4: Calculate the Step 4: Calculate the Correlation Correlation (r value)(r value)

The correlation between crying and The correlation between crying and IQ is IQ is r = 0.455.r = 0.455. ˆ 91.27 1.493ˆ regression line

reminds u gives pre

s that tdictions of IQ

e h.

y xy

Page 9: Chapter 12 Inference for Linear Regression

Interpret rInterpret r22 = = 0.2070.207, , only about 21% of only about 21% of the variation in IQ the variation in IQ

scores (response variable) is explained scores (response variable) is explained by crying intensity.by crying intensity.

rr2 2 is called the is called the coefficient of coefficient of determination.determination.

Is prediction of IQ accurate with this Is prediction of IQ accurate with this model? model? NoNo

Page 10: Chapter 12 Inference for Linear Regression

It is interesting though that It is interesting though that behavior shortly after birth can behavior shortly after birth can partly predictpartly predict IQIQ..

Page 11: Chapter 12 Inference for Linear Regression

Conditions for Regression Conditions for Regression InferenceInference

3 SRSs of 20 Old Faithful Eruptions

The values of the slope b for the 1000 sample regression lines are plotted.

How long it will take before Old Faithful erupts again based on the duration of the previous eruption.

Page 12: Chapter 12 Inference for Linear Regression

Pg. 742

Page 13: Chapter 12 Inference for Linear Regression

Conditions for Regression Conditions for Regression InferenceInference

Our goal is to Our goal is to predict the behavior predict the behavior of yof y for a given value of x. for a given value of x.

1)1) Linear: Linear: The The y responsesy responses for for various samplesvarious samples vary vary according to according to a a normal distribution.normal distribution.

The The mean response μmean response μyy has a has a straight-straight-lineline relationship with x. relationship with x.

The The true regression linetrue regression line is written in is written in the form: the form:

y x

Page 14: Chapter 12 Inference for Linear Regression

where where μμyy is the is the mean response,mean response, and is the and is the true y-intercepttrue y-intercept and and ββ is is the true slope.the true slope.

y x

Page 15: Chapter 12 Inference for Linear Regression

2)2) Independent: Independent: The The y responses y responses are are independent independent of each other.of each other.

3)3) Normal: Normal: for any fixed value of x, for any fixed value of x, the the observed response value yobserved response value y varies according to a varies according to a normal normal distributiondistribution having mean μ having mean μyy..

Page 16: Chapter 12 Inference for Linear Regression

4)4) Equal Variance: Equal Variance: The The standard standard deviation sdeviation s about the true about the true regression line is regression line is the same for all the same for all values of x. (constant).values of x. (constant).

It is usually an unknown parameter.It is usually an unknown parameter.5)5) Random:Random: The data come from a The data come from a

well designed random sample or well designed random sample or randomized experiment.randomized experiment.

Page 17: Chapter 12 Inference for Linear Regression

LLinearinear IIndependentndependent NNormalormal EEqual qual

VarianceVariance RRandomandom

Page 18: Chapter 12 Inference for Linear Regression

The The LSRL :LSRL : = = aa + + b b xx where where bb is an is an unbiased estimator of the true unbiased estimator of the true slope slope ββ and and aa is the unbiased estimator of the is the unbiased estimator of the true true interceptintercept . .

y

Page 19: Chapter 12 Inference for Linear Regression

The line is the The line is the true regression linetrue regression line, , which which showsshows how how the mean the mean response μresponse μyy changeschanges as the as the explanatory variable x explanatory variable x changeschanges..

Page 20: Chapter 12 Inference for Linear Regression

Standard DeviationStandard Deviation σ determines σ determines whether the points whether the points fall fall

close to the true regression line (small σ)close to the true regression line (small σ) or are or are widely scatteredwidely scattered (large σ). (large σ).

This is also the size of a This is also the size of a typical typical prediction errorprediction error if we use the least- if we use the least-squares regression line to predict squares regression line to predict “how “how long it will take before Old Faithful erupts long it will take before Old Faithful erupts again” based on the duration of the again” based on the duration of the previous eruption.previous eruption.

Page 21: Chapter 12 Inference for Linear Regression

Ex: Slope and InterceptEx: Slope and Intercept The LSRL is The LSRL is = 91.27 + 1.493x= 91.27 + 1.493x

The The slope measures rate of changeslope measures rate of change: : how much how much higher average IQ higher average IQ is for is for children with children with one more peakone more peak in their in their crying measurements. crying measurements.

bb est. the unknown est. the unknown ββ; we est. that on ; we est. that on the average IQ is about the average IQ is about 1.5 points 1.5 points higher for each additional crying peak.higher for each additional crying peak.

yIQ crying peak

Page 22: Chapter 12 Inference for Linear Regression

Standard DeviationStandard Deviation σσ describes the describes the variability of the variability of the

response yresponse y about the true regression about the true regression line.line.

Recall that Recall that residualsresiduals estimate estimate how how much much yy varies about the true line varies about the true line and are the and are the vertical deviationsvertical deviations of the of the data points from the least-square data points from the least-square line:line:

Residual = observed y – predicted yResidual = observed y – predicted y

Page 23: Chapter 12 Inference for Linear Regression

Standard Error about the Standard Error about the LSRLLSRL

We We estimate estimate σσ with with s s, the sample standard , the sample standard deviation, which is also called the deviation, which is also called the standard standard errorerror (this is the key to inference about the (this is the key to inference about the regression).regression).

Since Since σσ is unknown is unknown, we , we use suse s to estimate to estimate the value of the value of σσ. .

Note:Note: (n – 2) is the degrees of freedom(n – 2) is the degrees of freedom for for the regression model. the regression model.

2

2RESID

sn

Page 24: Chapter 12 Inference for Linear Regression

Ex. Calculating Residuals and Ex. Calculating Residuals and Standard ErrorStandard Error

The quickest way to do this is to: (use ex 14.1 The quickest way to do this is to: (use ex 14.1 data).data).

Enter “crying” data into L1 and “IQ” Enter “crying” data into L1 and “IQ” data into L2. data into L2. (We already did this.)(We already did this.)

Recall: LINREG (a+bx)Recall: LINREG (a+bx) automatically automatically calculates the residuals and stores them calculates the residuals and stores them in in “Resid.”“Resid.”

Store “Resid” in L3Store “Resid” in L3 STAT:CALC:STAT:CALC:1-Var Stats1-Var Stats L3L3∑∑ resid2

Page 25: Chapter 12 Inference for Linear Regression

To find s, first find sTo find s, first find s22:: To find sTo find s22::• Enter the value of ∑XEnter the value of ∑X22 by hand or by hand or (VARS:5: : (VARS:5: :

∑X∑X22 ) ) and divide by (n-2) and divide by (n-2)

Take sqrt to find s.Take sqrt to find s.

2

2RESID

sn

Page 26: Chapter 12 Inference for Linear Regression

A level C A level C confidence intervalconfidence interval for for the slope b of the true regression the slope b of the true regression

line isline is

* bb t SE

2bsSEx x

Page 27: Chapter 12 Inference for Linear Regression

You will rarely have to calculate You will rarely have to calculate this by hand. this by hand.

Regression software gives youRegression software gives you the standard error the standard error SE SE bb and and bb itself.itself.

Page 28: Chapter 12 Inference for Linear Regression

ExEx. Regression Output: . Regression Output: Crying and IQCrying and IQ

Statistic

Page 29: Chapter 12 Inference for Linear Regression

There are 38 data points so There are 38 data points so df = n – 2df = n – 2 = 36. = 36.

Find the critical value Find the critical value t* (critical value).t* (critical value).For a 95% C.I. for true slope b, use For a 95% C.I. for true slope b, use critical value critical value t* = 2.042t* = 2.042 with df =30 with df =30 from table C.from table C.

* bb t SE .48701.4929 2.0420.4985 to 2.4873

Do

Page 30: Chapter 12 Inference for Linear Regression

ConcludeConclude We are 95 % confident that We are 95 % confident that

mean IQ increases by, between mean IQ increases by, between 0.5 and 2.50.5 and 2.5 points,points, for each for each additional peak in crying.additional peak in crying.

Page 31: Chapter 12 Inference for Linear Regression

Interpret SEInterpret SEbb

SeSeb b estimates how much the slope of the estimates how much the slope of the sample regression line typically varies sample regression line typically varies from the from the slope of the population (true) regression line slope of the population (true) regression line if if we repeat the data production process many we repeat the data production process many times.times.

If If we repeated the experiment many timeswe repeated the experiment many times, , the the slope the slope of the sample regression line slope the slope of the sample regression line would typically vary by about .4870 would typically vary by about .4870 from the from the slope of the true regression line for predicting slope of the true regression line for predicting IQ IQ from cry count of infants.from cry count of infants.