john loucks st . edward’s university

1 Slide© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied

or duplicated, or posted to a publicly accessible website, in whole or in part.

John LoucksSt. Edward’sUniversity

...........

SLIDES . BY



Chapter 14, Part ASimple Linear Regression

Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance



Simple Linear Regression

Regression analysis can be used to develop an equation showing how the variables are related.

Managerial decisions often are based on the relationship between two or more variables.

The variables being used to predict the value of the dependent variable are called the independent variables and are denoted by x.

The variable being predicted is called the dependent variable and is denoted by y.




The relationship between the two variables is approximated by a straight line.

Simple linear regression involves one independent variable and one dependent variable.

Regression analysis involving two or more independent variables is called multiple regression.



Simple Linear Regression Model

y = b0 + b1x +e

where: b0 and b1 are called parameters of the model, e is a random variable called the error term.

The simple linear regression model is:

The equation that describes how y is related to x and an error term is called the regression model.



Simple Linear Regression Equation

The simple linear regression equation is:

• E(y) is the expected value of y for a given x value.• b1 is the slope of the regression line.• b0 is the y intercept of the regression line.• Graph of the regression equation is a straight line.

E(y) = b0 + b1x




Positive Linear Relationship

E(y)

x

Slope b1is positive

Regression line

Intercept b0




Negative Linear Relationship

E(y)

x

Slope b1is negative

Regression lineIntercept

b0




No Relationship

E(y)

x

Slope b1is 0

Regression lineIntercept b0



Estimated Simple Linear Regression Equation

The estimated simple linear regression equation

0 1y b b x

• is the estimated value of y for a given x value.y• b1 is the slope of the line.• b0 is the y intercept of the line.• The graph is called the estimated regression line.



Estimation Process

Regression Modely = b0 + b1x +e

Regression EquationE(y) = b0 + b1x

Unknown Parametersb0, b1

Sample Data:x yx1 y1. . . . xn yn

b0 and b1provide estimates of

b0 and b1

EstimatedRegression Equation

Sample Statistics

b0, b1

0 1y b b x



Least Squares Method

Least Squares Criterion

min (y yi i )2

where:yi = observed value of the dependent variable for the ith observation

^yi = estimated value of the dependent variable for the ith observation



Slope for the Estimated Regression Equation

1 2( )( )

( )i i

i

x x y yb

x x


where:xi = value of independent variable for ith observation

_y = mean value for dependent variable

_x = mean value for independent variable

yi = value of dependent variable for ith observation



y-Intercept for the Estimated Regression Equation


0 1b y b x



Reed Auto periodically has a special week-long sale.

As part of the advertising campaign Reed runs one or

more television commercials during the weekendpreceding the sale. Data from a sample of 5

previoussales are shown on the next slide.


Example: Reed Auto Sales




Example: Reed Auto Sales

Number of TV Ads (x)

Number ofCars Sold (y)

13213

1424181727

Sx = 10 Sy = 1002x 20y



Estimated Regression Equation

ˆ 10 5y x

1 2( )( ) 20 5( ) 4

i i

i

x x y yb

x x

0 1 20 5(2) 10b y b x

Slope for the Estimated Regression Equation

y-Intercept for the Estimated Regression Equation

Estimated Regression Equation



Coefficient of Determination

Relationship Among SST, SSR, SSE

where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error

SST = SSR + SSE

2( )iy y 2ˆ( )iy y 2ˆ( )i iy y



The coefficient of determination is:


where:SSR = sum of squares due to regressionSST = total sum of squares

r2 = SSR/SST




r2 = SSR/SST = 100/114 = .8772 The regression relationship is very strong; 87.72%of the variability in the number of cars sold can beexplained by the linear relationship between thenumber of TV ads and the number of cars sold.



Sample Correlation Coefficient

21 ) of(sign rbrxy

ionDeterminat oft Coefficien ) of(sign 1brxy

where: b1 = the slope of the estimated regression equation xbby 10ˆ



21 ) of(sign rbrxy

The sign of b1 in the equation is “+”.ˆ 10 5y x

=+ .8772xyr

Sample Correlation Coefficient

rxy = +.9366



Assumptions About the Error Term e

1. The error e is a random variable with mean of zero.

2. The variance of e , denoted by 2, is the same for all values of the independent variable.

3. The values of e are independent.

4. The error e is a normally distributed random variable.



Testing for Significance

To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of b1 is zero.

Two tests are commonly used:t Test and F Test

Both the t test and F test require an estimate of 2, the variance of e in the regression model.



An Estimate of 2


210

2 )()ˆ(SSE iiii xbbyyy

where:

s 2 = MSE = SSE/(n 2)

The mean square error (MSE) provides the estimateof 2, and the notation s2 is also used.




An Estimate of

2SSEMSE

n

s

• To estimate we take the square root of 2.• The resulting s is called the standard error of the estimate.



Hypotheses

Test Statistic

Testing for Significance: t Test

0 1: 0H b

1: 0aH b

1

1

b

bts

where1 2( )b

i

ssx x

S



Rejection Rule


where: t is based on a t distributionwith n - 2 degrees of freedom

Reject H0 if p-value < or t < -t or t > t



1. Determine the hypotheses.

2. Specify the level of significance.

3. Select the test statistic.

= .05

4. State the rejection rule.Reject H0 if p-value < .05or |t| > 3.182 (with

3 degrees of freedom)


0 1: 0H b

1: 0aH b

1

1

b

bts




5. Compute the value of the test statistic.

6. Determine whether to reject H0.t = 4.541 provides an area of .01 in the uppertail. Hence, the p-value is less than .02. (Also,t = 4.63 > 3.182.) We can reject H0.

1

1 5 4.631.08b

bts



Confidence Interval for b1

H0 is rejected if the hypothesized value of b1 is not included in the confidence interval for b1.

We can use a 95% confidence interval for b1 to test the hypotheses just used in the t test.



The form of a confidence interval for b1 is:


11 / 2 bb t s

where is the t value providing an areaof /2 in the upper tail of a t distributionwith n - 2 degrees of freedom

2/tb1 is the

pointestimat

or

is themarginof error

1/ 2 bt s




Reject H0 if 0 is not included inthe confidence interval for b1.

0 is not included in the confidence interval. Reject H0

= 5 +/- 3.182(1.08) = 5 +/- 3.4412/1 bstb

or 1.56 to 8.44

Rejection Rule

95% Confidence Interval for b1

Conclusion



Hypotheses

Test Statistic

Testing for Significance: F Test

F = MSR/MSE

0 1: 0H b

1: 0aH b



Rejection Rule


where:F is based on an F distribution with1 degree of freedom in the numerator andn - 2 degrees of freedom in the denominator

Reject H0 if p-value <

or F > F



1. Determine the hypotheses.

2. Specify the level of significance.

3. Select the test statistic.

= .05

4. State the rejection rule.Reject H0 if p-value < .05or F > 10.13 (with 1 d.f.

in numerator and 3 d.f. in denominator)


0 1: 0H b

1: 0aH b

F = MSR/MSE




5. Compute the value of the test statistic.

6. Determine whether to reject H0. F = 17.44 provides an area of .025 in the upper tail. Thus, the p-value corresponding to F = 21.43 is less than .025. Hence, we reject H0.

F = MSR/MSE = 100/4.667 = 21.43

The statistical evidence is sufficient to concludethat we have a significant relationship between thenumber of TV ads aired and the number of cars sold.



Some Cautions about theInterpretation of Significance Tests

Just because we are able to reject H0: b1 = 0 and demonstrate statistical significance does not enable

us to conclude that there is a linear relationshipbetween x and y.

Rejecting H0: b1 = 0 and concluding that the

relationship between x and y is significant does not enable us to conclude that a cause-and-effect

relationship is present between x and y.



End of Chapter 14, Part A

john loucks st . edward’s university

Documents

estimated regression

estimated regression

cengage learning

accessible website

simple regression john

estimated value of y

expected value of y

given x value