econ 3790: business and economics statistics instructor: yogesh uppal [email protected]
TRANSCRIPT
![Page 2: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/2.jpg)
Sampling Distribution of b1
Expected value of b1:
E(b1) =1
Variance of b1:
Var(b1) = σ2/SSx
![Page 3: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/3.jpg)
Estimate of σ2
The mean square error (MSE) provides the The mean square error (MSE) provides the estimate of estimate of σσ22..
ss 22 = MSE = SSE/( = MSE = SSE/(n n 2) 2)
where:where:2)ˆ(SSE ii yy 2)ˆ(SSE ii yy
![Page 4: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/4.jpg)
Sample variance of b1
Estimate of variance of b1:
Standard error of b1:
s is called the standard error of the estimate.
xx SSMSE
SSsbVar
2
1)(
xxx SSs
SSMSE
SSsbSE
2
1)(
![Page 5: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/5.jpg)
Interval Estimate of 1:
(1-)100% confidence interval for 1 is:
Where t/2 is the value from t distribution with (n-2) degrees of freedom such that probability in the upper tail is /2.
)( 12/1 bSEtb
![Page 6: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/6.jpg)
Example: Reed Auto SalesReed Auto Sales
ss22 = MSE = SSE/( = MSE = SSE/(n n - 2) = 8.2/3 =2.73- 2) = 8.2/3 =2.73
95% confidence interval for 95% confidence interval for 11::
We can say we 95% confidence that We can say we 95% confidence that 11 will lie will lie
between 1.87 and 7.13.between 1.87 and 7.13.
83.0473.2)(
2
1 xSS
sbSE
63.25.483.0182.35.4
![Page 7: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/7.jpg)
Testing for Significance: t Test
Hypotheses
Test Statistic
Where b1 is the slope estimate and SE(b1) is the standard error of b1.
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
)(
0
1
1
bSE
bt
![Page 8: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/8.jpg)
Rejection RuleRejection Rule
Testing for Significance: Testing for Significance: tt Test Test
where: where:
tt is based on a is based on a tt distribution distribution
with with nn - 2 degrees of freedom - 2 degrees of freedom
Reject Reject HH00 if if pp-value -value << or or tt << - -ttor or tt >> tt
![Page 9: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/9.jpg)
1. Determine the hypotheses.1. Determine the hypotheses.
2. Specify the level of significance.2. Specify the level of significance.
3. Select the test statistic.3. Select the test statistic.
= .05= .05
4. State the rejection rule.4. State the rejection rule.Reject Reject HH00 if if pp-value -value << .05 .05or t ≤ 3.182 or t ≥ 3.182or t ≤ 3.182 or t ≥ 3.182
Testing for Significance: Testing for Significance: tt Test Test
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
)( 1
1
bSE
bt
![Page 10: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/10.jpg)
Testing for Significance: Testing for Significance: tt Test Test
5. Compute the value of the test statistic.5. Compute the value of the test statistic.
6. Determine whether to reject 6. Determine whether to reject HH00..
tt = 5.42 > t = 5.42 > t/2/2 = 3.182. We can reject = 3.182. We can reject HH00..
42.583.0
5.4
)( 1
1 bSE
bt
![Page 11: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/11.jpg)
Some Cautions about theInterpretation of Significance Tests
Just because we are able to reject Just because we are able to reject HH00: : 11 = 0 and = 0 and demonstrate statistical significance does not enabledemonstrate statistical significance does not enable
us to conclude that there is a us to conclude that there is a linear relationshiplinear relationshipbetween between xx and and yy..
Rejecting Rejecting HH00: : 11 = 0 and concluding that = 0 and concluding that thethe
relationship between relationship between xx and and yy is significant is significant does does not enable us to conclude that a not enable us to conclude that a cause-cause-and-effectand-effect
relationshiprelationship is present between is present between xx and and yy..
![Page 12: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/12.jpg)
The equation that describes how the dependent variable y is related to the independent variables x1, x2, . . . xp and an error term is called the multiple regression model.
Multiple Regression Model
yy = = 00 + + 11xx11 + + 22xx2 2 ++ . . . + . . . + ppxxpp + +
where:where:00, , 11, , 22, . . . , , . . . , pp are the are the parametersparameters, and, and is a random variable called the is a random variable called the error termerror term
![Page 13: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/13.jpg)
A simple random sample is used to A simple random sample is used to compute sample statistics compute sample statistics bb00, , bb11, , bb22, , . . . , . . . , bbpp that are used as the point estimators of the that are used as the point estimators of the parameters parameters 00, , 11, , 22, . . . , , . . . , pp..
Estimated Multiple Regression EquationEstimated Multiple Regression Equation
^yy = = bb00 + + bb11xx1 1 + + bb22xx2 2 + . . . + + . . . + bbppxxpp
The The estimated multiple regression equationestimated multiple regression equation is: is:
![Page 14: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/14.jpg)
Interpreting the CoefficientsInterpreting the Coefficients
In multiple regression analysis, we In multiple regression analysis, we interpret eachinterpret each
regression coefficient as follows:regression coefficient as follows: bbii represents an estimate of the change in represents an estimate of the change in yy corresponding to a 1-unit increase in corresponding to a 1-unit increase in xxii when all when all other independent variables are held constant.other independent variables are held constant.
![Page 15: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/15.jpg)
Example: Car SalesExample: Car Sales Suppose we believe that number of cars sold (Suppose we believe that number of cars sold (yy) is) is
not only related to the number of ads (not only related to the number of ads (xx11), but also ), but also to the minimum down payment required at the to the minimum down payment required at the ((xx22). The regression model can be given by:). The regression model can be given by:
Multiple Regression ModelMultiple Regression Model
wherewhere yy = number of cars sold = number of cars sold
xx11 = number of ads = number of ads
xx22 = minimum down payment required (‘000) = minimum down payment required (‘000)
yy = = 00 + + 11xx1 1 + + 22xx2 2 + +
![Page 16: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/16.jpg)
Estimated Regression EquationEstimated Regression Equation
y = 14.4 + 3.7 y = 14.4 + 3.7 xx11 + 0.251 + 0.251 xx22y = 14.4 + 3.7 y = 14.4 + 3.7 xx11 + 0.251 + 0.251 xx22
Interpretation? Interpretation? Estimated values of y?Estimated values of y? Error?Error? Prediction?Prediction?
![Page 17: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/17.jpg)
Multiple Coefficient of DeterminationMultiple Coefficient of Determination
Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE
where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error
SST = SSR + SST = SSR + SSE SSE
2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y
![Page 18: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/18.jpg)
Multiple Coefficient of DeterminationMultiple Coefficient of Determination
RR22 = 84.63/89.2 = .949 = 84.63/89.2 = .949
Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination
R Rn
n pa2 21 1
11
( )R Rn
n pa2 21 1
11
( )
Standard Error of EstimateStandard Error of Estimate
RR22 = SSR/SST = SSR/SST
1 pnSSEMSEs
![Page 19: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/19.jpg)
Testing for Significance: Testing for Significance: t t Test Test
HypothesesHypotheses
Rejection RuleRejection Rule
Test StatisticsTest Statistics
Reject Reject HH00 if if pp-value -value << or or
if if tt << - -ttor or tt >> ttwhere where tt
is based on a is based on a t t distribution distribution
with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.
0 : 0iH 0 : 0iH
: 0a iH : 0a iH
)( i
i
bSE
bt
![Page 20: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/20.jpg)
Example: Testing for significance of coefficients
HypothesesHypotheses
Rejection RuleRejection RuleFor For = .05 and d.f. = ?, = .05 and d.f. = ?, tt.025.025 = =
0:
0:0
ia
i
H
H
Test StatisticsTest Statistics)( i
i
bSE
bt
![Page 21: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/21.jpg)
Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest
HypothesesHypotheses
Rejection RuleRejection Rule
Test StatisticsTest Statistics
HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0
HHaa: One or more of the parameters: One or more of the parameters
is not equal to zero.is not equal to zero.
FF = MSR/MSE = MSR/MSE
Reject Reject HH00 if if pp-value -value << or if or if FF > > FF
where where FF is based on an is based on an FF distribution distribution
with with pp d.f. in the numerator and d.f. in the numerator and
nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.
![Page 22: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/22.jpg)
The years of experience, score on the The years of experience, score on the aptitudeaptitudetest, and corresponding annual salary test, and corresponding annual salary ($1000s) for a ($1000s) for a sample of 20 programmers is shown on the sample of 20 programmers is shown on the nextnextslide.slide.
Example 2: Programmer Salary Survey
Multiple Regression ModelMultiple Regression Model
A software firm collected data for a sampleA software firm collected data for a sampleof 20 computer programmers. A suggestionof 20 computer programmers. A suggestionwas made that regression analysis couldwas made that regression analysis couldbe used to determine if salary was relatedbe used to determine if salary was relatedto the years of experience and the scoreto the years of experience and the scoreon the firm’s programmer aptitude test.on the firm’s programmer aptitude test.
![Page 23: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/23.jpg)
4477115588101000116666
9922101055668844663333
787810010086868282868684847575808083839191
8888737375758181747487877979949470708989
24244343
23.723.734.334.335.835.83838
22.222.223.123.130303333
383826.626.636.236.231.631.629293434
30.130.133.933.928.228.23030
Exper.Exper. ScoreScore ScoreScoreExper.Exper.SalarySalary SalarySalary
Multiple Regression ModelMultiple Regression Model
![Page 24: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/24.jpg)
Suppose we believe that salary (Suppose we believe that salary (yy) is) is
related to the years of experience (related to the years of experience (xx11) and the ) and the score onscore on
the programmer aptitude test (the programmer aptitude test (xx22) by the ) by the following following
regression model:regression model:
Multiple Regression ModelMultiple Regression Model
wherewhere yy = annual salary ($1000) = annual salary ($1000)
xx11 = years of experience = years of experience
xx22 = score on programmer aptitude test = score on programmer aptitude test
yy = = 00 + + 11xx1 1 + + 22xx2 2 + +
![Page 25: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/25.jpg)
Solving for 0, 1 and 2:
A B C3839 Coeffic. Std. Err.40 Intercept 3.17394 6.1560741 Experience 1.4039 0.1985742 Test Score 0.25089 0.07735
![Page 26: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/26.jpg)
Anova Table
Source of Variation
Sum of Squares
Degrees of Freedom
Mean Square
F-statistic
Regression 500.34 …… …….. ……….
Error …….. ……. …….
Total 599.8 ……..
![Page 27: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/27.jpg)
Estimated Regression EquationEstimated Regression Equation
SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)
bb11 = 1.404 implies that salary is expected to = 1.404 implies that salary is expected to increase by $1,404 for each additional year of increase by $1,404 for each additional year of experience (when the variable experience (when the variable score on score on programmer attitude testprogrammer attitude test is held constant). is held constant).
b2 = 0.251 implies that salary is expected to b2 = 0.251 implies that salary is expected to increase by $251 for each additional point increase by $251 for each additional point scored on the programmer aptitude test (when scored on the programmer aptitude test (when the variable the variable years of experienceyears of experience is held is heldconstant).constant).
![Page 28: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/28.jpg)
Prediction
Suppose Bob had an experience of 4 years and had a score of 78 on the aptitude test. What would you estimate (or expect) his score to be?
= 3.174 + 1.404*(4) + 0.251(78)= 3.174 + 1.404*(4) + 0.251(78)
= 28.358= 28.358 Bob’s estimated salary is $28,358.Bob’s estimated salary is $28,358.
y
![Page 29: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/29.jpg)
Error
Bob’s actual salary is $24000. How much error we made in estimating his salary based on his experience and score?
So, we shall overestimate Bob’s salary.
43582835824000ˆ yyerror
![Page 30: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/30.jpg)
Multiple Coefficient of DeterminationMultiple Coefficient of Determination
Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE
where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error
SST = SSR + SST = SSR + SSE SSE
2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y
![Page 31: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/31.jpg)
Multiple Coefficient of DeterminationMultiple Coefficient of Determination
RR22 = 500.3285/599.7855 = .83418 = 500.3285/599.7855 = .83418
RR22 = SSR/SST = SSR/SST
Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination
R Rn
n pa2 21 1
11
( )R Rn
n pa2 21 1
11
( )
2 20 11 (1 .834179) .814671
20 2 1aR
2 20 11 (1 .834179) .814671
20 2 1aR
![Page 32: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/32.jpg)
Testing for Significance: Testing for Significance: t t Test Test
HypothesesHypotheses
Rejection RuleRejection Rule
Test StatisticsTest Statistics
Reject Reject HH00 if if pp-value -value << or or
if if tt << - -ttor or tt >> ttwhere where tt
is based on a is based on a t t distribution distribution
with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.
0 : 0iH 0 : 0iH
: 0a iH : 0a iH
)( i
i
bSE
bt
![Page 33: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/33.jpg)
Example
HypothesesHypotheses
Rejection RuleRejection RuleFor For = .05 and d.f. = 17, = .05 and d.f. = 17, tt.025.025 = 2.11 = 2.11
Reject Reject HH00 if if pp-value -value << .05 or if .05 or if tt >> 2.11 2.11
0:
0:
1
10
aH
H
Test StatisticsTest Statistics 07.7199.0
404.1
)( 1
1 bSE
bt
Since t=7.07 > tSince t=7.07 > t0.0250.025 =2.11, we reject H =2.11, we reject H00..
![Page 34: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/34.jpg)
Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest
HypothesesHypotheses
Rejection RuleRejection Rule
Test StatisticsTest Statistics
HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0
HHaa: One or more of the parameters: One or more of the parameters
is not equal to zero.is not equal to zero.
FF = MSR/MSE = MSR/MSE
Reject Reject HH00 if if pp-value -value << or if or if FF > > FF
where where FF is based on an is based on an FF distribution distribution
with with pp d.f. in the numerator and d.f. in the numerator and
nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.
![Page 35: Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu](https://reader030.vdocuments.mx/reader030/viewer/2022032611/56649e855503460f94b86f79/html5/thumbnails/35.jpg)
ExampleExample
HypothesesHypotheses HH00: : 11 = = 2 2 = 0= 0
HHaa: One or both of the parameters: One or both of the parameters
is not equal to zero.is not equal to zero.
Rejection RuleRejection Rule For For = .05 and d.f. = 2, 17; = .05 and d.f. = 2, 17; FF.05.05 = 3.59 = 3.59
Reject Reject HH00 if if pp-value -value << .05 or .05 or FF >> 3.59 3.59
Test StatisticsTest Statistics FF = MSR/MSE = MSR/MSE = 250.17/5.86 = 42.8= 250.17/5.86 = 42.8
FF = 42.8 = 42.8 >> F F0.050.05 = 3.59, so we can reject = 3.59, so we can reject HH00..