time series analysis – chapter 4 hypothesis testing hypothesis testing is basic to the scientific...

41
Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting tests of scientific hypotheses. Scientific philosophy today rests on the idea of falsification: For a theory to be a valid scientific theory it must be possible, at least in principle, to make observations that would prove the theory false. For example, here is a simple theory: All swans are white

Upload: beatrice-fox

Post on 28-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Time Series Analysis – Chapter 4Hypothesis Testing

Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting tests of scientific hypotheses. Scientific philosophy today rests on the idea of falsification: For a theory to be a valid scientific theory it must be possible, at least in principle, to make observations that would prove the theory false. For example, here is a simple theory:

All swans are white

Page 2: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Time Series Analysis – Chapter 4Hypothesis Testing

All swans are white This is a valid scientific theory because there is a way to falsify it: I can observe one black swan and the theory would fall. For more information on the history and philosophy of falsification I suggest reading Karl Popper.

Page 3: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Time Series Analysis – Chapter 4Hypothesis Testing

Besides the idea of falsification, we must keep in mind the other basic tenant of the scientific method: All evidence that supports a theory or falsifies it must be empirically based and reproducible.

Page 4: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

All evidence that supports a theory or falsifies it must be empirically based and reproducible.

In other words, data! Just holding a belief (no matter how firm) that a theory is true or false is not a justifiable stance. This chapter gives us the most basic statistical tools for taking data or empirical evidence and using it to substantiate or nullify (show to be false) a hypothesis.

Page 5: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

All evidence that supports a theory or falsifies it must be empirically based and reproducible.

I have just used the word hypothesis and this chapter is concerned with hypothesis testing, not theory testing. This is because theories are composed of many hypotheses and, usually, a theory is not directly supported or attacked but one or more of it’s supporting hypotheses are scrutinized.

Page 6: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Discrimination or Not Activity

Null Hypothesis Ho: No Discrimination

Alternative Hypothesis Ha: Discrimination

How do we choose which hypothesis to support?

Page 7: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Discrimination or Not Activity

Null Hypothesis Ho: No Discrimination

Alternative Hypothesis Ha: Discrimination

How do we choose which hypothesis to support?

Page 8: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

The p-value

• p-value measures amount of support for alternative hypothesis.

• The smaller the p-value the more support for the alternative hypothesis.

• Typical level of support is 5% or 0.05

Page 9: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Fourth Graders Feet Data Set

Predictor variable (x): Childs AgeResponse variable (y): Foot LengthModel: Test: Ho: -> x has no effect on y

H1: -> x has an effect on y

Page 10: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Fourth Graders Feet Data Set – Minitab Output

Predictor variable (x): Childs AgeResponse variable (y): Foot Length

The regression equation isFoot Length = 18.1 + 0.0358 Childs Age

Predictor Coef SE Coef T PConstant 18.138 3.753 4.83 0.000Childs Age 0.03575 0.02922 1.22 0.229

Page 11: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Fourth Graders Feet Data Set – Minitab Output

Predictor variable (x): Childs AgeResponse variable (y): Foot Length

Ho: -> x has no effect on y

H1: -> x has an effect on y

P-value = 0.229 -> x has no STATISTICAL effect on y given the model we used!

Page 12: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Fourth Graders Feet Data Set – One Tailed Alternative

Predictor variable (x): AgeResponse variable (y): foot length

Ho: -> x has no effect on y

H1: -> x has a positive effect on y

P-value = (0.229)/2 = 0.1145 -> x has no statistical positive effect on y given the model we used.

Page 13: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Statistical vs. Practical Significance

401K data setPredictor variables x1: mrate

x2: age

x3: totemp

Response variable (y): prate

Page 14: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Statistical vs. Practical Significance

The regression equation is:prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp

Predictor Coef SE Coef T PConstant 80.2943 0.7777 103.25 0.000mrate 5.4414 0.5244 10.38 0.000age 0.26941 0.04515 5.97 0.000totemp -0.00012978 0.00003672 -3.53 0.000

Page 15: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Statistical vs. Practical Significance

The regression equation is:prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp

All predictors are statistically significant.

Page 16: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Statistical vs. Practical Significance

The regression equation is:prate = 80.3 + 5.44 mrate + 0.269 age - 0.000130 totemp

If total number of employees increases by ten thousand then participation rate decreases by -0.000130*10,000 = 1.3% (other predictors held constant)

Page 17: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Boeing 747 Jet

What does an empty Boeing 747 jet weigh?

Page 18: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Boeing 747 Jet

What does an empty Boeing 747 jet weigh?

My point estimate: 250,000 lbs

Answer: 358,000 lbs

I am wrong! A point estimate is almost always wrong!

Page 19: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Boeing 747 Jet

What does an empty Boeing 747 jet weigh?

My confidence interval estimate: (0, ∞)

Answer: 358,000 lbs

I am right! But, my interval is not useful!

Page 20: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Point and Interval Estimates – Minitab will compute both

401K data setPredictor variables x1: age

Response variable (y): prate

In Minitab go to Regression -> General Regression and select the correct model variables then click on the Results box and make sure the “Display confidence intervals” box is selected.

Page 21: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Point and Interval Estimates – Minitab will compute both

401K data setPredictor variables x1: age

Response variable (y): prateRegression Equation

prate = 83.4231 + 0.298893 age

Coefficients

Term Coef SE Coef T P 95% CIConstant 83.4231 0.737593 113.102 0.000 (81.9763, 84.8699)age 0.2989 0.045938 6.506 0.000 ( 0.2088, 0.3890)

Page 22: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Confidence Intervals

General structure of all confidence intervals:

The standard error is an estimate of the standard deviation of the point estimator.

Page 23: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Confidence Intervals

General structure of all confidence intervals:

Term Coef SE Coef T P 95% CIConstant 83.4231 0.737593 113.102 0.000 (81.9763, 84.8699)age 0.2989 0.045938 6.506 0.000 ( 0.2088, 0.3890)

0.2989 + 1.960*0.045938 = 0.38900.2989 – 1.960*0.045938 = 0.2088

Page 24: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Confidence Intervals

General structure of all confidence intervals:

0.2989 + 1.960*0.045938 = 0.38900.2989 – 1.960*0.045938 = 0.2088

Where does 1.960 come from?

Page 25: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Confidence Intervals

Where does 1.960 come from?

t distribution with n – k – 1 degrees of freedom where k is the number of predictors in the model.

For our model, n = 1533 and k = 1We also need to know the confidence level of the interval (typically 95%)

Then, use a t table!

Page 26: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

TWOYEAR data setPredictor variables x1: jc – # years attending a two-year college

x2: univ – # years attending a four-year college

x3: exper – months in workforce

Response variable (y): log(wage)

Page 27: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

Predictor variables x1: jc – # years attending a two-year college

x2: univ – # years attending a four-year college

x3: exper – months in workforce

Response variable (y): log(wage)

Ho:

“one year at two-year college is worth the same as one year at four-year college”

Page 28: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

Predictor variables x1: jc – # years attending a two-year college

x2: univ – # years attending a four-year college

x3: exper – months in workforce

Response variable (y): log(wage)

H1:

“one year at two-year college is worth less than one year at four-year college”

Page 29: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

Predictor variables x1: jc – # years attending a two-year college

x2: univ – # years attending a four-year college

x3: exper – months in workforce

Response variable (y): log(wage)

Ho: -> Ho:

H1: -> H1:

Page 30: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

Let ->

Then

Page 31: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of Parameters

Now, after we create the new variable , we can conduct the following test:

Ho: -> Ho: -> Ho:

H1: -> H1: -> H1:< 0

Page 32: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Ho: -> Ho: -> Ho:

H1: -> H1: -> H1:< 0

lwage = 1.47233 - 0.0101795 jc + 0.0768763 jc+univ + 0.00494422 exper

CoefficientsTerm Coef SE Coef T P 95% CIConstant 1.47233 0.0210602 69.9102 0.000 ( 1.43104, 1.51361)jc -0.01018 0.0069359 -1.4677 0.142 (-0.02378, 0.00342)jc+univ 0.07688 0.0023087 33.2981 0.000 ( 0.07235, 0.08140)exper 0.00494 0.0001575 31.3972 0.000 ( 0.00464, 0.00525)

Page 33: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Ho: -> Ho: -> Ho:

H1: -> H1: -> H1:< 0

This is a one-tailed test so the p-value needs to be divided by 2:

0.142/2 = 0.071

Conclusion: analysis supports the null hypothesis – “one year at a junior college is worth the same as one year at a university.”

Page 34: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Testing Linear Combinations of ParametersUse the TWOYEAR data set to test the following hypothesis:

Ho:

H1:

Page 35: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

The ANOVA F TestFor a multiple regression model:

The ANOVA F test is:

Ho:

H1: at least one is not equal to 0

Page 36: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 1: the model is linear in the parameters

Page 37: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 2: Data comes from a random sample

Page 38: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 3: None of the independent or predictor variables are perfectly correlated (if they were, Minitab would not run a regression analysis).

Page 39: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 4: The error, u, has an expected value of zero.

Page 40: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 5: The error, u, has the same variance given any values of the explanatory variables. This is the assumption of homoskedasticity.

Page 41: Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting

Multiple Linear Regression Assumptions

MLR Assumption 6: The error, u, is independent of the explanatory or predictor variables and is normally distributed with mean zero and variance .