tests of hypothesis in linear regression models

21
TESTS OF HYPOTHESIS IN LINEAR REGRESSION MODELS Hendrik Wolff – [email protected]

Upload: honora

Post on 16-Feb-2016

68 views

Category:

Documents


0 download

DESCRIPTION

Tests of Hypothesis in Linear Regression Models. Hendrik Wolff – [email protected]. Simple Tests in Linear Regression MOdels. t-Test F-Test Autocorrelation Test Heteroskedasticity Test Chow Test for Structural Breaks. Student T-Test in the Linear Regression Model . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tests  of Hypothesis in Linear Regression Models

TESTS OF HYPOTHESIS IN LINEAR REGRESSION MODELS

Hendrik Wolff – [email protected]

Page 2: Tests  of Hypothesis in Linear Regression Models

SIMPLE TESTS IN LINEAR REGRESSION MODELS

t-Test

F-Test

Autocorrelation Test

Heteroskedasticity Test

Chow Test for Structural Breaks

Page 3: Tests  of Hypothesis in Linear Regression Models

STUDENT T-TEST IN THE LINEAR REGRESSION MODEL

William Gosset, employee of Guinness, developed the t- distribution and published it under the pseudonym ‘student’ in Biometrika in 1908.

Page 4: Tests  of Hypothesis in Linear Regression Models

STUDENT T-TEST IN THE LINEAR REGRESSION MODEL

William Gosset, employee of Guinness, developed the t- distribution and published it under the pseudonym ‘student’ in Biometrika in 1908.

Is result of an experiment • “Random” or • “Statistical Significant”?

Page 5: Tests  of Hypothesis in Linear Regression Models

WHAT IS “STATISTICAL SIGNIFICANT?”

Google: statistical significantContext of Linear Regression Model

How do we know that is “Statistically Significant”?

Procedure: Specify a “Null “Hypothesis and perform a t-Test!

Page 6: Tests  of Hypothesis in Linear Regression Models

STATA EXAMPLE Are Black American Woman Discriminated in the U.S. Labor Market? STATA provides wage data for year 1988 “for free”. Simply type “sysuse nlsw88” The 'National Longitudinal Surveys of Young Women and Mature Women' (nlsw) is a dataset of 2229

individuals and has rich information on - hourly wage - ethnicity - grade (years of education) - tenure (years of job experience)

Regression Equation:

Whereby the dummy African is defined as: African = 1 if person is black (African American) African = 0 if person is White

Page 7: Tests  of Hypothesis in Linear Regression Models

STATA EXAMPLE Are Black American Woman Discriminated in the U.S. Labor Market? STATA provides wage data for year 1988 “for free”. Simply type “sysuse nlsw88” The 'National Longitudinal Surveys of Young Women and Mature Women' (nlsw) is a dataset of 2229

individuals and has rich information on - hourly wage - ethnicity - grade (years of education) - tenure (years of job experience)

Regression Equation:

Whereby the dummy African is defined as: African = 1 if person is black (African American) African = 0 if person is White

Page 8: Tests  of Hypothesis in Linear Regression Models

* Do File for Lecture on Testing for 'racial discrimination' using t-test. * The purpose is to estimate the regression equation * hourly wage = alpha + beta1*"years of education" + beta2*'years of job experience' +

beta3*black + epsilon * and then test for racial discrimination * Background: The 'National Longitudinal Surveys of Young Women and Mature Women' (NLSW)

comprises two separate surveys. The Young Women's survey includes women who were ages 14–24 when first interviewed in 1968. The Mature Women's survey includes women who were ages 30–44 when first interviewed in 1967. These surveys were discontinued in 2003.

* Here we use the (NLSW) of the year 1988 for the 'mature' woman. sysuse nlsw88 * browse the dataset browse * Prepare the 'black' variable gen black = 0 replace black = 1 if race == 2 * regression reg wage grade tenure black

* Exercises: * What is the regression equation? * If I go one more year to school, by how much will my hourly wage increase? * A 'white' women of age 40 with, with 15 years of schooling and three years of job tenure: What hourly

wage is she likely to earn? * If this women were black, what would be her hourly wage? * Which of the parameters is statistically significant? * How is the t-test for beta3 computed? (show your calculations). * For the two sided t-test on beta3: what is the H_0 and what is the H_A? * In the above regression, does STATA report the one-sided or the two sided t-test? * Provide a numerical example of an one sided t-test for beta3. What is the corresponding p-value? How

would you define H_0 vs. H_A?

Page 9: Tests  of Hypothesis in Linear Regression Models

T-TEST

Page 10: Tests  of Hypothesis in Linear Regression Models

T-TEST INTUITION

Wage

grade (years of schooling)

Page 11: Tests  of Hypothesis in Linear Regression Models

T-TEST INTUITION

Wage

grade (years of schooling)

Page 12: Tests  of Hypothesis in Linear Regression Models

T-TEST INTUITION

Wage = -0.74 with std.err of 0.26: Is

this statistically different from zero?

grade (years of schooling)

Page 13: Tests  of Hypothesis in Linear Regression Models

INTUITION Under the “Null Hypothesis” of NO discrimination is distributed as N(0, δβ4)

Our point estimate = -0.74 is a random draw from N(0, δβ4) Plot N(0, δβ4) and mark -0.74: Is this ‘significantly’ different from 0? See blackboard. To avoid having different test statistic for each parameter estimate: let’s normalize Standardize distribution N(0, δβ4) to N(0, 1) by dividing β4 by SQRT( δβ4) Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces

additional noise, which produces fatter tails to the normal distribution

From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K) δβ4 is a function of a Chi-square distribution with N-K degrees of freedom

Page 14: Tests  of Hypothesis in Linear Regression Models

T-TEST INTUITION Trick: To avoid having different test statistic for each parameter estimate: let’s standardize N(0, δβ4) to N(0, 1) by dividing β4 by SQRT( δβ4) .

In Large samples critical values in N(0,1): 5% Critical Value is 1.96 1% Critical Value is 2.58

In small samples: Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces additional noise, which produces fatter tails to the normal distribution From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K)δβ4 is a function of a Chi-square distribution with N-K degrees of freedom

Page 15: Tests  of Hypothesis in Linear Regression Models

WHAT IS SIGNIFICANT ENOUGH?

“Statisticians are people, whose aim in life is to be wrong 5% of the time”

(Kempthorne and Doerfler, 1969)

Page 16: Tests  of Hypothesis in Linear Regression Models

T-TEST

Page 17: Tests  of Hypothesis in Linear Regression Models

T-TEST

In small samples: Crux: We don’t know exactly δβ4, but the estimate of δβ4 only. This introduces additional noise, which produces fatter tails to the normal distribution

From Statistics: N(0,1) divided by Chi-square(N-K) = t(N-K) δβ4 is a function of a Chi-square distribution with N-K degrees of freedom

Page 18: Tests  of Hypothesis in Linear Regression Models

BLACK: N(0,1) = T(∞) VS. RED = T(N-K=3)

Page 19: Tests  of Hypothesis in Linear Regression Models

TWO SIDED TEST VS. ONE SIDED TESTRepeat: Two sided test is today the ‘standard’ and is more conservative: H0: , HA

: Critical value for 5% significance level is +/-1.96 (asymptotically)

One sided test, however, absolutely makes sense too: H0: , HA

: Then critical value is +1.64 Generally, we can also formalize any Null with = b0, and test for this distance as t=[- b0]/(s.e()

Page 20: Tests  of Hypothesis in Linear Regression Models

SUMMARY OF TERMS: BY NOW YOU SHOULD BE FAMILIAR WITH THE FOLLOWING TERMS: Null Hypothesis Alternative Hypothesis critical values T value P value Two sided t-test One sided t-test Statistical Significant

Page 21: Tests  of Hypothesis in Linear Regression Models

HOMEWORK: USING NLSW88 AND RACIAL DISCRIMINATION EQUATION DISCUSSED IN CLASS

• Run the STATA do file. What is the regression equation? • Define the Null Hypothesis and the Alternative Hypothesis for whether or not

Black women are discriminated. * If I go one more year to school, by how much will my hourly wage increase? * A 'white' women of age 40 with, with 15 years of schooling and 3 years of job tenure: What hourly wage is she likely to earn? * If this women were not white but black, what would be her hourly wage? * Which of the estimated parameters [beta1, beta2, beta3, beta4] is statistically significant? * How is the t-test for beta4 computed? (show your calculations!). * For the two sided t-test on beta4: what is the H_0 and what is the H_A? * In the above regression, does STATA report the one-sided or the two sided t-test? * Provide a numerical example of an one sided t-test for beta3. What is the corresponding p-value? How would you define H_0 vs. H_A for such a one sided test?