econ 495 - econometric review 1 linear regression...

54
Econ 495 - Econometric Review 1 Linear Regression Analysis

Upload: others

Post on 09-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 4

1 Linear Regression Analysis

1.1 The Mincer Wage Equation

• Our first exercise in empirical analysis will focus on the determinants

of wages in a cross-section of individuals, that is, observations on

individuals at a specific point in time.

• A complete wage equation model would include the following human

capital variables

log(wagesi) = β0 + β1educi + β2experi + β3exper2i + . . . + ui (1)

where the term ui contains factors such as ability, quality of education,

family background and other factors influencing a person’s wage.

Page 2: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 5

• For some specific purpose, we will also include gender and union status.

• We may think of the relationship between wages and their determi-

nants, including institutions and industrial characteristics, as the wage

structure.

• Let’s suppose to begin with that we are interested in the effect of

education, β1, measured in years of schooling, on wages

wagesi = β0 + β1educi + ui (2)

Page 3: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 6

1.2 Data

• The Labour Force Survey selects individuals (close to) randomly and

ask them about their wage (Yi), education and other characteristics

(Xi).

• These data {(Xi, Yi) : i = 1, , n} will constitute our random sample

(A2) of size n from the population.

• A scatter plot of wages and education level indicates a positive rela-

tionship.

Page 4: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 7

010

2030

4050

rwag

e

8 10 12 14 16 18schooling

Figure 1: Wages and Years of Schooling

• As do the average wages by education level

Page 5: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 9

• But we may want to know by how much do wages increase whenschooling increases by one year

1.3 Econometric Model

• The (population) regression function

E(wagesi|educi) = β0 + β1educi (3)

describe the wages conditional on a level of schooling as a linear (A1)function of the parameters, under the zero conditional mean (A3)assumption E(ui|educi) = 0,

• For any given value of schooling, the distribution of wages is centeredabout E(wages|schooling)

Page 6: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 10

010

2030

4050

8 10 12 14 16 18

rwag

e

schooling

Figure 2: E(wages|schooling) as a linear function of schooling

• Note that E(ui|educi) = 0 implies by the law of iterated expectations

that E(ui) = 0 and than Cov(ui, educi) = E(ui ∗ educi) = 0.

• This means that ui has a zero mean and is uncorrelated with educ,

which may be farfetched in this case.

Page 7: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 11

• Another typical assumption (A5) is that V ar(ui|educ) = σ2 is con-

stant, a property called homoskedasticity.

• But it appears problematic here! We will see later how to test for it.

1.4 Estimation

• The objective is to obtain an estimate called β1 of the unknown pa-

rameter β1 from the data sample

Page 8: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 16

. regress rwage schooling

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 1, 9718) = 1729.94

Model | 117804.085 1 117804.085 Prob > F = 0.0000Residual | 661769.296 9718 68.0972727 R-squared = 0.1511

-------------+------------------------------ Adj R-squared = 0.1510Total | 779573.38 9719 80.2112749 Root MSE = 8.2521

------------------------------------------------------------------------------rwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------schooling | 1.541137 .0370532 41.59 0.000 1.468505 1.613769

_cons | -2.426309 .489984 -4.95 0.000 -3.386779 -1.465838------------------------------------------------------------------------------

. predict prwage(option xb assumed; fitted values). predict reswage, residuals

Page 9: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 17

1.5 Diagnostics - Goodness of Fit

• The STATA output gives many measures of whether our regression

model fits the data well

Model/Explained : SSE ≡n∑

i=1

(Yi − Y )2

Residual : SSR ≡n∑

i=1

(ui)2

Total : SST ≡n∑

i=1

(Yi − Y )2

and the R2 which is the ratio of the explained variation compared to

the total variation

R2 = SSE/SST = 1 − SSR/SST

Page 10: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 18

• The R2 can also be shown to equal the squared correlation coefficient

between the actual Yi and the fitted values Yi.

• The adjusted R2 takes into account the number of explanatory vari-

ables R2a = 1 − (1 − R2)(n − c)/(n − k) where k is the number of

variables in the model and c = 1 if there is a constant.

• Here, a R2 = 0.15 means that 15% of the variation in wages across

individuals is explained by their education level

• This means that 85% of the variation in wages remains unexplained!

We will want to add more variables!

Page 11: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 16

. regress rwage schooling

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 1, 9718) = 1729.94

Model | 117804.085 1 117804.085 Prob > F = 0.0000Residual | 661769.296 9718 68.0972727 R-squared = 0.1511

-------------+------------------------------ Adj R-squared = 0.1510Total | 779573.38 9719 80.2112749 Root MSE = 8.2521

------------------------------------------------------------------------------rwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------schooling | 1.541137 .0370532 41.59 0.000 1.468505 1.613769

_cons | -2.426309 .489984 -4.95 0.000 -3.386779 -1.465838------------------------------------------------------------------------------

. predict prwage(option xb assumed; fitted values). predict reswage, residuals

Siwan
Highlight
Siwan
Highlight
Siwan
Highlight
Siwan
Highlight
Siwan
Highlight
Siwan
Highlight
Page 12: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 20

1.6 Inference - Hypothesis Testing

• The success of a model also depends on whether the variables included

in the model belong there, that is, are statistically significant.

• Under the assumption (A6) that the ui are normally distributed with

zero mean and variance σ2 : u ∼ Normal(0, σ2), the estimates β will

also be distributed normally distributed, and

(β − β)/se(β) ∼ tDF

will follow the Student-t distribution, where DF = n − k − 1 the

degrees of freedom in the model is equal to the number of observations

minus the number of variables minus 1 for the constant.

Page 13: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 21

• We can use the t−statistic reported by STATA to test the null hy-

pothesis H0 : β = 0 against H1 : β 6= 0

• If the t−statistic is greater the critical value corresponding to our

degrees of freedom and the desired level of the test (5% or 1%), we

can reject the null

• The rule of thumb is: if |t| ≥ 2.0 then reject H0 : β = 0 at the

5% significance level. For more robustness, sometimes we prefer even

higher values.

• But we do not have to look the critical values in a table since STATA

gives us the p − value corresponding to our t−statistic

Page 14: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 22

– If p ≤ 0.01 then the relationship is significant at the 1% level,

– If p ≤ 0.05 then the relationship is significant at the 5% level,

– If p ≤ 0.10 then the relationship is significant at the 10% level,

• Here with a t-statistic of 41.59, we can say that schooling is a very

significant factor explaining the variation in wages

• It is all very good to know that the coefficient of schooling is different

from zero, but we would also like know how precisely it is estimated

• The confidence intervals tells us that, under the classical OLS assump-

Page 15: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 16

. regress rwage schooling

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 1, 9718) = 1729.94

Model | 117804.085 1 117804.085 Prob > F = 0.0000Residual | 661769.296 9718 68.0972727 R-squared = 0.1511

-------------+------------------------------ Adj R-squared = 0.1510Total | 779573.38 9719 80.2112749 Root MSE = 8.2521

------------------------------------------------------------------------------rwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------schooling | 1.541137 .0370532 41.59 0.000 1.468505 1.613769

_cons | -2.426309 .489984 -4.95 0.000 -3.386779 -1.465838------------------------------------------------------------------------------

. predict prwage(option xb assumed; fitted values). predict reswage, residuals

Siwan
Highlight
Siwan
Highlight
Page 16: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 25

1.7 Reporting the results

• The results from a STATA output are reported in a table that typically

contains

– estimated coefficients

– standard errors of the coefficients

– number of observations

– R2 or R2a

• In some instances, it may worthwhile to report other statistics. We

will discuss these issues when we will cover the readings.

Page 17: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 16

. regress rwage schooling

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 1, 9718) = 1729.94

Model | 117804.085 1 117804.085 Prob > F = 0.0000Residual | 661769.296 9718 68.0972727 R-squared = 0.1511

-------------+------------------------------ Adj R-squared = 0.1510Total | 779573.38 9719 80.2112749 Root MSE = 8.2521

------------------------------------------------------------------------------rwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------schooling | 1.541137 .0370532 41.59 0.000 1.468505 1.613769

_cons | -2.426309 .489984 -4.95 0.000 -3.386779 -1.465838------------------------------------------------------------------------------

. predict prwage(option xb assumed; fitted values). predict reswage, residuals

Siwan
Highlight
Siwan
Highlight
Siwan
Highlight
Page 18: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 27

1.8 Interpretation of the Estimates

• In general, the β parameters measure the marginal effect of increasing

X by one unit on the predicted wages Y .

• In our example,

∆wage = β1∆educ

tell us that the wage value of an one additional year of schooling in

this sample is $1.54.

• But in this simple regression, we cannot claim to have found a causal

relationship, so we should be cautious in our interpretation

Page 19: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 28

• The value of -2.42 for β0 says that a person with zero years of schooling

has a negative predicted wage, which is silly. This occurs because no

one in our sample has less than 8 years of schooling. For a person

with eight years of schooling, the predicted wage is

wage = −2.42 + 1.54 ∗ 8 = 9.90

which is above the minimum wage.

• If this person completes high school (4 more years), our model predicts

that the predicted wage would be higher by 4*$1.54=$6.60 per hour

more! This is more than the average wage of $15.80 for high school

graduates in Table 1, which may make us question the linearity in our

functional form assumption.

Page 20: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 29

• Indeed, it is more common to estimate the following log-linear model

log(wagesi) = β0 + β1educi + ui (8)

where log(·) denotes the natural logarithm. Since wages tend to be

lognormal, this reduces the problem of heteroscedasticity.

• This is equivalent with writing wage = exp(β0+β1educi +ui), which

is consistent with the increasing returns to education that we found in

Table 1.

• In this case the interpretation of β1 is

%∆wage ≈ (100 · β1)∆educ

that is multiplying β1 by 100 gives us the percentage change in pre-

dicted wage given an additional year of schooling.

Page 21: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 30

• We run the log wage regression by first taking the log of the dependent

variable

regress lrwage schooling

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 1, 9718) = 1585.31

Model | 331.915467 1 331.915467 Prob > F = 0.0000Residual | 2034.6566 9718 .209369891 R-squared = 0.1403

-------------+------------------------------ Adj R-squared = 0.1402Total | 2366.57207 9719 .243499544 Root MSE = .45757

------------------------------------------------------------------------------lrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------schooling | .081804 .0020546 39.82 0.000 .0777767 .0858314

_cons | 1.684045 .027169 61.98 0.000 1.630788 1.737302------------------------------------------------------------------------------

• The coefficient on schooling has a percentage interpretation when it

Page 22: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 31

is multiplied by 100. That is, predicted wages increase by 8.2 percent

for every additional year of education.

• In the human capital interpretation of the wage equation, this means

that the rate of return of one year of schooling is 8.2%, not bad!

• This easy interpretation of the rate of return of schooling is one of the

reasons why the log wage specification is the preferred one.

• The intercept of 1.684 is again not very meaningful, since it gives the

predicted log(wages) when schooling = 0

Page 23: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 33

1.9 Multivariate Regression Analysis

• We have already improved our wage equation model by using log(wages),

now we would like to add more variables, in particular labour market

experience

• We can also use a more flexible functional form by adding higher order

terms (polynomial) in the explanatory variables

• For example, here a quadratic in experience can capture diminishing

returns to on-the-job training

log(wagesi) = β0 + β1educi + β2experi + β3exper2i + . . . + ui (9)

Page 24: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 35

. regress lwage educ exper exp2 [weight=weight](analytic weights assumed)(sum of wgt is 2.6311e+07)

Source | SS df MS Number of obs = 11893-------------+------------------------------ F( 3, 11889) = 1943.58

Model | 1231.32118 3 410.440394 Prob > F = 0.0000Residual | 2510.68698 11889 .211177305 R-squared = 0.3291

-------------+------------------------------ Adj R-squared = 0.3289Total | 3742.00816 11892 .314666008 Root MSE = .45954

------------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95\% Conf. Interval]

-------------+----------------------------------------------------------------educ | .1084982 .0017458 62.15 0.000 .1050761 .1119202

exper | .0383817 .0012279 31.26 0.000 .0359747 .0407887exp2 | -.000639 .0000296 -21.57 0.000 -.000697 -.0005809

_cons | .5751865 .0250984 22.92 0.000 .5259896 .6243835------------------------------------------------------------------------------

Page 25: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 36

test exper= exp2=0

( 1) exper - exp2 = 0( 2) exper = 0

F( 2, 11889) = 873.19Prob > F = 0.0000

1.10 Diagnostics - Goodness of Fit

• As before, we can use the t-statistic to determine whether each variable

is statistically significant individually

Page 26: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 37

• But we can also use the F-statistic to test the significance of the whole

model, that is the hypothesis that the variables are jointly significant

H0 : β1 = β2 = β3 = 0 vs. H1 : H0 is not true

• Here, we would overwhelmingly reject H0.

• The F-statistic can also be used to test a restricted model against an

unrestricted model.

• The model log(wagesi) = β0 + β1educi can be seen as a restricted

version of the model with experience where H0 : β2 = β3 = 0

Page 27: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 40

1.11 Interpretation of the Estimates

• The general model

Yi = β1X1 + β2X2 + β3X3 + . . . + βkXk (10)

can written in terms of changes

∆Y=β1∆X1 + β2∆X2 + β3∆X3 + . . . + βk∆Xk (11)

• the coefficient on the variable Xk measures the change in Yi due to

a one-unit increase in Xk, holding all the other explanatory variables

fixed (the so-called ceteris paribus) assumption: ∆Yk = βk∆Xk

• These effects are sometimes called marginal or partial effects.

Page 28: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 46

1.12 Choosing the Functional Form

• We have already tried a few functional forms

wagesi = β0 + β1educi + ui

log(wagesi) = β0 + β1educi + ui

log(wagesi) = β0 + β1educi + β2experi + β3exper2i + ui

• Perhaps, we could soften the curvature of the relationship between exper andlog(wages) with a quartic

log(wagesi) = β0 + β1educi + β2experi + β3exper2i + β4exper3

i + β5exper4i + ui

. regress lwage educ exper exp2 exp3 exp4 [weight=weight](analytic weights assumed)(sum of wgt is 2.6311e+07)

Source | SS df MS Number of obs = 11893-------------+------------------------------ F( 5, 11887) = 1187.74

Model | 1246.66628 5 249.333256 Prob > F = 0.0000Residual | 2495.34188 11887 .209921921 R-squared = 0.3332

Page 29: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 47

-------------+------------------------------ Adj R-squared = 0.3329Total | 3742.00816 11892 .314666008 Root MSE = .45817

------------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------educ | .1086498 .0017533 61.97 0.000 .1052131 .1120864

exper | .0637706 .00501 12.73 0.000 .0539502 .0735909exp2 | -.0023563 .0004593 -5.13 0.000 -.0032567 -.001456exp3 | .0000358 .0000154 2.32 0.020 5.60e-06 .000066exp4 | -1.80e-07 1.70e-07 -1.06 0.289 -5.13e-07 1.53e-07

_cons | .4973948 .0269574 18.45 0.000 .4445538 .5502357------------------------------------------------------------------------------

• Now the model with the quadratic in experience in the restricted

model, we get F=[(0.3332-0.3291)/(1-0.3332)](11887/2)=36.55, which

is greater than the critical value F2,11887 = 3.00, so we reject H0.

Page 30: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 49

1.13 Potential Problems

1.13.1 Multicollinearity

• We could be tempted to use

log(wagesi) = β0 + β1educi + β2 log(experi) + β3 log(exper2i ) + ui

• But this would not work because log(exper2i ) = 2 ∗ log(experi), so

that log(exper2i ) and log(experi) would be perfectly correlated, we

would have a problem of multicollinearity

• In this case, STATA would drop log(experi), so you would know that

something is wrong

Page 31: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 2

2 Dummy Variables

2.1 The case of one dummy variable

• Often times, variables that take on continuous values are not available,

instead dummy variables have to be used.

• For example, Statistics Canada often classifies continuous variables,

such as age, into categories to preserve confidentiality; education is

often available in the form of the highest degree or diploma attained.

• A dummy variable is a variable that takes on the value 1 or 0, thus

dummy variables are also called binary variables.

Page 32: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 3

• Examples: male (= 1 if the worker is male, 0 otherwise), part-timework status (= 1 if the worker is part-time, 0 otherwise), etc.

• Consider a simple model with one continuous variable X and onedummy D

Yi = β0 + β1X1 + δ0D + ui

This can be interpreted as an intercept shift

If D = 0, then Yi = β0 + β1X1 + ui

If D = 1, then Yi = β0 + β1X1 + δ0D + ui

where the case of D = 0 is the base or reference group

• Returning to our Canadian wage regression, let’s include a male dummy

log(wagesi) = β0 + β1educi + δ0male + ui

Page 33: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 4

. gen male=(sex==1)

. regress lrwage male schooling [weight=fweight](analytic weights assumed)(sum of wgt is 2.2780e+06)

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 2, 9717) = 1156.78

Model | 452.291246 2 226.145623 Prob > F = 0.0000Residual | 1899.62851 9717 .19549537 R-squared = 0.1923

-------------+------------------------------ Adj R-squared = 0.1921Total | 2351.91976 9719 .241991949 Root MSE = .44215

------------------------------------------------------------------------------lrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------male | .2138704 .0089866 23.80 0.000 .1962548 .2314861

schooling | .0841955 .0019724 42.69 0.000 .0803292 .0880618_cons | 1.568616 .0268082 58.51 0.000 1.516066 1.621165

------------------------------------------------------------------------------

Page 34: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 5

• Notice that we have included only 1 dummy when there are two groups.

Since female + male = 1, putting both in would have resulted in

perfect collinearity and one variable would have been dropped!

• This is perhaps the simplest example of the dummy variable trap.

• Since we have choosen females as our base group, β0 represents the

intercept for females and δ0 represents the male advantage.

• In terms of expectations, if we assume the zero conditional mean as-

sumption E(u|male, schooling) = 0, then

δ0 = E(lrwage|male = 1, schooling)−E(lrwage|male = 0, schooling)

Page 35: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 6

• δ = 0.214 is the difference in log hourly wage between males and

females, given the same amount of education (and the same error

term u)

• It says that for the same level of education, men earn about 21%

more than women. The correct calculation is a bit lower (wageM −wageF )/wageF = exp(−0.213704) − 1 = 0.193

• Of course, other factors would have to be taken into account to de-

termine whether this is a discrimination effect

• Instead, we could have constructed a female dummy,

log(wagesi) = α0 + α1educi + γ0female + ui

Page 36: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 7

. gen female=(sex==2)

. regress lrwage female schooling [weight=fweight](analytic weights assumed)(sum of wgt is 2.2780e+06)

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 2, 9717) = 1156.78

Model | 452.291246 2 226.145623 Prob > F = 0.0000Residual | 1899.62851 9717 .19549537 R-squared = 0.1923

-------------+------------------------------ Adj R-squared = 0.1921Total | 2351.91976 9719 .241991949 Root MSE = .44215

------------------------------------------------------------------------------lrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------female | -.2138704 .0089866 -23.80 0.000 -.2314861 -.1962548

schooling | .0841955 .0019724 42.69 0.000 .0803292 .0880618_cons | 1.782486 .026398 67.52 0.000 1.73074 1.834232

------------------------------------------------------------------------------

Page 37: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 8

• Notice that −γ0 = δ0 the coefficient of the dummy variables are ofthe same magnitude but of opposite sign

• Also α0 = β0 + δ0 and β0 = α0 + γ0

• It does not matter which group is choosen to be the base group, butkeeping track of which group is the base group is important for theinterpretation

2.2 Multiple categories

• We can use dummy variables to control for something with multiplecategories

Page 38: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 9

• In our LFS data, education was initially available in terms of 7 cate-gories, as in Table 1

• In this case, we can construct dummy variables for each category, butwe have to omit one category

• We could let STATA choose the category by including edd* in the listof explanatory variables, but often it is best to choose ourselves

• An intermediate category that has a sufficiently large number of ob-servations is a good choice, in the context of wage regression highschool graduates are often the base group

• Because the base group is absorbed in the intercept, if there are ncategories there should be n − 1 dummy variables

Page 39: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 10

. tab educ90, gen(edd)

highest |educational |attainment | Freq. Percent Cum.------------+-----------------------------------

0 | 547 4.61 4.61 /* 0 to 8 years*/1 | 1,878 15.82 20.43 /*Some secondary*/2 | 2,459 20.72 41.15 /*Grade 11 to 13*/3 | 1,088 9.17 50.31 /*Some post secondary*/4 | 4,002 33.72 84.03 /*Post secondary diploma*/5 | 1,321 11.13 95.16 /*Bachelors */6 | 575 4.84 100.00 /*Graduate degree */

------------+-----------------------------------Total | 11,870 100.00

regress lrwage edd1 edd2 edd4 edd5 edd6 edd7 [weight=fweight] /*edd3 omitted */(analytic weights assumed)(sum of wgt is 2.2780e+06)

Page 40: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 11

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 6, 9713) = 316.00

Model | 384.123205 6 64.0205341 Prob > F = 0.0000Residual | 1967.79655 9713 .202594106 R-squared = 0.1633

-------------+------------------------------ Adj R-squared = 0.1628Total | 2351.91976 9719 .241991949 Root MSE = .4501

------------------------------------------------------------------------------lrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------edd1 | -.1295773 .0241357 -5.37 0.000 -.1768883 -.0822663edd2 | -.173165 .0157088 -11.02 0.000 -.2039575 -.1423724edd4 | -.0482456 .0173476 -2.78 0.005 -.0822504 -.0142407edd5 | .1495457 .0126066 11.86 0.000 .1248341 .1742573edd6 | .3869386 .0163941 23.60 0.000 .3548028 .4190744edd7 | .5895479 .0226934 25.98 0.000 .5450641 .6340316

_cons | 2.692276 .0097833 275.19 0.000 2.673099 2.711454------------------------------------------------------------------------------

Page 41: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 12

• The interpretation of the wage premiums is by comparison with high

school educated workers

• Since the dependent variable is log(wages), the coefficients of edd7

and edd6 mean that workers with a bachelor’s degree make about 39%

more than workers with a high school degree and that percentage is

about 59% for workers with a graduate degree

• If there are a lot of categories, it may make sense to group some

together

. gen lesshs=0

. replace lesshs=1 if educ90<=1(2425 real changes made). gen hs=0

Page 42: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 13

. replace hs=1 if educ90==2(2459 real changes made). gen somecol=0. replace somecol=1 if educ90==3 | educ90==4(5090 real changes made). gen univ=0. replace univ=1 if educ90==5 | educ90==6(1896 real changes made)

. sum lrwage lesshs hs somecol univ [weight=fweight](analytic weights assumed)

Variable | Obs Weight Mean Std. Dev. Min Max-------------+-----------------------------------------------------------------

lrwage | 9720 2278028 2.783154 .4919268 1.261226 4.371922lesshs | 9720 2278028 .1807976 .3848702 0 1

hs | 9720 2278028 .2177673 .4127496 0 1somecol | 9720 2278028 .4312809 .4952807 0 1

univ | 9720 2278028 .1701542 .3757875 0 1

. regress lrwage lesshs somecol univ [weight=fweight]

Page 43: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Econ 495 - Econometric Review 14

(analytic weights assumed)(sum of wgt is 2.2780e+06)

Source | SS df MS Number of obs = 9720-------------+------------------------------ F( 3, 9716) = 547.22

Model | 339.954055 3 113.318018 Prob > F = 0.0000Residual | 2011.9657 9716 .207077573 R-squared = 0.1445

-------------+------------------------------ Adj R-squared = 0.1443Total | 2351.91976 9719 .241991949 Root MSE = .45506

------------------------------------------------------------------------------lrwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------lesshs | -.162843 .0146856 -11.09 0.000 -.1916297 -.1340562somecol | .1029682 .0121338 8.49 0.000 .0791835 .1267529

univ | .4461324 .0149344 29.87 0.000 .4168579 .475407_cons | 2.692276 .0098909 272.20 0.000 2.672888 2.711665

------------------------------------------------------------------------------

• In the case of educational attainment, the dummy variables reflect thechoices of individuals. The question of causality is a central issue. Are

Page 44: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Notes from “The long-term effects of Africa’s slave trade” (Nunn) Africa’s economic performance in second half of the 20th century has been very poor One explanation for Africa’s underdevelopment is its history of extraction characterized by two events: slave trades colonialism

Page 45: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Earlier work (Acemoglu et. al.) focus on countries’ colonial experience and current economic development Reasons to expect that slave trades may have been at least as important as colonial rule for Africa’s development For period of nearly 500 years (1400-1900) African continent simultaneously experienced four slave trades By comparison colonial rule lasted from 1885 to 1960 (only 75 years)

Page 46: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Empirical examination

Examine importance of Africa’s slave trades in shaping Africa’s current economic development Construct measures of: number of slaves from each country in Africa in each century between 1400 and 1900

Estimates are constructed by combining: data from ship records on the number of slaves shipped from each

African port or region data from a variety of historical documents that report the ethnic

identities of slaves that were shipped from Africa

Page 47: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Find robust negative relationship between the number of slaves exported from each country and subsequent economic performance The African countries that are the poorest today are the ones from which most slaves were taken This finding cannot be taken as conclusive evidence that the slave trades caused differences in subsequent economic development An alternative explanation is that countries that were initially the most economically and socially underdeveloped selected into the slave trades and these countries continue to be the most underdeveloped today Need to identify the direction of causality (find an instrumental

variable)

Page 48: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Historical Background

Between 1400 and 1900 -- African continent experienced four simultaneous slave trades Largest and most well known is the trans-Atlantic slave trade beginning in the 15th century slaves shipped from West Central Africa and Eastern Africa to

European colonies in the New World

Three other slave trades: Trans-Saharan Red Sea Indian Ocean

Page 49: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

These trades much older -- predated trans-Atlantic slave trade Trans-Saharan slave trade : slaves taken from south Saharan desert to Northern Africa

Red Sea slave trade: slaves taken from inland of the Red Sea and shipped to the Middle

East and India

Indian Ocean slave trade: slaves taken from Eastern Africa and shipped either to Middle East

and India or to plantation islands in the Indian Ocean

Page 50: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

THE LONG-TERM EFFECTS OF AFRICA’S SLAVE TRADES 147

TABLE ISLAVE ETHNICITY DATA FOR THE TRANS-ATLANTIC SLAVE TRADE

Num. Num.Location Years ethnic. obs. Record type

Valencia, Spain 1482–1516 77 2,675 Crown recordsPuebla, Mexico 1540–1556 14 115 Notarial recordsDominican Republic 1547–1591 26 22 Records of salePeru 1548–1560 16 202 Records of saleMexico 1549 12 80 Plantation accountsPeru 1560–1650 30 6,754 Notarial recordsLima, Peru 1583–1589 15 288 Baptism recordsColombia 1589–1607 9 19 Various recordsMexico 1600–1699 28 102 Records of saleDominican Republic 1610–1696 33 55 Government recordsChile 1615 6 141 Sales recordsLima, Peru 1630–1702 33 409 Parish recordsPeru (Rural) 1632 25 307 Parish recordsLima, Peru 1640–1680 33 936 Marriage recordsColombia 1635–1695 6 17 Slave inventoriesGuyane (French Guiana) 1690 12 69 Plantation recordsColombia 1716–1725 33 59 Government recordsFrench Louisiana 1717–1769 23 223 Notarial recordsDominican Republic 1717–1827 11 15 Government recordsSouth Carolina 1732–1775 35 681 Runaway noticesColombia 1738–1778 11 100 Various recordsSpanish Louisiana 1770–1803 79 6,615 Notarial recordsSt. Dominique (Haiti) 1771–1791 25 5,413 Sugar plantationsBahia, Brazil 1775–1815 14 581 Slave listsSt. Dominique (Haiti) 1778–1791 36 1,280 Coffee plantationsGuadeloupe 1788 8 45 Newspaper reportsSt. Dominique (Haiti) 1788–1790 21 1,297 Fugitive slave listsCuba 1791–1840 59 3,093 Slave registersSt. Dominique (Haiti) 1796–1797 56 5,632 Plantation inventoriesAmerican Louisiana 1804–1820 62 223 Notarial recordsSalvador, Brazil 1808–1842 6 456 Records of manumissionTrinidad 1813 100 12,460 Slave registersSt. Lucia 1815 62 2,333 Slave registersBahia, Brazil 1816–1850 27 2,666 Slave listsSt. Kitts 1817 48 2,887 Slave registersSenegal 1818 17 80 Captured slave shipBerbice (Guyana) 1819 66 1,127 Slave registersSalvador, Brazil 1819–1836 12 871 Manumission certificatesSalvador, Brazil 1820–1835 11 1,106 Probate recordsSierra Leone 1821–1824 68 605 Child registersRio de Janeiro, Brazil 1826–1837 31 772 Prison recordsAnguilla 1827 7 51 Slave registersRio de Janeiro, Brazil 1830–1852 190 2,921 Free africans’ recordsRio de Janeiro, Brazil 1833–1849 35 476 Death certificates

Page 51: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

152 QUARTERLY JOURNAL OF ECONOMICS

TABLE IIESTIMATED TOTAL SLAVE EXPORTS BETWEEN 1400 AND 1900 BY COUNTRY

Trans- Indian Trans- Red All slaveIsocode Country name Atlantic Ocean Saharan Sea trades

AGO Angola 3,607,020 0 0 0 3,607,020NGA Nigeria 1,406,728 0 555,796 59,337 2,021,859GHA Ghana 1,614,793 0 0 0 1,614,793ETH Ethiopia 0 200 813,899 633,357 1,447,455SDN Sudan 615 174 408,261 454,913 863,962MLI Mali 331,748 0 509,950 0 841,697ZAR Democratic 759,468 7,047 0 0 766,515

Republic of CongoMOZ Mozambique 382,378 243,484 0 0 625,862TZA Tanzania 10,834 523,992 0 0 534,826TCD Chad 823 0 409,368 118,673 528,862BEN Benin 456,583 0 0 0 456,583SEN Senegal 278,195 0 98,731 0 376,926GIN Guinea 350,149 0 0 0 350,149TGO Togo 289,634 0 0 0 289,634GNB Guinea-Bissau 180,752 0 0 0 180,752BFA Burkina Faso 167,201 0 0 0 167,201MRT Mauritania 417 0 164,017 0 164,434MWI Malawi 88,061 37,370 0 0 125,431MDG Madagascar 36,349 88,927 0 0 125,275COG Congo 94,663 0 0 0 94,663KEN Kenya 303 12,306 60,351 13,490 86,448SLE Sierra Leone 69,607 0 0 0 69,607CMR Cameroon 66,719 0 0 0 66,719DZA Algeria 0 0 61,835 0 61,835CIV Ivory Coast 52,646 0 0 0 52,646SOM Somalia 0 229 26,194 5,855 32,277ZMB Zambia 6,552 21,406 0 0 27,958GAB Gabon 27,403 0 0 0 27,403GMB Gambia 16,039 0 5,693 0 21,731NER Niger 133 0 0 19,779 19,912LBY Libya 0 0 8,848 0 8,848LBR Liberia 6,790 0 0 0 6,790UGA Uganda 900 3,654 0 0 4,554ZAF South Africa 1,944 87 0 0 2,031CAF Central African 2,010 0 0 0 2,010

RepublicEGY Egypt 0 0 1,492 0 1,492ZWE Zimbabwe 554 536 0 0 1,089NAM Namibia 191 0 0 0 191BDI Burundi 0 87 0 0 87GNQ Equatorial Guinea 11 0 0 0 11DJI Djibouti 0 5 0 0 5BWA Botswana 0 0 0 0 0

Page 52: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

THE LONG-TERM EFFECTS OF AFRICA’S SLAVE TRADES 153

TABLE II(CONTINUED)

Trans- Indian Trans- Red All slaveIsocode Country name Atlantic Ocean Saharan Sea trades

CPV Cape Verde Islands 0 0 0 0 0COM Comoros 0 0 0 0 0LSO Lesotho 0 0 0 0 0MUS Mauritius 0 0 0 0 0MAR Morocco 0 0 0 0 0RWA Rwanda 0 0 0 0 0STP Sao Tome & Principe 0 0 0 0 0SWZ Swaziland 0 0 0 0 0SYC Seychelles 0 0 0 0 0TUN Tunisia 0 0 0 0 0

FIGURE IIIRelationship between Log Slave Exports Normalized by Land Area,

ln(exports/area), and Log Real Per Capita GDP in 2000, ln y

between 1400 and 1900 normalized by land area and the naturallog of per capita GDP in 2000.7 As shown in the figure, a negative

7. Because the natural log of zero is undefined, I take the natural log of 0.1. AsI show in the Appendix, the results are robust to the omission of these zero-exportcountries.

Page 53: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

Estimation Equation

 

is per capita GDP in country i

is vector of variables reflecting origin of colonizer prior to independence

is vector of variables reflecting geography and climate

Page 54: Econ 495 - Econometric Review 1 Linear Regression Analysisfaculty.arts.ubc.ca/asiwan/documents/nunn-econ495... · 2013-09-17 · Econ 495 - Econometric Review 27 1.8 Interpretation

THE LONG-TERM EFFECTS OF AFRICA’S SLAVE TRADES 155

TABLE IIIRELATIONSHIP BETWEEN SLAVE EXPORTS AND INCOME

Dependent variable is log real per capita GDP in 2000, ln y

(1) (2) (3) (4) (5) (6)

ln(exports/area) −0.112∗∗∗ −0.076∗∗∗ −0.108∗∗∗ −0.085∗∗ −0.103∗∗∗ −0.128∗∗∗(0.024) (0.029) (0.037) (0.035) (0.034) (0.034)

Distance from 0.016 −0.005 0.019 0.023 0.006equator (0.017) (0.020) (0.018) (0.017) (0.017)

Longitude 0.001 −0.007 −0.004 −0.004 −0.009(0.005) (0.006) (0.006) (0.005) (0.006)

Lowest monthly −0.001 0.008 0.0001 −0.001 −0.002rainfall (0.007) (0.008) (0.007) (0.006) (0.008)

Avg max humidity 0.009 0.008 0.009 0.015 0.013(0.012) (0.012) (0.012) (0.011) (0.010)

Avg min −0.019 −0.039 −0.005 −0.015 −0.037temperature (0.028) (0.028) (0.027) (0.026) (0.025)

ln(coastline/area) 0.085∗∗ 0.092∗∗ 0.095∗∗ 0.082∗∗ 0.083∗∗(0.039) (0.042) (0.042) (0.040) (0.037)

Island indicator −0.398 −0.150(0.529) (0.516)

Percent Islamic −0.008∗∗∗ −0.006∗ −0.003(0.003) (0.003) (0.003)

French legal origin 0.755 0.643 −0.141(0.503) (0.470) (0.734)

North Africa 0.382 −0.304indicator (0.484) (0.517)

ln(gold prod/pop) 0.011 0.014(0.017) (0.015)

ln(oil prod/pop) 0.078∗∗∗ 0.088∗∗∗(0.027) (0.025)

ln(diamond −0.039 −0.048prod/pop) (0.043) (0.041)

Colonizer fixed Yes Yes Yes Yes Yes Yeseffects

Number obs. 52 52 42 52 52 42R2 .51 .60 .63 .71 .77 .80

Notes. OLS estimates of (1) are reported. The dependent variable is the natural log of real per capitaGDP in 2000, ln y. The slave export variable ln(exports/area) is the natural log of the total number of slavesexported from each country between 1400 and 1900 in the four slave trades normalized by land area. Thecolonizer fixed effects are indicator variables for the identity of the colonizer at the time of independence.Coefficients are reported with standard errors in brackets. ∗∗∗ , ∗∗ , and ∗ indicate significance at the 1%, 5%,and 10% levels.

for slave exports remains negative and significant, and the mag-nitude of the estimated coefficient actually increases.9

9. One may also be concerned that the inclusion of the countries in southernAfrica—namely South Africa, Swaziland, and Lesotho—may also be biasing theresults. As I report in the Appendix, the results are robust to also omitting this