glm estimation of trade gravity models with ﬁxed ﬀ · cepr, cesifo, gep, wifo kevin e. stauby...

GLM estimation of trade gravity models with fixed effects

Peter H. Egger∗

ETH Zurich

CEPR, CESifo, GEP, WIFO

Kevin E. Staub†

University of Melbourne

June 2, 2014

Abstract

Many empirical gravity models are now based on generalized linear models (GLM), ofwhich the Poisson pseudo-maximum likelihood estimator is a prominent example and themost-frequently used estimator. Previous literature on the performance of these estimatorshas primarily focussed on the role of the variance function for the estimators’ behavior.We add to this literature by studying the small-sample performance of estimators in adata-generating process that is fully consistent with general equilibrium economic models ofinternational trade. Economic theory suggests that (i) importer- and exporter-specific ef-fects need to be accounted for in estimation, and (ii) that they are correlated with bilateraltrade costs through general-equilibrium (or balance-of-payments) restrictions. We comparethe performance of structural estimators, fixed effects estimators, and quasi-differences es-timators in such settings, using the GLM approach as a unifying framework.

Keywords: Gravity models; Generalized linear models; Fixed effects.

JEL-codes: F14, C23.

∗Corresponding author: ETH Zurich, Department of Management, Technology, and Economics, Weinbergstr.35, 8092 Zurich, Switzerland; E-mail: [email protected]; + 41 44 632 41 08. Egger acknowledges funding byCzech Science Fund (GA ČR) through grant number P402/12/0982.

†Department of Economics, 111 Barry Street, The University of Melbourne, 3010 VIC, Australia; E-mail:[email protected]; +61 3 903 53776.The authors acknowledge financial support from the Faculty of Business and Economics, University of Melbourne.

1

1 Introduction

Models of international trade often imply a gravity equation for bilateral international exports

of country i to country j (Xij),

Xij = Ei Mj Tij ,

where Ei and Mj are exporter-specific factors (a function of goods prices and mass of suppliers)

and importer-specific factors (a function of the price index and total expenditures on goods),

and Tij are bilateral, pair-specific ones (a function of country-pair-specific consumer preferences

and ad-valorem trade costs). Examples include Eaton and Kortum (2002), Anderson and van

Wincoop (2003), Baier and Bergstrand (2009), Waugh (2010), Anderson and Yotov (2012),

Arkolakis, Costinot, and Rodríguez-Clare (2012), and Bergstrand, Egger, and Larch (2013).

Baltagi, Egger, and Pfaffermayr (2014) and Head and Mayer (2014) provide recent surveys of

this literature.

Writing ei = lnEi, mj = lnMj , and tij = lnTij and using the parametrization tij = d′ijβ,

where dij is a vector of observable bilateral variables and β a conformable vector of parameters,

empirical models of the gravity equation follow the general structure

E(Xij |ei,mj , dij) = exp(ei +mj + d′ijβ). (1)

Thus, there are two departures from the original equation. The first relates to the parametriza-

tion of the unknown tij in terms of observables. The linear index structure coupled with the

exponential function allows the inclusion of variables in a flexible way, while giving the elements

of β the convenient interpretation of direct (semi-)elasticities of exports with respect to the

variables in dij and restricting the domain of exports to be positive. The second departure from

the original is that the relationship is stochastic and assumed to hold (only) in expectation.

The stochastic formulation implies an error term which makes the relationship between Xij and

its specified expectation E(Xij |ei,mj , dij) exact. Thus, the moment condition (1) implies two

equivalent representations with stochastic errors, either additive or multiplicative:

Xij = exp(ei +mj + d′ijβ) + εij = exp(ei +mj + d′ijβ)ηij , (2)

where ηij = 1+εij exp(−ei−mj−d′ijβ). Together with (1), this implies that E(εij |ei,mj , dij) = 0

for the additive error εij , and similarly E(ηij |ei,mj , dij) = 1 for the multiplicative error ηij , i.e.,

these errors are mean-independent of the covariates.

It is the goal of this paper to discuss a number of estimators which are consistent, if model

(1) is correctly specified, and explore their small-sample properties under varying distributions

2

of ηij . This is a subject which has attracted considerable attention recently, and the body of

work studying the performance of estimators of gravity models for trade includes Santos Silva

and Tenreyro (2006, 2011), Martin and Pham (2008), Gómez Herrera (2012), Martínez Zarzoso

(2013), and Head and Mayer (2014), among others.

This paper offers three additions to this literature. First, it uses a data-generating process

(DGP) which corresponds to a general equilibrium model of international trade as the basis for

Monte Carlo simulations. The model of international trade is calibrated to match some features

from real-world data. Thus, unlike the previous literature, the performance of estimators is

analyzed in a setting with exporter- and importer-specific effects which are correlated with

dij , as suggested by economic theory and as imposed in all applied structural work on gravity

models (e.g., in Anderson and van Wincoop, 2003, or Bergstrand, Egger, and Larch, 2013).1

Second, apart from estimators used by the previous literature, the comparison also includes

(i) iterative-structural estimators which fully exploit information from the underlying economic

general equilibrium model of international trade, and (ii) estimators based on quasi-differenced

moment conditions recently proposed by Charbonneau (2012). Third, we provide a unified

discussion of the available estimators within the framework of generalized linear models (GLM).

As exemplified in an application to a cross-section of 94 countries in the year 2008, the GLM

framework can be useful for choosing between competing estimators.

Because our focus is on consistent estimators of the gravity equation (1), one common es-

timator which we will ignore is the OLS estimator of the regression with dependent variable

lnXij . The log-linearized OLS estimator will generally deliver inconsistent estimates for β, even

if the true model is (1). A lucid exposition of this issue is given in Santos Silva and Tenreyro

(2006). Following their illustration, note that the logarithmized equation for bilateral exports is

lnXij = ei +mj + d′ijβ + ln ηij ,

and OLS estimation will only be consistent if E(ln ηij |ei,mj , dij) = 0. But, as (2) makes clear,

ln ηij is a function of the covariates and εij , and so the condition for consistency of OLS will be

violated in general.

Approaches that estimate equation (1) consistently are discussed below in Section 2. The

Monte Carlo setup and results from the simulation study are presented in Section 3, followed

by an application with real-world data in Section 4. We summarize and discuss our findings in

Section 5.1In some of the aforementioned work, e.g., in Eaton and Kortum (2002) or Waugh (2010) structural constraints

are assumed by the underlying theory but not imposed in estimation.

3

2 Estimation of gravity models of bilateral trade

In gravity equations derived from economic models of trade, the exporter- and importer-specific

effects ei and mj are functions of the bilateral-specific trade costs tij . This implies that the

exporter- and importer-specific terms are, in general, correlated with the bilateral variables

dij . Thus, estimators relegating these terms to the error will be inconsistent. This includes all

(non-structural) random effects approaches.

Approaches which do allow for dependence between exporter- and importer-specific effects

are structural estimation and fixed effects estimation. Structural estimation explicitly specifies

the exporter- and importer-specific terms as a function of the economic model’s variables. In

contrast, the fixed effects approach is to leave the relationship between dij and the exporter-

and importer-specific terms –the fixed effects– unspecified, and is thus consistent regardless of

the nature of their dependence. Within this approach, there are two ways of handling the fixed

effects. The first way is to estimate β from moment conditions which are derived from (1) but

which do not depend on the fixed effects. The second way treats the fixed effects as parameters

to be estimated. Although both fall into the fixed effects domain, for convenience we will from

now on refer to estimators under the first approach as quasi-differences estimators and to the

ones under the latter approach as fixed effects estimators.

Structural estimators fully exploit the information on the data-generating process and are

thus potentially more efficient than other approaches. However, to be consistent they require

the underlying economic model to be correctly specified. On the other hand, quasi-differences

estimators and fixed effects estimators are consistent under much weaker assumptions. The

quasi-differences approach has the feature of avoiding estimating a potentially large number of

parameters. This can be an advantage because estimation of the set of fixed effects can be both

computationally difficult and lead to less precise estimates. On the other hand, some objects of

interest (such as predictions and average marginal effects) are functions not only of β, but also

of ei and mj . Without having estimates of these fixed effects, it may be impossible to recover

estimates for such objects of interest.

An important drawback of fixed effects estimators of nonlinear models is that they suffer

from the incidental parameters problem. In the context of the gravity equation, this means

that as the number of observations grows, the number of parameters ei and mj that has to be

estimated grows as well. In general, this implies that ei and mj cannot be estimated consistently

and, worse, the inconsistency passes over to β. However, the exponential mean model has

the attractive feature that, like the linear model, it does not exhibit the incidental parameters

problem. This fact is well-known in the context of panel models with one way (i.e., one dimension

4

of fixed effects). Recent simulation evidence suggests that this is also the case with two fixed

effects (Fernández-Val and Weidner, 2013); and our own Monte Carlo results of section 3 also

point in this direction.

We begin by discussing the GLM approach (Section 2.1), which can be used to estimate a

variety of fixed effects and structural estimators, and in principle even serve as a “wrapper" for

some quasi-differenced moment conditions. To fix ideas, we explain the GLM approach in the

context of fixed-effects estimation. Then, we discuss structural and quasi-differences estimation

in some more detail in sections 2.2 and 2.3.

2.1 GLM estimation of gravity models with fixed effects

Nonlinear models with a linear index, such as (1), can be estimated consistently by a host of so-

called generalized linear model (GLM) estimators. These models have a common structure, as is

well-known and outlined briefly in this section. The GLM framework (Nelder and Wedderburn,

1972; McCullagh and Nelder, 1989) is a likelihood-based approach that remains consistent for

the parameters of interest as long as the conditional expectation function is correctly specified.

It is widely used to estimate nonlinear models.

GLM is based on likelihood estimation of potentially misspecified densities of the linear-

exponential family (LEF) of distributions, which includes the Poisson, Negative Binomial,

Gamma, Normal, and other distributions. A GLM estimator is determined by (i) the speci-

fication of a link function – which governs the relationship between the conditional expectation

function and the linear index of covariates – together with (ii) the specification of a family, a

density from the LEF.

Let us view ei and mj under the fixed-effects approach, where the fixed effects are parameters

to be estimated. The distribution of Xij is part of the LEF class if its density can be written as

fθ(Xij) = exp

{Xijθij − b(θij)

a(ϕ)+ c(θij , Xij)

},

with parameters θij = (β′, ei,mj)′, ϕ (which we will later show to be related to the variance of

Xij , denoted as V (Xij)), and arbitrary functions a(·), b(·) and c(·). Such a random variable will

have the log-likelihood

l(θij) =Xijθij − b(θij)

a(ϕ)+ c(θij , Xij),

and score

s(θij) =∂l(θij)

∂θij=

Xij − b′(θij)

a(ϕ).

5

A fundamental result of likelihood theory is that the expected score evaluated at the true value

of the parameter is zero. For members of the LEF, this implies

E[s(θij)] =E(Xij)− b′(θij)

a(ϕ)= 0,

and, therefore, E(Xij) = b′(θij) ≡ µij . In GLM estimation, the mean of Xij , µij , is specified as

a function of a linear index, in our context d′ijβ + ei +mj , so that µij = g−1(d′ijβ + ei +mj), or

g(µij) = d′ijβ + ei +mj . This function, g(·), is called the link function. The gravity model (1),

consequently, implies a log-link.

A second result from likelihood theory, the information matrix equality, allows to derive the

variance function. The Hessian of a LEF random variable is

H(θij) =∂2l(θij)

∂θ2ij=

−b′′(θij)

a(ϕ).

By the information matrix equality, V [s(θ)] = −E[H(θ)], and therefore

E[Xij − b′(θij)]2

a(ϕ)2=

b′′(θij)

a(ϕ),

which is equivalent to V (Xij) = b′′(θij)a(ϕ).

The parameter vector θ is estimated by maximizing the sample (quasi-)likelihood function

lC(β) =C∑i=1

C∑j=1

Xijθij − b(θij)

a(ϕ)+ c(θij , Xij),

with corresponding score for β

sC(β) =

C∑i=1

C∑j=1

Xij − b′(θij)

a(ϕ)

∂θij∂β

=Xij − b′(θij)

a(ϕ)

∂θij∂µij

∂µij

∂β=

Xij − b′(θij)

b′′(θij)a(ϕ)

∂µij

∂β.

Analogous equations hold for ei and mj . The last equality follows from µij = b′(θij). Thus, the

first order conditions sN (β) = 0 can be written as

C∑i=1

C∑j=1

[Xij − E(Xij)]

V (Xij)

∂E(Xij)

∂β= 0. (3)

Equation (3) shows that GLM estimators are moment estimators based on the conditional ex-

pectation residual (the expression in square brackets). As long as the expectation function is

correctly specified, they will be consistent for β. Equation (3) also shows that GLM estimators

weight the residuals in the first-order conditions by the inverse of the variance, V (Xij). There-

fore, relative efficiency gains – but not consistency of β – depend on the correct specification

of the variance function. Different choices of the LEF distribution (i.e., of the family) lead to

different variance functions.

6

As mentioned above, gravity models use the log-link function, so that in general the first

order conditions for β in models based on (1) have the form

C∑i=1

C∑j=1

[Xij − exp(ei +mj + d′ijβ)]

V (Xij)exp(ei +mj + d′ijβ)dij = 0. (4)

Various distributions are suitable candidates for gravity models in the sense that they are mem-

bers of LEF and that they imply variance functions V (Xij) that are potentially compatible with

bilateral export flows. Poisson and Gamma estimators have been discussed in Santos Silva and

Tenreyro (2006) and others; estimators based on Gaussian, inverse Gaussian, and Negative Bi-

nomial quasi-likelihood are also possible and have been used in the literature, though much less

so than Poisson and Gamma estimators. The implied variance functions, in terms of the condi-

tional expectation function µij , are given in Table 1 below. Because of the exponential function,

the derivative ∂E(Xij)/∂β is again µij . The column on the right of the table reports the weight

given in the first-order conditions to the residual of country-pair ij, i.e., wij = µij/V (Xij).

Table 1: Variance functions and F.O.C. weights of some GLM gravity estimators

Estimator V (Xij) wij

Poisson µij 1

Gamma µ2ij µ−1

ij

Negative Binomial µij + µ2ij (1 + µij)

−1

Gaussian 1 µ

Inverse Gaussian µ3ij µ−2

ij

Thus, as pointed out by Santos Silva and Tenreyro (2006), the Poisson estimator weights every

observation equally, which might be optimal when wanting to remain agnostic with respect to

higher-order specification. The other estimators in Table 1, with the exception of the Gaussian

one, down-weight observations with larger means. If these observations are also noisier in the

sense of having a larger variance, this would result in efficiency gains. In contrast, the Gaussian

estimator gives more weight to country-pairs with larger expected exports, a weighting scheme

which might be beneficial, for instance, if there are trustworthiness issues with small trade

flows.2 The Gaussian GLM gravity estimator is equivalent to the nonlinear least squares (NLS)

estimator of (1).2Such trade flows may emerge particularly with or among less-developed countries, where reporting standards

are poor. For instance, Egger and Nigai (2014) illustrate that structural parameterized GLM gravity models tend

to predict large bilateral trade flows with much smaller error than small trade flows.

7

2.2 Iterative-structural estimation

A structural approach to estimating the gravity equation makes use of economic theory. Most

modern gravity models assume resource constraints of the form

E[

C∑j=1

Xij ] = E[

C∑j=1

exp(ei +mj + d′ijβ)] = E[exp(ei)

C∑j=1

exp(mj + d′ijβ)] (5)

and

E[C∑i=1

Xij ] = E[C∑i=1

exp(ei +mj + d′ijβ)] = E[exp(mj)C∑i=1

exp(ei + d′ijβ)]. (6)

Hence, (5) and (6) imply that the country-specific effects ei and mj are implicitly fully de-

termined given data on Xij and dij together with estimates of β. Clearly, this implies that

E[eidij ] = 0, E[mjdij ] = 0, and E[eimj ] = 0 according to economic theory.3

In our Monte Carlo study, we will employ data-generating processes which respect this

feature. Specifically we will consider an endowment economy. Many data-generating processes

considered in the literature are isomorphic to the one of endowment economies (see Anderson and

van Wincoop, 2003, for such an approach, and Eaton and Kortum, 2002; Arkolakis, Costinot,

and Rodrguez-Clare, 2012; and Bergstrand, Egger, and Larch, 2013; for isomorphic processes).

Suppose an economy i is endowed with a production volume of Hi which it sells at a mill

price of Pi, earning it an aggregate income (GDP) of Yi = PiHi. In general, sales to market j are

impeded by some trade costs, for which we use the notation Tij as in the introduction. Then,

in an endowment economy with Armington differentiation (between the goods from different

countries) at an elasticity of 1 − α, nominal aggregate bilateral trade flows in a world akin to

the one in Anderson and van Wincoop (2003) are determined as4

E(Xij) =Pαi TijYj∑C

k=1 Pαk Tkj

, (7)

where, at given Hi and Tij , Pi is implicitly determined from the resource constraint as generically

introduced in (5) and (6), which in the present context becomes

Yi = PiHi =

C∑j=1

Pαi TijPjHj∑Ck=1 P

αk Tkj

, (8)

⇒ P 1−αi =

1

Hi

C∑j=1

TijPjHj∑Ck=1 P

αk Tkj

. (9)

3As a consequence, omitting ei + mj from the specification in (2) or simply replacing it by a log-additive

function of GDP and/or GDP per capita of countries i and j, as had been done for decades in a-theoretical

empirical gravity models, will generally lead to inconsistent estimates of β, the semi-elasticity of trade with

respect to dij .4We skip the exporter-specific preference parameter introduced in Anderson and van Wincoop (2003) without

loss of generality in the interest of brevity.

8

This is exactly the data-generating process which we will use for the endogenous variables in the

model, {E(Xij), Pi, Yi}, before adding a stochastic term to (7). In terms of the gravity equation

(1), lnTij = d′ijβ, lnPαi = ei and ln

PjHj∑Ck=1 P

αk Tkj

= mj .

Assuming that data on GDP of each country i (Yi) are available and α is known, a structural

estimator for β, ei, mj (i = 1, . . . , C, j = 1, . . . , C) can be defined as the solution obtained by

iterating between the following Step 1 and Step 2 until convergence:

Step 0 Initial values:

Set initial values for ei, mj (i = 1, . . . , C, j = 1, . . . , C).

Call vectors of these values {e(n), m(n)}.

Step 1 Update β:

Given current values {e(n), m(n)}, generate the auxiliary variable

em(n)ij = e

(n)i + m

(n)j .

Estimate β by GLM regression of Xij on dij and em(n)ij , under the constraint

that the coefficient corresponding to em(n)ij is set to 1.

The resulting estimator of β is called β(n).

Step 2 Update {e, m}:

Given current estimates β(n), {e(n), m(n)}, GDP {Yi}Ci=1, and α, obtain T(n)ij =

exp{d′ij β(n)(1−α)}, e(n+1)i = lnE

(n+1)i and m

(n+1)j = lnM

(n+1)j , where E(n+1)

i =

Yi∑j

T(n)ij

Yj∑k E

(n)k

T(n)kj

and M(n+1)j =

Yj∑i E

(n)i T

(n)ij

.

After Step 2, (n + 1) is redefined as (n) and Step 1 is repeated. Convergence criteria can be

defined by norms on the vector of differences (β(n+1)′ − β(n)′ , e(n+1)′ − e(n)′, m(n+1)′ − m(n)′). As

in the fixed-effects approach, different iterative-structural estimators are defined by the choice

of the respective family of distributions for the GLM regression in Step 1.

2.3 Quasi-differences estimation

The approaches in the previous sections estimate the gravity model by either viewing the fixed

effects as additional parameters to be jointly estimated with β or by obtaining them from the

solution to an economic system of resource constraints. An advantage of these approaches is

that the estimated fixed effects can be used to calculate general equilibrium comparative statics

(see e.g. Fally, 2012, who uses a Poisson estimation with fixed effects to recover parameters

of an Anderson and van Wincoop, 2003, model). Compared to the structural estimation, the

approach of estimating the fixed effects allows to account for additional unobserved importer-

9

and exporter-specific trade costs not part of the structural importer- and exporter-specific terms

(see e.g. Egger et al., 2012, for an example of such an empirical approach).

An alternative approach is to transform the model to obtain a moment condition that does

not depend on the fixed effects at all. This has the advantage that the number of parameters

to be estimated is reduced substantially which can lead to considerable efficiency gains in the

estimation of β. Since βk (the kth element of β) has the interpretation of the ceteris paribus or

partial equilibrium semi-elasticity of exports with respect to the bilateral trade cost dij,k, the

vector β is often a first-order object of interest in gravity estimation.

In linear models, differencing observations with respect to i and j yields a fixed-effects-

free expression (see for instance Baltagi, 2013, or Cameron and Trivedi, 2005). In nonlinear

models with an exponential mean and one-way fixed effects, a quasi-differencing analog exists:

fixed effects can be eliminated by using i-specific ratios of observations. Charbonneau (2012)

introduced a quasi-differencing approach for exponential mean models with two-way fixed effects,

deriving a moment condition from (1) which does not depend on either ei or mj .

In the context of the gravity model (1), this moment condition requires sets s of 4 country-

pairs involving two exporters (i, l) and two importers (k, j). The two products of exports,

XikXlj = exp[(ei + el) + (mk +mj) + (dik + dlj)

′β]+ (εik + εlj), (10)

XlkXij = exp[(ei + el) + (mk +mj) + (dlk + dij)

′β]+ (εlk + εij), (11)

have the same sums of exporter- and importer-specific terms. Dividing by the bilateral-specific

part and taking conditional expectations yields

E{XikXlj exp

[−(dik + dlj)

′β]|·}

= exp [(ei + el) + (mk +mj)] , (12)

E{XlkXij exp

[−(dlk + dij)

′β]|·}

= exp [(ei + el) + (mk +mj)] , (13)

where the conditioning variables have been suppressed to avoid cluttered notation.5 Since the

expectations of these terms are the same, a method of moment estimator can be based, for

instance, on the difference (12) minus (13). Specifically, Charbonneau (2012) proposes to use

the unconditional moment conditions

E {[XikXlj −XlkXij exp(dik + dlj − dlk − dij)] (dik + dlj − dlk − dij)} = 0 (14)

as a basis for GMM estimation.

In case that the data does not contain observations where bilateral export flows are zero,

alternative moment conditions can be used. Defining Xs ≡ XikXlj/XlkXij and ds = (dik +

5The conditioning variables are dik, dli, ei, el,mk,mj in the first equation, and the corresponding variables in

the second.

10

dlj)− (dlk + dij), one obtains

E(Xs|ds) = exp(d′sβ). (15)

The quasi-differenced model (15) lends itself to estimation by any of the GLM estimators dis-

cussed before.

Even if the country-pair-specific errors εij are independent, the corresponding errors εs =

Xs−E(Xs|ds) will exhibit a mechanical dependence structure as a result of overlaps in exporter

and importer countries in two observations s and s′. Therefore, an appropriate cluster-robust

variance estimator should be adopted when using estimators based on (14) or (15).

The number of possible sets of countries s is very large, so much that for the number of

countries typically used in applications, it might cause computational difficulties. Sacrificing

efficiency, one might only select a subset of all possible sets s. We only use sets for which

i = 1, . . . , C − 2; C − 1 ≥ j ≥ i; l = i + 1; and k ≥ l. This still increases the observations

by an order of magnitude. To get a perspective of the number of sets, for 10 countries (90

country-pairs) this results in 245 sets; for 50 countries (2,450 country-pairs) it results in 55,225

sets.

3 Monte Carlo experiments

If the data are generated from the gravity equation (1) and according to an underlying eco-

nomic model satisfying equations (5), (6), (7), and (9), all estimators discussed in section 2 are

consistent. To explore the performance of the estimators in finite samples, we set up a Monte

Carlo experiment. The questions that we seek to explore are how the relative performance of the

estimators is affected by features of the economic model, by the distribution of the stochastic

errors, and by an increase in the number of countries.

3.1 Data-generating process

The data generating process (DGP) consists of a structural part, in which the bilateral-, exporter-

, and importer-specific determinants of exports are drawn; and a stochastic part, in which random

errors are drawn and joined to the structural part of exports. In terms of the gravity equation,

the structural part is the conditional expectation function given in (1) (denoted by µij), and the

stochastic part is εij .

I. Structural economic model

The structural part of the DGP follows the economic model (5)–(9). Given

11

(i) a value of the substitution elasticity α,

(ii) endowments Hi for every country (i = 1, . . . , C), and

(iii) bilateral trade cost factors Tij for every country-pair ij,

the model implicitly determines GDP Yi, prices Pi, and structural exports E(Xij) = µij through

general equilibrium constraints. Thus, these variables are determined by α and the joint distri-

bution of Hi and Tij .

Broadly in line with Anderson and van Wincoop (2003) and a host of other estimates in the

literature, the parameter α is set at the value -4 as a baseline. The sensitivity of the gravity

estimators to this parameter is explored by varying α to -9 in an alternative scenario. Ceteris

paribus, a higher elasticity of substitution between goods (higher absolute value of α) leads to

a higher variability of exports while endowments remain constant.

The parameters Hi and Tij are drawn in a way which is consistent with economic theory.

In particular, we set intra-national trade frictions to zero (see Eaton and Kortum, 2002; or

Anderson and van Wincoop, 2003), whereby Tii = 1 for all i. For j = i, Tij ∈ (0, 1]. First, two

auxiliary bivariate normal variables are drawn:

zHij , zTij ∼ N(µz,Σz), µz = (3,−2)′, σz = vHT

(3√C/4)2 0.95× 15

√C/4

0.95× 15√C/4 25

.

Then, the parameters of interest are obtained as

Hi = exp(∑j

zHij /C)C > 0 and Tij =

(exp(zTij)

1 + exp(zTij)

)−α/4

∈ (0, 1).

The number of countries C is incorporated in Hi (and zH) such that the share of a country’s

endowment in world endowment remains constant as C increases. With a baseline of 10 countries,

the correlation between Hi and Tij is about 25% and weakens to 12% as C increases to 50.

Endowments Hi have a mean of about 200.

The parameter vHT is a scaling constant for the variance of Hi and Tij which is set equal to

0.1 at baseline. This produces a variance of about 2,200. In an alternative scenario, we increase

the cross-country inequality in endowments by setting vHT = 0.3 which increases the variance

to about 7,800. Similar to the increase in |α| as discussed above, more inequality in endowments

(higher vHT ) leads to more variability in exports as well. However, it also leads to higher average

exports. The resulting coefficient of variation of exports is even slightly lower (1.37) than in

the baseline (1.5), whereas the increase in |α| is associated with a significantly higher coefficient

of variation (2.0). Thus, the challenge of the experiment of increasing |α| lies in the increased

12

dispersion of exports, whereas the challenge of increasing vHT lies in the greater variance of the

fixed effects.

Finally, we parameterize Tij by a single observable trade cost parameter dij for each ij as

lnTij = β0 + β1dij , (16)

with β0 = 0 and β1 = 1.

II. Stochastic shocks

Observed exports are obtained by joining stochastic shocks with the mean of exports, µij , as de-

termined by the structural model: Xij = µijηij . Provided the errors ηij are mean-independent,

E(ηij) = 1, all considered estimators are consistent. Their asymptotic efficiency depends on

the variance of ηij . We consider six different variance functions for ηij . Five of these imply

asymptotic efficiency of the Poisson, Gamma, Negative Binomial, Gaussian, and Inverse Gaus-

sian estimators, respectively. The sixth variance function of ηij is not optimal for any of these

five estimators.

The errors ηij are drawn from a heteroskedastic log-normal distribution

ηij = exp zηij , zηij ∼ N(−0.5σ2η, σ

2η).

Hence, E(ηij) = 1 and V(ηij) = exp(σ2η − 1). Since V(Xij |dij , ei,mj) = µ2

ijσ2η, we determine

σ2η as detailed in Table 2. The term “Optimal estimator" in the table refers to asymptotic

efficiency. This does not exclude the possibility of poor small-sample performance. While the

sixth (dubbed “Other”) variance function we consider does not correspond to anyone of the

discussed estimators, we note that among the estimators of Table 2 it is ‘closest’ to the one of

Poisson for large values of µij .

Table 2: Variance functions in Monte Carlo simulations

Number σ2η V(ηij) V(Xij) Optimal estimator

1 ln(2) 1 µ2ij Gamma

2 ln(1 + µ−1ij ) µ−1

ij µij Poisson

3 ln(2 + µ−1ij ) 1 + µ−1

ij µij + µ2ij NegBin

4 ln(1 + µ−2ij ) µ−2

ij µ−2ij Gaussian

5 ln(1 + µij) µij µ3ij Inv. Gaussian

6 ln(1 + µ−1.5ij ) µ−1.5

ij√µij Other

13

In every case, we rescale the errors ηij to achieve a constant pseudo-R2 = V(µij)/[V(µij) +

V(εij)] of about 50% for all scenarios and variance functions. The errors εij in this pseudo R2

are the implicit additive errors Xij − µij .

III. Specifications and estimators

For estimation, we consider four alternative specifications:

S1: Xij = exp(β10 + β11dij + ei +mj) + ε1ij , (17)

S2: Xij = exp(β20 + β21dij +C∑i=2

e2iDi +C∑

j=2

m2jDj) + ε2ij , (18)

S3: Xij = exp(β30 + β31dij + β32yi + β33yj) + ε3ij , (19)

S4: Xij = exp(β40 + β41dij) + ε4ij . (20)

All specifications include a constant which is denoted by {β10, β20, β30, β40}. The object of

interest is β1 from (16) which here is estimated by the coefficients on dij , {β11, β21, β31, β41}.

S1 is a structural model which includes the true terms of ei and mj with constrained unitary

parameters on them, in line with economic theory. The true unknown parameter β11 is unity

but allowed to be estimated differently from that. S2 is a two-way fixed effects model which

estimates e2i and m2j by country-specific constants through exporter and importer indicator

variables Di and Dj , and the true parameter β21 is unity. Hence, S2 estimates ei + mj by

C(C − 1) constants rather than by structural constraints as in S1 which is less efficient than

S1. S3 is an old-fashioned ad-hoc gravity model which replaces {ei,mj} by log exporter and

importer GDP, {yi, yj}, and estimates parameters {β32, β33} on them – about which we do not

have priors. Clearly, this model is misspecified, as {yi, tij , yj} are correlated with ε3ij . However,

given its vast application in the past, it is interesting to see how this model fares in a laboratory

experiment relative to S1 and S2. Finally, S4 is an ad-hoc model which only includes tij apart

from a constant. For the same reason as S3, this model is misspecified, as tij is correlated with

ε4ij . Notice that while both S3 and S4 are misspecified, it is clearly the case that the problem

that E[eidij ] = 0 and E[mjdij ] = 0 vanishes as the number of countries C grows. Since the

number of countries is large empirically, the bias of β due to omitting ei +mj should be small

in theory (even though countries vary a lot in terms of their size). This issue is commonly

disregarded. Finally, we also estimate two quasi-differences models, based on equations (14) and

(15).

For every specification S1-S4 we use the five GLM estimators based on the Gamma, Poisson,

NegBin, Gaussian, and Inverse Gaussian LEF family. In our baseline scenario α = −4, vHT =

0.1, and C = 10. Since we disregard intra-national trade in estimation (as is commonly the case

14

in applied work), C = 10 results in 90 observations. We consider three alternative scenarios: a

higher value of |α| with α = −9, a higher value of vHT with vHT = 0.3, and a larger number

of countries with C = 50 (2,450 observations). In each alternative scenario, we leave the two

remaining parameters of {α, vHT , C} at baseline values. Results of a fourth alternative with

α = −2 are displayed in Table 9 in the Appendix.

3.2 Simulation results

All statistics presented throughout this subsection are based on 1,000 replications of the DGP.

Baseline scenario α = −4, vHT = 0.1, C = 10

Table 3 presents our main results corresponding to iterative structural (IS) (dubbed S1 above),

fixed effects (FE), and quasi-differences (QD) estimation (FE and QD were dubbed S2 above).

Estimators are grouped in columns. Six panels present statistics for the estimated β1 under the

six different variance functions of the errors in the DGP: the number of convergences achieved out

of 1,000 runs (CR), the mean of β1 over converged estimations (Mean), the standard deviation

(SD), median (Med), and the 5th and 95th percentile of the distribution of estimates.

A glance at the results for the IS estimators reveals that despite the small sample size of

only 90 country-pairs, exploiting all available economic structure of the DGP results in extremely

precise estimates. All estimators are virtually unbiased (mean equal to 1) and display standard

deviations which more often than not are smaller than the two decimal places we report in the

table. The excellent performance of the IS estimators in this case stems from the fact that they

specify the model absolutely correctly, and that the DGP with a single well-behaved explanatory

variable dij is very simple. In this case, the IS estimators may serve as a benchmark against which

the other estimators can be measured. The only exception to the outstanding performance of

the IS estimators lies in the inverse Gaussian (IS-IG) estimator’s serious convergence difficulties.

With a convergence failure rate as high as 40%, this issue points to major numerical difficulties

of this estimator in handling the optimization process reliably for DGPs other than its optimal

variance function.

In the group of FE estimators, the performances of Poisson (FE-P) and Negative Binomial

(FE-NB) come closest to the one of the IS estimators. They display no bias, and present small

standard deviations for FE estimators, lagging only behind the estimator that specifies the vari-

ance process correctly. While the Gamma FE estimator (FE-Gam), too, has a similar dispersion,

it is slightly biased upwards in most DGPs. In contrast, the Gaussian –i.e., Nonlinear Least

Squares– (FE-Gau) and inverse Gaussian estimators (FE-IG) display more serious problems.

15

FE-Gau’s mean is seriously biased upwards under the Gamma and Negative Binomial variance

function DGPs. The problem is not just one of a few outliers, as can be seen from FE-Gau’s

median which does not seem much better-behaved. FE-IG’s performance is dismal. Its mean

is substantially distorted and so is its median. Only if the variance function of exports is cubic

(i.e., with Variance function 5 in the table) does FE-IG produce good results (indeed, in this

case FE-IG provides the best performance among FE estimators, as one would expect).

The last two columns of Table 3 give descriptive statistics of two quasi-differences (QD)

estimators’ small-sample distribution in these DGPs. Using 245 observations of country-tetrads,

a two-step GMM estimator (QD-GMM) and a simple Poisson estimator (QD-P) is used. The

performance of QD-GMM is clearly superior to QD-P. Its mean and median are both centered

at the true value of 1. Its standard deviation tends to be larger than that of the better FE

estimators, though. On the other hand, QD-P is visibly biased in some DGPs.

As the results from the alternative scenarios will show, these results are quite robust and to

a large degree they remain unchanged throughout the alterations that the DGP is subjected to.

Finally, results of the baseline DGP for these IS, FE, and QD estimators can be compared to

the inconsistent ad-hoc approaches. The results are displayed in Table 4. The first five columns

contain results for GLM estimators using log-GDP (yi and yj) as proxies for ei and mi (dubbed

S3 above); the last five columns contain results from GLM regressions of Xij on dij without

any additional regressors (dubbed S4 above). Table 4 illustrates that large biases can arise from

the omission and incorrect treatment of fixed effects. We will not consider these estimators any

further in detail. It will suffice to say that the biases in the alternative scenarios with increased

α or vHT are substantially larger than those in this baseline. On the other hand, the biases

decrease with a larger sample size; a consequence of the fact that in our structural model the

correlation between endowments and bilateral trade costs decreases in a growing world (i.e.,

with more trade partners becoming available).6

6We would like to emphasize this point and put it in context. Anderson and van Wincoop (2003) correctly

remarked that omitting the structural country-specific effects ei +mj from the estimation in gravity models is at

odds with economic theory. Parameter bias of β may be a consequence, since E[eidij ] = 0 and E[mjdij ] = 0, as

we remarked above. However, with a large pair-specific component in trade costs (as documented in Egger and

Nigai, 2014) and more than 200 countries in the world economy, the bias of β from omitting ei+mj in estimation

should be small, especially if the observations (country-pairs) are weighted equally, as is the case with Poisson

estimation. To some extent, this may be viewed as a potential rehabilitation of ad-hoc gravity models as had

been used in the past.

16

Higher elasticity of substitution α = −9, and higher variance of endowments, vHT =

0.3

Results for the DGP with α = 9 are displayed in Table 5. As discussed above, this experiment

increases the variability of trade flows by making countries reacting more strongly to price

differences of traded goods, all the while the world endowment is held constant. In general,

this is a more challenging DGP for the estimators. The IS estimators maintain their good

performance, but convergence failures becomes more prevalent.

The IG estimator, both in its IS and its FE variant, cannot be used. Its convergence rates are

close to 0%, and even the instances recorded as converged are likely to be convergence failures

as the estimates were extremely large numbers. We have marked such cases where the statistic

was not credible with the symbol “†".

Among the FE estimators, the performances broadly echo the baseline. FE-P and FE-NB

continue to show favorable performances, and likewise FE-Gau continues having difficulties in

the Gamma and Negative Binomial DGP. The most striking result is the stark deterioration

of FE-Gam, which now only shows acceptable properties in the Gamma and Inverse Gaussian

DGPs, while having up to 30% bias in the median in the other DGPs.

In contrast, QD-GMM is only slightly negatively affected by this DGP relative to the baseline.

Similarly, there is not much difference in QD-P’s medians. Its means, however, seem quite

distorted by outliers.

By increasing vHT = 0.3, the variability of the endowments (and, hence, of the fixed effects

ei and mj) is increased. The simulation results in Table 6 suggest that this kind of change

can be handled better by the estimators than the increase in the substitution elasticity; the IS

estimators, for instance, have a visibly better convergence rate. The IS-IG estimator works quite

reliably; however, its FE-IG counterpart shows the same poor performance as before.

Higher number of countries C = 50

Finally a much larger sample is considered. With C = 50, one obtains 2,450 observations on

country-pairs that can be used in estimation. This change in the DGP can help in assessing to

what extent the problems described above are small-sample difficulties that can be resolved with

more observations. The results in Table 7 indicate that by and large most of the issues indeed

vanish when data on more countries are available. FE-Gam, FE-P and FE-NB all have little to

no bias and standard deviations which are often not far from IS. The average bias of FE-Gau

is substantially smaller, and even more so is its median bias. However, the bias is still visible,

and this suggests that, while vanishing asymptotically, it can be quite persistent. At C = 50,

17

QD-GMM has essentially zero remaining small-sample bias. Its standard deviation is also small,

although it is most often larger than that of FE-Gam, FE-P, and FE-NB. It seems unlikely

that the two-step QD-GMM will catch up and overtake this group of FE estimators in terms of

efficiency with further increases in sample size. However, we only used a small fraction of the

available sets of country-tetrads s, and further efficiency gains might be achieved by increasing

the number of sets s included in estimation. The estimator FE-IG cannot be recommended, in

general. It is the only FE estimator exhibiting tremendous biases in both mean and median.

QD-P cannot be recommended either. Worryingly, some of the biases are larger than in the case

of C = 10, which suggests that QD-P and estimators like it might not have moments even in

larger samples (at least not for the DGPs we considered), or that the rate of convergence is very

slow.

4 Application

The purpose of this section is to apply the models discussed in the previous section to data. For

this, we employ cross-sectional bilateral export data in nominal U.S. dollars from the United

Nations’ Comtrade Database among 94 countries in the year 2008 and combine them with data

on bilateral trade costs from the Centre d’Études Prospectives et d’Informations Internationales’

Geographical Database (GeoDist Database, 2013). In particular, we use two variables from the

latter database: bilateral distance (entitled dist in the variable list) and common language

(entitled comlang_off in the variable list). Rather than using (log) distance in the original

format, we generate five indicator variables based on the quintiles of log distance (=1 if a pair

ij exhibits distance in that quintile, =0 else). This strategy is akin to, e.g., Eaton and Kortum

(2002). Then, we use these five indicators (we suppress the fifth quintile as the norm in order

to estimate a parameter on the constant) and additionally interact them with the language

indicator. Altogether the model then includes nine arguments in the trade cost function: four

distance quintile main indicator variables and five interacted distance quintile with language

indicator variables:

Tij = exp

(4∑

d=1

βDd Distd,ij +

5∑d=1

βLd Distd,ijLangij

)(21)

While this model seems parsimonious (it excludes other candidates from the trade cost function

such as cultural, economic, historical, and institutional similarity indicators), it captures many

of those aspects due to their collinearity with geographical proximity. In any case, the analysis

here is meant to be illustrative for applied researchers, and the trade cost function could be

altered at discretion.

18

Beyond those variables, all considered GLM models include a constant and some include

exporter and importer country fixed effects to estimate ei and mj (referred to as Fixed-effects

Models) while others include the iteratively-determined structural counterparts to ei and mj as

functions of the estimates Tij (referred to as Iterative-structural Models). Similarly, we tried to

estimate two versions of the quasi-differenced estimator with differenced-out fixed effects, one

relying on GMM (QD-GMM ) and one on Poisson-type estimation (QD-P). QD-GMM failed to

converge and we thus only report results for QD-P.

Table 8 summarises the parameter estimates and robust standard errors (in parentheses)

for all considered models7 and, at the bottom of the table, the information on the Akaike

information criterion for the GLM models. The latter is a particularly meaningful statistic with

gravity models as considered here, since the iterative-structural (IS) models are nested in (are

constrained versions of) the fixed-effects (FE) models.

The parameter results may be summarised as follows. First, we observe a relatively stark

difference among the parameter estimates across the considered estimators. Only a rough pattern

is common to all estimators: Trade between countries in the first quintile of distance is higher

by orders of magnitude relative to the more distant countries. Distant country-pairs that share

a common language tend to trade more than those with different languages; but country-pairs in

the first distance quintile that share the same language trade only about half as much as those

with different languages.

From our Monte Carlo simulations in the previous section, we would conclude that the

differences across estimators can hardly be due to a misspecification of the variance process

alone, since there was virtually no bias in much smaller samples considered before. In any

case, it turns out that the Gamma-GLM and the NegBin-GLM estimators obtain very similar

parameter estimates whereas those for Gaussian-GLM and Poisson-GLM are relatively different.

The Akaike Information Criterion suggests that the Gamma-GLM performs best among both

the FE and the IS GLMs each. But NegBin-GLM is very close to Gamma-GLM in that regard.

Hence, in view of the similarity in the parameter estimates and the comparably low values of

the Akaike information Criterion, Gamma-GLM and NegBin-GLM seem to be the preferred

estimators with the data and specification at hand.

Let us define the Pearson residuals of a GLM model of the family type ℓ as

εPearsonℓ,ij =

Xij − µij

σℓX,ij, (22)

where σℓX,ij is the square root of the variance function evaluated at the estimated value of7The standard errors for the QD-P estimator are derived from resampling the data 100 times for subsamples

of one-quarter of the number of countries

19

the conditional mean, µij . With a proper specification of the variance function, it should be

the case that εPearsonℓ,ij is independent of µij . This is illustrated for the four FE GLMs by way

of scatterplots in Figure 1. In the scatterplots we include a linear regression line to visualise

the relative mean-dependence of the Pearson residuals. If the variance process was correctly

specified, we would expect a horizontal line, i.e.. (mean-)independence of the Pearson residuals

of µij . This is largely the case with Gamma-GLM and NegBin-GLM and not so with Poisson-

GLM and Gaussian-GLM. Hence, the Pearson residual plots provide further evidence against

the latter two models.

Furthermore, we may scrutinize the question of the proper specification of the variance

function by way of so-called deviance residuals:

εdevianceℓ,ij =sign(Xij − µij)√

2[fℓ(Xij)− fℓ(µij)]ϕ, (23)

where fℓ(·) measures the linear-exponential family-ℓ conditional density evaluated at the argu-

ment. The statistic εdevianceℓ,ij should have mean zero and be approximately normally distributed

for any GLM of family type ℓ. We shed light on this issue for the FE GLMs in Figure 2.8 To

facilitate the readability, we present kernel density plots of εdevianceℓ,ij illustrated by a black dashed

curve and add a normal density plot based on the same variance as model ℓ. Figure 2 suggests

that, among the considered models, NegBin-GLM performs best, followed by Poisson-GLM.

Hence, there appears to be some indication of a larger degree of misspecification of the variance

function for Gamma-GLM than for NegBin-GLM. Overall, Figure 2 in conjunction with Figure

1 and the Akaike Information Criterion from Table 8 leads us to classify NegBin-GLM as the

preferred model for the data and specification at hand.

5 Conclusions

This paper alludes to issues in the application of generalized linear models for the estimation of

(structural) gravity models of bilateral international trade. The current status of research on the

matter is the following. First, in a host of theoretical models, bilateral trade flows are structurally

determined by an exponential function based on a log-linear index (see Arkolakis, Costinot, and

Rodriguez-Clare, 2012). Second, it has been remarked and well received that exponential-family,

generalized-linear models rather than log-linearized models should be employed for consistent

estimation (see Santos Silva and Tenreyro, 2006). The literature appears to favor Poisson-,

Gamma-, and to a lesser extent Gaussian-type GLMs in both analysis and application. Other

approaches such as the inverse Gaussian or the Negative Binomial model tend to be ignored.8Corresponding figures for the the IS estimators can be found in the Appendix.

20

The Gaussian and the Negative Binomial models are even deemed improper (see Bosquet and

Boulhol, 2014; Head and Mayer, 2014). Yet other approaches such as quasi-differencing and

generalized method of moments estimation have not been considered at all. As to the model

selection, researchers are recommended to resort to goodness-of-fit measures (see Santos Silva and

Tenreyro, 2006) or not given strong guidance at all. In terms of small-to-medium-sample analysis

provided in earlier work, the data-generating process has never been selected in accordance with

structural models of bilateral trade, and little is known as to how fixed-country-effects estimators

fare relative to structural-iterative models.

This paper takes the generic gravity model of bilateral trade literally and focuses on data-

generating processes that are fully aligned with general-equilibrium or resource constraints

present in such models. The paper presents a rich set of Monte Carlo results for various sizes

of the world economy (in terms of the number of countries) and various assumptions about the

error process. Moreover, the paper takes those insights to cross-sectional data for the year 2008

and illustrates issues with the model selection.

The main insights of the paper are the following. First, we find that the Poisson and Negative

Binomial quasi-maximum likelihood estimators as well as the quasi-differenced GMM estimators

appear to be the best all-round estimators for small as well as larger sample cases, and for

various stochastic processes. However, we encountered difficulties with the GMM estimator in

our application with year 2008 data. For the chosen specification, the Negative Binomial model

was the preferred model. Conclusions drawn in earlier research about the inappropriateness

of the Negative Binomial model should be taken with a grain of salt: they relate to a two-

step estimation approach towards estimation of the variance and the other model parameters

(see Bosquet and Boulhol, 2014). The problems related to this approach do not emerge with

straightforward one-step estimation. An appealing feature of the Negative Binomial estimator

is that it provides a weighted average of the Poisson and the Gamma models.

A further insight is that the iterative-structural GLM estimators perform better in the sim-

ulations than fixed-country-effects estimators due to their greater parsimony. This result is

somewhat different from the analysis in Head and Mayer (2014) who find their “structurally

iterated least squares" (SILS) estimators to be less efficient than the least squares dummy vari-

ables estimator. The differences to the results in this paper may stem from the linearity of the

models considered in Head and Mayer (2014), or from the different algorithms that SILS and

IS-GLM estimators are based on.

The Monte Carlo simulation revealed potentially serious problems with the quasi-differenced

21

Poisson model which is based on ratios of bilateral exports.9 Likewise, the poor performance

of the inverse Gaussian model emerged both in the Monte Carlo simulations with non-inverse-

Gaussian data and in the application with real data. It is possible that the problems associated

with the Inverse Gaussian estimator are to some degree numerical, though; further research

is needed to determine whether using better starting values and optimizing algorithms might

improve this estimator’s performance.

References

1. Anderson, James E. and Eric van Wincoop, 2003. Gravity with gravitas: a solution to the

border puzzle. American Economic Review 93(1), 170-192.

2. Anderson, James E. and Yoto V. Yotov, 2012. Gold standard gravity. Boston College

Working Papers in Economics 795.

3. Arkolakis, Costas, Arnaud Costinot, and Andrés Rodríguez-Clare, 2012. New trade mod-

els, same old gains? American Economic Review 102(1), 94-130.

4. Baier, Scott L. and Jeffrey H. Bergstrand, 2009. Bonus vetus OLS: A simple method

for approximating international trade-cost effects using the gravity equation. Journal of

International Economics 77(1), 77-85.

5. Baltagi, Badi H., 2013, Econometric Analysis of Panel Data, 5th edition, John Wiley and

Sons.

6. Baltagi, Badi H., Peter H. Egger, and Michael Pfaffermayr, 2003. A generalized design for

bilateral trade flow models. Economics Letters 80(3): 391-397.

7. Baltagi, Badi H., Peter H. Egger, and Michael Pfaffermayr, 2014. Panel data gravity

models of international trade, CESifo Working Paper No. 4616, to appear as Chapter 20

of The Oxford Handbook of Panel Data, edited by Badi H. Baltagi.

8. Bergstrand, Jeffrey H., Peter H. Egger, and Mario Larch, 2013. Gravity redux: estima-

tion of gravity-equation coefficients, elasticities of substitution, and general equilibrium

comparative statics under asymmetric bilateral trade costs. Journal of International Eco-

nomics 89(1), 110-121.9Some of these issues might carry over to the tetradic-differenced estimators as discussed in Head and Mayer

(2014) which are also based on ratios of exports.

22

9. Bosquet, Clément, and Hervé Boulhol, 2014. Applying the GLM variance assumption to

overcome the scale-dependence of the Negative Binomial QGPML estimator. Econometric

Reviews 33(7), 772-784.

10. Cameron, A. Colin and Pravin K. Trivedi, 2005. Microeconometrics: Methods and Appli-

cations. Cambridge University Press..

11. Charbonneau, Karyne B., 2012. Multiple Fixed Effects in Nonlinear Panel Data Models –

Theory and Evidence. Unpublished Manuscript, Princeton University.

12. Eaton, Jonathan, and Samuel S. Kortum, 2002. Technology, Geography, and Trade.

Econometrica, 70, 1741-1779.

13. Egger, Peter H., Mario Larch, and Kevin Staub, 2012. The log of gravity with correlated

sectors: An application to structural estimation of bilateral goods and services trade.

Revised version of CEPR Discussion Paper No. 9051.

14. Egger, Peter H., and Sergey Nigai, 2014. Structural gravity with dummies only. Con-

strained ANOVA-type estimation of gravity panel data models. CEPR Discussion Paper.

15. Fally, Thibault, 2012. Structural gravity with fixed effects. Unpublished Manuscript,

University of Colorado.

16. GeoDist Database. Centre d’Études Prospectives et d’Informations Internationales, ac-

cessed in October 2013. http://www.cepii.fr

17. Gómez Herrera, Estrella, 2013. Comparing alternative methods to estimate gravity models

of bilateral trade. Empirical Economics, 44, 1087-1111.

18. Harrigan, James, 1996. Openness to trade in manufactures in the OECD. Journal of

International Economics 40(1-2), 23-39.

19. Head, Keith and Thierry Mayer, 2014. Gravity Equations: Workhorse, Toolkit, and Cook-

book. To appear in Handbook of International Economics Vol. 4, eds. Gopinath, Helpman,

and Rogoff.

20. Martin, Will, and Cong S. Pham, 2008. Estimating the Gravity Model When Zero Trade

Flows Are Frequent. Working Paper Economics Series 2008_03, Deakin University.

21. Martínez Zarzoso, Immaculada, 2013. The log of gravity revisited. Applied Economics,

45(3), 311-327.

23

22. McCullagh P. and John A. Nelder, 1989. Generalized Linear Models. Second ed. London:

Chapman and Hall.

23. Nelder, John A. and Robert W.M. Wedderburn, 1972. Generalized linear models. Journal

of the Royal Statistical Society, Series A 135(3), 370-384.

24. Santos Silva João M.C and Silvana Tenreyro, 2006. The log of gravity. The Review of

Economics and Statistics 88(4), 641-658.

25. Santos Silva João M.C and Silvana Tenreyro, 2011. Further simulation evidence on the

performance of the Poisson pseudo-maximum likelihood estimator. Economics Letters,

112, 220-222.

26. Waugh, Michael E., 2010. International trade and income differences. American Economic

Review 100(5), 2093-2124.

24

Tables

Table 3: Baseline Monte Carlo results, 1,000 replications

Iterative Structural Estimators Fixed Effects Estimators QD

Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 909 988 1000 1000 988 992 999 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.00 1.00 0.99 1.55 1.26 0.99 0.93SD 0.00 0.01 0.01 0.03 0.02 0.07 0.12 0.11 1.03 0.23 0.11 0.22Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.31 1.21 0.99 0.895th 1.00 1.00 1.00 1.00 1.00 0.88 0.82 0.82 0.85 0.99 0.81 0.6695th 1.00 1.00 1.00 1.08 1.00 1.11 1.21 1.19 2.98 1.67 1.18 1.33Variance function 2 (Poisson)CR 1000 1000 1000 998 818 1000 1000 961 1000 969 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 1.46 1.00 1.19SD 0.00 0.00 0.00 0.01 0.02 0.05 0.01 0.01 0.01 0.40 0.05 0.36Med 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 1.36 1.01 1.115th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 0.98 1.04 0.93 0.8495th 1.00 1.00 1.00 1.02 1.01 1.12 1.02 1.02 1.02 2.24 1.05 1.78Variance function 3 (Negative Binomial)CR 1000 1000 1000 911 816 1000 1000 980 993 920 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.05 1.00 1.00 1.63 1.88 1.00 1.09SD 0.00 0.01 0.01 0.04 0.02 0.10 0.12 0.12 1.87 0.56 0.14 0.43Med 1.00 1.00 1.00 1.00 1.00 1.06 0.99 0.99 1.30 1.74 1.01 1.005th 1.00 1.00 1.00 1.00 1.00 0.89 0.82 0.82 0.87 1.21 0.78 0.6695th 1.00 1.00 1.00 1.09 1.00 1.23 1.23 1.21 3.01 3.01 1.23 1.82Variance function 4 (Gaussian)CR 1000 1000 1000 1000 594 1000 1000 950 1000 799 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.06 1.00 1.00 1.00 1.86 1.00 1.36SD 0.00 0.00 0.00 0.00 0.03 0.06 0.01 0.01 0.00 0.72 0.04 0.88Med 1.00 1.00 1.00 1.00 1.00 1.06 1.00 1.00 1.00 1.68 1.01 1.185th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 1.00 1.02 0.95 0.7695th 1.00 1.00 1.01 1.00 1.01 1.16 1.01 1.01 1.00 3.25 1.05 2.39Variance function 5 (Inverse Gaussian)CR 996 999 995 982 996 996 1000 951 993 996 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 1.09 0.99 0.98 0.97SD 0.00 0.00 0.00 0.01 0.00 0.02 0.08 0.06 0.50 0.02 0.07 0.07Med 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 0.98 0.99 0.98 0.975th 1.00 1.00 1.00 1.00 1.00 0.95 0.90 0.90 0.90 0.96 0.92 0.9095th 1.00 1.00 1.00 1.02 1.00 1.01 1.04 1.04 1.66 1.02 1.03 1.03Variance function 6 (Other)CR 1000 1000 1000 1000 691 1000 1000 951 1000 910 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.05 1.00 1.00 1.00 1.65 1.00 1.25SD 0.01 0.00 0.00 0.00 0.02 0.06 0.01 0.01 0.00 0.57 0.04 0.63Med 1.00 1.00 1.00 1.00 1.00 1.04 1.00 1.00 1.00 1.51 1.01 1.125th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 1.00 1.01 0.94 0.7895th 1.00 1.00 1.00 1.00 1.00 1.14 1.02 1.02 1.01 2.75 1.05 1.97

Notes: The table shows descriptive statistics for the distribution of estimates of β1 = 1 in equation (16) for 1,000 replicationsof the DGP of section 3.1 for each of six different variance functions of Table 2. The statistics are: number of convergedreplications (CR), as well as mean, standard deviation (SD), median (Med), 5th and 95th percentiles, all of these computedover converged replications. Gam, P, NB, Gau, and IG stand for Gamma, Poisson Negative Binomial, Gaussian and InverseGaussian GLM estimators. QD stands for Quasi-differences estimators, and GMM and P correspond to equations (14)estimated by two-step GMM and (15) estimated by Poisson GLM. Where statistics were larger than 10 they have beenreplaced by the symbol †.

25

Table 4: Baseline scenario: inconsistent approaches, 1,000 replications

Est. with GDP as proxy for FE Est. without add. regressors

Gam P NB Gau IG Gam P NB Gau IGVariance function 1 (Gamma)CR 1000 1000 1000 998 989 1000 1000 1000 1000 1000Mean 0.92 0.84 0.85 0.98 1.15 0.89 0.81 0.83 0.88 1.09SD 0.08 0.12 0.11 1.10 0.33 0.09 0.12 0.11 2.30 0.42Med 0.93 0.83 0.85 0.79 1.08 0.90 0.80 0.83 0.74 1.025th 0.78 0.65 0.69 0.54 0.83 0.74 0.62 0.66 0.51 0.7695th 1.05 1.04 1.03 1.58 1.66 1.03 1.03 1.01 1.23 1.60Variance function 2 (Poisson)CR 1000 1000 943 1000 872 1000 1000 967 1000 991Mean 0.93 0.84 0.84 0.79 1.83 0.90 0.81 0.82 0.77 2.08SD 0.07 0.06 0.06 0.07 1.54 0.07 0.07 0.07 0.09 2.65Med 0.94 0.85 0.85 0.79 1.28 0.91 0.82 0.82 0.77 1.115th 0.82 0.74 0.74 0.66 0.84 0.78 0.70 0.70 0.62 0.7595th 1.03 0.93 0.93 0.90 4.43 1.01 0.91 0.91 0.90 6.93Variance function 3 (Negative Binomial)CR 1000 1000 998 996 817 1000 1000 999 1000 977Mean 0.94 0.84 0.86 1.01 2.10 0.91 0.82 0.83 0.85 1.99SD 0.11 0.12 0.11 1.53 1.90 0.12 0.12 0.11 0.92 2.23Med 0.94 0.84 0.85 0.80 1.48 0.91 0.81 0.83 0.75 1.165th 0.76 0.65 0.68 0.55 0.77 0.70 0.62 0.65 0.52 0.6795th 1.12 1.07 1.05 1.54 5.24 1.09 1.04 1.04 1.22 6.96Variance function 4 (Gaussian)CR 1000 1000 946 1000 630 1000 1000 968 1000 885Mean 0.95 0.84 0.84 0.79 2.79 0.91 0.82 0.82 0.77 3.35SD 0.08 0.06 0.06 0.07 2.91 0.08 0.07 0.07 0.09 4.59Med 0.95 0.85 0.85 0.79 1.46 0.92 0.82 0.82 0.77 1.215th 0.80 0.74 0.74 0.66 0.80 0.76 0.70 0.70 0.62 0.7495th 1.06 0.93 0.93 0.91 8.58 1.04 0.92 0.92 0.90 12.56Variance function 5 (Inverse Gaussian)CR 996 1000 958 988 996 997 1000 966 999 996Mean 0.90 0.82 0.82 † 1.02 0.88 0.80 0.80 † 1.00SD 0.07 0.13 0.09 † 0.09 0.09 0.15 0.11 † 0.15Med 0.91 0.82 0.83 0.77 1.01 0.88 0.79 0.80 0.74 0.985th 0.81 0.70 0.70 0.61 0.88 0.76 0.66 0.66 0.57 0.8395th 0.98 0.92 0.92 0.92 1.18 0.97 0.91 0.91 0.91 1.21Variance function 6 (Other)CR 1000 1000 955 1000 781 1000 1000 955 1000 944Mean 0.94 0.84 0.84 0.79 2.36 0.91 0.82 0.82 0.77 2.71SD 0.07 0.06 0.06 0.07 2.29 0.08 0.07 0.07 0.09 3.99Med 0.94 0.85 0.85 0.79 1.39 0.91 0.82 0.82 0.77 1.145th 0.81 0.74 0.73 0.66 0.81 0.77 0.70 0.70 0.62 0.7395th 1.05 0.92 0.92 0.90 6.11 1.02 0.91 0.91 0.90 8.81

Notes: See Table 3.

26

Table 5: Alternative 1 (α = −9), 1,000 replications


Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 971 999 757 799 1000 1000 961 965 876 1000 1000Mean 1.00 1.01 1.00 1.02 1.01 1.00 1.00 1.00 1.60 1.24 0.99 0.96SD 0.00 0.02 0.01 0.04 0.03 0.03 0.11 0.10 0.93 0.32 0.06 0.46Med 1.00 1.01 1.00 1.00 1.00 1.00 0.99 0.99 1.32 1.17 0.99 0.865th 1.00 0.99 1.00 0.97 1.00 0.95 0.84 0.85 0.89 0.99 0.89 0.5995th 1.00 1.03 1.01 1.08 1.01 1.05 1.20 1.18 3.13 1.77 1.10 1.57Variance function 2 (Poisson)CR 990 999 1000 957 33 937 1000 933 1000 17 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.34 1.00 1.00 1.00 † 1.03 4.02SD 0.01 0.01 0.00 0.02 0.01 0.12 0.02 0.02 0.01 † 0.10 22.00Med 1.00 1.00 1.00 1.00 1.00 1.33 1.00 1.00 1.00 † 1.03 0.925th 1.00 1.00 1.00 1.00 1.00 1.15 0.97 0.97 0.99 2.20 0.87 0.3395th 1.00 1.02 1.01 1.05 1.03 1.54 1.03 1.03 1.02 † 1.17 10.34Variance function 3 (Negative Binomial)CR 974 978 999 721 34 914 1000 955 960 14 1000 1000Mean 1.00 1.01 1.00 1.02 1.01 1.36 1.01 1.01 1.66 † 1.06 3.60SD 0.01 0.02 0.01 0.04 0.04 0.13 0.11 0.11 2.11 † 0.15 18.16Med 1.00 1.01 1.00 1.01 1.00 1.36 1.00 1.00 1.32 † 1.07 0.885th 1.00 0.99 1.00 0.97 1.00 1.16 0.84 0.84 0.89 0.20 0.81 0.3495th 1.00 1.03 1.01 1.08 1.12 1.57 1.22 1.22 3.03 † 1.29 11.07Variance function 4 (Gaussian)CR 883 998 1000 1000 0 602 1000 936 1000 44 1000 1000Mean 1.00 1.01 1.00 1.00 – 1.62 1.01 1.01 1.00 † 1.04 3.59SD 0.01 0.01 0.00 0.00 – 0.17 0.03 0.03 0.00 † 0.12 26.88Med 1.00 1.00 1.00 1.00 – 1.61 1.01 1.01 1.00 † 1.04 0.695th 1.00 1.00 1.00 1.00 – 1.35 0.97 0.97 1.00 † 0.83 0.2495th 1.00 1.02 1.01 1.00 – 1.91 1.03 1.03 1.00 † 1.23 6.80Variance function 5 (Inverse Gaussian)CR 998 988 996 912 996 998 1000 927 992 942 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 0.99 0.98 0.98 1.03 1.00 0.98 0.99SD 0.00 0.01 0.01 0.03 0.00 0.01 0.04 0.04 0.27 0.01 0.02 0.14Med 1.00 1.00 1.00 1.00 1.00 0.99 0.98 0.98 0.99 1.00 0.99 0.995th 0.99 0.99 1.00 0.99 1.00 0.98 0.94 0.94 0.91 0.99 0.97 0.9295th 1.00 1.02 1.01 1.06 1.00 1.00 1.01 1.01 1.23 1.01 1.00 1.04Variance function 6 (Other)CR 950 999 1000 1000 7 780 1000 933 1000 47 1000 1000Mean 1.00 1.00 1.00 1.00 1.08 1.50 1.00 1.00 1.00 † 1.04 2.85SD 0.01 0.01 0.00 0.00 0.24 0.15 0.02 0.02 0.00 † 0.10 16.98Med 1.00 1.00 1.00 1.00 1.00 1.49 1.01 1.01 1.00 † 1.04 0.725th 1.00 1.00 1.00 1.00 0.93 1.26 0.98 0.98 1.00 † 0.86 0.2995th 1.00 1.02 1.01 1.00 1.62 1.77 1.03 1.03 1.00 † 1.22 9.55

Notes: See Table 3.

27

Table 6: Alternative 2 (vHT = 0.3), 1,000 replications


Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 998 1000 820 929 1000 1000 981 985 965 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.00 1.00 1.00 1.66 1.26 0.99 0.92SD 0.00 0.01 0.01 0.05 0.01 0.04 0.11 0.11 1.23 0.29 0.08 0.29Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.33 1.20 0.99 0.865th 1.00 1.00 1.00 1.00 1.00 0.93 0.84 0.84 0.83 1.00 0.86 0.6295th 1.00 1.01 1.00 1.09 1.00 1.07 1.20 1.18 3.65 1.67 1.13 1.40Variance function 2 (Poisson)CR 1000 1000 1000 976 262 1000 1000 943 1000 200 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.17 1.00 1.00 1.00 2.67 1.01 1.97SD 0.00 0.00 0.00 0.02 0.01 0.09 0.02 0.02 0.01 0.82 0.07 3.88Med 1.00 1.00 1.00 1.00 1.00 1.16 1.00 1.00 1.00 2.53 1.02 1.135th 1.00 1.00 1.00 1.00 1.00 1.02 0.97 0.97 0.98 1.74 0.90 0.5095th 1.00 1.01 1.00 1.05 1.00 1.33 1.03 1.03 1.02 4.24 1.11 5.17Variance function 3 (Negative Binomial)CR 1000 999 1000 834 313 1000 1000 977 986 186 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.20 1.01 1.00 1.67 2.67 1.04 2.05SD 0.00 0.01 0.01 0.05 0.03 0.11 0.12 0.11 1.25 0.86 0.14 4.55Med 1.00 1.00 1.00 1.00 1.00 1.19 1.00 1.00 1.35 2.54 1.05 1.075th 1.00 1.00 1.00 1.00 1.00 1.02 0.83 0.83 0.88 1.64 0.82 0.4695th 1.00 1.01 1.00 1.09 1.01 1.39 1.24 1.20 3.40 4.01 1.26 5.49Variance function 4 (Gaussian)CR 1000 1000 1000 1000 31 998 1000 951 1000 32 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.33 1.00 1.00 1.00 † 1.03 3.57SD 0.00 0.00 0.00 0.00 0.01 0.14 0.02 0.02 0.00 † 0.07 15.01Med 1.00 1.00 1.00 1.00 1.00 1.32 1.01 1.01 1.00 † 1.03 1.035th 1.00 1.00 1.00 1.00 1.00 1.12 0.97 0.97 1.00 3.16 0.92 0.3795th 1.00 1.01 1.01 1.00 1.00 1.56 1.03 1.03 1.00 † 1.13 11.11Variance function 5 (Inverse Gaussian)CR 997 996 997 959 992 997 1000 947 992 994 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 0.99 0.98 0.98 1.05 1.00 0.98 0.98SD 0.00 0.01 0.00 0.02 0.00 0.01 0.06 0.05 0.41 0.01 0.04 0.07Med 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.98 0.98 0.99 0.98 0.985th 0.99 1.00 1.00 1.00 1.00 0.98 0.92 0.92 0.90 0.98 0.95 0.9295th 1.00 1.00 1.00 1.05 1.00 1.00 1.04 1.03 1.46 1.01 1.01 1.04Variance function 6 (Other)CR 1000 1000 1000 998 96 1000 1000 943 1000 44 1000 1000Mean 1.00 1.00 1.00 1.00 1.01 1.26 1.00 1.00 1.00 † 1.03 2.75SD 0.00 0.00 0.00 0.00 0.04 0.12 0.02 0.02 0.00 † 0.07 7.27Med 1.00 1.00 1.00 1.00 1.00 1.25 1.01 1.01 1.00 2.70 1.03 1.105th 1.00 1.00 1.00 1.00 1.00 1.07 0.97 0.97 1.00 1.46 0.92 0.4495th 1.00 1.01 1.01 1.00 1.01 1.46 1.03 1.03 1.01 † 1.13 9.07

Notes: See Table 3.

28

Table 7: Alternative 3 (C = 50), 1,000 replications


Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.15 1.13 1.00 0.98SD 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.02 0.12 0.04 0.02 0.07Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.13 1.13 1.00 0.975th 1.00 1.00 1.00 1.00 1.00 0.98 0.96 0.96 1.01 1.07 0.97 0.8995th 1.00 1.00 1.00 1.00 1.00 1.02 1.04 1.02 1.36 1.22 1.03 1.12Variance function 2 (Poisson)CR 1000 1000 1000 1000 844 1000 1000 849 1000 640 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.00 1.00 3.03 1.00 1.60SD 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.62 0.01 0.51Med 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.00 1.00 2.95 1.00 1.495th 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 2.17 0.99 1.1395th 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 4.08 1.01 2.35Variance function 3 (Negative Binomial)CR 1000 1000 1000 1000 824 1000 1000 1000 1000 432 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 0.99 1.16 3.60 1.00 1.53SD 0.00 0.00 0.00 0.00 0.00 0.02 0.02 0.02 0.12 0.66 0.03 0.45Med 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.14 3.52 1.00 1.415th 1.00 1.00 1.00 1.00 1.00 0.99 0.96 0.96 1.00 2.76 0.96 1.0995th 1.00 1.00 1.00 1.00 1.00 1.05 1.04 1.03 1.38 4.78 1.04 2.37Variance function 4 (Gaussian)CR 1000 1000 1000 1000 273 1000 1000 835 1000 3 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 5.99 1.00 2.42SD 0.00 0.00 0.00 0.00 0.03 0.02 0.00 0.00 0.00 0.49 0.01 3.97Med 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 5.94 1.00 1.595th 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 5.53 0.99 1.0395th 1.00 1.00 1.00 1.00 1.00 1.05 1.00 1.00 1.00 6.51 1.01 6.06Variance function 5 (Inverse Gaussian)CR 997 1000 997 988 997 997 1000 908 987 997 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.97 1.48 0.99 0.98 0.98SD 0.00 0.00 0.00 0.00 0.00 0.01 0.06 0.02 1.50 0.00 0.04 0.04Med 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 1.00 0.99 0.98 0.985th 1.00 1.00 1.00 1.00 1.00 0.97 0.94 0.95 0.93 0.99 0.95 0.9595th 1.00 1.00 1.00 1.00 1.00 0.99 1.06 1.01 3.32 1.00 1.03 1.02Variance function 6 (Other)CR 1000 1000 1000 1000 572 1000 1000 831 1000 60 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.00 4.19 1.00 2.19SD 0.00 0.00 0.00 0.00 0.01 0.02 0.00 0.00 0.00 0.82 0.01 6.98Med 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.00 4.26 1.00 1.555th 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 3.16 0.99 1.1095th 1.00 1.00 1.00 1.00 1.00 1.04 1.00 1.00 1.00 5.60 1.01 3.77Notes: See Table 3.

29

Tab

le8:

Est

imat

ion

resu

lts:

Gra

vity

mod

elof

trad

e,C

=94

,N

=8,

836

Fix

ed-e

ffect

ses

tim

ator

sIt

erat

ive-

stru

ctur

ales

tim

ator

sQ

Des

tim

ator

Gam

ma

Poi

sson

Neg

Bin

Gau

ssia

nG

amm

aPoi

sson

Neg

Bin

Gau

ssia

nPoi

sson

dist

110

.096

∗∗∗

7.10

3∗∗∗

9.82

1∗∗∗

8.24

6∗∗∗

8.89

0∗∗∗

7.07

6∗∗∗

8.88

1∗∗∗

6.04

8∗∗∗

7.03

8∗∗∗

(0.1

61)

(0.2

61)

(0.1

50)

(0.8

54)

(0.1

99)

(0.2

92)

(0.1

99)

(0.5

11)

(0.2

53)

dist

22.

227∗

∗∗1.

049∗

∗∗2.

084∗

∗∗0.

294

2.26

2∗∗∗

1.02

9∗∗∗

2.24

4∗∗∗

-0.0

850.

180

(0.1

01)

(0.2

77)

(0.0

92)

(1.3

20)

(0.1

82)

(0.2

82)

(0.1

80)

(0.5

22)

(0.2

74)

dist

31.

239∗

∗∗1.

203∗

∗∗1.

157∗

∗∗0.

477

1.36

5∗∗∗

1.20

8∗∗∗

1.36

3∗∗∗

0.55

90.

501∗

(0.0

82)

(0.2

81)

(0.0

75)

(0.4

96)

(0.0

92)

(0.2

81)

(0.0

91)

(0.5

08)

(0.2

66)

dist

40.

799∗

∗∗0.

813∗

∗∗0.

755∗

∗∗0.

088

0.96

7∗∗∗

0.81

8∗∗∗

0.96

5∗∗∗

0.27

80.

306

(0.0

77)

(0.2

96)

(0.0

71)

(0.9

47)

(0.0

91)

(0.2

91)

(0.0

91)

(0.5

38)

(0.3

37)

lang

×di

st1

-5.6

18∗∗

∗-4

.536

∗∗∗

-5.5

46∗∗

∗-4

.969

∗∗∗

-5.5

77∗∗

∗-4

.517

∗∗∗

-5.5

78∗∗

∗-3

.899

∗∗∗

-4.7

94∗∗

∗

(0.1

91)

(0.2

44)

(0.1

86)

(0.4

30)

(0.2

45)

(0.2

14)

(0.2

45)

(0.2

04)

(0.2

21)

lang

×di

st2

0.26

0∗∗

0.60

3∗∗

0.28

3∗∗

0.81

5-0

.608

∗∗∗

0.64

0∗∗

-0.5

98∗∗

∗1.

351∗

∗∗1.

530∗

∗∗

(0.1

27)

(0.2

99)

(0.1

24)

(0.9

91)

(0.2

09)

(0.2

77)

(0.2

08)

(0.3

59)

(0.3

57)

lang

×di

st3

0.87

6∗∗∗

-0.0

240.

860∗

∗∗-0

.446

0.26

0-0

.032

0.25

4-0

.292

0.31

5(0

.179

)(0

.306

)(0

.174

)(0

.575

)(0

.185

)(0

.257

)(0

.184

)(0

.325

)(0

.251

)

lang

×di

st4

0.34

9∗∗∗

0.03

00.

381∗

∗∗-0

.423

0.41

6∗∗∗

0.05

90.

410∗

∗∗-0

.405

0.54

4∗

(0.1

26)

(0.2

94)

(0.1

20)

(0.5

57)

(0.1

41)

(0.1

92)

(0.1

40)

(0.2

89)

(0.2

94)

lang

×di

st5

0.58

4∗∗∗

0.71

8∗0.

572∗

∗∗-0

.278

1.07

5∗∗∗

0.77

2∗∗

1.06

7∗∗∗

-0.0

551.

145∗

∗∗

(0.1

60)

(0.3

82)

(0.1

54)

(0.5

43)

(0.1

76)

(0.3

25)

(0.1

75)

(0.5

23)

(0.3

26)

AIC

1192

02.6

319.

750e

+08

1200

23.2

4624

6273

.375

1338

24.2

809.

757e

+08

1339

38.5

1124

9670

.440

603.

449

Not

es:

The

tabl

eco

ntai

nses

tim

ates

ofa

grav

ity

mod

elw

ith

spec

ifica

tion

(21)

.R

obus

tst

anda

rder

rors

inpa

rent

hesi

s.B

oots

trap

ped

stan

dard

erro

rsfo

rQ

D-P

oiss

onin

pare

nthe

sis.

The

vari

able

sdi

st1,

...,d

ist4

are

indi

cato

rva

riab

les

=1

ifth

eco

untr

y-pa

iris

inth

e1,

...,4

dist

ance

quin

tile

.T

heva

riab

les

lang

×di

st1,

...,la

ng×

dist

5ar

ein

tera

ctio

nsof

the

dist

ance

quin

tile

indi

cato

rsw

ith

anin

dica

tor

vari

able

=1

ifth

eco

untr

y-pa

irsh

ares

the

sam

ela

ngua

ge.

∗,∗

∗an

d∗∗

∗in

dica

test

atis

tica

lsig

nific

ance

atth

e10

%,5

%an

d1%

leve

l.A

ICis

the

Aka

ike

Info

rmat

ion

Cri

teri

on.

30

Figures

020

4060

Res

idua

ls

0 10000 20000 30000 40000Predicted conditional mean

Gamma

050

0010

000

1500

0

Res

idua

ls

0 50 100 150Predicted conditional mean

Poisson0

2040

60

Res

idua

ls


NegBin

050

0000

01.

00e+

07

Res

idua

ls


Gaussian

Figure 1: Fixed Effects estimators: Predictions and Pearson Residuals

Notes: The figure plots predictions of trade flows against Pearson residuals for fixed effects GLM estimates of the trade

gravity model (21), corresponding to columns 2 to 5 of Table 8.

31

0.1

.2.3

Ker

nel d

ensi

ty

−5 0 5 10 15Deviance residuals

kernel = epanechnikov, bandwidth = 0.2714

Gamma

0.0

05.0

1

Ker

nel d

ensi

ty

−4000 −2000 0 2000 4000 6000Deviance residuals


Poisson

0.1

.2

Ker

nel d

ensi

ty

−5 0 5 10 15Deviance residuals


NegBin

05.

000e

−07

1.00

0e−

061.

500e

−06

Ker

nel d

ensi

ty0 2000000 4000000 6000000 8000000 10000000

Deviance residualskernel = epanechnikov, bandwidth = 34.3910

Gaussian

Residuals Normal density

Figure 2: Fixed Effects estimators: Density of deviance residuals

Notes: The figure plots kernel density of deviance residuals for fixed effects GLM estimates of the trade gravity model (21),

corresponding to columns 2 to 5 of Table 8.

32

Appendix

Table 9: Alternative scenario α = −2 (1,000 replications)


Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 989 1000 1000 1000 977 997 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.43 1.25 0.99 0.95SD 0.00 0.01 0.00 0.02 0.00 0.14 0.17 0.17 0.85 0.25 0.20 0.24Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.26 1.23 0.98 0.925th 1.00 1.00 1.00 1.00 1.00 0.77 0.73 0.73 0.75 0.87 0.66 0.6195th 1.00 1.00 1.00 1.02 1.00 1.21 1.30 1.29 2.54 1.71 1.34 1.38Variance function 2 (Poisson)CR 1000 1000 1000 1000 999 1000 1000 999 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.02SD 0.01 0.00 0.00 0.00 0.01 0.03 0.01 0.01 0.01 0.06 0.03 0.07Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.015th 1.00 1.00 1.00 1.00 1.00 0.95 0.97 0.97 0.98 0.94 0.95 0.9395th 1.00 1.00 1.01 1.00 1.00 1.05 1.02 1.02 1.02 1.14 1.04 1.13Variance function 3 (Negative Binomial)CR 1000 1000 1000 990 997 1000 1000 980 1000 999 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 0.99 1.53 1.41 0.99 0.96SD 0.00 0.00 0.00 0.02 0.00 0.15 0.18 0.18 1.79 0.35 0.22 0.27Med 1.00 1.00 1.00 1.00 1.00 1.02 0.99 0.98 1.26 1.36 1.00 0.935th 1.00 1.00 1.00 1.00 1.00 0.77 0.72 0.72 0.73 0.93 0.65 0.6195th 1.00 1.00 1.00 1.02 1.00 1.27 1.33 1.31 2.91 2.03 1.34 1.42Variance function 4 (Gaussian)CR 1000 1000 1000 1000 998 1000 1000 997 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.01SD 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.06 0.01 0.06Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.005th 1.00 1.00 1.00 1.00 1.00 0.98 0.99 0.99 1.00 0.97 0.99 0.9795th 1.00 1.00 1.00 1.00 1.00 1.02 1.01 1.01 1.00 1.06 1.01 1.06Variance function 5 (Inverse Gaussian)CR 993 999 993 982 993 993 1000 991 992 994 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.97 0.97 0.96 1.16 0.98 0.96 0.96SD 0.00 0.01 0.06 0.06 0.06 0.06 0.19 0.17 0.90 0.07 0.18 0.12Med 1.00 1.00 1.00 1.00 1.00 0.97 0.97 0.97 0.98 0.98 0.97 0.975th 1.00 1.00 1.00 1.00 1.00 0.90 0.82 0.83 0.81 0.93 0.79 0.8495th 1.00 1.00 1.00 1.00 1.00 1.04 1.12 1.11 2.36 1.05 1.10 1.05Variance function 6 (Other)CR 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.01SD 0.00 0.00 0.00 0.00 0.00 0.02 0.01 0.01 0.00 0.04 0.01 0.04Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.005th 1.00 1.00 1.00 1.00 1.00 0.97 0.99 0.99 0.99 0.96 0.98 0.9595th 1.00 1.00 1.00 1.00 1.00 1.03 1.01 1.01 1.01 1.08 1.02 1.08Notes: See Table 3.

33

010

020

0

Res

idua

ls


Gamma

050

0010

000

1500

0

Res

idua

ls


Poisson

010

020

0

Res

idua

ls


NegBin

−1.

00e+

070

1.00

e+07

Res

idua

ls


Gaussian

Figure 3: Iterative Structural estimators: Predictions and Pearson Residuals

Notes: The figure plots predictions of trade flows agains Pearson residuals for iterative structural GLM estimates of the

trade gravity model (21), corresponding to columns 6 to 9 of Table 8.

34

0.1

.2

Ker

nel d

ensi

ty

−5 0 5 10 15 20Deviance residuals


Gamma

0.0

05.0

1.0

15

Ker

nel d

ensi

ty

−4000 −2000 0 2000 4000 6000Deviance residuals


Poisson

0.1

.2.3

Ker

nel d

ensi

ty

−5 0 5 10 15 20Deviance residuals


NegBin

05.

000e

−07

1.00

0e−

06

Ker

nel d

ensi

ty−5000000 0 5000000 10000000 15000000

Deviance residualskernel = epanechnikov, bandwidth = 321.4131

Gaussian

Residuals Normal density

Figure 4: Iterative Structural estimators: Density of deviance residuals

Notes: The figure plots kernel density of deviance residuals for iterative structural GLM estimates of the trade gravity

model (21), corresponding to columns 6 to 9 of Table 8.

35

glm estimation of trade gravity models with ﬁxed ﬀ · cepr, cesifo, gep, wifo kevin e. stauby...

Documents