glm estimation of trade gravity models with fixed ff · cepr, cesifo, gep, wifo kevin e. stauby...
TRANSCRIPT
GLM estimation of trade gravity models with fixed effects
Peter H. Egger∗
ETH Zurich
CEPR, CESifo, GEP, WIFO
Kevin E. Staub†
University of Melbourne
June 2, 2014
Abstract
Many empirical gravity models are now based on generalized linear models (GLM), ofwhich the Poisson pseudo-maximum likelihood estimator is a prominent example and themost-frequently used estimator. Previous literature on the performance of these estimatorshas primarily focussed on the role of the variance function for the estimators’ behavior.We add to this literature by studying the small-sample performance of estimators in adata-generating process that is fully consistent with general equilibrium economic models ofinternational trade. Economic theory suggests that (i) importer- and exporter-specific ef-fects need to be accounted for in estimation, and (ii) that they are correlated with bilateraltrade costs through general-equilibrium (or balance-of-payments) restrictions. We comparethe performance of structural estimators, fixed effects estimators, and quasi-differences es-timators in such settings, using the GLM approach as a unifying framework.
Keywords: Gravity models; Generalized linear models; Fixed effects.
JEL-codes: F14, C23.
∗Corresponding author: ETH Zurich, Department of Management, Technology, and Economics, Weinbergstr.35, 8092 Zurich, Switzerland; E-mail: [email protected]; + 41 44 632 41 08. Egger acknowledges funding byCzech Science Fund (GA ČR) through grant number P402/12/0982.
†Department of Economics, 111 Barry Street, The University of Melbourne, 3010 VIC, Australia; E-mail:[email protected]; +61 3 903 53776.The authors acknowledge financial support from the Faculty of Business and Economics, University of Melbourne.
1
1 Introduction
Models of international trade often imply a gravity equation for bilateral international exports
of country i to country j (Xij),
Xij = Ei Mj Tij ,
where Ei and Mj are exporter-specific factors (a function of goods prices and mass of suppliers)
and importer-specific factors (a function of the price index and total expenditures on goods),
and Tij are bilateral, pair-specific ones (a function of country-pair-specific consumer preferences
and ad-valorem trade costs). Examples include Eaton and Kortum (2002), Anderson and van
Wincoop (2003), Baier and Bergstrand (2009), Waugh (2010), Anderson and Yotov (2012),
Arkolakis, Costinot, and Rodríguez-Clare (2012), and Bergstrand, Egger, and Larch (2013).
Baltagi, Egger, and Pfaffermayr (2014) and Head and Mayer (2014) provide recent surveys of
this literature.
Writing ei = lnEi, mj = lnMj , and tij = lnTij and using the parametrization tij = d′ijβ,
where dij is a vector of observable bilateral variables and β a conformable vector of parameters,
empirical models of the gravity equation follow the general structure
E(Xij |ei,mj , dij) = exp(ei +mj + d′ijβ). (1)
Thus, there are two departures from the original equation. The first relates to the parametriza-
tion of the unknown tij in terms of observables. The linear index structure coupled with the
exponential function allows the inclusion of variables in a flexible way, while giving the elements
of β the convenient interpretation of direct (semi-)elasticities of exports with respect to the
variables in dij and restricting the domain of exports to be positive. The second departure from
the original is that the relationship is stochastic and assumed to hold (only) in expectation.
The stochastic formulation implies an error term which makes the relationship between Xij and
its specified expectation E(Xij |ei,mj , dij) exact. Thus, the moment condition (1) implies two
equivalent representations with stochastic errors, either additive or multiplicative:
Xij = exp(ei +mj + d′ijβ) + εij = exp(ei +mj + d′ijβ)ηij , (2)
where ηij = 1+εij exp(−ei−mj−d′ijβ). Together with (1), this implies that E(εij |ei,mj , dij) = 0
for the additive error εij , and similarly E(ηij |ei,mj , dij) = 1 for the multiplicative error ηij , i.e.,
these errors are mean-independent of the covariates.
It is the goal of this paper to discuss a number of estimators which are consistent, if model
(1) is correctly specified, and explore their small-sample properties under varying distributions
2
of ηij . This is a subject which has attracted considerable attention recently, and the body of
work studying the performance of estimators of gravity models for trade includes Santos Silva
and Tenreyro (2006, 2011), Martin and Pham (2008), Gómez Herrera (2012), Martínez Zarzoso
(2013), and Head and Mayer (2014), among others.
This paper offers three additions to this literature. First, it uses a data-generating process
(DGP) which corresponds to a general equilibrium model of international trade as the basis for
Monte Carlo simulations. The model of international trade is calibrated to match some features
from real-world data. Thus, unlike the previous literature, the performance of estimators is
analyzed in a setting with exporter- and importer-specific effects which are correlated with
dij , as suggested by economic theory and as imposed in all applied structural work on gravity
models (e.g., in Anderson and van Wincoop, 2003, or Bergstrand, Egger, and Larch, 2013).1
Second, apart from estimators used by the previous literature, the comparison also includes
(i) iterative-structural estimators which fully exploit information from the underlying economic
general equilibrium model of international trade, and (ii) estimators based on quasi-differenced
moment conditions recently proposed by Charbonneau (2012). Third, we provide a unified
discussion of the available estimators within the framework of generalized linear models (GLM).
As exemplified in an application to a cross-section of 94 countries in the year 2008, the GLM
framework can be useful for choosing between competing estimators.
Because our focus is on consistent estimators of the gravity equation (1), one common es-
timator which we will ignore is the OLS estimator of the regression with dependent variable
lnXij . The log-linearized OLS estimator will generally deliver inconsistent estimates for β, even
if the true model is (1). A lucid exposition of this issue is given in Santos Silva and Tenreyro
(2006). Following their illustration, note that the logarithmized equation for bilateral exports is
lnXij = ei +mj + d′ijβ + ln ηij ,
and OLS estimation will only be consistent if E(ln ηij |ei,mj , dij) = 0. But, as (2) makes clear,
ln ηij is a function of the covariates and εij , and so the condition for consistency of OLS will be
violated in general.
Approaches that estimate equation (1) consistently are discussed below in Section 2. The
Monte Carlo setup and results from the simulation study are presented in Section 3, followed
by an application with real-world data in Section 4. We summarize and discuss our findings in
Section 5.1In some of the aforementioned work, e.g., in Eaton and Kortum (2002) or Waugh (2010) structural constraints
are assumed by the underlying theory but not imposed in estimation.
3
2 Estimation of gravity models of bilateral trade
In gravity equations derived from economic models of trade, the exporter- and importer-specific
effects ei and mj are functions of the bilateral-specific trade costs tij . This implies that the
exporter- and importer-specific terms are, in general, correlated with the bilateral variables
dij . Thus, estimators relegating these terms to the error will be inconsistent. This includes all
(non-structural) random effects approaches.
Approaches which do allow for dependence between exporter- and importer-specific effects
are structural estimation and fixed effects estimation. Structural estimation explicitly specifies
the exporter- and importer-specific terms as a function of the economic model’s variables. In
contrast, the fixed effects approach is to leave the relationship between dij and the exporter-
and importer-specific terms –the fixed effects– unspecified, and is thus consistent regardless of
the nature of their dependence. Within this approach, there are two ways of handling the fixed
effects. The first way is to estimate β from moment conditions which are derived from (1) but
which do not depend on the fixed effects. The second way treats the fixed effects as parameters
to be estimated. Although both fall into the fixed effects domain, for convenience we will from
now on refer to estimators under the first approach as quasi-differences estimators and to the
ones under the latter approach as fixed effects estimators.
Structural estimators fully exploit the information on the data-generating process and are
thus potentially more efficient than other approaches. However, to be consistent they require
the underlying economic model to be correctly specified. On the other hand, quasi-differences
estimators and fixed effects estimators are consistent under much weaker assumptions. The
quasi-differences approach has the feature of avoiding estimating a potentially large number of
parameters. This can be an advantage because estimation of the set of fixed effects can be both
computationally difficult and lead to less precise estimates. On the other hand, some objects of
interest (such as predictions and average marginal effects) are functions not only of β, but also
of ei and mj . Without having estimates of these fixed effects, it may be impossible to recover
estimates for such objects of interest.
An important drawback of fixed effects estimators of nonlinear models is that they suffer
from the incidental parameters problem. In the context of the gravity equation, this means
that as the number of observations grows, the number of parameters ei and mj that has to be
estimated grows as well. In general, this implies that ei and mj cannot be estimated consistently
and, worse, the inconsistency passes over to β. However, the exponential mean model has
the attractive feature that, like the linear model, it does not exhibit the incidental parameters
problem. This fact is well-known in the context of panel models with one way (i.e., one dimension
4
of fixed effects). Recent simulation evidence suggests that this is also the case with two fixed
effects (Fernández-Val and Weidner, 2013); and our own Monte Carlo results of section 3 also
point in this direction.
We begin by discussing the GLM approach (Section 2.1), which can be used to estimate a
variety of fixed effects and structural estimators, and in principle even serve as a “wrapper" for
some quasi-differenced moment conditions. To fix ideas, we explain the GLM approach in the
context of fixed-effects estimation. Then, we discuss structural and quasi-differences estimation
in some more detail in sections 2.2 and 2.3.
2.1 GLM estimation of gravity models with fixed effects
Nonlinear models with a linear index, such as (1), can be estimated consistently by a host of so-
called generalized linear model (GLM) estimators. These models have a common structure, as is
well-known and outlined briefly in this section. The GLM framework (Nelder and Wedderburn,
1972; McCullagh and Nelder, 1989) is a likelihood-based approach that remains consistent for
the parameters of interest as long as the conditional expectation function is correctly specified.
It is widely used to estimate nonlinear models.
GLM is based on likelihood estimation of potentially misspecified densities of the linear-
exponential family (LEF) of distributions, which includes the Poisson, Negative Binomial,
Gamma, Normal, and other distributions. A GLM estimator is determined by (i) the speci-
fication of a link function – which governs the relationship between the conditional expectation
function and the linear index of covariates – together with (ii) the specification of a family, a
density from the LEF.
Let us view ei and mj under the fixed-effects approach, where the fixed effects are parameters
to be estimated. The distribution of Xij is part of the LEF class if its density can be written as
fθ(Xij) = exp
{Xijθij − b(θij)
a(ϕ)+ c(θij , Xij)
},
with parameters θij = (β′, ei,mj)′, ϕ (which we will later show to be related to the variance of
Xij , denoted as V (Xij)), and arbitrary functions a(·), b(·) and c(·). Such a random variable will
have the log-likelihood
l(θij) =Xijθij − b(θij)
a(ϕ)+ c(θij , Xij),
and score
s(θij) =∂l(θij)
∂θij=
Xij − b′(θij)
a(ϕ).
5
A fundamental result of likelihood theory is that the expected score evaluated at the true value
of the parameter is zero. For members of the LEF, this implies
E[s(θij)] =E(Xij)− b′(θij)
a(ϕ)= 0,
and, therefore, E(Xij) = b′(θij) ≡ µij . In GLM estimation, the mean of Xij , µij , is specified as
a function of a linear index, in our context d′ijβ + ei +mj , so that µij = g−1(d′ijβ + ei +mj), or
g(µij) = d′ijβ + ei +mj . This function, g(·), is called the link function. The gravity model (1),
consequently, implies a log-link.
A second result from likelihood theory, the information matrix equality, allows to derive the
variance function. The Hessian of a LEF random variable is
H(θij) =∂2l(θij)
∂θ2ij=
−b′′(θij)
a(ϕ).
By the information matrix equality, V [s(θ)] = −E[H(θ)], and therefore
E[Xij − b′(θij)]2
a(ϕ)2=
b′′(θij)
a(ϕ),
which is equivalent to V (Xij) = b′′(θij)a(ϕ).
The parameter vector θ is estimated by maximizing the sample (quasi-)likelihood function
lC(β) =C∑i=1
C∑j=1
Xijθij − b(θij)
a(ϕ)+ c(θij , Xij),
with corresponding score for β
sC(β) =
C∑i=1
C∑j=1
Xij − b′(θij)
a(ϕ)
∂θij∂β
=Xij − b′(θij)
a(ϕ)
∂θij∂µij
∂µij
∂β=
Xij − b′(θij)
b′′(θij)a(ϕ)
∂µij
∂β.
Analogous equations hold for ei and mj . The last equality follows from µij = b′(θij). Thus, the
first order conditions sN (β) = 0 can be written as
C∑i=1
C∑j=1
[Xij − E(Xij)]
V (Xij)
∂E(Xij)
∂β= 0. (3)
Equation (3) shows that GLM estimators are moment estimators based on the conditional ex-
pectation residual (the expression in square brackets). As long as the expectation function is
correctly specified, they will be consistent for β. Equation (3) also shows that GLM estimators
weight the residuals in the first-order conditions by the inverse of the variance, V (Xij). There-
fore, relative efficiency gains – but not consistency of β – depend on the correct specification
of the variance function. Different choices of the LEF distribution (i.e., of the family) lead to
different variance functions.
6
As mentioned above, gravity models use the log-link function, so that in general the first
order conditions for β in models based on (1) have the form
C∑i=1
C∑j=1
[Xij − exp(ei +mj + d′ijβ)]
V (Xij)exp(ei +mj + d′ijβ)dij = 0. (4)
Various distributions are suitable candidates for gravity models in the sense that they are mem-
bers of LEF and that they imply variance functions V (Xij) that are potentially compatible with
bilateral export flows. Poisson and Gamma estimators have been discussed in Santos Silva and
Tenreyro (2006) and others; estimators based on Gaussian, inverse Gaussian, and Negative Bi-
nomial quasi-likelihood are also possible and have been used in the literature, though much less
so than Poisson and Gamma estimators. The implied variance functions, in terms of the condi-
tional expectation function µij , are given in Table 1 below. Because of the exponential function,
the derivative ∂E(Xij)/∂β is again µij . The column on the right of the table reports the weight
given in the first-order conditions to the residual of country-pair ij, i.e., wij = µij/V (Xij).
Table 1: Variance functions and F.O.C. weights of some GLM gravity estimators
Estimator V (Xij) wij
Poisson µij 1
Gamma µ2ij µ−1
ij
Negative Binomial µij + µ2ij (1 + µij)
−1
Gaussian 1 µ
Inverse Gaussian µ3ij µ−2
ij
Thus, as pointed out by Santos Silva and Tenreyro (2006), the Poisson estimator weights every
observation equally, which might be optimal when wanting to remain agnostic with respect to
higher-order specification. The other estimators in Table 1, with the exception of the Gaussian
one, down-weight observations with larger means. If these observations are also noisier in the
sense of having a larger variance, this would result in efficiency gains. In contrast, the Gaussian
estimator gives more weight to country-pairs with larger expected exports, a weighting scheme
which might be beneficial, for instance, if there are trustworthiness issues with small trade
flows.2 The Gaussian GLM gravity estimator is equivalent to the nonlinear least squares (NLS)
estimator of (1).2Such trade flows may emerge particularly with or among less-developed countries, where reporting standards
are poor. For instance, Egger and Nigai (2014) illustrate that structural parameterized GLM gravity models tend
to predict large bilateral trade flows with much smaller error than small trade flows.
7
2.2 Iterative-structural estimation
A structural approach to estimating the gravity equation makes use of economic theory. Most
modern gravity models assume resource constraints of the form
E[
C∑j=1
Xij ] = E[
C∑j=1
exp(ei +mj + d′ijβ)] = E[exp(ei)
C∑j=1
exp(mj + d′ijβ)] (5)
and
E[C∑i=1
Xij ] = E[C∑i=1
exp(ei +mj + d′ijβ)] = E[exp(mj)C∑i=1
exp(ei + d′ijβ)]. (6)
Hence, (5) and (6) imply that the country-specific effects ei and mj are implicitly fully de-
termined given data on Xij and dij together with estimates of β. Clearly, this implies that
E[eidij ] = 0, E[mjdij ] = 0, and E[eimj ] = 0 according to economic theory.3
In our Monte Carlo study, we will employ data-generating processes which respect this
feature. Specifically we will consider an endowment economy. Many data-generating processes
considered in the literature are isomorphic to the one of endowment economies (see Anderson and
van Wincoop, 2003, for such an approach, and Eaton and Kortum, 2002; Arkolakis, Costinot,
and Rodrguez-Clare, 2012; and Bergstrand, Egger, and Larch, 2013; for isomorphic processes).
Suppose an economy i is endowed with a production volume of Hi which it sells at a mill
price of Pi, earning it an aggregate income (GDP) of Yi = PiHi. In general, sales to market j are
impeded by some trade costs, for which we use the notation Tij as in the introduction. Then,
in an endowment economy with Armington differentiation (between the goods from different
countries) at an elasticity of 1 − α, nominal aggregate bilateral trade flows in a world akin to
the one in Anderson and van Wincoop (2003) are determined as4
E(Xij) =Pαi TijYj∑C
k=1 Pαk Tkj
, (7)
where, at given Hi and Tij , Pi is implicitly determined from the resource constraint as generically
introduced in (5) and (6), which in the present context becomes
Yi = PiHi =
C∑j=1
Pαi TijPjHj∑Ck=1 P
αk Tkj
, (8)
⇒ P 1−αi =
1
Hi
C∑j=1
TijPjHj∑Ck=1 P
αk Tkj
. (9)
3As a consequence, omitting ei + mj from the specification in (2) or simply replacing it by a log-additive
function of GDP and/or GDP per capita of countries i and j, as had been done for decades in a-theoretical
empirical gravity models, will generally lead to inconsistent estimates of β, the semi-elasticity of trade with
respect to dij .4We skip the exporter-specific preference parameter introduced in Anderson and van Wincoop (2003) without
loss of generality in the interest of brevity.
8
This is exactly the data-generating process which we will use for the endogenous variables in the
model, {E(Xij), Pi, Yi}, before adding a stochastic term to (7). In terms of the gravity equation
(1), lnTij = d′ijβ, lnPαi = ei and ln
PjHj∑Ck=1 P
αk Tkj
= mj .
Assuming that data on GDP of each country i (Yi) are available and α is known, a structural
estimator for β, ei, mj (i = 1, . . . , C, j = 1, . . . , C) can be defined as the solution obtained by
iterating between the following Step 1 and Step 2 until convergence:
Step 0 Initial values:
Set initial values for ei, mj (i = 1, . . . , C, j = 1, . . . , C).
Call vectors of these values {e(n), m(n)}.
Step 1 Update β:
Given current values {e(n), m(n)}, generate the auxiliary variable
em(n)ij = e
(n)i + m
(n)j .
Estimate β by GLM regression of Xij on dij and em(n)ij , under the constraint
that the coefficient corresponding to em(n)ij is set to 1.
The resulting estimator of β is called β(n).
Step 2 Update {e, m}:
Given current estimates β(n), {e(n), m(n)}, GDP {Yi}Ci=1, and α, obtain T(n)ij =
exp{d′ij β(n)(1−α)}, e(n+1)i = lnE
(n+1)i and m
(n+1)j = lnM
(n+1)j , where E(n+1)
i =
Yi∑j
T(n)ij
Yj∑k E
(n)k
T(n)kj
and M(n+1)j =
Yj∑i E
(n)i T
(n)ij
.
After Step 2, (n + 1) is redefined as (n) and Step 1 is repeated. Convergence criteria can be
defined by norms on the vector of differences (β(n+1)′ − β(n)′ , e(n+1)′ − e(n)′, m(n+1)′ − m(n)′). As
in the fixed-effects approach, different iterative-structural estimators are defined by the choice
of the respective family of distributions for the GLM regression in Step 1.
2.3 Quasi-differences estimation
The approaches in the previous sections estimate the gravity model by either viewing the fixed
effects as additional parameters to be jointly estimated with β or by obtaining them from the
solution to an economic system of resource constraints. An advantage of these approaches is
that the estimated fixed effects can be used to calculate general equilibrium comparative statics
(see e.g. Fally, 2012, who uses a Poisson estimation with fixed effects to recover parameters
of an Anderson and van Wincoop, 2003, model). Compared to the structural estimation, the
approach of estimating the fixed effects allows to account for additional unobserved importer-
9
and exporter-specific trade costs not part of the structural importer- and exporter-specific terms
(see e.g. Egger et al., 2012, for an example of such an empirical approach).
An alternative approach is to transform the model to obtain a moment condition that does
not depend on the fixed effects at all. This has the advantage that the number of parameters
to be estimated is reduced substantially which can lead to considerable efficiency gains in the
estimation of β. Since βk (the kth element of β) has the interpretation of the ceteris paribus or
partial equilibrium semi-elasticity of exports with respect to the bilateral trade cost dij,k, the
vector β is often a first-order object of interest in gravity estimation.
In linear models, differencing observations with respect to i and j yields a fixed-effects-
free expression (see for instance Baltagi, 2013, or Cameron and Trivedi, 2005). In nonlinear
models with an exponential mean and one-way fixed effects, a quasi-differencing analog exists:
fixed effects can be eliminated by using i-specific ratios of observations. Charbonneau (2012)
introduced a quasi-differencing approach for exponential mean models with two-way fixed effects,
deriving a moment condition from (1) which does not depend on either ei or mj .
In the context of the gravity model (1), this moment condition requires sets s of 4 country-
pairs involving two exporters (i, l) and two importers (k, j). The two products of exports,
XikXlj = exp[(ei + el) + (mk +mj) + (dik + dlj)
′β]+ (εik + εlj), (10)
XlkXij = exp[(ei + el) + (mk +mj) + (dlk + dij)
′β]+ (εlk + εij), (11)
have the same sums of exporter- and importer-specific terms. Dividing by the bilateral-specific
part and taking conditional expectations yields
E{XikXlj exp
[−(dik + dlj)
′β]|·}
= exp [(ei + el) + (mk +mj)] , (12)
E{XlkXij exp
[−(dlk + dij)
′β]|·}
= exp [(ei + el) + (mk +mj)] , (13)
where the conditioning variables have been suppressed to avoid cluttered notation.5 Since the
expectations of these terms are the same, a method of moment estimator can be based, for
instance, on the difference (12) minus (13). Specifically, Charbonneau (2012) proposes to use
the unconditional moment conditions
E {[XikXlj −XlkXij exp(dik + dlj − dlk − dij)] (dik + dlj − dlk − dij)} = 0 (14)
as a basis for GMM estimation.
In case that the data does not contain observations where bilateral export flows are zero,
alternative moment conditions can be used. Defining Xs ≡ XikXlj/XlkXij and ds = (dik +
5The conditioning variables are dik, dli, ei, el,mk,mj in the first equation, and the corresponding variables in
the second.
10
dlj)− (dlk + dij), one obtains
E(Xs|ds) = exp(d′sβ). (15)
The quasi-differenced model (15) lends itself to estimation by any of the GLM estimators dis-
cussed before.
Even if the country-pair-specific errors εij are independent, the corresponding errors εs =
Xs−E(Xs|ds) will exhibit a mechanical dependence structure as a result of overlaps in exporter
and importer countries in two observations s and s′. Therefore, an appropriate cluster-robust
variance estimator should be adopted when using estimators based on (14) or (15).
The number of possible sets of countries s is very large, so much that for the number of
countries typically used in applications, it might cause computational difficulties. Sacrificing
efficiency, one might only select a subset of all possible sets s. We only use sets for which
i = 1, . . . , C − 2; C − 1 ≥ j ≥ i; l = i + 1; and k ≥ l. This still increases the observations
by an order of magnitude. To get a perspective of the number of sets, for 10 countries (90
country-pairs) this results in 245 sets; for 50 countries (2,450 country-pairs) it results in 55,225
sets.
3 Monte Carlo experiments
If the data are generated from the gravity equation (1) and according to an underlying eco-
nomic model satisfying equations (5), (6), (7), and (9), all estimators discussed in section 2 are
consistent. To explore the performance of the estimators in finite samples, we set up a Monte
Carlo experiment. The questions that we seek to explore are how the relative performance of the
estimators is affected by features of the economic model, by the distribution of the stochastic
errors, and by an increase in the number of countries.
3.1 Data-generating process
The data generating process (DGP) consists of a structural part, in which the bilateral-, exporter-
, and importer-specific determinants of exports are drawn; and a stochastic part, in which random
errors are drawn and joined to the structural part of exports. In terms of the gravity equation,
the structural part is the conditional expectation function given in (1) (denoted by µij), and the
stochastic part is εij .
I. Structural economic model
The structural part of the DGP follows the economic model (5)–(9). Given
11
(i) a value of the substitution elasticity α,
(ii) endowments Hi for every country (i = 1, . . . , C), and
(iii) bilateral trade cost factors Tij for every country-pair ij,
the model implicitly determines GDP Yi, prices Pi, and structural exports E(Xij) = µij through
general equilibrium constraints. Thus, these variables are determined by α and the joint distri-
bution of Hi and Tij .
Broadly in line with Anderson and van Wincoop (2003) and a host of other estimates in the
literature, the parameter α is set at the value -4 as a baseline. The sensitivity of the gravity
estimators to this parameter is explored by varying α to -9 in an alternative scenario. Ceteris
paribus, a higher elasticity of substitution between goods (higher absolute value of α) leads to
a higher variability of exports while endowments remain constant.
The parameters Hi and Tij are drawn in a way which is consistent with economic theory.
In particular, we set intra-national trade frictions to zero (see Eaton and Kortum, 2002; or
Anderson and van Wincoop, 2003), whereby Tii = 1 for all i. For j = i, Tij ∈ (0, 1]. First, two
auxiliary bivariate normal variables are drawn:
zHij , zTij ∼ N(µz,Σz), µz = (3,−2)′, σz = vHT
(3√C/4)2 0.95× 15
√C/4
0.95× 15√C/4 25
.
Then, the parameters of interest are obtained as
Hi = exp(∑j
zHij /C)C > 0 and Tij =
(exp(zTij)
1 + exp(zTij)
)−α/4
∈ (0, 1).
The number of countries C is incorporated in Hi (and zH) such that the share of a country’s
endowment in world endowment remains constant as C increases. With a baseline of 10 countries,
the correlation between Hi and Tij is about 25% and weakens to 12% as C increases to 50.
Endowments Hi have a mean of about 200.
The parameter vHT is a scaling constant for the variance of Hi and Tij which is set equal to
0.1 at baseline. This produces a variance of about 2,200. In an alternative scenario, we increase
the cross-country inequality in endowments by setting vHT = 0.3 which increases the variance
to about 7,800. Similar to the increase in |α| as discussed above, more inequality in endowments
(higher vHT ) leads to more variability in exports as well. However, it also leads to higher average
exports. The resulting coefficient of variation of exports is even slightly lower (1.37) than in
the baseline (1.5), whereas the increase in |α| is associated with a significantly higher coefficient
of variation (2.0). Thus, the challenge of the experiment of increasing |α| lies in the increased
12
dispersion of exports, whereas the challenge of increasing vHT lies in the greater variance of the
fixed effects.
Finally, we parameterize Tij by a single observable trade cost parameter dij for each ij as
lnTij = β0 + β1dij , (16)
with β0 = 0 and β1 = 1.
II. Stochastic shocks
Observed exports are obtained by joining stochastic shocks with the mean of exports, µij , as de-
termined by the structural model: Xij = µijηij . Provided the errors ηij are mean-independent,
E(ηij) = 1, all considered estimators are consistent. Their asymptotic efficiency depends on
the variance of ηij . We consider six different variance functions for ηij . Five of these imply
asymptotic efficiency of the Poisson, Gamma, Negative Binomial, Gaussian, and Inverse Gaus-
sian estimators, respectively. The sixth variance function of ηij is not optimal for any of these
five estimators.
The errors ηij are drawn from a heteroskedastic log-normal distribution
ηij = exp zηij , zηij ∼ N(−0.5σ2η, σ
2η).
Hence, E(ηij) = 1 and V(ηij) = exp(σ2η − 1). Since V(Xij |dij , ei,mj) = µ2
ijσ2η, we determine
σ2η as detailed in Table 2. The term “Optimal estimator" in the table refers to asymptotic
efficiency. This does not exclude the possibility of poor small-sample performance. While the
sixth (dubbed “Other”) variance function we consider does not correspond to anyone of the
discussed estimators, we note that among the estimators of Table 2 it is ‘closest’ to the one of
Poisson for large values of µij .
Table 2: Variance functions in Monte Carlo simulations
Number σ2η V(ηij) V(Xij) Optimal estimator
1 ln(2) 1 µ2ij Gamma
2 ln(1 + µ−1ij ) µ−1
ij µij Poisson
3 ln(2 + µ−1ij ) 1 + µ−1
ij µij + µ2ij NegBin
4 ln(1 + µ−2ij ) µ−2
ij µ−2ij Gaussian
5 ln(1 + µij) µij µ3ij Inv. Gaussian
6 ln(1 + µ−1.5ij ) µ−1.5
ijõij Other
13
In every case, we rescale the errors ηij to achieve a constant pseudo-R2 = V(µij)/[V(µij) +
V(εij)] of about 50% for all scenarios and variance functions. The errors εij in this pseudo R2
are the implicit additive errors Xij − µij .
III. Specifications and estimators
For estimation, we consider four alternative specifications:
S1: Xij = exp(β10 + β11dij + ei +mj) + ε1ij , (17)
S2: Xij = exp(β20 + β21dij +C∑i=2
e2iDi +C∑
j=2
m2jDj) + ε2ij , (18)
S3: Xij = exp(β30 + β31dij + β32yi + β33yj) + ε3ij , (19)
S4: Xij = exp(β40 + β41dij) + ε4ij . (20)
All specifications include a constant which is denoted by {β10, β20, β30, β40}. The object of
interest is β1 from (16) which here is estimated by the coefficients on dij , {β11, β21, β31, β41}.
S1 is a structural model which includes the true terms of ei and mj with constrained unitary
parameters on them, in line with economic theory. The true unknown parameter β11 is unity
but allowed to be estimated differently from that. S2 is a two-way fixed effects model which
estimates e2i and m2j by country-specific constants through exporter and importer indicator
variables Di and Dj , and the true parameter β21 is unity. Hence, S2 estimates ei + mj by
C(C − 1) constants rather than by structural constraints as in S1 which is less efficient than
S1. S3 is an old-fashioned ad-hoc gravity model which replaces {ei,mj} by log exporter and
importer GDP, {yi, yj}, and estimates parameters {β32, β33} on them – about which we do not
have priors. Clearly, this model is misspecified, as {yi, tij , yj} are correlated with ε3ij . However,
given its vast application in the past, it is interesting to see how this model fares in a laboratory
experiment relative to S1 and S2. Finally, S4 is an ad-hoc model which only includes tij apart
from a constant. For the same reason as S3, this model is misspecified, as tij is correlated with
ε4ij . Notice that while both S3 and S4 are misspecified, it is clearly the case that the problem
that E[eidij ] = 0 and E[mjdij ] = 0 vanishes as the number of countries C grows. Since the
number of countries is large empirically, the bias of β due to omitting ei +mj should be small
in theory (even though countries vary a lot in terms of their size). This issue is commonly
disregarded. Finally, we also estimate two quasi-differences models, based on equations (14) and
(15).
For every specification S1-S4 we use the five GLM estimators based on the Gamma, Poisson,
NegBin, Gaussian, and Inverse Gaussian LEF family. In our baseline scenario α = −4, vHT =
0.1, and C = 10. Since we disregard intra-national trade in estimation (as is commonly the case
14
in applied work), C = 10 results in 90 observations. We consider three alternative scenarios: a
higher value of |α| with α = −9, a higher value of vHT with vHT = 0.3, and a larger number
of countries with C = 50 (2,450 observations). In each alternative scenario, we leave the two
remaining parameters of {α, vHT , C} at baseline values. Results of a fourth alternative with
α = −2 are displayed in Table 9 in the Appendix.
3.2 Simulation results
All statistics presented throughout this subsection are based on 1,000 replications of the DGP.
Baseline scenario α = −4, vHT = 0.1, C = 10
Table 3 presents our main results corresponding to iterative structural (IS) (dubbed S1 above),
fixed effects (FE), and quasi-differences (QD) estimation (FE and QD were dubbed S2 above).
Estimators are grouped in columns. Six panels present statistics for the estimated β1 under the
six different variance functions of the errors in the DGP: the number of convergences achieved out
of 1,000 runs (CR), the mean of β1 over converged estimations (Mean), the standard deviation
(SD), median (Med), and the 5th and 95th percentile of the distribution of estimates.
A glance at the results for the IS estimators reveals that despite the small sample size of
only 90 country-pairs, exploiting all available economic structure of the DGP results in extremely
precise estimates. All estimators are virtually unbiased (mean equal to 1) and display standard
deviations which more often than not are smaller than the two decimal places we report in the
table. The excellent performance of the IS estimators in this case stems from the fact that they
specify the model absolutely correctly, and that the DGP with a single well-behaved explanatory
variable dij is very simple. In this case, the IS estimators may serve as a benchmark against which
the other estimators can be measured. The only exception to the outstanding performance of
the IS estimators lies in the inverse Gaussian (IS-IG) estimator’s serious convergence difficulties.
With a convergence failure rate as high as 40%, this issue points to major numerical difficulties
of this estimator in handling the optimization process reliably for DGPs other than its optimal
variance function.
In the group of FE estimators, the performances of Poisson (FE-P) and Negative Binomial
(FE-NB) come closest to the one of the IS estimators. They display no bias, and present small
standard deviations for FE estimators, lagging only behind the estimator that specifies the vari-
ance process correctly. While the Gamma FE estimator (FE-Gam), too, has a similar dispersion,
it is slightly biased upwards in most DGPs. In contrast, the Gaussian –i.e., Nonlinear Least
Squares– (FE-Gau) and inverse Gaussian estimators (FE-IG) display more serious problems.
15
FE-Gau’s mean is seriously biased upwards under the Gamma and Negative Binomial variance
function DGPs. The problem is not just one of a few outliers, as can be seen from FE-Gau’s
median which does not seem much better-behaved. FE-IG’s performance is dismal. Its mean
is substantially distorted and so is its median. Only if the variance function of exports is cubic
(i.e., with Variance function 5 in the table) does FE-IG produce good results (indeed, in this
case FE-IG provides the best performance among FE estimators, as one would expect).
The last two columns of Table 3 give descriptive statistics of two quasi-differences (QD)
estimators’ small-sample distribution in these DGPs. Using 245 observations of country-tetrads,
a two-step GMM estimator (QD-GMM) and a simple Poisson estimator (QD-P) is used. The
performance of QD-GMM is clearly superior to QD-P. Its mean and median are both centered
at the true value of 1. Its standard deviation tends to be larger than that of the better FE
estimators, though. On the other hand, QD-P is visibly biased in some DGPs.
As the results from the alternative scenarios will show, these results are quite robust and to
a large degree they remain unchanged throughout the alterations that the DGP is subjected to.
Finally, results of the baseline DGP for these IS, FE, and QD estimators can be compared to
the inconsistent ad-hoc approaches. The results are displayed in Table 4. The first five columns
contain results for GLM estimators using log-GDP (yi and yj) as proxies for ei and mi (dubbed
S3 above); the last five columns contain results from GLM regressions of Xij on dij without
any additional regressors (dubbed S4 above). Table 4 illustrates that large biases can arise from
the omission and incorrect treatment of fixed effects. We will not consider these estimators any
further in detail. It will suffice to say that the biases in the alternative scenarios with increased
α or vHT are substantially larger than those in this baseline. On the other hand, the biases
decrease with a larger sample size; a consequence of the fact that in our structural model the
correlation between endowments and bilateral trade costs decreases in a growing world (i.e.,
with more trade partners becoming available).6
6We would like to emphasize this point and put it in context. Anderson and van Wincoop (2003) correctly
remarked that omitting the structural country-specific effects ei +mj from the estimation in gravity models is at
odds with economic theory. Parameter bias of β may be a consequence, since E[eidij ] = 0 and E[mjdij ] = 0, as
we remarked above. However, with a large pair-specific component in trade costs (as documented in Egger and
Nigai, 2014) and more than 200 countries in the world economy, the bias of β from omitting ei+mj in estimation
should be small, especially if the observations (country-pairs) are weighted equally, as is the case with Poisson
estimation. To some extent, this may be viewed as a potential rehabilitation of ad-hoc gravity models as had
been used in the past.
16
Higher elasticity of substitution α = −9, and higher variance of endowments, vHT =
0.3
Results for the DGP with α = 9 are displayed in Table 5. As discussed above, this experiment
increases the variability of trade flows by making countries reacting more strongly to price
differences of traded goods, all the while the world endowment is held constant. In general,
this is a more challenging DGP for the estimators. The IS estimators maintain their good
performance, but convergence failures becomes more prevalent.
The IG estimator, both in its IS and its FE variant, cannot be used. Its convergence rates are
close to 0%, and even the instances recorded as converged are likely to be convergence failures
as the estimates were extremely large numbers. We have marked such cases where the statistic
was not credible with the symbol Ҡ".
Among the FE estimators, the performances broadly echo the baseline. FE-P and FE-NB
continue to show favorable performances, and likewise FE-Gau continues having difficulties in
the Gamma and Negative Binomial DGP. The most striking result is the stark deterioration
of FE-Gam, which now only shows acceptable properties in the Gamma and Inverse Gaussian
DGPs, while having up to 30% bias in the median in the other DGPs.
In contrast, QD-GMM is only slightly negatively affected by this DGP relative to the baseline.
Similarly, there is not much difference in QD-P’s medians. Its means, however, seem quite
distorted by outliers.
By increasing vHT = 0.3, the variability of the endowments (and, hence, of the fixed effects
ei and mj) is increased. The simulation results in Table 6 suggest that this kind of change
can be handled better by the estimators than the increase in the substitution elasticity; the IS
estimators, for instance, have a visibly better convergence rate. The IS-IG estimator works quite
reliably; however, its FE-IG counterpart shows the same poor performance as before.
Higher number of countries C = 50
Finally a much larger sample is considered. With C = 50, one obtains 2,450 observations on
country-pairs that can be used in estimation. This change in the DGP can help in assessing to
what extent the problems described above are small-sample difficulties that can be resolved with
more observations. The results in Table 7 indicate that by and large most of the issues indeed
vanish when data on more countries are available. FE-Gam, FE-P and FE-NB all have little to
no bias and standard deviations which are often not far from IS. The average bias of FE-Gau
is substantially smaller, and even more so is its median bias. However, the bias is still visible,
and this suggests that, while vanishing asymptotically, it can be quite persistent. At C = 50,
17
QD-GMM has essentially zero remaining small-sample bias. Its standard deviation is also small,
although it is most often larger than that of FE-Gam, FE-P, and FE-NB. It seems unlikely
that the two-step QD-GMM will catch up and overtake this group of FE estimators in terms of
efficiency with further increases in sample size. However, we only used a small fraction of the
available sets of country-tetrads s, and further efficiency gains might be achieved by increasing
the number of sets s included in estimation. The estimator FE-IG cannot be recommended, in
general. It is the only FE estimator exhibiting tremendous biases in both mean and median.
QD-P cannot be recommended either. Worryingly, some of the biases are larger than in the case
of C = 10, which suggests that QD-P and estimators like it might not have moments even in
larger samples (at least not for the DGPs we considered), or that the rate of convergence is very
slow.
4 Application
The purpose of this section is to apply the models discussed in the previous section to data. For
this, we employ cross-sectional bilateral export data in nominal U.S. dollars from the United
Nations’ Comtrade Database among 94 countries in the year 2008 and combine them with data
on bilateral trade costs from the Centre d’Études Prospectives et d’Informations Internationales’
Geographical Database (GeoDist Database, 2013). In particular, we use two variables from the
latter database: bilateral distance (entitled dist in the variable list) and common language
(entitled comlang_off in the variable list). Rather than using (log) distance in the original
format, we generate five indicator variables based on the quintiles of log distance (=1 if a pair
ij exhibits distance in that quintile, =0 else). This strategy is akin to, e.g., Eaton and Kortum
(2002). Then, we use these five indicators (we suppress the fifth quintile as the norm in order
to estimate a parameter on the constant) and additionally interact them with the language
indicator. Altogether the model then includes nine arguments in the trade cost function: four
distance quintile main indicator variables and five interacted distance quintile with language
indicator variables:
Tij = exp
(4∑
d=1
βDd Distd,ij +
5∑d=1
βLd Distd,ijLangij
)(21)
While this model seems parsimonious (it excludes other candidates from the trade cost function
such as cultural, economic, historical, and institutional similarity indicators), it captures many
of those aspects due to their collinearity with geographical proximity. In any case, the analysis
here is meant to be illustrative for applied researchers, and the trade cost function could be
altered at discretion.
18
Beyond those variables, all considered GLM models include a constant and some include
exporter and importer country fixed effects to estimate ei and mj (referred to as Fixed-effects
Models) while others include the iteratively-determined structural counterparts to ei and mj as
functions of the estimates Tij (referred to as Iterative-structural Models). Similarly, we tried to
estimate two versions of the quasi-differenced estimator with differenced-out fixed effects, one
relying on GMM (QD-GMM ) and one on Poisson-type estimation (QD-P). QD-GMM failed to
converge and we thus only report results for QD-P.
Table 8 summarises the parameter estimates and robust standard errors (in parentheses)
for all considered models7 and, at the bottom of the table, the information on the Akaike
information criterion for the GLM models. The latter is a particularly meaningful statistic with
gravity models as considered here, since the iterative-structural (IS) models are nested in (are
constrained versions of) the fixed-effects (FE) models.
The parameter results may be summarised as follows. First, we observe a relatively stark
difference among the parameter estimates across the considered estimators. Only a rough pattern
is common to all estimators: Trade between countries in the first quintile of distance is higher
by orders of magnitude relative to the more distant countries. Distant country-pairs that share
a common language tend to trade more than those with different languages; but country-pairs in
the first distance quintile that share the same language trade only about half as much as those
with different languages.
From our Monte Carlo simulations in the previous section, we would conclude that the
differences across estimators can hardly be due to a misspecification of the variance process
alone, since there was virtually no bias in much smaller samples considered before. In any
case, it turns out that the Gamma-GLM and the NegBin-GLM estimators obtain very similar
parameter estimates whereas those for Gaussian-GLM and Poisson-GLM are relatively different.
The Akaike Information Criterion suggests that the Gamma-GLM performs best among both
the FE and the IS GLMs each. But NegBin-GLM is very close to Gamma-GLM in that regard.
Hence, in view of the similarity in the parameter estimates and the comparably low values of
the Akaike information Criterion, Gamma-GLM and NegBin-GLM seem to be the preferred
estimators with the data and specification at hand.
Let us define the Pearson residuals of a GLM model of the family type ℓ as
εPearsonℓ,ij =
Xij − µij
σℓX,ij, (22)
where σℓX,ij is the square root of the variance function evaluated at the estimated value of7The standard errors for the QD-P estimator are derived from resampling the data 100 times for subsamples
of one-quarter of the number of countries
19
the conditional mean, µij . With a proper specification of the variance function, it should be
the case that εPearsonℓ,ij is independent of µij . This is illustrated for the four FE GLMs by way
of scatterplots in Figure 1. In the scatterplots we include a linear regression line to visualise
the relative mean-dependence of the Pearson residuals. If the variance process was correctly
specified, we would expect a horizontal line, i.e.. (mean-)independence of the Pearson residuals
of µij . This is largely the case with Gamma-GLM and NegBin-GLM and not so with Poisson-
GLM and Gaussian-GLM. Hence, the Pearson residual plots provide further evidence against
the latter two models.
Furthermore, we may scrutinize the question of the proper specification of the variance
function by way of so-called deviance residuals:
εdevianceℓ,ij =sign(Xij − µij)√
2[fℓ(Xij)− fℓ(µij)]ϕ, (23)
where fℓ(·) measures the linear-exponential family-ℓ conditional density evaluated at the argu-
ment. The statistic εdevianceℓ,ij should have mean zero and be approximately normally distributed
for any GLM of family type ℓ. We shed light on this issue for the FE GLMs in Figure 2.8 To
facilitate the readability, we present kernel density plots of εdevianceℓ,ij illustrated by a black dashed
curve and add a normal density plot based on the same variance as model ℓ. Figure 2 suggests
that, among the considered models, NegBin-GLM performs best, followed by Poisson-GLM.
Hence, there appears to be some indication of a larger degree of misspecification of the variance
function for Gamma-GLM than for NegBin-GLM. Overall, Figure 2 in conjunction with Figure
1 and the Akaike Information Criterion from Table 8 leads us to classify NegBin-GLM as the
preferred model for the data and specification at hand.
5 Conclusions
This paper alludes to issues in the application of generalized linear models for the estimation of
(structural) gravity models of bilateral international trade. The current status of research on the
matter is the following. First, in a host of theoretical models, bilateral trade flows are structurally
determined by an exponential function based on a log-linear index (see Arkolakis, Costinot, and
Rodriguez-Clare, 2012). Second, it has been remarked and well received that exponential-family,
generalized-linear models rather than log-linearized models should be employed for consistent
estimation (see Santos Silva and Tenreyro, 2006). The literature appears to favor Poisson-,
Gamma-, and to a lesser extent Gaussian-type GLMs in both analysis and application. Other
approaches such as the inverse Gaussian or the Negative Binomial model tend to be ignored.8Corresponding figures for the the IS estimators can be found in the Appendix.
20
The Gaussian and the Negative Binomial models are even deemed improper (see Bosquet and
Boulhol, 2014; Head and Mayer, 2014). Yet other approaches such as quasi-differencing and
generalized method of moments estimation have not been considered at all. As to the model
selection, researchers are recommended to resort to goodness-of-fit measures (see Santos Silva and
Tenreyro, 2006) or not given strong guidance at all. In terms of small-to-medium-sample analysis
provided in earlier work, the data-generating process has never been selected in accordance with
structural models of bilateral trade, and little is known as to how fixed-country-effects estimators
fare relative to structural-iterative models.
This paper takes the generic gravity model of bilateral trade literally and focuses on data-
generating processes that are fully aligned with general-equilibrium or resource constraints
present in such models. The paper presents a rich set of Monte Carlo results for various sizes
of the world economy (in terms of the number of countries) and various assumptions about the
error process. Moreover, the paper takes those insights to cross-sectional data for the year 2008
and illustrates issues with the model selection.
The main insights of the paper are the following. First, we find that the Poisson and Negative
Binomial quasi-maximum likelihood estimators as well as the quasi-differenced GMM estimators
appear to be the best all-round estimators for small as well as larger sample cases, and for
various stochastic processes. However, we encountered difficulties with the GMM estimator in
our application with year 2008 data. For the chosen specification, the Negative Binomial model
was the preferred model. Conclusions drawn in earlier research about the inappropriateness
of the Negative Binomial model should be taken with a grain of salt: they relate to a two-
step estimation approach towards estimation of the variance and the other model parameters
(see Bosquet and Boulhol, 2014). The problems related to this approach do not emerge with
straightforward one-step estimation. An appealing feature of the Negative Binomial estimator
is that it provides a weighted average of the Poisson and the Gamma models.
A further insight is that the iterative-structural GLM estimators perform better in the sim-
ulations than fixed-country-effects estimators due to their greater parsimony. This result is
somewhat different from the analysis in Head and Mayer (2014) who find their “structurally
iterated least squares" (SILS) estimators to be less efficient than the least squares dummy vari-
ables estimator. The differences to the results in this paper may stem from the linearity of the
models considered in Head and Mayer (2014), or from the different algorithms that SILS and
IS-GLM estimators are based on.
The Monte Carlo simulation revealed potentially serious problems with the quasi-differenced
21
Poisson model which is based on ratios of bilateral exports.9 Likewise, the poor performance
of the inverse Gaussian model emerged both in the Monte Carlo simulations with non-inverse-
Gaussian data and in the application with real data. It is possible that the problems associated
with the Inverse Gaussian estimator are to some degree numerical, though; further research
is needed to determine whether using better starting values and optimizing algorithms might
improve this estimator’s performance.
References
1. Anderson, James E. and Eric van Wincoop, 2003. Gravity with gravitas: a solution to the
border puzzle. American Economic Review 93(1), 170-192.
2. Anderson, James E. and Yoto V. Yotov, 2012. Gold standard gravity. Boston College
Working Papers in Economics 795.
3. Arkolakis, Costas, Arnaud Costinot, and Andrés Rodríguez-Clare, 2012. New trade mod-
els, same old gains? American Economic Review 102(1), 94-130.
4. Baier, Scott L. and Jeffrey H. Bergstrand, 2009. Bonus vetus OLS: A simple method
for approximating international trade-cost effects using the gravity equation. Journal of
International Economics 77(1), 77-85.
5. Baltagi, Badi H., 2013, Econometric Analysis of Panel Data, 5th edition, John Wiley and
Sons.
6. Baltagi, Badi H., Peter H. Egger, and Michael Pfaffermayr, 2003. A generalized design for
bilateral trade flow models. Economics Letters 80(3): 391-397.
7. Baltagi, Badi H., Peter H. Egger, and Michael Pfaffermayr, 2014. Panel data gravity
models of international trade, CESifo Working Paper No. 4616, to appear as Chapter 20
of The Oxford Handbook of Panel Data, edited by Badi H. Baltagi.
8. Bergstrand, Jeffrey H., Peter H. Egger, and Mario Larch, 2013. Gravity redux: estima-
tion of gravity-equation coefficients, elasticities of substitution, and general equilibrium
comparative statics under asymmetric bilateral trade costs. Journal of International Eco-
nomics 89(1), 110-121.9Some of these issues might carry over to the tetradic-differenced estimators as discussed in Head and Mayer
(2014) which are also based on ratios of exports.
22
9. Bosquet, Clément, and Hervé Boulhol, 2014. Applying the GLM variance assumption to
overcome the scale-dependence of the Negative Binomial QGPML estimator. Econometric
Reviews 33(7), 772-784.
10. Cameron, A. Colin and Pravin K. Trivedi, 2005. Microeconometrics: Methods and Appli-
cations. Cambridge University Press..
11. Charbonneau, Karyne B., 2012. Multiple Fixed Effects in Nonlinear Panel Data Models –
Theory and Evidence. Unpublished Manuscript, Princeton University.
12. Eaton, Jonathan, and Samuel S. Kortum, 2002. Technology, Geography, and Trade.
Econometrica, 70, 1741-1779.
13. Egger, Peter H., Mario Larch, and Kevin Staub, 2012. The log of gravity with correlated
sectors: An application to structural estimation of bilateral goods and services trade.
Revised version of CEPR Discussion Paper No. 9051.
14. Egger, Peter H., and Sergey Nigai, 2014. Structural gravity with dummies only. Con-
strained ANOVA-type estimation of gravity panel data models. CEPR Discussion Paper.
15. Fally, Thibault, 2012. Structural gravity with fixed effects. Unpublished Manuscript,
University of Colorado.
16. GeoDist Database. Centre d’Études Prospectives et d’Informations Internationales, ac-
cessed in October 2013. http://www.cepii.fr
17. Gómez Herrera, Estrella, 2013. Comparing alternative methods to estimate gravity models
of bilateral trade. Empirical Economics, 44, 1087-1111.
18. Harrigan, James, 1996. Openness to trade in manufactures in the OECD. Journal of
International Economics 40(1-2), 23-39.
19. Head, Keith and Thierry Mayer, 2014. Gravity Equations: Workhorse, Toolkit, and Cook-
book. To appear in Handbook of International Economics Vol. 4, eds. Gopinath, Helpman,
and Rogoff.
20. Martin, Will, and Cong S. Pham, 2008. Estimating the Gravity Model When Zero Trade
Flows Are Frequent. Working Paper Economics Series 2008_03, Deakin University.
21. Martínez Zarzoso, Immaculada, 2013. The log of gravity revisited. Applied Economics,
45(3), 311-327.
23
22. McCullagh P. and John A. Nelder, 1989. Generalized Linear Models. Second ed. London:
Chapman and Hall.
23. Nelder, John A. and Robert W.M. Wedderburn, 1972. Generalized linear models. Journal
of the Royal Statistical Society, Series A 135(3), 370-384.
24. Santos Silva João M.C and Silvana Tenreyro, 2006. The log of gravity. The Review of
Economics and Statistics 88(4), 641-658.
25. Santos Silva João M.C and Silvana Tenreyro, 2011. Further simulation evidence on the
performance of the Poisson pseudo-maximum likelihood estimator. Economics Letters,
112, 220-222.
26. Waugh, Michael E., 2010. International trade and income differences. American Economic
Review 100(5), 2093-2124.
24
Tables
Table 3: Baseline Monte Carlo results, 1,000 replications
Iterative Structural Estimators Fixed Effects Estimators QD
Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 909 988 1000 1000 988 992 999 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.00 1.00 0.99 1.55 1.26 0.99 0.93SD 0.00 0.01 0.01 0.03 0.02 0.07 0.12 0.11 1.03 0.23 0.11 0.22Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.31 1.21 0.99 0.895th 1.00 1.00 1.00 1.00 1.00 0.88 0.82 0.82 0.85 0.99 0.81 0.6695th 1.00 1.00 1.00 1.08 1.00 1.11 1.21 1.19 2.98 1.67 1.18 1.33Variance function 2 (Poisson)CR 1000 1000 1000 998 818 1000 1000 961 1000 969 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 1.46 1.00 1.19SD 0.00 0.00 0.00 0.01 0.02 0.05 0.01 0.01 0.01 0.40 0.05 0.36Med 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 1.36 1.01 1.115th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 0.98 1.04 0.93 0.8495th 1.00 1.00 1.00 1.02 1.01 1.12 1.02 1.02 1.02 2.24 1.05 1.78Variance function 3 (Negative Binomial)CR 1000 1000 1000 911 816 1000 1000 980 993 920 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.05 1.00 1.00 1.63 1.88 1.00 1.09SD 0.00 0.01 0.01 0.04 0.02 0.10 0.12 0.12 1.87 0.56 0.14 0.43Med 1.00 1.00 1.00 1.00 1.00 1.06 0.99 0.99 1.30 1.74 1.01 1.005th 1.00 1.00 1.00 1.00 1.00 0.89 0.82 0.82 0.87 1.21 0.78 0.6695th 1.00 1.00 1.00 1.09 1.00 1.23 1.23 1.21 3.01 3.01 1.23 1.82Variance function 4 (Gaussian)CR 1000 1000 1000 1000 594 1000 1000 950 1000 799 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.06 1.00 1.00 1.00 1.86 1.00 1.36SD 0.00 0.00 0.00 0.00 0.03 0.06 0.01 0.01 0.00 0.72 0.04 0.88Med 1.00 1.00 1.00 1.00 1.00 1.06 1.00 1.00 1.00 1.68 1.01 1.185th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 1.00 1.02 0.95 0.7695th 1.00 1.00 1.01 1.00 1.01 1.16 1.01 1.01 1.00 3.25 1.05 2.39Variance function 5 (Inverse Gaussian)CR 996 999 995 982 996 996 1000 951 993 996 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 1.09 0.99 0.98 0.97SD 0.00 0.00 0.00 0.01 0.00 0.02 0.08 0.06 0.50 0.02 0.07 0.07Med 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 0.98 0.99 0.98 0.975th 1.00 1.00 1.00 1.00 1.00 0.95 0.90 0.90 0.90 0.96 0.92 0.9095th 1.00 1.00 1.00 1.02 1.00 1.01 1.04 1.04 1.66 1.02 1.03 1.03Variance function 6 (Other)CR 1000 1000 1000 1000 691 1000 1000 951 1000 910 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.05 1.00 1.00 1.00 1.65 1.00 1.25SD 0.01 0.00 0.00 0.00 0.02 0.06 0.01 0.01 0.00 0.57 0.04 0.63Med 1.00 1.00 1.00 1.00 1.00 1.04 1.00 1.00 1.00 1.51 1.01 1.125th 1.00 1.00 1.00 1.00 1.00 0.95 0.98 0.98 1.00 1.01 0.94 0.7895th 1.00 1.00 1.00 1.00 1.00 1.14 1.02 1.02 1.01 2.75 1.05 1.97
Notes: The table shows descriptive statistics for the distribution of estimates of β1 = 1 in equation (16) for 1,000 replicationsof the DGP of section 3.1 for each of six different variance functions of Table 2. The statistics are: number of convergedreplications (CR), as well as mean, standard deviation (SD), median (Med), 5th and 95th percentiles, all of these computedover converged replications. Gam, P, NB, Gau, and IG stand for Gamma, Poisson Negative Binomial, Gaussian and InverseGaussian GLM estimators. QD stands for Quasi-differences estimators, and GMM and P correspond to equations (14)estimated by two-step GMM and (15) estimated by Poisson GLM. Where statistics were larger than 10 they have beenreplaced by the symbol †.
25
Table 4: Baseline scenario: inconsistent approaches, 1,000 replications
Est. with GDP as proxy for FE Est. without add. regressors
Gam P NB Gau IG Gam P NB Gau IGVariance function 1 (Gamma)CR 1000 1000 1000 998 989 1000 1000 1000 1000 1000Mean 0.92 0.84 0.85 0.98 1.15 0.89 0.81 0.83 0.88 1.09SD 0.08 0.12 0.11 1.10 0.33 0.09 0.12 0.11 2.30 0.42Med 0.93 0.83 0.85 0.79 1.08 0.90 0.80 0.83 0.74 1.025th 0.78 0.65 0.69 0.54 0.83 0.74 0.62 0.66 0.51 0.7695th 1.05 1.04 1.03 1.58 1.66 1.03 1.03 1.01 1.23 1.60Variance function 2 (Poisson)CR 1000 1000 943 1000 872 1000 1000 967 1000 991Mean 0.93 0.84 0.84 0.79 1.83 0.90 0.81 0.82 0.77 2.08SD 0.07 0.06 0.06 0.07 1.54 0.07 0.07 0.07 0.09 2.65Med 0.94 0.85 0.85 0.79 1.28 0.91 0.82 0.82 0.77 1.115th 0.82 0.74 0.74 0.66 0.84 0.78 0.70 0.70 0.62 0.7595th 1.03 0.93 0.93 0.90 4.43 1.01 0.91 0.91 0.90 6.93Variance function 3 (Negative Binomial)CR 1000 1000 998 996 817 1000 1000 999 1000 977Mean 0.94 0.84 0.86 1.01 2.10 0.91 0.82 0.83 0.85 1.99SD 0.11 0.12 0.11 1.53 1.90 0.12 0.12 0.11 0.92 2.23Med 0.94 0.84 0.85 0.80 1.48 0.91 0.81 0.83 0.75 1.165th 0.76 0.65 0.68 0.55 0.77 0.70 0.62 0.65 0.52 0.6795th 1.12 1.07 1.05 1.54 5.24 1.09 1.04 1.04 1.22 6.96Variance function 4 (Gaussian)CR 1000 1000 946 1000 630 1000 1000 968 1000 885Mean 0.95 0.84 0.84 0.79 2.79 0.91 0.82 0.82 0.77 3.35SD 0.08 0.06 0.06 0.07 2.91 0.08 0.07 0.07 0.09 4.59Med 0.95 0.85 0.85 0.79 1.46 0.92 0.82 0.82 0.77 1.215th 0.80 0.74 0.74 0.66 0.80 0.76 0.70 0.70 0.62 0.7495th 1.06 0.93 0.93 0.91 8.58 1.04 0.92 0.92 0.90 12.56Variance function 5 (Inverse Gaussian)CR 996 1000 958 988 996 997 1000 966 999 996Mean 0.90 0.82 0.82 † 1.02 0.88 0.80 0.80 † 1.00SD 0.07 0.13 0.09 † 0.09 0.09 0.15 0.11 † 0.15Med 0.91 0.82 0.83 0.77 1.01 0.88 0.79 0.80 0.74 0.985th 0.81 0.70 0.70 0.61 0.88 0.76 0.66 0.66 0.57 0.8395th 0.98 0.92 0.92 0.92 1.18 0.97 0.91 0.91 0.91 1.21Variance function 6 (Other)CR 1000 1000 955 1000 781 1000 1000 955 1000 944Mean 0.94 0.84 0.84 0.79 2.36 0.91 0.82 0.82 0.77 2.71SD 0.07 0.06 0.06 0.07 2.29 0.08 0.07 0.07 0.09 3.99Med 0.94 0.85 0.85 0.79 1.39 0.91 0.82 0.82 0.77 1.145th 0.81 0.74 0.73 0.66 0.81 0.77 0.70 0.70 0.62 0.7395th 1.05 0.92 0.92 0.90 6.11 1.02 0.91 0.91 0.90 8.81
Notes: See Table 3.
26
Table 5: Alternative 1 (α = −9), 1,000 replications
Iterative Structural Estimators Fixed Effects Estimators QD
Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 971 999 757 799 1000 1000 961 965 876 1000 1000Mean 1.00 1.01 1.00 1.02 1.01 1.00 1.00 1.00 1.60 1.24 0.99 0.96SD 0.00 0.02 0.01 0.04 0.03 0.03 0.11 0.10 0.93 0.32 0.06 0.46Med 1.00 1.01 1.00 1.00 1.00 1.00 0.99 0.99 1.32 1.17 0.99 0.865th 1.00 0.99 1.00 0.97 1.00 0.95 0.84 0.85 0.89 0.99 0.89 0.5995th 1.00 1.03 1.01 1.08 1.01 1.05 1.20 1.18 3.13 1.77 1.10 1.57Variance function 2 (Poisson)CR 990 999 1000 957 33 937 1000 933 1000 17 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.34 1.00 1.00 1.00 † 1.03 4.02SD 0.01 0.01 0.00 0.02 0.01 0.12 0.02 0.02 0.01 † 0.10 22.00Med 1.00 1.00 1.00 1.00 1.00 1.33 1.00 1.00 1.00 † 1.03 0.925th 1.00 1.00 1.00 1.00 1.00 1.15 0.97 0.97 0.99 2.20 0.87 0.3395th 1.00 1.02 1.01 1.05 1.03 1.54 1.03 1.03 1.02 † 1.17 10.34Variance function 3 (Negative Binomial)CR 974 978 999 721 34 914 1000 955 960 14 1000 1000Mean 1.00 1.01 1.00 1.02 1.01 1.36 1.01 1.01 1.66 † 1.06 3.60SD 0.01 0.02 0.01 0.04 0.04 0.13 0.11 0.11 2.11 † 0.15 18.16Med 1.00 1.01 1.00 1.01 1.00 1.36 1.00 1.00 1.32 † 1.07 0.885th 1.00 0.99 1.00 0.97 1.00 1.16 0.84 0.84 0.89 0.20 0.81 0.3495th 1.00 1.03 1.01 1.08 1.12 1.57 1.22 1.22 3.03 † 1.29 11.07Variance function 4 (Gaussian)CR 883 998 1000 1000 0 602 1000 936 1000 44 1000 1000Mean 1.00 1.01 1.00 1.00 – 1.62 1.01 1.01 1.00 † 1.04 3.59SD 0.01 0.01 0.00 0.00 – 0.17 0.03 0.03 0.00 † 0.12 26.88Med 1.00 1.00 1.00 1.00 – 1.61 1.01 1.01 1.00 † 1.04 0.695th 1.00 1.00 1.00 1.00 – 1.35 0.97 0.97 1.00 † 0.83 0.2495th 1.00 1.02 1.01 1.00 – 1.91 1.03 1.03 1.00 † 1.23 6.80Variance function 5 (Inverse Gaussian)CR 998 988 996 912 996 998 1000 927 992 942 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 0.99 0.98 0.98 1.03 1.00 0.98 0.99SD 0.00 0.01 0.01 0.03 0.00 0.01 0.04 0.04 0.27 0.01 0.02 0.14Med 1.00 1.00 1.00 1.00 1.00 0.99 0.98 0.98 0.99 1.00 0.99 0.995th 0.99 0.99 1.00 0.99 1.00 0.98 0.94 0.94 0.91 0.99 0.97 0.9295th 1.00 1.02 1.01 1.06 1.00 1.00 1.01 1.01 1.23 1.01 1.00 1.04Variance function 6 (Other)CR 950 999 1000 1000 7 780 1000 933 1000 47 1000 1000Mean 1.00 1.00 1.00 1.00 1.08 1.50 1.00 1.00 1.00 † 1.04 2.85SD 0.01 0.01 0.00 0.00 0.24 0.15 0.02 0.02 0.00 † 0.10 16.98Med 1.00 1.00 1.00 1.00 1.00 1.49 1.01 1.01 1.00 † 1.04 0.725th 1.00 1.00 1.00 1.00 0.93 1.26 0.98 0.98 1.00 † 0.86 0.2995th 1.00 1.02 1.01 1.00 1.62 1.77 1.03 1.03 1.00 † 1.22 9.55
Notes: See Table 3.
27
Table 6: Alternative 2 (vHT = 0.3), 1,000 replications
Iterative Structural Estimators Fixed Effects Estimators QD
Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 998 1000 820 929 1000 1000 981 985 965 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.00 1.00 1.00 1.66 1.26 0.99 0.92SD 0.00 0.01 0.01 0.05 0.01 0.04 0.11 0.11 1.23 0.29 0.08 0.29Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.33 1.20 0.99 0.865th 1.00 1.00 1.00 1.00 1.00 0.93 0.84 0.84 0.83 1.00 0.86 0.6295th 1.00 1.01 1.00 1.09 1.00 1.07 1.20 1.18 3.65 1.67 1.13 1.40Variance function 2 (Poisson)CR 1000 1000 1000 976 262 1000 1000 943 1000 200 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 1.17 1.00 1.00 1.00 2.67 1.01 1.97SD 0.00 0.00 0.00 0.02 0.01 0.09 0.02 0.02 0.01 0.82 0.07 3.88Med 1.00 1.00 1.00 1.00 1.00 1.16 1.00 1.00 1.00 2.53 1.02 1.135th 1.00 1.00 1.00 1.00 1.00 1.02 0.97 0.97 0.98 1.74 0.90 0.5095th 1.00 1.01 1.00 1.05 1.00 1.33 1.03 1.03 1.02 4.24 1.11 5.17Variance function 3 (Negative Binomial)CR 1000 999 1000 834 313 1000 1000 977 986 186 1000 1000Mean 1.00 1.00 1.00 1.02 1.00 1.20 1.01 1.00 1.67 2.67 1.04 2.05SD 0.00 0.01 0.01 0.05 0.03 0.11 0.12 0.11 1.25 0.86 0.14 4.55Med 1.00 1.00 1.00 1.00 1.00 1.19 1.00 1.00 1.35 2.54 1.05 1.075th 1.00 1.00 1.00 1.00 1.00 1.02 0.83 0.83 0.88 1.64 0.82 0.4695th 1.00 1.01 1.00 1.09 1.01 1.39 1.24 1.20 3.40 4.01 1.26 5.49Variance function 4 (Gaussian)CR 1000 1000 1000 1000 31 998 1000 951 1000 32 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.33 1.00 1.00 1.00 † 1.03 3.57SD 0.00 0.00 0.00 0.00 0.01 0.14 0.02 0.02 0.00 † 0.07 15.01Med 1.00 1.00 1.00 1.00 1.00 1.32 1.01 1.01 1.00 † 1.03 1.035th 1.00 1.00 1.00 1.00 1.00 1.12 0.97 0.97 1.00 3.16 0.92 0.3795th 1.00 1.01 1.01 1.00 1.00 1.56 1.03 1.03 1.00 † 1.13 11.11Variance function 5 (Inverse Gaussian)CR 997 996 997 959 992 997 1000 947 992 994 1000 1000Mean 1.00 1.00 1.00 1.01 1.00 0.99 0.98 0.98 1.05 1.00 0.98 0.98SD 0.00 0.01 0.00 0.02 0.00 0.01 0.06 0.05 0.41 0.01 0.04 0.07Med 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.98 0.98 0.99 0.98 0.985th 0.99 1.00 1.00 1.00 1.00 0.98 0.92 0.92 0.90 0.98 0.95 0.9295th 1.00 1.00 1.00 1.05 1.00 1.00 1.04 1.03 1.46 1.01 1.01 1.04Variance function 6 (Other)CR 1000 1000 1000 998 96 1000 1000 943 1000 44 1000 1000Mean 1.00 1.00 1.00 1.00 1.01 1.26 1.00 1.00 1.00 † 1.03 2.75SD 0.00 0.00 0.00 0.00 0.04 0.12 0.02 0.02 0.00 † 0.07 7.27Med 1.00 1.00 1.00 1.00 1.00 1.25 1.01 1.01 1.00 2.70 1.03 1.105th 1.00 1.00 1.00 1.00 1.00 1.07 0.97 0.97 1.00 1.46 0.92 0.4495th 1.00 1.01 1.01 1.00 1.01 1.46 1.03 1.03 1.01 † 1.13 9.07
Notes: See Table 3.
28
Table 7: Alternative 3 (C = 50), 1,000 replications
Iterative Structural Estimators Fixed Effects Estimators QD
Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.15 1.13 1.00 0.98SD 0.00 0.00 0.00 0.00 0.00 0.01 0.02 0.02 0.12 0.04 0.02 0.07Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.13 1.13 1.00 0.975th 1.00 1.00 1.00 1.00 1.00 0.98 0.96 0.96 1.01 1.07 0.97 0.8995th 1.00 1.00 1.00 1.00 1.00 1.02 1.04 1.02 1.36 1.22 1.03 1.12Variance function 2 (Poisson)CR 1000 1000 1000 1000 844 1000 1000 849 1000 640 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.00 1.00 3.03 1.00 1.60SD 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.62 0.01 0.51Med 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.00 1.00 2.95 1.00 1.495th 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 2.17 0.99 1.1395th 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 4.08 1.01 2.35Variance function 3 (Negative Binomial)CR 1000 1000 1000 1000 824 1000 1000 1000 1000 432 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 0.99 1.16 3.60 1.00 1.53SD 0.00 0.00 0.00 0.00 0.00 0.02 0.02 0.02 0.12 0.66 0.03 0.45Med 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.14 3.52 1.00 1.415th 1.00 1.00 1.00 1.00 1.00 0.99 0.96 0.96 1.00 2.76 0.96 1.0995th 1.00 1.00 1.00 1.00 1.00 1.05 1.04 1.03 1.38 4.78 1.04 2.37Variance function 4 (Gaussian)CR 1000 1000 1000 1000 273 1000 1000 835 1000 3 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 5.99 1.00 2.42SD 0.00 0.00 0.00 0.00 0.03 0.02 0.00 0.00 0.00 0.49 0.01 3.97Med 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.00 1.00 5.94 1.00 1.595th 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 5.53 0.99 1.0395th 1.00 1.00 1.00 1.00 1.00 1.05 1.00 1.00 1.00 6.51 1.01 6.06Variance function 5 (Inverse Gaussian)CR 997 1000 997 988 997 997 1000 908 987 997 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.98 0.98 0.97 1.48 0.99 0.98 0.98SD 0.00 0.00 0.00 0.00 0.00 0.01 0.06 0.02 1.50 0.00 0.04 0.04Med 1.00 1.00 1.00 1.00 1.00 0.98 0.97 0.97 1.00 0.99 0.98 0.985th 1.00 1.00 1.00 1.00 1.00 0.97 0.94 0.95 0.93 0.99 0.95 0.9595th 1.00 1.00 1.00 1.00 1.00 0.99 1.06 1.01 3.32 1.00 1.03 1.02Variance function 6 (Other)CR 1000 1000 1000 1000 572 1000 1000 831 1000 60 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.00 4.19 1.00 2.19SD 0.00 0.00 0.00 0.00 0.01 0.02 0.00 0.00 0.00 0.82 0.01 6.98Med 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.00 1.00 4.26 1.00 1.555th 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 3.16 0.99 1.1095th 1.00 1.00 1.00 1.00 1.00 1.04 1.00 1.00 1.00 5.60 1.01 3.77Notes: See Table 3.
29
Tab
le8:
Est
imat
ion
resu
lts:
Gra
vity
mod
elof
trad
e,C
=94
,N
=8,
836
Fix
ed-e
ffect
ses
tim
ator
sIt
erat
ive-
stru
ctur
ales
tim
ator
sQ
Des
tim
ator
Gam
ma
Poi
sson
Neg
Bin
Gau
ssia
nG
amm
aPoi
sson
Neg
Bin
Gau
ssia
nPoi
sson
dist
110
.096
∗∗∗
7.10
3∗∗∗
9.82
1∗∗∗
8.24
6∗∗∗
8.89
0∗∗∗
7.07
6∗∗∗
8.88
1∗∗∗
6.04
8∗∗∗
7.03
8∗∗∗
(0.1
61)
(0.2
61)
(0.1
50)
(0.8
54)
(0.1
99)
(0.2
92)
(0.1
99)
(0.5
11)
(0.2
53)
dist
22.
227∗
∗∗1.
049∗
∗∗2.
084∗
∗∗0.
294
2.26
2∗∗∗
1.02
9∗∗∗
2.24
4∗∗∗
-0.0
850.
180
(0.1
01)
(0.2
77)
(0.0
92)
(1.3
20)
(0.1
82)
(0.2
82)
(0.1
80)
(0.5
22)
(0.2
74)
dist
31.
239∗
∗∗1.
203∗
∗∗1.
157∗
∗∗0.
477
1.36
5∗∗∗
1.20
8∗∗∗
1.36
3∗∗∗
0.55
90.
501∗
(0.0
82)
(0.2
81)
(0.0
75)
(0.4
96)
(0.0
92)
(0.2
81)
(0.0
91)
(0.5
08)
(0.2
66)
dist
40.
799∗
∗∗0.
813∗
∗∗0.
755∗
∗∗0.
088
0.96
7∗∗∗
0.81
8∗∗∗
0.96
5∗∗∗
0.27
80.
306
(0.0
77)
(0.2
96)
(0.0
71)
(0.9
47)
(0.0
91)
(0.2
91)
(0.0
91)
(0.5
38)
(0.3
37)
lang
×di
st1
-5.6
18∗∗
∗-4
.536
∗∗∗
-5.5
46∗∗
∗-4
.969
∗∗∗
-5.5
77∗∗
∗-4
.517
∗∗∗
-5.5
78∗∗
∗-3
.899
∗∗∗
-4.7
94∗∗
∗
(0.1
91)
(0.2
44)
(0.1
86)
(0.4
30)
(0.2
45)
(0.2
14)
(0.2
45)
(0.2
04)
(0.2
21)
lang
×di
st2
0.26
0∗∗
0.60
3∗∗
0.28
3∗∗
0.81
5-0
.608
∗∗∗
0.64
0∗∗
-0.5
98∗∗
∗1.
351∗
∗∗1.
530∗
∗∗
(0.1
27)
(0.2
99)
(0.1
24)
(0.9
91)
(0.2
09)
(0.2
77)
(0.2
08)
(0.3
59)
(0.3
57)
lang
×di
st3
0.87
6∗∗∗
-0.0
240.
860∗
∗∗-0
.446
0.26
0-0
.032
0.25
4-0
.292
0.31
5(0
.179
)(0
.306
)(0
.174
)(0
.575
)(0
.185
)(0
.257
)(0
.184
)(0
.325
)(0
.251
)
lang
×di
st4
0.34
9∗∗∗
0.03
00.
381∗
∗∗-0
.423
0.41
6∗∗∗
0.05
90.
410∗
∗∗-0
.405
0.54
4∗
(0.1
26)
(0.2
94)
(0.1
20)
(0.5
57)
(0.1
41)
(0.1
92)
(0.1
40)
(0.2
89)
(0.2
94)
lang
×di
st5
0.58
4∗∗∗
0.71
8∗0.
572∗
∗∗-0
.278
1.07
5∗∗∗
0.77
2∗∗
1.06
7∗∗∗
-0.0
551.
145∗
∗∗
(0.1
60)
(0.3
82)
(0.1
54)
(0.5
43)
(0.1
76)
(0.3
25)
(0.1
75)
(0.5
23)
(0.3
26)
AIC
1192
02.6
319.
750e
+08
1200
23.2
4624
6273
.375
1338
24.2
809.
757e
+08
1339
38.5
1124
9670
.440
603.
449
Not
es:
The
tabl
eco
ntai
nses
tim
ates
ofa
grav
ity
mod
elw
ith
spec
ifica
tion
(21)
.R
obus
tst
anda
rder
rors
inpa
rent
hesi
s.B
oots
trap
ped
stan
dard
erro
rsfo
rQ
D-P
oiss
onin
pare
nthe
sis.
The
vari
able
sdi
st1,
...,d
ist4
are
indi
cato
rva
riab
les
=1
ifth
eco
untr
y-pa
iris
inth
e1,
...,4
dist
ance
quin
tile
.T
heva
riab
les
lang
×di
st1,
...,la
ng×
dist
5ar
ein
tera
ctio
nsof
the
dist
ance
quin
tile
indi
cato
rsw
ith
anin
dica
tor
vari
able
=1
ifth
eco
untr
y-pa
irsh
ares
the
sam
ela
ngua
ge.
∗,∗
∗an
d∗∗
∗in
dica
test
atis
tica
lsig
nific
ance
atth
e10
%,5
%an
d1%
leve
l.A
ICis
the
Aka
ike
Info
rmat
ion
Cri
teri
on.
30
Figures
020
4060
Res
idua
ls
0 10000 20000 30000 40000Predicted conditional mean
Gamma
050
0010
000
1500
0
Res
idua
ls
0 50 100 150Predicted conditional mean
Poisson0
2040
60
Res
idua
ls
0 10000 20000 30000Predicted conditional mean
NegBin
050
0000
01.
00e+
07
Res
idua
ls
0 50 100 150Predicted conditional mean
Gaussian
Figure 1: Fixed Effects estimators: Predictions and Pearson Residuals
Notes: The figure plots predictions of trade flows against Pearson residuals for fixed effects GLM estimates of the trade
gravity model (21), corresponding to columns 2 to 5 of Table 8.
31
0.1
.2.3
Ker
nel d
ensi
ty
−5 0 5 10 15Deviance residuals
kernel = epanechnikov, bandwidth = 0.2714
Gamma
0.0
05.0
1
Ker
nel d
ensi
ty
−4000 −2000 0 2000 4000 6000Deviance residuals
kernel = epanechnikov, bandwidth = 4.1121
Poisson
0.1
.2
Ker
nel d
ensi
ty
−5 0 5 10 15Deviance residuals
kernel = epanechnikov, bandwidth = 0.2381
NegBin
05.
000e
−07
1.00
0e−
061.
500e
−06
Ker
nel d
ensi
ty0 2000000 4000000 6000000 8000000 10000000
Deviance residualskernel = epanechnikov, bandwidth = 34.3910
Gaussian
Residuals Normal density
Figure 2: Fixed Effects estimators: Density of deviance residuals
Notes: The figure plots kernel density of deviance residuals for fixed effects GLM estimates of the trade gravity model (21),
corresponding to columns 2 to 5 of Table 8.
32
Appendix
Table 9: Alternative scenario α = −2 (1,000 replications)
Iterative Structural Estimators Fixed Effects Estimators QD
Gam P NB Gau IG Gam P NB Gau IG GMM PVariance function 1 (Gamma)CR 1000 1000 1000 989 1000 1000 1000 977 997 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.99 1.43 1.25 0.99 0.95SD 0.00 0.01 0.00 0.02 0.00 0.14 0.17 0.17 0.85 0.25 0.20 0.24Med 1.00 1.00 1.00 1.00 1.00 1.00 0.99 0.99 1.26 1.23 0.98 0.925th 1.00 1.00 1.00 1.00 1.00 0.77 0.73 0.73 0.75 0.87 0.66 0.6195th 1.00 1.00 1.00 1.02 1.00 1.21 1.30 1.29 2.54 1.71 1.34 1.38Variance function 2 (Poisson)CR 1000 1000 1000 1000 999 1000 1000 999 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.03 1.00 1.02SD 0.01 0.00 0.00 0.00 0.01 0.03 0.01 0.01 0.01 0.06 0.03 0.07Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.02 1.00 1.015th 1.00 1.00 1.00 1.00 1.00 0.95 0.97 0.97 0.98 0.94 0.95 0.9395th 1.00 1.00 1.01 1.00 1.00 1.05 1.02 1.02 1.02 1.14 1.04 1.13Variance function 3 (Negative Binomial)CR 1000 1000 1000 990 997 1000 1000 980 1000 999 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.02 1.00 0.99 1.53 1.41 0.99 0.96SD 0.00 0.00 0.00 0.02 0.00 0.15 0.18 0.18 1.79 0.35 0.22 0.27Med 1.00 1.00 1.00 1.00 1.00 1.02 0.99 0.98 1.26 1.36 1.00 0.935th 1.00 1.00 1.00 1.00 1.00 0.77 0.72 0.72 0.73 0.93 0.65 0.6195th 1.00 1.00 1.00 1.02 1.00 1.27 1.33 1.31 2.91 2.03 1.34 1.42Variance function 4 (Gaussian)CR 1000 1000 1000 1000 998 1000 1000 997 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.01SD 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.06 0.01 0.06Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.005th 1.00 1.00 1.00 1.00 1.00 0.98 0.99 0.99 1.00 0.97 0.99 0.9795th 1.00 1.00 1.00 1.00 1.00 1.02 1.01 1.01 1.00 1.06 1.01 1.06Variance function 5 (Inverse Gaussian)CR 993 999 993 982 993 993 1000 991 992 994 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 0.97 0.97 0.96 1.16 0.98 0.96 0.96SD 0.00 0.01 0.06 0.06 0.06 0.06 0.19 0.17 0.90 0.07 0.18 0.12Med 1.00 1.00 1.00 1.00 1.00 0.97 0.97 0.97 0.98 0.98 0.97 0.975th 1.00 1.00 1.00 1.00 1.00 0.90 0.82 0.83 0.81 0.93 0.79 0.8495th 1.00 1.00 1.00 1.00 1.00 1.04 1.12 1.11 2.36 1.05 1.10 1.05Variance function 6 (Other)CR 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000Mean 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.01SD 0.00 0.00 0.00 0.00 0.00 0.02 0.01 0.01 0.00 0.04 0.01 0.04Med 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.005th 1.00 1.00 1.00 1.00 1.00 0.97 0.99 0.99 0.99 0.96 0.98 0.9595th 1.00 1.00 1.00 1.00 1.00 1.03 1.01 1.01 1.01 1.08 1.02 1.08Notes: See Table 3.
33
010
020
0
Res
idua
ls
0 100 200 300 400Predicted conditional mean
Gamma
050
0010
000
1500
0
Res
idua
ls
0 50 100 150Predicted conditional mean
Poisson
010
020
0
Res
idua
ls
0 100 200 300 400Predicted conditional mean
NegBin
−1.
00e+
070
1.00
e+07
Res
idua
ls
0 50 100 150 200Predicted conditional mean
Gaussian
Figure 3: Iterative Structural estimators: Predictions and Pearson Residuals
Notes: The figure plots predictions of trade flows agains Pearson residuals for iterative structural GLM estimates of the
trade gravity model (21), corresponding to columns 6 to 9 of Table 8.
34
0.1
.2
Ker
nel d
ensi
ty
−5 0 5 10 15 20Deviance residuals
kernel = epanechnikov, bandwidth = 0.2751
Gamma
0.0
05.0
1.0
15
Ker
nel d
ensi
ty
−4000 −2000 0 2000 4000 6000Deviance residuals
kernel = epanechnikov, bandwidth = 4.1978
Poisson
0.1
.2.3
Ker
nel d
ensi
ty
−5 0 5 10 15 20Deviance residuals
kernel = epanechnikov, bandwidth = 0.2536
NegBin
05.
000e
−07
1.00
0e−
06
Ker
nel d
ensi
ty−5000000 0 5000000 10000000 15000000
Deviance residualskernel = epanechnikov, bandwidth = 321.4131
Gaussian
Residuals Normal density
Figure 4: Iterative Structural estimators: Density of deviance residuals
Notes: The figure plots kernel density of deviance residuals for iterative structural GLM estimates of the trade gravity
model (21), corresponding to columns 6 to 9 of Table 8.
35