xi-qu.weebly.com...mutl & pfa⁄ermayr (2010) and lee & yu (2010b) consider the estimation...
TRANSCRIPT
QML Estimation of Dynamic Spatial Panel Data Models withEndogenous Spatial Weight Matrices
Xi Qu�
Antai College of Economics and Management, Shanghai Jiaotong University
Lung-fei LeeDepartment of Economics, The Ohio State University
May 20, 2014
Abstract
This paper investigates the adjusted quasi-maximum likelihood estimation of spatial panel modelswith both individual and time �xed e¤ects. The spatial weight matrices are constructed by some economicvariables and can be endogenous and time varying. In this setting, we consider a dynamic spatial paneldata model when the time dimension is short and the same model when the time dimension is long.We establish the consistency and asymptotic normality of the QML estimators in these two settings andinvestigate their �nite sample properties by a Monte Carlo study.
JEL classi�cation: C31; C51Keywords: Spatial panel models; Endogenous spatial weight matrices; Fixed e¤ects; Maximum like-
lihood
1 Introduction
Spatial panel data models are standard tools to analyze data with both cross-sectional and dynamic de-pendences among economic units. They are generalized from a cross-sectional spatial autoregressive (SAR)model proposed by Cli¤ & Ord (1973). Recently, there is much progress in empirical and theoretical workson spatial panel data models. For the static case, spatial panel data models can be applied to agriculturaleconomics (Druska & Horrace, 2004), transportation research (Frazier & Kockelman, 2005), public economics(Egger et al., 2005), consumer demand (Baltagi & Li 2006), to name a few. For the dynamic case, spatialdynamic panel data models can be applied to the growth convergence of countries and regions (Ertur & Koch,2007), regional markets (Keller & Shiue, 2007), labor economics (Foote, 2007), public economics (Revelli,2001; Tao, 2005; Franzese, 2007), and some other �elds. For the estimation and statistical inference, randome¤ects and �xed e¤ects spatial panel models are most commonly used. For the random e¤ects model, Baltagiet al. (2003, 2007a, 2007b), Mutl (2006) and Kapoor et al. (2007) investigate various speci�cations witherror components. For the �xed e¤ects model, Elhorst (2005), Korniotis (2010), Su & Yang (2007), Yu etal. (2008, 2012) and Lee and Yu (2010a) study static or dynamic models under various spatial structures.
�Corresponding author: [email protected], Antai College of Economics and Management, Shanghai Jiao Tong University,Shanghai, China, 200052.
1
Mutl & Pfa¤ermayr (2010) and Lee & Yu (2010b) consider the estimation of spatial panel data models withboth �xed and random e¤ects speci�cations, and propose Hausman-type speci�cation tests.In the current literature of spatial panel data models, the spatial weights matrix is usually speci�ed
to be exogenous and time invariant. This is plausible if spatial weight matrices are based on contiguityor geographic distances among regions. But there are plenty of cases that spatial weights are constructedwith economic/socioeconomic distances. For example, Aiello and Cardamone (2008) construct their spatialweights by a variable that re�ects �rms technological similarity and geographical proximity to study anR&D spillover in Italy. In Crabb and Vandenbussche (2008), where in addition to the physical distance,spatial weight matrices are constructed by inverse trade share and inverse distance between GDP per capita.When elements of a spatial weights matrix are constructed from economic/socioeconomic characteristics ofregions (or districts) in a panel setting, these characteristics might be endogenous and changing over time.Qu and Lee (2013) study a cross-sectional SAR model with endogenous spatial weights and �nd ignoring theendogeneity in spatial weights matrices would have substantial consequences on estimates. Lee & Yu (2012)consider spatial dynamic panel data models with time varying spatial weights matrices, but they assume theweights are still exogenous. One may wonder whether ignoring the endogeneity in spatial weights matriceswould have severe consequences in panel setting, and whether spatial panel models with endogenous timevarying spatial weights can be easily handled and estimated. These motivate our investigation on the spatialpanel data models with endogenous spatial weights This paper investigates the quasi-maximum likelihood(QML) estimation of static and dynamic spatial panel models under the setting of endogenous and timevarying spatial weights matrices.This paper is organized as follows. Section 2 introduces the model and presents the likelihood function
to be maximized. Section 3 establishes asymptotic properties of QML estimators. We show that the QMLestimates are consistent and asymptotically normal. Monte Carlo results for various estimators are providedin Section 4. Section 5 concludes the paper. Some lemmas and proofs are collected in the Appendices.
2 The Model
2.1 Model speci�cation
Following Jenish and Prucha (2009 & 2012), we consider spatial processes located on a (possibly) unevenlyspaced lattice D � Rd, d � 1. Asymptotic methods we employ are increasing domain asymptotics: growthof the sample is ensured by an unbounded expansion of the sample region as in Jenish and Prucha (2012).1
Let f("0i;nt; vi;nt); i 2 Dn, n 2 N , t = 1; :::Tg be a triangular double array of real random variablesde�ned on a probability space (; F ; P ), where the index set Dn � D is a �nite set.In this paper, we consider a dynamic spatial panel data model
Ynt = �1WntYnt + �2Wn;t�1Yn;t�1 + �Yn;t�1 +X1nt� + cn + �tln + Vnt; (1)
where Ynt = (y1t; y2t; :::ynt)0 and Vnt = (v1;nt; v2;nt; :::vn;nt)
0 are n dimensional column vectors, and vi;nt�sare i.i.d across i and t with zero mean and variance �2v. The X1nt is an n � k1 matrix of individuallyand time varying non-stochastic regressors. � is an k1 dimensional vector of coe¢ cients, and �1, �2, and� are scalar coe¢ cients. cn is an n dimensional vector of the individual �xed e¤ects and �t is a scalar ofthe time �xed e¤ect. The spatial weight matrix Wnt is an n � n matrix with each entry constructed by:(Wnt)ij = wij;nt = g(zi;nt; zj;nt), where zi;nt is a p dimensional row vector. For any i = 1; :::n, zi;nt has the
1 In�ll asymptotics have not been developed for a NED process in the literature.
2
modelzi;nt = x
02i;t� + d
0i;n + g
0t + "
0i;nt;
where x2i;t is a k2 dimensional vector of individually and time varying non-stochastic regressors, � is anp�k2 matrix of coe¢ cients, di;n is a p dimensional constant vector invariant over time, gt is a p dimensionalconstant vector invariant over individual, and "i;nt is a p dimensional random variable. Denote n � k2
matrix X2nt =
0B@ x01;2nt...
x0n;2nt
1CA, n � p matrices Znt; dn, and "nt with Znt =0B@ z1;nt
...zn;nt
1CA, dn =0B@ d01;n
...d0n;n
1CA, and"nt =
0B@ "01;nt...
"0n;nt
1CA. Then we can write in matrix form that
Znt = X2nt� + dn + ln g0t + "nt: (2)
2.2 Source of endogeneity
We consider n agents in an area, each endowed with a predetermined location i. Due to some competitionor spillover e¤ects, at period t, each agent i has an outcome yi;nt directly a¤ected by its neighbors�currentoutcomes y0j;nts; its own outcome from last period yi;n;t�1, and its neighbors�outcomes from the last periody0j;n;t�1s The spatial weight wij;nt is a measure of relative strength of linkage between agents i and j at timet, However, this weight wij;nt is not predetermined but depends on some observable random variable Znt:Wecan think of zi;nt as some economic variables at location i and time t such as GDP, consumption, economicgrowth rate, etc, which in�uence strength of links across units. We have the following assumptions.
Assumption 1 The lattice D � Rd0 , d0 � 1, is in�nitely countable. All elements in D are located atdistances of at least dis0 > 0 from each other, i.e., 8i; j 2 D : �ij � dis0, where �ij is the distance betweenlocations i and j; w.l.o.g. we assume that dis0 = 1.
Assumption 2 The error terms vi;nt and "i;nt, have a joint distribution: (vi;nt; "0i;nt)0 � i:i:d:(0;�v"), where
�v" =
��2v �0v"�v� �"
�is positive de�nite, �2v is a scalar variance, covariance �v" = (�v"1 ; :::�v"p2 )
0 is a p
dimensional vector, and �" is a p � p matrix. The supi;n;tEjvi;ntj4+�" and supi;n;tEjj"i;ntjj4+�" exist forsome �" > 0. Furthermore, E(vi;ntj"i;nt) = "0i;nt� and V ar(vi;ntj"i;nt) = �2�.
The endogeneity of Wnt comes from the correlation between vi;nt and "i;nt. If �v" is zero, the spatialweight matrix Wnt might be treated as strictly exogenous and we can apply conventional methodology ofspatial panel data models for estimation. However, if �v" is not zero, Wnt becomes an endogenous spatialweights matrix.
2.3 The QML estimation
From the two conditional moments assumptions in Assumption 2, we have the p dimensional column vector� = ��1" �v" and the scalar �2� = �
2v � �0v"��1" �v". Denote �nt = Vnt� "nt�, then its mean conditional on "nt
is zero and its conditional variance matrix is �2�In. In particular, �nt are uncorrelated with the terms of "ntand the variance of �nt is �
2�0In.
3
The outcome equation (1) becomes
Ynt = �1WntYnt+�2Wn;t�1Yn;t�1+ �Yn;t�1+X1nt�+(Znt�X2nt�� dn� ln g0t)�+ cn+ atln+ �nt; (3)
with E(�i;ntj"i;nt) = 0 and E(�2i;ntj"i;nt) = �2� ; and �i;nt�s are i.i.d. across i and t. Our subsequent asymptoticanalysis will mainly rely on equation (1), where (Znt�X2nt�� dn� ln g0t) are control variables to controlthe endogeneity of Wnt. Assumption 2 is relatively general without imposing a speci�c distribution ondisturbances as it is based on only conditional moments restrictions. In the special case that (vi;nt; "0i;nt)
0
has a jointly normal distribution, then vi;ntj"i;nt � N(�0v"��1" "i;n; �2v � �0v"��1" �v") and �nt is independent
of "nt in equation (2).
2.3.1 A dynamic spatial panel model with large T
In this setting, we consider the row-normalized Wnt. Let (Fn;n�1; ln=pn) be the orthonormal matrix of
eigenvectors of Jn = In � (1=n)lnl0n, where Fn;n�1 corresponds to the eigenvalues of ones and ln=pn cor-
responds to the eigenvalue zero. From Lee and Yu (2012), denoting Y �nt = F 0n;n�1Ynt and other variablessimilarly, we have
Y �nt = �1W�ntY
�nt + �2W
�n;t�1Y
�n;t�1 + �Y
�n;t�1 +X
�1nt� + (Z
�nt �X�
2nt�� d�n)� + c�n + ��nt; (4)
where W �nt = F 0n;n�1WntFn;n�1, X�
nt = F 0n;n�1Xnt, c�n = F 0n;n�1cn, Z
�nt = F 0n;n�1Znt, D
�n = F 0n;n�1Dn,
��nt = F0n;n�1�nt, and �
�nt is an (n� 1) dimensional disturbance vector with zero mean and variance matrix
�2�In�1: In this format, time e¤ects are eliminated and the number of observations is T (n � 1). Denote� = (�1; �2; �; �
0; [V ec(�)]0; �; �2� ; �0)0 with � being the vector of all distinct elements in �". The quasi log
likelihood function is
lnLn;T (�; c�n; d
�n) = � (n� 1)T
2ln 2��2� j�"j+
TXt=1
ln jIn�1 � �1W �ntj �
1
2�2�
TXt=1
��0nt(�; c�n; d
�n)�
�nt(�; c
�n; d
�n)
�12
TXt=1
(Z�nt �X�2nt�� d�n)0(��1" In)(Z�nt �X�
2nt�� d�n);
where ��nt(�; c�n; d
�n) = (In�1��1W �
nt)Y�nt� (�2W �
n;t�1Y�n;t�1+�Y
�n;t�1)� [X�
1nt�+c�n+(Z
�nt�X�
2nt��d�n)�]:As jIn�1 � �1W �
ntj = 1=(1� �1)jIn � �1Wntj, (In�1 � �1W �nt)
�1 = F 0n;n�1(In � �1Wnt)�1Fn;n�1, the quasi
log likelihood function for Y �nt can be expressed in terms of Ynt as
lnLn;T (�; cn; dn) = � (n� 1)T2
ln 2��2� j�"j � T ln(1� �1)�1
2�2�
TXt=1
�0nt(�; cn; dn)Jn�nt(�; cn; dn)
+
TXt=1
ln jIn � �1Wntj �1
2
TXt=1
(Znt �X2nt�� dn)0(��1" Jn)(Znt �X2nt�� dn);
where �nt(�; cn; dn) = (In � �1Wnt)Ynt � [�2Wn;t�1Yn;t�1 + �Yn;t�1 +X1nt� + (Znt �X2nt�� dn)� + cn]:We can concentrate out cn (and dn) by substituting cn(�) for cn with cn(�) = 1=T
PTt=1[(In��1Wnt)Ynt�
(�2Wn;t�1Yn;t�1 + �Yn;t�1) � X1nt�] and dn(�) for dn with dn(�) = 1=TPT
t=1(Znt � X2nt�)). Then the
4
concentrated quasi log likelihood function is
lnLcn;T (�) = � (n� 1)T2
ln 2��2� j�"j � T ln(1� �1) +TXt=1
ln jSnt(�1)j
� 1
2�2�
TXt=1
e�0nt(�)Jne�nt(�)� 12TXt=1
e"nt(�)0(��1� Jn)e"nt(�); (5)
where Snt(�1) = In � �1Wnt, e�nt(�) = eYn;t � �1WgntY nt � (�2W g
n;t�1Y n;t�1 + �1eYn;t�1)� [ eX1nt� +e"nt(�)�]
and e"nt(�) = eZnt � eX2nt� with eUnt = Unt � (1=T )PTt=1 Unt for any Unt:
The QMLE b� is the solution to the �rst order conditions @ lnLcn;T (b�)@� = 0, where the expression of@ lnLcn;T (�)
@�
can be found in the appendix. To simlify the estimation procedure, we may use a consistent estimator b�" of�"0 in the 1st stage and then plug it into the log likelihood function and estimate the other parameters. Inthe 1st stage, we estimate Jn eZnt = Jn eX2nt� + Jne"nt by OLS. Hence, b�ols = ( eX 0
2ntJneX2nt)�1( eX 0
2ntJneZnt)
and b�" = 1
(T � 1)(n� 1� k2)
TXt=1
( eZnt � eX2ntb�ols)0Jn( eZnt � eX2ntb�ols):Denote � = (��; �0)0. In the 2nd stage, our QMLE c�� is the solution to the �rst order conditions@ lnLcn;T (
c��;b�)@��
= 0, where b� are the elements in b�".3 Asymptotic property
To analyze the asymptotic properties of our QMLE of the dynamic spatial panel data model, we need furtherassumptions.
Assumption 3 3.1). For any i, j, n, and t, the spatial weight wij;nt � 0, wii;nt = 0, supn;t jjWntjj1 =cw <1 and supn;t jjWntjj1 = cu <1.3.2). The parameter � = (�1; �2; �; �
0; vec(�)0; �2v; �0; �0v�)
0 is in a compact set � in the Euclidean spaceRk� , where � is a vector of distinct parameters in �� and k� = k + 2 + kp + p + J ; k is the dimension of�, p is the dimension of �v�, kp is the number of parameters in �, and J is the dimension of �. In this set,�2v > 0 and ��(�) is positive de�nite. The true parameter �0 is contained in the interior of �. Furthermore,�10, �20, and �0 satis�y that j�10jcw < 1.3.3). The matrix Snt(�1) is nonsingular for all �1, and for any n and t.3.3). Let the k� n matrix Xnt collect all distinct column vectors in X1nt and X2nt: All elements in Xnt
are deterministic and bounded in absolute value. 1nT
PTt=1X
0ntJnXnt is nonsingular.
5
3.1 The dynamic spatial panel model with large T
Denote Gnt(�1) =WntS�1nt (�1) with Snt = In � �10Wnt and Gnt =WntS
�1nt . The reduced form of Ynt is
Ynt =1Xh=0
S�1nt (�10In + �20Wn;t�1)S�1n;t�1 � � � (�10In + �20Wn;t�h)S
�1n;t�h
�(X1n;t�h�0 + cn0 + �t�h;0ln + "n;t�h�0 + �n;t�h)
= S�1nt
1Xh=0
B(h)nt (X1n;t�h�0 + cn0 + �t�h;0ln + "n;t�h�0 + �n;t�h); (6)
where B(h)nt = (�0S�1n;t�1 + �20Gn;t�1)(�0S
�1n;t�2 + �20Gn;t�2) � � � (�0S�1n;t�h + �20Gn;t�h) =
Yh
k=1(�0S
�1n;t�k +
�20Gn;t�k). In this setting, statistics the asymptotic analysis is based on the independence across di¤erenttime period t. We need further assumptions.
Assumption 4 supt;nPt
s=0
P1h=0 jjB
(s+h)nt jj1 <1 and supt;n
Pts=0
P1h=0 jjB
(s+h)nt jj1 <1.
For this assumption to hold, a su¢ cient condition is that supt;n jj�0S�1n;t + �20Gn;tjj1 < qw < 1 and forany t and n, there exist at most K (K � 1) columns of �0S�1n;t + �20Gn;t that the column sum exceeds qw,where K is a �xed number that does not depend n or t.
Assumption 5 Either a) limT!11nT
PTt=1E[(
eKnt; eTnt)0Jn( eKnt; eTnt)] exists and is nonsingular, where eTnt =[ gWn;t�1Y n;t�1;
eYn;t�1; eXnt; e"nt] and Knt = �20Wn;t�1Yn;t�1 + �0Yn;t�1 +X1nt�0 + "nt�0,
or b) limT!11nT
PTt=1E(
eT 0ntJn eTnt) exists and is nonsingular and limT!11nT
PTt=1 Snt(�1)
0Snt(�1) isnot proportional to limT!1
1nT
PTt=1 S
0ntSnt with probability one whenever �1 6= �10.
Assumption 5 is an identi�cation condition for the model. Assumption 5a) is a strong rank condition.Assumption 5b) explores the i.i.d. disturbances of the model so that the reduced form of Ynt has a uniquevariance structure. Assumption 5 also implies that the information matrix of this model is nonsingular.
Theorem 1 Under Assumptions 1-3, and 5, �0 is the unique maximizer of limT!11nT E[lnL
cn;T (�)]:
Theorem 2 Under Assumptions 1-3, and 5,The 2-stage QMLE c�� that maximizes lnLcn;T (c��; b�) has c�� p!��0 as T !1 and
p(n� 1)T (c�� � ��0 ) +rn� 1T ��1�0;nTa�0;nT +Op
�max(
pn� 1T
;1pT)
�d! N
�0;��1�0 (��0 +�0)�
�1�0
�where the expressions of a�0;nT , ��0 ; and �0 can be found in the Appendix.
4 Monte Carlo simulation
5 Conclusion
This paper investigates the QML estimation of spatial panel models with both individual and time �xede¤ects. The spatial weight matrices are constructed by some economic variables and can be endogenous and
6
time varying. In this setting, we consider a static spatial panel model when the time dimension is short and adynamic spatial panel model when the time dimension is long. We establish the consistency and asymptoticnormality of the QML estimators in these two models. For the static model, the asymptotic analysis is basedon the near-epoch dependence on the individuals, so we impose additional assumptions on the structure ofspatial weight matrices. For the dynamic model, the asymptotic analysis is based on the independence overtime, so we impose assumptions on the summability of spatial weight matrices. Finite sample properties arestudied by a Monte Carlo simulation.
6 Appendix
6.1 Expressions of the adjusted quasi log likelihood function
In the large T setting, the �rst order derivatives of (5) are
@ lnLcn;T (�)
@�1=
1
�2�
TXt=1
�Wg
ntY0ntJn
e�nt(�)� �2�tr[JnWntS�1nt (�1)]
�;
@ lnLcn;T (�)
@�2=
1
�2�
TXt=1
W gn;t�1Y
0n;t�1Jn
e�nt(�); @ lnLcn;T (�)@�=1
�2�
TXt=1
eY 0n;t�1Jne�nt(�);@ lnLcn;T (�)
@�=
1
�2�
TXt=1
eX 01ntJn
e�nt(�); @ lnLcn;T (�)@�=1
�2�
TXt=1
e"nt(�)0Jne�nt(�);@ lnLcn;T (�)
@V ec(�)=
TXt=1
(��1" eX 02ntJn)V ec(e"nt(�))� 1
�2��
TXt=1
eX 02ntJn
e�nt(�);@ lnLcn;T (�)
@�2�= � (n� 1)T
2�2�+
1
2�4�
TXt=1
e�nt(�)0Jne�nt(�);@ lnLcn;T (�)
@�= � (n� 1)T
2
@ ln j�"j@�
+1
2
TXt=1
@tr���1" e"nt(�)0Jne"nt(�)�
@�:
The �rst equality holds because trGn(�)� tr(JnGn(�)) = 1=(1� �) from Lee & Yu (2010c).The second order derivatives are
@2 lnLcn;T (�)
@�21= � 1
�2�
TXt=1
�Wg
ntY0ntJnWg
ntY nt + �2�tr[(JnWntS
�1nt (�1))
2]�;@2 lnLcn;T (�)
@�1@�= 0;
@2 lnLcn;T (�)
@�1@�2= � 1
�2�
TXt=1
WgntY
0ntJnW g
n;t�1Y n;t�1;@2 lnLcn;T (�)
@�1@�= � 1
�2�
TXt=1
WgntY
0ntJn eYn;t�1;
@2 lnLcn;T (�)
@�1@�= � 1
�2�
TXt=1
WgntY
0ntJn eX 0
1nt;@2 lnLcn;T (�)
@�1@V ec(�)=1
�2��
TXt=1
eX 02ntJnWg
ntY nt;
@2 lnLcn;T (�)
@�1@�2�= � 1
�4�
TXt=1
WgntY
0ntJn
e�nt(�); ; @2 lnLcn;T (�)@�1@�= � 1
�2�
TXt=1
WgntY
0ntJne"nt(�);
7
@2 lnLcn;T (�)
@�22= � 1
�2�
TXt=1
W gn;t�1Y
0n;t�1JnW
gn;t�1Y n;t�1;
@2 lnLcn;T (�)
@�2@�= � 1
�2�
TXt=1
W gn;t�1Y
0n;t�1Jn
eYn;t�1;@2 lnLcn;T (�)
@�2@�= � 1
�2�
TXt=1
W gn;t�1Y
0n;t�1Jne"nt(�); @2 lnLcn;T (�)@�2@V ec(�)
=1
�2��
TXt=1
eX 02ntJnW
gn;t�1Y n;t�1;
@2 lnLcn;T (�)
@�2@�= � 1
�2�
TXt=1
W gn;t�1Y
0n;t�1Jn
eX1nt; @2 lnLcn;T (�)@�2@�2�
= � 1
�4�
TXt=1
W gn;t�1Y
0n;t�1Jn
e�nt(�); @2 lnLcn;T (�)@�2@�= 0;
@2 lnLcn;T (�)
@�2= � 1
�2�
TXt=1
eY 0n;t�1Jn eYn;t�1; @2 lnLcn;T (�)@�@�= � 1
�2�
TXt=1
eY 0n;t�1Jn eX1nt;@2 lnLcn;T (�)
@�@V ec(�)=
1
�2��
TXt=1
eX 02ntJn eYn;t�1; @2 lnLcn;T (�)@�@�
= � 1
�2�
TXt=1
eY 0n;t�1Jne"nt(�);@2 lnLcn;T (�)
@�@�2�= � 1
�4�
TXt=1
eY 0n;t�1Jne�nt(�); @2 lnLcn;T (�)@�@�= 0;
@2 lnLcn;T (�)
@�@�0= � 1
�2�
TXt=1
eX 01ntJn
eX1nt; @2 lnLcn;T (�)@�@V ec(�)0
=1
�2��
TXt=1
eX 02ntJn
eX1nt;@2 lnLcn;T (�)
@�@�= � 1
�2�
TXt=1
eX 0ntJne"nt(�); @2 lnLcn;T (�)@�@�2�
= � 1
�4�
TXt=1
eX 0ntJn
e�nt(�); @2 lnLcn;T (�)@�@�= 0;
@2 lnLcn;T (�)
@V ec(�)@V ec(�)0= �(��1" +
��0
�2�)
TXt=1
eX 02ntJn eX2nt;
@2 lnLcn;T (�)
@V ec(�)@�0= � 1
�2�
TXt=1
eX 01ntJn
e�nt(�) + �
�2�
TXt=1
eX 02ntJne"nt(�);
@2 lnLcn;T (�)
@V ec(�)@�2�=
�
�4�
TXt=1
eX 02ntJn
e�nt(�); @2 lnLcn;T (�)@V ec(�)@�0= [Ip
TXt=1
eX 02n;tJne"nt(�)]@V ec ���1" �@�0
;
@2 lnLcn;T (�)
@�@�0= � 1
�2�
TXt=1
e"nt(�)0Jne"nt(�); @2 lnLcn;T (�)@�@�2�
= � 1
�4�
TXt=1
e"nt(�)0Jne�nt(�); @2 lnLcn;T (�)@�@�= 0;
@2 lnLcn;T (�)
@(�2�)2
=(n� 1)T2�4�
� 1
�6�
TXt=1
e�nt(�)0Jne�nt(�); @ lnLn;T (�)@�2�@�= 0;
@2 lnLcn;T (�)
@�@�0= � (n� 1)T
2
@2 ln j�"j@�@�0
+1
2
TXt=1
@2tr���1" e"nt(�)0Jne"nt(�)�
@�@�0:
8
6.2 Some useful lemmas
6.2.1 Lemmas related to the large T setting
Claim 1 If supt;n jj�0S�1n;t + �20Gn;tjj1 < qw < 1 and for any t and n, there exist at most K (K � 1)
columns of �0S�1n;t+�20Gn;t that the column sum exceeds qw, where K is a �xed number that does not depend
n or t, then Assumption 4 holds.
Proof. In this setting, supt;n jjB(h)nt jj1 � hquKq
h�1w , where qu = supt;n jj�0S�1n;t + �20Gn;tjj1. Denote an
index set Sn with qw �Pn
j=1(�10S�1n;t + �20Gn;t)ji < qu if i 2 Sn and
Pnj=1(�10S
�1n;t + �20Gn;t)ji < qw if
i =2 Sn. Then jSntj � K for any n and t. Consider the kth column sum of B(h)nt , i.e., e0nB
(h)nt ek;n, where
en = (1; :::; 1)0 and ek;n is the unit column vector with one in its kth entry and zeros in its other entries. As
In =Pn
i=1 ei;ne0i;n,
e0nB(h)nt ek;n =
nXi=1
e0nB(1)nt ei;ne
0i;nB
(h�1)n;t�1 ek;n =
Xi2Sn
e0nB(1)nt ei;ne
0i;nB
(h�1)n;t�1 ek;n +
Xi=2Sn
e0nB(1)nt ei;ne
0i;nB
(h�1)n;t�1 ek;n
� K
�maxi2Sn
e0nB(1)nt ei;n
��maxi2Sn
e0i;nB(h�1)n;t�1 ek;n
�+
�maxi=2Sn
e0nB(1)nt ei;n
� Xi=2Sn
e0i;nB(h�1)n;t�1 ek;n
� KqujjB(h�1)n;t�1 jj1 + qwjjB(h�1)n;t�1 jj1 � Kquqh�1w + qwjjB(h�1)n;t�1 jj1 � Kquqh�1w + qw supt;njjB(h�1)nt jj1:
As this inequality holds for any k = 1; :::; n, and t, we have supt;n jjB(h)nt jj1 � quKqh�1w +qw supt;n jjB
(h�1)nt jj1:
By deduction, we have supt;n jjB(h)nt jj1 � (h� 1)quKqh�1w + quq
h�1w � hquKqh�1w . Therefore, we can check
tXs=0
1Xh=0
jjB(s+h)nt jj1 =tX
s=0
1Xh=0
jjs+hYk=1
(�0S�1n;t�k + �20Gn;t�k)jj1 �
tXs=0
1Xh=0
s+hYk=1
jj(�0S�1n;t�k + �20Gn;t�k)jj1
�tX
s=0
1Xh=0
qs+hw � 1
(1� qw)2
and for any t and n,tX
s=0
1Xh=0
jjB(s+h)nt jj1 � quKtX
s=0
1Xh=0
(h+ s)qs+h�1w <1:
And a su¢ cient condition for supt;n jj�0S�1n;t + �20Gn;tjj1 < 1 is that j�0j + j�0�10 + �20j cw1�j�10jcw < 1:
This is so because
supt;njj�0S�1n;t + �20Gn;tjj1 = sup
t;njj�0In + (�0�10 + �20)Gn;tjj1 � j�0j+ j�0�10 + �20j
cw1� j�10jcw
:
Consider the kth column sum of �0S�1n;t + �20Gn;t, we only need to consider the kth column sum of Gn;t:
If at most K (K � 1) columns of Gn;t that the column sum exceeds jjGn;tjj1, then at most K columns of�0S
�1n;t + �20Gn;t that the column sum exceeds qw:
9
Claim 2 Suppose �nt = f("nt; X; �) is an n�1 vector of variables, Cnt is an n�n matrix with each elementjCnt(i; j)j = jg("nt; X; �)j � c�ij deterministically. If supi;n;tEj�4i;ntj < c� and supi;n
Pj c�ij � Cc with Cc
not depending on t or n, then supt1;t2;tEj�0nt1Cnt�nt2 j � nCcc
1=2� , supt1;t2;tEj�
0nt1Cnt�nt2 j
2 � n2C2c c�,and Cov(�0nt1C1nt�nt2 ; �
0nt3C2nt�nt4) � n2c�Cc1Cc2, where Cc1 and Cc2 are the deterministic bounds for
supi;nP
j jC1nt(i; j)j and supi;nP
j jC2nt(i; j)j.
Proof. It is straightforward to show
Ej�0nt1Cnt�nt2 j = EjnXi=1
nXj=1
�i;nt1�j;nt2Cnt(i; j)j � EjnXi=1
nXj=1
�i;nt1�j;nt2c�ij j
�nXi=1
nXj=1
c�ijEj�i;nt1�j;nt2 j � c1=2�
nXi=1
nXj=1
c�ij � nc1=2� Cc:
The last inequality is from Cauchy�s inequality. And similarly,
supt1;t2;t
Ej�0nt1Cnt�nt2 j2 �
nXi=1
nXj=1
nXk=1
nXl=1
Ej�i;nt1�j;nt2�k;nt1�l;nt2Cnt(i; j)Cnt(k; l)j
�nXi=1
nXj=1
nXk=1
nXl=1
c�ijc�klEj�i;nt1�j;nt2�k;nt1�l;nt2 j � n
2c�C2c :
Therefore,
Cov(�0nt1C1nt�nt2 ; �0nt3C2nt�nt4) �
qV ar(�0nt1C1nt�nt2)
qV ar(�0nt3C2nt�nt4)
�rsupt1;t2;t
Ej�0nt1C1nt�nt2 j2rsupt1;t2;t
Ej�0nt1C2nt�nt2 j2 � n2c�Cc1Cc2:
Denote
Unt = Gnt
1Xh=0
B(h)nt "n;t�h =
1Xh=0
Pnt;h"n;t�h; and Vns =1Xg=0
Qns;g"n;s�g;
where Pnt;h = GntB(h)nt and Qns;g = GnsB
(g)ns are sequences of n�n square matrices. Let eUnt = Unt�UnT
where UnT = 1=TPT
t=1Unt, andeeUnt = Un;t�1 �UnT;�1 where UnT;�1 = 1=T
PT�1t=0 Unt. Also, similar
de�nations apply to eVnt, VnT ; and eVnt: As1
nT
TXt=1
( gWntY nt)0Jn gWntY nt =
1
nT
TXt=1
(WntYnt)0JnWntYnt �
1
nT 2
TXt=1
(WntYnt)0Jn
TXs=1
WnsYns;
We want to show a general result that for any constant matrix An that does not depend on �nt�s,
1
nT
TXt=1
U0ntAnVnt � E(
TXt=1
U0ntAnVnt)
!= Op(
1pT) and
1
nU0nTAnVnT � E
�1
nU0nTAnVnT
�= Op(
1pT):
10
Lemma 1 Under Assumptions 1-3, for any uniformly bounded constant matrix An that does not depend on"nt�s,
1
nT
TXt=1
U0ntAnVnt � E
1
nT
TXt=1
U0ntAnVnt
!= Op(
1pT):
Proof. For t > s; Unt =Pt�s�1
h=0 Pnt;h"n;t�h +P1
l=0 Pnt;t�s+l"n;s�l. Therefore,
Cov(U0ntAnVnt;U
0nsAnVns)
= Cov[(t�s�1Xh1=0
Pnt;h1"n;t�h1 +1Xl1=0
Pnt;t�s+l1"n;s�l1)0An(
t�s�1Xh2=0
Qnt;h2"n;t�h2 +1Xl2=0
Qnt;t�s+l2"n;s�l2);
(
1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ]
= Cov[(t�s�1Xh1=0
Pnt;h1"n;t�h1)0An
t�s�1Xh2=0
Qnt;h2"n;t�h2 ; (1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ]
+Cov[(
1Xl1=0
Pnt;t�s+l1"n;s�l1)0An
t�s�1Xh2=0
Qnt;h2"n;t�h2 ; (
1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ]
+Cov[(1X
h1=0
Pnt;h1"n;t�h1)0An
1Xl2=0
Qnt;t�s+l2"n;s�l2 ; (1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ] (7)
= Cov[(
1Xl1=0
Pnt;t�s+l1"n;s�l1)0An
t�s�1Xh2=0
Qnt;h2"n;t�h2 ; (
1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ]
+Cov[(
1Xh1=0
Pnt;h1"n;t�h1)0An
1Xl2=0
Qnt;t�s+l2"n;s�l2 ; (
1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ]
� n2jjAnjj21c"1Xl1=0
jjPnt;t�s+l1 jj1 �t�s�1Xh2=0
jjQnt;h2 jj1 �1Xg1=0
jjPns;g1 jj1 �1Xg2=0
jjQns;g2 jj1
+n2c"jjAnjj211X
h1=0
1Xl2=0
1Xg1=0
1Xg2=0
jjPnt;h1 jj1 � jjQnt;t�s+l2 jj1 �1Xg1=0
jjPns;g1 jj1 �1Xg2=0
jjQns;g2 jj1 (8)
� n2jjAnjj21jjGntjj21jjGntjj21c"
1Xl1=0
jjB(t�s+l1)nt jj1 +1Xl2=0
jjB(t�s+l2)nt jj1
!: (9)
where c" = supi;n;tEjj"i;ntjj4+�" . The second equality holds because the third summand in (7) is a summationof two terms as
1Xh1=0
Pnt;h1"n;t�h1 =t�s�1Xh=0
Pnt;h"n;t�h +1Xl=0
Pnt;t�s+l"n;s�l:
11
The third equality holds because
Cov[(t�s�1Xh1=0
Pnt;h1"n;t�h1)0An
t�s�1Xh2=0
Qnt;h2"n;t�h2 ; (1Xg1=0
Pns;g1"n;s�g1)0An
1Xg2=0
Qns;g2"n;s�g2 ] = 0
from the independence of "nt�s across t. The fourth equality holds because P 0nt1;lAnQnt2;g 1 = kAnk1 � jjPnt1;ljj1 � jjQnt2;gjj1 � kAnk1 kGntk1 kGntk1 jjB(l)nt1 jj1jjB
(g)nt2 jj1
From this result, for any uniformly bounded constant matrix An, as
V ar(U0ntAnVnt) � E[(U0
ntAnVnt)2] � c"n2
(1Xh=0
Pnt;h)0An
1Xg2=0
Qnt;g2
2
1
;
V ar(1
n
TXt=1
U0ntAnVnt) =
1
n2
TXt=1
TXs=1
Cov(U0ntAnVnt;U
0nsAnVns)
=1
n2
TXt=1
V ar(U0ntAnVnt) +
2
n2
TXt=2
t�1Xs=1
Cov(U0ntAnVnt;U
0nsAnVns)
� 2
n2
TXt=1
tXs=1
n2Cq
1Xl1=0
jjB(t�s+l1)nt jj1 +1Xl2=0
jjB(t�s+l2)nt jj1
!= Tc;
where c is a constant not depending on T or n. Therefore,
1
nT
TXt=1
U0ntAnVnt � E
1
nT
TXt=1
U0ntAnVnt
!= Op(
1pT):
Lemma 2 Under Assumptions 1-4, for any uniformly bounded constant matrix An that does not depend on"nt�s,
1
nU0nTAnVnT � E
�1
nU0nTAnVnT
�= Op(
1pT):
Proof. As1
nU0nTAnVnT =
1
nT 2
TXt=1
1Xh1=0
Pnt;h1"n;t�h1
!0An
TXs=1
1Xg=0
Qns;g"n;s�g
!;
we will show
V ar
�1
nU0nTAnVnT
�=
1
n2T 4
TXt1=1
TXs1=1
TXt2=1
TXs2=1
Cov[(1Xh=0
Pnt1;h"n;t1�h)0An
1Xg=0
Pns1;g"n;s1�g;
(1Xh=0
Pnt2;h"n;t2�h)0An
1Xg=0
Pns2;g"n;s2�g] = O(1
T):
12
To show this, we consider three di¤erent cases (we use Vs to denote the variance in each case, where s is thecase index):(a) At least two summation indices from t1, s1, t2, and s2 are the same. In this case, Va = O(1=T )
because
Cov[(1Xh=0
Pnt1;h"n;t1�h)0An
1Xg=0
Qns1;g"n;s1�g; (1Xh=0
Pnt2;h"n;t2�h)0An
1Xg=0
Qns2;g"n;s2�g]
� O(n2)
1Xh=0
1Xg=0
k(Pnt;h)0AnQns1;gk1
!2= O(n2):
(b) All the indices are di¤erent. We further divide this case into the following three sub cases.(b.1) min(t1; s1) > max(t2; s2) or min(t2; s2) > max(t1; s1): By symmetry, w.l.o.g, we assume t1 > s1 >
t2 > s2. Apparently,
Cov[(
t1�t2�1Xh=0
Pnt1;h"n;t1�h)0An
s1�t2�1Xg=0
Pns1;g"n;s1�g; (1Xh=0
Pnt2;h"n;t2�h)0An
1Xg=0
Pns2;g"n;s2�g] = 0:
Hence,
Cov[(1X
h1=0
Pnt1;h1"n;t1�h1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
= Cov[(1Xl=0
Pnt1;t1�t2+l"n;t2�l)0An
s1�t2�1Xh=0
Pns1;h"n;s1�h; (1Xh=0
Pnt2;h"n;t2�h)0An
1Xg=0
Pns2;g"n;s2�g]
+Cov[(
t1Xh=0
Pnt1;h"n;t1�h)0An
1Xl=0
Pns1;s1�t2+l"n;t2�l; (
1Xh2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
� O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1 +1Xl2=0
jjB(s1�t2+l2)nt jj1
!: ((*))
Therefore,
Vb1 =1
n2T 4
TXt1=1
TXs1=1
TXt2=1
TXs2=1
O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1 +1Xl2=0
jjB(s1�t2+l2)nt jj1
!= O(
1
T):
(b.2) max(t1; s1) > max(t2; s2) > min(t2; s2) > min(t1; s1). By symmetry, w.l.o.g, we assume t1 > t2 >
13
s2 > s1.
Cov[(1X
h1=0
Pnt1;h1"n;t1�h1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
= Cov[(
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1 +1Xl1=0
Pnt1;t1�t2+l1"n;t2�l1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ;
(1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
= Cov[(
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
+Cov[(1Xl1=0
Pnt1;t1�t2+l1"n;t2�l1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
=nXi=1
E(e0i;nA0n
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1)Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ;
(1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ] +O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1
= C2
nXi=1
Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ] +O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1
= O(n2)
1Xl2=0
jjB(s1�s1+l2)nt jj1 +O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1:
The last equality holds because of ((*)) in (b.1). Therefore,
Vb2 =1
n2T 4
TXt1=1
TXs1=1
TXt2=1
TXs2=1
O(n2)[
1Xl2=0
jjB(s1�s1+l2)nt jj1 +O(n2)
1Xl1=0
jjB(t1�t2+l1)nt jj1] = O(1
T):
(b.3) max(t1; s1) > max(t2; s2) > min(t1; s1) > min(t2; s2). By symmetry, w.l.o.g, we assume t1 > t2 >
14
s1 > s2.
Cov[(1X
h1=0
Pnt1;h1"n;t1�h1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
= Cov[(
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1 +1Xl1=0
Pnt1;t1�t2+l1"n;t2�l1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ;
(1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
= Cov[(
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
+Cov[(1Xl1=0
Pnt1;t1�t2+l1"n;t2�l1)0An
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
=nXi=1
E(e0i;nA0n
t1�t2�1Xh1=0
Pnt1;h1"n;t1�h1)Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ; (1X
h2=0
Pnt2;h2"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
+O(n2)1Xl1=0
jjB(t1�t2+l1)nt jj1
= C2
nXi=1
Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ; (
t2�s1�1Xh2=0
Pnt2;h"n;t2�h2 +1Xl2=0
Pnt2;t2�s1+l2"n;s1�l2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
+O(n2)1Xl1=0
jjB(t1�t2+l1)nt jj1
= C2
nXi=1
Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ; (
t2�s1�1Xh2=0
Pnt2;h"n;t2�h2)0An
1Xg2=0
Pns2;g2"n;s2�g2 ]
+O(n2)
1Xl1=0
jjB(t2�s1+l1)nt jj1 +O(n2)1Xl1=0
jjB(t1�t2+l1)nt jj1
= C2
nXi=1
nXj=1
Cov[e0i;n
1Xg1=0
Pns1;g1"n;s1�g1 ; e0j;n
1Xg2=0
Pns2;g2"n;s2�g2 ]E(e0j;nA
0n
t2�s1�1Xh2=0
Pnt2;h"n;t2�h2)
+O(n2)1Xl1=0
jjB(t2�s1+l1)nt jj1 +O(n2)1Xl1=0
jjB(t1�t2+l1)nt jj1
= O(n2)jjB(s1�s2+l1)nt jj1 +O(n2)1Xl1=0
jjB(t2�s1+l1)nt jj1 +O(n2)1Xl1=0
jjB(t1�t2+l1)nt jj1:
15
Therefore,
Vb3 =1
n2T 4
TXt1=1
TXs1=1
TXt2=1
TXs2=1
O(n2)[1Xl=0
jjB(s1�s2+l)nt jj1 +1Xl=0
jjB(t2�s1+l)nt jj1 +1Xl=0
jjB(t1�t2+l)nt jj1] = O(1
T):
Combine all cases together, we have V ar�1nU
0nTAnVnT
�= O(1=T ) and hence,
1
nU0nTAnVnT � E
�1
nU0nTAnVnT
�= Op(
1pT):
1
nT
TXt=1
eU0ntAn
eVnt � E(1
nT
TXt=1
eU0ntAn
eVnt) = Op(1pT):
6.3 Proofs of the main result
Proof. First we show that limT!11nT E[L
cn;T (�)] attains its unique maximum at �0: Consider the objective
function
Lcn;T (�) = � (n� 1)T2
ln 2��2� j�"j+TXt=1
ln jSnt(�1)j � T ln(1� �1)
� 1
2�2�
TXt=1
e�0nt(�)Jne�nt(�)� 12TXt=1
( eZnt � eX2nt�)0(��1" Jn)( eZnt � eX2nt�): (10)
Let Xnt collect all distinct column vectors in X1nt and X2nt: The associated coe¢ cients are �+ and �+
corresponding to � and �. Denote Tnt = [Wn;t�1Yn;t�1; Yn;t�1; Xnt, "nt] and � = (�2; �; �+0; �0)0: As
Snt(�1)Ynt = (�10 � �1)Gnt(Tnt�0 + cn0 + at0ln + �nt) + Tnt�0 + cn0 + at0ln + �nt;
we have
Jne�nt(�) = Jnf(�10 � �1)( gGntTnt�0 + eGntcn0 + gGnt�nt) + eTnt(�0 � �) + eXnt(�+ � �+0 )� + e�ntg= (�10 � �1)Jn( gGntTnt�0 + eGntcn0) + Jn eTnt[�0 � �+ (02; (�+ � �+0 )�; 0p)]
+(�10 � �1)Jn gGnt�nt + Jne�ntand
TXt=1
[e�0nt(�)Jne�nt(�)]= [�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]
TXt=1
[( gGntTnt�0 + eGntcn0; eTnt)0Jn( gGntTnt�0 + eGntcn0; eTnt)]�[�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]0 +
TXt=1
[(�10 � �1)gGnt�nt + e�nt]0Jn[(�10 � �1)gGnt�nt + e�nt]+2
TXt=1
[(�10 � �1)gGnt�nt + e�nt]0Jn[(�10 � �1)( gGntTnt�0 + eGntcn0) + eTnt(�0 � �+ (0; 0; (�+ � �+0 )�; 0p))];16
Therefore,
1
(n� 1)T ETXt=1
[e�0nt(�)Jne�nt(�)]= [�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]HnT [�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]0
+�2�0
(n� 1)T (1�1
T)E
TXt=1
[tr(S�0�1nt S�0nt(�1)S�nt(�1)S
��1nt )] +Op(1=
pT );
where HnT = 1(n�1)T
PTt=1E[(
gGntTnt�0 + eGntcn0; eTnt)0Jn( gGntTnt�0 + eGntcn0; eTnt)]. This holds becausefrom Lemma 2 in Lee and Yu 2012,
1
(n� 1)T
TXt=1
E[(�10 � �1)( gGntTnt�0 + eGntcn0) + eTnt(�0 � �+ (0; 0; (�+ � �+0 )�; 0p))]0Jn�[(�10 � �1)gGnt�nt + e�nt] = O( 1T )
and
ETXt=1
[(�10 � �1)gGnt�nt + e�nt]0Jn[(�10 � �1)gGnt�nt + e�nt]= (1� 1
T)�2�0E
TXt=1
[tr(S0�1nt S0nt(�1)JnSnt(�1)S
�1nt )] = (1�
1
T)�2�0E
TXt=1
[tr(S�0�1nt S�0nt(�1)S�nt(�1)S
��1nt )]:
The last "=" holds because
S�nt(�1)S��1nt = F 0n;n�1(In � �1Wn)Fn;n�1F
0n;n�1(In � �10Wn)
�1Fn;n�1
= F 0n;n�1(In � �1Wn)(In �1
nlnl
0n)(In � �10Wn)
�1Fn;n�1
= F 0n;n�1(In � �1Wn)(In � �10Wn)�1Fn;n�1
and hence,
tr(S�0�1nt S�0nt(�1)S�nt(�1)S
��1nt )
= tr[F 0n;n�1(In � �1Wn)0(In � �10W 0
n)�1Jn(In � �1Wn)(In � �10Wn)
�1Fn;n�1]
= tr[(In � �1Wn)0(In � �10W 0
n)�1Jn(In � �1Wn)(In � �10Wn)
�1(In �1
nlnl
0n)]
= [tr(S0�1nt S0nt(�1)JnSnt(�1)S
�1nt )]:
17
Therefore,
1
(n� 1)T E[lnLcn;T (�)� lnLcn;T (�0)]
= �12ln
�2� j�"j�2�0j�"0j
+1
(n� 1)T
TXt=1
E
�ln(1� �10)jSnt(�1)j(1� �1)jSntj
�
� 1
2(n� 1)T
TXt=1
V ec[ eX2nt(�0 � �)]0(��1" Jn)V ec[ eX2nt(�0 � �)]� (T � 1)2T
[tr(��1" �"0)� p]
��2�0
2(n� 1)T�2�(1� 1
T)E
TXt=1
tr[S�0�1nt S�0nt(�)S�nt(�)S
��1nt ] +
(T � 1)2T
+Op(1=T )
� 1
2�2�[�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]HnT [�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]0:
limT!1
1
(n� 1)T E[lnLcn;T (�)� lnLcn;T (�0)]
= � 1
2�2�[�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)] lim
T!1HnT [�10 � �1; �0 � �+ (02; (�+ � �+0 )�; 0p)]0
�12
�tr(��1" �"0)� ln
j�"0jj�"j
� p�� limT!1
1
2(n� 1)T
TXt=1
V ec[ eX2nt(�0 � �)]0(��1" Jn)V ec[ eX2nt(�0 � �)]� limT!1
1
2(n� 1)T
TXt=1
E[tr(
�2�0�2�S�0�1nt S�0nt(�)S
�nt(�)S
��1nt )]� ln
������2�0�2� S�0�1nt S�0nt(�)S�nt(�)S
��1nt
������ (n� 1)!� 0:
If limT!11
(n�1)T E[lnLcn;T (�)� lnLcn;T (�0)] = 0, then it must be �" = �"0 from the second term and �0 = �
from the third term. Under Assumption 5a) that limT!1HnT = 0, the �rst term gives �10 = �1; �20 = �2;�0 = �; �0 = �; and �0 = �: Under Assumption 5b), the fourth term gives us �10 = �1 and �2�0 = �
2� : In this
case, the �rst term reduces to
(�20��2; �0��; �00��0+(�+��+0 )�; �0��) limT!1
1
(n� 1)T
TXt=1
E( eT 0ntJn eTnt)(�20��2; �0��; �00��0+(�+��+0 )�; �0��)0 = 0:Hence, � = �0. Therefore, limT!1
1nT E[lnL
cn;T (�)] attains its unique maximum at �0:
Using the same arguments in Qu and Lee 2013, we can show the uniform stochastic equicontinuity oflnLcn;T (�
�; �). Then based on the pointwise convergence, we have the uniform convergence that
sup��;�
limT!1
1
(n� 1)T j lnLcn;T (�
�; �)� E lnLcn;T (��; �)jp! 0;
sup��;�
limT!1
1
(n� 1)T
@ lnLcn;T (��; �)
@�� E
@ lnLcn;T (��; �)
@�
p! 0;
sup��;�
limT!1
1
(n� 1)T
@2 lnLcn;T (��; �)
@�@�0� E
@2 lnLcn;T (��; �)
@�@�0
p! 0:
18
As
1
(n� 1)T lnLcn;T (�
�; b�)� 1
(n� 1)T lnLcn;T (�
�; �0) =1
(n� 1)T@ lnLcn;T (�
�; e�)@�0
(b���0) = Op(1) �Op( 1pnT)
limT!1
1
(n� 1)T E[lnLcn;T (�
�; b�)� lnLcn;T (��0 ; b�)] = limT!1
1
(n� 1)T E[lnLcn;T (�
�; �0)� lnLcn;T (��0 ; �0)]:
Therefore, limT!11nT E[lnL
cn;T (�
�; b�)] attains its unique maximum at ��0 : Together with the uniform
convergence, we can conclude that the QMLE c�� ! ��0 as T !1.For the CLT, as scores only involve linear and quadratic forms of �nt (all "nt�s are independent of �nt),
we can apply the martingale central limit theorem to show the asymptotic normality of scores.Let
R1nT =
TXt=1
[U 0n;t�1�nt +D0nt�nt + �
0ntBnt�nt � �2�0Etr(Bnt)]:
From Lemma 5 in Yu and Lee (2012), if (1=nT )�2R1nTis bounded away from zero, then R1nT =�R1nT
d!N(0; 1): In our case, we have a slightly di¤erent form that
R2nT =
TXt=1
[U 0n;t�1�nt +D0nt�nt + �
0ntBnt�nt � �2�0tr(Bnt)];
where Un;t�1, Dnt, and Bnt may contain "ns with s � t:
Lemma 3 Let R2nT =PT
t=1[U0n;t�1�nt + D
0nt�nt + �
0ntBnt�nt � �2�0tr(Bnt)]. Under Assumptions 1-3,
R2nT =�R2nT
d! N(0; 1):
Denote
ri;nt = (uin;t�1 + dnti)�i;nt + bnt;ii(�2i;nt � �2�0) + 2
i�1Xj=1
bnt;ij�j;nt�i;nt
and the ���eldF�n;t;i = �(v11; v21; :::; vn1; :::; v1t; :::; vit):
Then E(ri;ntjF�n;t;i�1) = 0 and E(ri;ntjF�n;t�1;i) = 0. Thus, fri;nt;F�n;t;i; 1 � t � T; 1 � i � ng forms amartingale di¤erence array. Using similar arguments in Yu and Lee (2012), we have R2nT =�R2nT
d! N(0; 1):
Next we show the asymptotic distribution of c��. From Taylor expansion,
p(n� 1)T (c�� � ��0 ) = �
1
(n� 1)T@2 lnLcnT (
f��; b�)@��@��0
!�11p
(n� 1)T@ lnLcnT (�
�0 ; b�)
@��
= ��E
�1
(n� 1)T@2 lnLcnT (�
�0 ; �0)
@��@��0
�+Op(
1pT)
��1 1p
(n� 1)T@ lnLcn;T (�
�0 ; �0)
@��+Op(
1pnT)
!:
19
The second "=" holds because
1
(n� 1)T
@2 lnLcnT (f��; b�)@��@��0� E@
2 lnLcnT (��0 ; �0)
@��@��0
= Op( 1pT )and
1p(n� 1)T
@ lnLcn;T (��0 ; b�)
@��=
1p(n� 1)T
@ lnLcn;T (�
�0 ; �0)
@��+@2 lnLcn;T (�
�0 ; e�)
@��@�0(b�� �0)!
=1p
(n� 1)T@ lnLcn;T (�
�0 ; �0)
@��+Op(
1pnT)
as all entries in@2 lnLcn;T (�
�0 ;�)
@��@�0are zeros except for
@2 lnLcn;T (��0 ;�)
@V ec(�)@�0 ,
1p(n� 1)T
@2 lnLcn;T (��0 ; e�)
@��@�0(b�� �0) = 1
(n� 1)T@2 lnLcn;T (�
�0 ; e�)
@V ec(�)@�0
p(n� 1)T (b�� �0) = Op( 1p
nT):
Now we need to analyze the asymptotic distribution of 1p(n�1)T
@ lnLcn;T (��0 ;�0)
@��:
At the true parameter values, the score is
@ lnLcn;T (��0 ; �0)
@��=
0BBBBB@@ lnLcn;T (�
�0 ;�0)
@�1@ lnLcn;T (�
�0 ;�0)
@�@ lnLcn;T (�
�0 ;�0)
@V ec(�)@ lnLcn;T (�
�0 ;�0)
@�2�
1CCCCCA =1
�2�0
0BBBB@PT
t=1
�Wg
ntY0ntJn
e�nt � �2�0tr(JnGnt)�PTt=1
eT 0ntJne�nt�2�0
PTt=1(�
�1"0 eX 0
2ntJn)V ec(e"nt)� � PTt=1
eX 02ntJn
e�nt�(n� 1)T=2 +
PTt=1e�0ntJne�nt=(2�2�0):
1CCCCAWe have a decomposition of the score. From (6) that
Ynt = �ntcn0 + �nt�0 +1
1� �10
1Xh=0
at�h;0ln(�20 + �01� �10
)h + #nt�0 + Unt;
where
�nt � S�1nt1Xh=0
B(h)nt , �nt � S�1nt
1Xh=0
B(h)nt X1n;t�h, #nt � S�1nt
1Xh=0
B(h)nt "n;t�h, and Unt � S�1nt
1Xh=0
B(h)nt �n;t�h;
we can decompose Jn eTnt = Jn[ gWn;t�1Y n;t�1;eYn;t�1; eX1nt; e"nt] into
Jn eTnt = Jn eT (u)nt � (JnUnT;�1; JnWn;T�1UnT;�1; 0)
where eT (u)nt = [ gWn;t�1�n;t�1cn0 +gWn;t�1�n;t�1�0 +
gWn;t�1#n;t�1�0 +Wn;t�1Un;t�1;e�n;t�1cn0 + e�n;t�1�0 + e#n;t�1�0 + Un;t�1; eX1nt;e"nt)]20
UnT;�1 =PT
t=1 Un;t�1=T and Wn;T�1UnT;�1 =PT
t=1Wn;t�1Un;t�1=T: Hence, Jn eTnt has two components:one is Jn eT (u)nt , uncorrelated with �nt; the other is �(JnWn;T�1UnT;�1; JnUnT;�1; 0n�(k1+p)), correlated with�nt when t � T � 1. Therefore, the score can be decomposed into two parts such that
1p(n� 1)T
@ lnLcn;T (��0 ; �0)
@��=
1p(n� 1)T
@ lnLc(u)n;T (�
�0 ; �0)
@����nT ;
where
@ lnLc(u)n;T (�
�0 ; �0)
@��=
1
�2�0
0BBBBB@PT
t=1
�( gGntT (u)nt �0 +gGntcn0)0Jn�nt + (Gnt�nt)0Jn�nt � �2�0tr(JnGnt)�PT
t=1eT (u)0nt Jn�nt
�2�0PT
t=1(��1"0 eX 0
2ntJn)V ec("nt)� � PT
t=1eX 02ntJn�nt
�(n� 1)T=2 +PT
t=1 �0ntJn�nt=(2�
2�0)
1CCCCCAwith E(
@ lnLc(u)n;T (�
�0 ;�0)
@��) = 0 and
�nT =1p
(n� 1)T1
�2�0
0BBB@T [[(GnTWn;T�1UnT;�1; GnTUnT;�1; 0n�(k1+p))�0]
0Jn�nT +GnT �nTJn�nT ]
T (Wn;T�1UnT;�1; UnT;�1; 0n�k1))0Jn�nT
0k2p�1T�
0nTJn�nT =(2�
2�0)
1CCCA :Similarly to Lee and Yu 2012, �nT =
p(n� 1)=Ta�0;nT +Op(1=
pT ) where a�0;nT = O(1) and
a�;nT =1
(n� 1)T E
0BBBBB@tr[Jn
PT�1t=1
PT�t�1h=0 (�2Gn;t+h+1(�1)S
�1n;t+h(�1)B
(h)n;t+h(�) + �Gn;t+h+1(�1)Gn;t+h(�1)B
(h)n;t+h(�))]
tr(JnPT�1
t=1
PT�t�1h=0 S�1n;t+h(�1)B
(h)n;t+h(�))
tr(JnPT�1
t=1
PT�t�1h=0 Gn;t+h(�1)B
(h)n;t+h(�))
0(k1+k2p)�1(n� 1)T=2�2�
1CCCCCA :
Denote
��0;nT =1
�2�0(n� 1)TE
0BBBBBBBB@
TPt=1( gGntT (u)nt �0 +gGntcn0)0Jn( gGntT (u)nt �0 +gGntcn0) � � �
TPt=1
eT (u)0nt Jn( gGntT (u)nt �0 +gGntcn0) TPt=1
eT (u)0nt Jn eT (u)nt � �
�� TPt=1
eX 02ntJn(
gGntT (u)nt �0 +gGntcn0) �� TPt=1( eX 0
2ntJneT (u)nt ) 0k2p�k2p �
0 01�(2+k1+p) 01�k2p 0
1CCCCCCCCA
+1
�2�0(n� 1)TE
0BBBBBBBB@
�2�0TPt=1tr[G0ntJnGnt + (JnGnt)
2] � � �
0(2+k1+p)�1 0(2+k1+p)�(2+k1+p) � �
0k2p�1 0k2p�(2+k1+p) (��1"0 �2�0 + �0�
00)
TPt=1
eX 02ntJn
eX2nt �TPt=1tr(JnGnt) 01�(2+k1+p) 01�k2p
(n�1)T2�2�0
1CCCCCCCCA21
and
�0;nT =�3
�6�0(n� 1)TE
0BBBBBBBBBB@
2�2�0TPt=1
nPi=1
E[(JnGnt)ii(Jn gGntT (u)nt �0 + JngGntcn0)i] � � �
�2�0TPt=1
nPi=1
E[(JnGnt)ii(Jn eT (u)nt )0i] 0(2+k1+p)�(2+k1+p) � �
��2�0TPt=1
nPi=1
E[(JnGnt)ii(� PT
t=1eX 02ntJn)i] 0k2p�(2+k1+p) 0k2p�k2p �
12
TPt=1
nPi=1
E[(Jn gGntT (u)nt �0 + JngGntcn0)i] 01�(2+k1+p) 01�k2p 0
1CCCCCCCCCCA
+E
0BBBBBB@�4�3�4�0�4�0(n�1)T
TPt=1(JnGnt)
2ii � � �
0(2+k1+p)�1 0(2+k1+p)�(2+k1+p) � �0k2p�1 0k2p�(2+k1+p) 0k2p�k2p �
�4�3�4�02�6�0(n�1)T
TPt=1tr(JnGnt) 01�(2+k1+p) 01�k2p
�4�3�4�0�8�0
1CCCCCCA :
Then
�E�
1
(n� 1)T@2 lnLcnT (�
�0 ; �0)
@��@��0
�= ��0;nT +O(
1
T)
and
E
1p
(n� 1)T@ lnL
c(u)n;T (�
�0 ; �0)
@��� 1p
(n� 1)T@ lnL
c(u)n;T (�
�0 ; �0)
@��0
!= ��0;nT +�0;nT +O(
1
T):
Therefore,
p(n� 1)T (c�� � ��0 ) =
���0;nT +Op(
1pT)
��1 1p
(n� 1)T@ lnLcn;T (�
�0 ; �0)
@��+Op(
1pnT)
!
=
���0;nT +Op(
1pT)
��1 1p
(n� 1)T@ lnL
c(u)n;T (�
�0 ; �0)
@���rn� 1T
a�0;nT +Op(1pT)
!Combine these together,p
(n� 1)T (c�� � ��0 ) +rn� 1T ��1�0;nTa�0;nT +Op
�max(
pn� 1T
;1pT)
�= ��1�0;nT
@ lnLc(u)n;T (�
�0 ; �0)
@��d! N
�0;��1�0 (��0 +�0)�
�1�0
�:
The result also implies that (c�� � ��0 ) = Op �max( 1pnT; 1T )
�:
7 References
Anselin, L., LeGallo, J. & Jayet, H. (2008) Spatial panel econometrics, in: L. Matyas & P. Sevestre (eds)The Econometrics of Panel Data, Chapter 19, pp. 625 660, Berlin, Springer-Verlag.
22
Baicker, K. (2005) The spillover e¤ects of state spending, Journal of Public Economics, 89, 529-544.Baltagi, B. & Li, D. (2006) Prediction in the panel data model with spatial correlation: the case of liquor,
Spatial Economic Analysis, 1, 175-185.Baltagi, B., Song, S. H. & Koh, W. (2003) Testing panel data regression models with spatial error
correlation, Journal of Econometrics, 117, 123-150.Baltagi, B., Egger, P. & Pfa¤ermayr, M. (2007a) A generalized spatial panel data model with random
e¤ects, Working Paper, Syracuse University.Baltagi, B., Song, S. H., Jung, B. C. & Koh, W. (2007b) Testing for serial correlation, spatial autocor-
relation and random e¤ects using panel data, Journal of Econometrics, 140, 5-51.Brueckner, J. K. (1998) Testing for strategic interaction among local governments: the case of growth
controls, Journal of Urban Economics, 44, 438-467.Brueckner, J. K. & Saavedra, L. A. (2001) Do local governments engage in strategic property tax com-
petition? National Tax Journal, 54, 203-229.Case, A., Hines, J. R. & Rosen, H. S. (1993) Budget spillovers and �scal policy interdependence: evidence
from the States, Journal of Public Economics, 52, 285-307.Cli¤, A. D. & Ord, J. K. (1973) Spatial Autocorrelation, London, Pion Ltd.Druska, V. & Horrace, W. C. (2004) Generalized moments estimation for spatial panel data: Indonesian
rice farming, American Journal of Agricultural Economics, 86, 185-198.Egger, P., Pfa¤ermayr, M. & Winner, H. (2005) An unbalanced spatial panel data approach to US state
tax competition, Economics Letters, 88, 329-335.Kapoor, M., Kelejian, H. H. & Prucha, I. R. (2007) Panel data models with spatially correlated error
components, Journal of Econometrics, 140, 97-130.Kelejian, H. H. & Prucha, I. R. (1998) A generalized spatial two-stage least squares procedure for esti-
mating a spatial autoregressive model with autoregressive disturbance, Journal of Real Estate Finance andEconomics, 17, 99-121.Kelejian, H. H. & Prucha, I. R. (2001) On the asymptotic distribution of the Moran I test statistic with
applications, Journal of Econometrics, 104, 219-257.Keller, W. & Shiue, C. H. (2007) The origin of spatial interaction, Journal of Econometrics, 140, 304-332.Korniotis, G. M. (2010) Estimating panel models with internal and external habit formation, Journal of
Business and Economic Statistics, 28, 145-158.Lee, L. F. (2004) Asymptotic distributions of quasi-maximum likelihood estimators for spatial econometric
models, Econometrica, 72, 1899-1925.Lee, L. F. (2007) GMM and 2SLS estimation of mixed regressive, spatial autoregressive models, Journal
of Econometrics, 137, 489-514.Lee, L. F. & Yu, J. (2010a) Estimation of spatial autoregressive panel data models with �xed e¤ects,
Journal of Econometrics, 154, 165-185.Lee, L. F. & Yu, J. (2010b) Estimation of spatial panels: random components vs. �xed e¤ects, Manu-
script, The Ohio State University.Lee, L. F. & Yu, J. (2010c) A spatial dynamic panel data model with both time and individual �xed
e¤ects, Econometric Theory, 26, 564-597.Lee, L. F. & Yu, J. (2010d) Some recent developments in spatial panel data models, Regional Science
and Urban Economics, 40, 255-271.Lee, L. F. & Yu, J. (2012) QML estimation of spatial dynamic panel data models with time varying
spatial weights matrices, Spatial Economic Analysis, 7, 31-74.LeSage, J. P. & Pace, R. K. (2009) Introduction to Spatial Econometrics, Boca Raton, FL, Chapman
and Hall/CRC.
23
Mutl, J. (2006) Dynamic panel data models with spatially correlated disturbances, PhD thesis, Universityof Maryland, College Park.Qu, X. & Lee, L.F. (2013). Estimating a spatial autoregressive model with an endogenous spatial weight
matrix, working paper.Mutl, J. & Pfa¤ermayr, M. (2011) The Hausman test in a Cli¤ and Ord panel model, Econometrics
Journal 14, 48-76.Revelli, F. (2001) Spatial patterns in local taxation: taxmimicking or errormimicking? Applied Eco-
nomics, 33, 1101-1107.Rincke, J. (2010) A commuting-based re�nement of the contiguity matrix for spatial models, and an
application to local police expenditures, Regional Science and Urban Economics, 40, 324-330.Su, L. & Yang, Z. (2007) QML estimation of dynamic panel data models with spatial errors, Manuscript,
Singapore Management University.Tao, J. (2005) Analysis of local school expenditures in a dynamic fame, Manuscript, Shanghai University
of Finance and Economics.Yu, J., de Jong R. & Lee, L.F. (2007) Quasi-maximum likelihood estimators for spatial dynamic panel
data with �xed e¤ects when both n and T are large: a nonstationary case, Manuscript, The Ohio StateUniversity.Yu, J., de Jong, R. & Lee, L.F. (2008) Quasi-maximum likelihood estimators for spatial dynamic panel
data with �xed e¤ects when both n and T are large, Journal of Econometrics, 146, 118-134.Yu, J., de Jong R. & Lee, L.F. (2012) Estimation for spatial dynamic panel data with �xed e¤ects: the
case of spatial cointegration. Journal of Econometrics, 167, 16-37.
24