nonstationary logistic regression1 2people.tamu.edu/~ganli/camp11/pv2231.pdfkey words and phrases:...

Nonstationary Logistic Regression1

Yoosoon Chang2

Bibo Jiang3

andJoon Y. Park4

Abstract

In this paper, we consider the logistic regression model with an integrated re-gressor of the ARIMA type. It is shown that the model can be consistentlyestimated by the usual nonlinear least squares (NLS) method. The convergencerates of the NLS estimators are derived and their limiting distributions are alsoprovided. The problem of asymptotic inference in the nonstationary logistic re-gression is also addressed. In particular, we show that the limiting distributionsof the NLS estimators are usually nonnormal due to the asymptotic correlationbetween the innovations of the regressors and the regression errors. This non-normality invalidates the t- and chi-square tests. To overcome this problem, weuse efficient nonlinear least squares (EN-NLS) estimators which have asymptoticnormal distributions with smaller variances. The finite sample properties of theusual and efficient NLS estimators and their t-statistics are investigated usingMonte Carlo simulations.

First Draft: April 22, 2004This version: February 24, 2006

JEL Classification: C22, C51

Key words and phrases: logistic regression, integrated time series, nonlinear least squares,nonstationary nonlinear asymptotics.

1This paper was presented in the econometrics seminar on February 17th, 2006, at Rice University. Wewould like to thank all the participants for helpful suggestions.

2Department of Economics, Rice University3Department of Economics, Rice University4Department of Economics, Rice University and Sungkyunkwan University

1. Introduction

Logistic regression models have been intensively used in empirical studies. Although quitea few have been done on cross-section data, it is found that this type of models are alsovery useful in the context of time series which often includes integrated regressors. Unfor-tunately, no theory has been established so far to support those applications. In this paper,we focus on developing an asymptotic theory for a logistic regression model that accommo-dates an integrated regressor. In particular, we derive the asymptotic theories of nonlinearleast squares estimators and efficient nonstationary nonlinear least squares estimators asfunctions of Brownian motions and Brownian local time. The t-statistics based on theseestimators are also considered.

Some well established asymptotic results for nonlinear and nonstationary regressions arevery essential in developing our asymptotic theories for the nonstationary logistic regressionmodel. The reader is referred to Park and Phillips (1999, 2001), and Chang, Park andPhillips (2001) for the detailed discussions. In those papers, the general theories of nonlinearregressions with integrated time series are established. Together with the earlier work onlinear cointegrated regressions (Phillips and Hansen 1990) and nonlinear regressions withfixed and/or weakly dependent regressors (see Jennrich 1969 and Wu 1981 for its earlydevelopments, Wooldridge 1994 and Andrews and McDermott 1995 for some importantextensions), the asymptotic theories for the nonlinear regressions with integrated regressorsgreatly extend the field of regression functions that we can evaluate. In particular, Park andPhillips (2001) derives asymptotic distributions of NLS estimators for integrable functionsand asymptotically homogeneous functions. A model accommodating a linear time trendand stationary regressors, as well as multiple I(1) regressors is considered in Chang etal. (2001), which is an important extension of Park and Phillips (1999, 2001). Althoughthese papers provide sufficient tools for analyzing many types of nonstationary nonlinearregression models, some commonly used regression functions are still not fully covered. Theintegrated logistic regression model we consider here is an important case. The logisticregression function is not an integrable or asymptotically homogeneous function, therefore,asymptotic theories developed in Park and Phillips (2001) are not applicable since thosetheories are derived in the context that regression functions are either I -regular or H -regular.Asymptotics for Additive Model introduced in Chang et al. (2001) extends asymptoticanalysis to a function which is not regular itself but can be written as a summation of tworegular functions. The theory requires that the component functions do not have unknownparameters in common. This theory seems very helpful in analyzing our model, however,because of the restriction on the component functions, we cannot apply the theory directlyto our model. In this paper, we develop a new limiting theory which shows the asymptoticequivalence of the original logistic regression function and another regression function onwhich the Asymptotics for Additive Model is applicable. By analyzing the asymptoticallyequivalent model, we develop the asymptotic theories of the NLS estimators for the originalmodel.

In the sense of Phillips (1991) and Saikkonnen (1991), the usual NLS estimators forthe integrated logistic model are generally inefficient, just as the usual OLS estimatorsare not efficient for the linear cointegrating regressions. This inefficiency is caused by the

1

failure of utilizing the information contained in the unit roots of the explanatory variables.Moreover, the limit distributions of NLS estimators are often nonnormal and dependent onnuisance parameters which invalidates the usual t- and chi-square tests. To overcome thisdifficulty, Chang et al. (2001) introduces EN-NLS estimators which have mixed normallimit distributions and thus yield asymptotically valid t− and chi-square tests. The sameapproach is applied to our model to construct efficient estimators.

The rest of the paper is organized as follows. Section 2 lays out the model, assumptionsand preliminary results, on which all the subsequent discussions heavily rely. The asymp-totic theories of the NLS estimators are presented in Section 3. The efficient estimationof and hypothesis testing on the model are considered subsequently in Section 4. Section5 reports some simulations to show the finite sample behavior of the estimators and teststatistics. Section 6 concludes the paper. Mathematical proofs are provided in Section 7.

2. The Model, Assumptions, and Preliminary Results

We consider the nonstationary logistic regression for (yt) given by

yt = f(xt, θ0) + ut (1)

where f : R → R is a logistic function such that

f(xt, θ0) = µ0 +α0

1 + e−β0(xt−γ0), (2)

(xt) and (ut) are integrated regressor and regression error respectively, and θ0 = (µ0, α0, β0, γ0)is the true parameter vector which lies in the parameter set Θ ⊂ R×R/0×R+×R. Weassume that parameter set Θ is compact and convex, and that θ0 is an interior point of Θ.The parameter set is constrained for the model to be identified. The bandwidth parameterα0 is assumed away from zero, and the exponential rate β0 is required to be a positivenumber. The shift parameter µ0, together with the bandwidth parameter α0 determinesthe region of the regression function. With the domain of xt being the real line, µ0 andµ0 + α0 are the two limit points of the regression function, i.e. within this region the valueof the function f is allowed to fluctuate. The exponential rate β0 determines the slope ofthe S- shaped curve. The bigger β0 is, the steeper the regression curve. γ0 is the locationparameter which shifts the curve along the xt axis and does not affect the shape of theregression curve.

The well established asymptotic theories for nonlinear and nonstationary regressions arenot sufficient in analyzing the nonstationary logistic regression model. Park and Phillips(2001) derives asymptotic theories of NLS estimators for regular functions, including inte-grable and asymptotically homogeneous functions. The nonstationary logistic model spec-ified in (1) and (2), however, is not regular. Therefore, we cannot directly apply theirtheories to our model. Chang, Park and Phillips (2001) introduces Asymptotics for Ad-ditive Model which extends the asymptotic theories to a regression function which is notregular itself but can be written as a summation of two regular functions. However, thetheory requires that the two regular functions do not have unknown parameters in common.

2

Due to this restriction, Asymptotics for Additive Model is also not applicable for our modelsince the model cannot be written as a summation of two regular functions which do notshare common unknown parameters. Thus a new limiting theory needs to be developed inorder to analyze the nonstationary logistic regression model. According to the assumptionsin Chang et al. (2001), we have our assumptions as follows:

Assumption 1 Let (Ft) be a filtration such that

(a) (xt) is adapted to (Ft−1) for each t,

(b) (ut,Ft) is a martingale difference sequence with E(u2t |Ft−1) = σ2

u for all t andsup1≤t≤n E(|ut|q|Ft−1) < ∞ for some q > 2.

This assumption is satisfied by a variety of data generating processes. Under condition(a), (xt) is predetermined. The condition can simply be met by choosing the naturalfiltration (ut, xt+1) for (Ft). The martingale difference assumption for the regression errorsin (b) is standard in stationary time series regression, and is used here to develop thenonstationary regression theory.

Assumption 2 Let (xt) be generated by

xt = xt−1 + vt

where (vt) follows the linear process

vt = ϕ(L)εt =∞∑

k=0

ϕkεt−k,

with ϕ(1) 6= 0 and∑∞

k=0 k|ϕk| < ∞. We assume that (εt) is a sequence of i.i.d. randomvariables with mean zero and variance σ2

ε , moreover, E|εt|r < ∞ for some r > 8. Thedistribution of εt is absolutely continuous with respect to the Lebesgue measure and hasthe characteristic function c(λ) such that limλ→∞ λpc(λ) = 0 for some p > 0.

Assumption 2 puts more restrictions on (xt). However, it is still satisfied by many of themodels that are used in empirical studies, including all invertible Gaussian ARMA models.

We define the stochastic processes Un(r) and Vn(r) on [0, 1] by

Un(r) =1√n

[nr]∑

t=1

ut and Vn(r) =1√n

[nr]∑

t=1

vt

where [s] denotes the largest integer not exceeding s. Under Assumptions 1 and 2, as shownin Phillips and Solo (1992),

(Un, Vn) →d (U, V )

as n → ∞, where (U, V ) is a bivariate Brownian motion. Most of our results will berepresented with Brownian motions introduced here. The covariance matrix of the limitBrownian motion (U, V ) is written as

Ω =(

ω2u ωuv

ωvu ω2v

)

3

Note that ω2u = σ2

u, since ut is a martingale difference sequence. We let the covariancebetween u and ε be σεu. Then we have ωuv = ϕ(1)σεu and ω2

v = ϕ(1)2σ2ε .

Our asymptotic theory involves local time of Brownian motion V , which we introducebriefly below. The reader is referred to Park and Phillips (1999, 2001), Chang et al. (2001)and the references cited there for the concept of local time and its use in the asymptoticsfor nonlinear models with integrated time series. The local time L of V is defined by

L(t, s) = limε→0

12ε

∫ t

01|V (r)− s| < ε dr.

The local time L, as the occupation density, measures the time that the Brownian motionV spends in the neighborhood of s, up to time t. It is well known that L is continuous inboth t and s.

In developing our asymptotic theories, we apply the asymptotics for I -, H - and H0-regular functions introduced in Park and Phillips (2001). Roughly, the class of I -regularfunctions includes integrable functions. Indicators over compact intervals such as f(x, β) =β 10 ≤ x ≤ 1 and other smooth functions like f(x, β) = e−βx2

are examples of I -regular functions. The class of H -regular functions consists of functions that are asymp-totically equivalent to some usual homogeneous functions, which are called limit homo-geneous functions. H0-regular function is a special type of H -regular function. Polyno-mials, logarithmic functions and all distribution function-like functions are all asymptot-ically homogeneous functions. In particular, the asymptotically homogeneous functionsT (s) = |s|k, log |s|, 1/(1 + e−s) have asymptotic orders κ(λ) = λk, log λ, 1 and limit ho-mogeneous functions h(s) = |s|k, 1, 1s ≥ 0, respectively. For more detailed discussionsof regular functions, the reader is referred to Park and Phillips (2001).

3. Asymptotics for Nonstationary Logistic Model

As usual, the NLS estimator θn is defined as

θn = argminθ∈Θ Qn(θ)

where

Qn(θ) =n∑

t=1

(yt − f(xt, θ))2 (3)

The corresponding error variance estimate is given by σ2n = 1

n

∑nt=1 u2

t , whereut = yt − f(xt, θn). As in Park and Phillips (2001), we define

f =(

∂f

∂θi

), f =

(∂2f

∂θi∂θj

),

...f=

(∂3f

∂θi∂θj∂θk

)

to be vectors arranged by the lexicographic ordering of their indices. For convenience, wedefine F = ∂2f/∂θ∂θ′ as the matrix form of the second derivative of f . Clearly, we mayobtain f from F by stacking its rows into a column vector. Moreover, we denote by h the

4

limit homogeneous function of H -regular function f . The asymptotic orders of H -regularfunctions f , f and

...f will be denoted as κ, κ and

...κ, respectively.

When the regression function f is I -regular or H -regular and satisfies some identifica-tion conditions (as in Park and Phillips (2001)) or higher order differentiability conditions(as in Chang et al. (2001)), the asymptotic distributions of NLS estimators are well es-tablished. However, our logistic function is neither I -regular nor H -regular. Therefore, thetheories developed for regular functions are not applicable in deriving the asymptotics forthe NLS estimator θn in nonstationary logistic model. We show, however, that the logisticregression function can actually be written as a summation of one integrable function andone asymptotically homogeneous function. The Asymptotics for Additive Model derived inChang et al. (2001) provides a possible way to deal with this type of regression functions.The theory shows that integrable and asymptotically homogeneous functions do not inter-act in the limit. Because of orthogonality, the regression on the sum of the two functionsis asymptotically equivalent to the regressions on the two component functions separatelyif these two functions do not have common unknown parameters. Because the asymptotictheories for regular functions are well established, this asymptotic equivalence greatly sim-plifies the analysis of some non-regular regression functions if their component functionsare regular and do not have unknown parameters in common.

Here, we rewrite the logistic model in an additive form.

f(xt, θ) = µ +α

1 + e−β(xt−γ)

= µ + α · 1xt ≥ 0+αeβ(xt−γ)

eβ(xt−γ) + 1· 1xt < 0 − α

eβ(xt−γ) + 1· 1xt ≥ 0

= f1(xt, θ) + f2(xt, θ)

where

f1(xt, θ) = µ + α · 1xt ≥ 0

f2(xt, θ) =αeβ(xt−γ)

eβ(xt−γ) + 1· 1xt < 0 − α

eβ(xt−γ) + 1· 1xt ≥ 0.

Although functions f1 and f2 are respectively H0-regular and I -regular , they do share anunknown parameter α which makes Asymptotics for Additive Model in Chang et. al (2001)unapplicable. In order to apply their theories, we need to find another regression functionwhich is asymptotically equivalent to the original function, and on which Asymptotics forAdditive Model is applicable.

Before we consider our logistic regression model, we first look at the following model ingeneral form.

f(xt, θ) = f1(xt, θ1) + f2(xt, θ2)

where f1 and f2 are H0-regular and I -regular functions respectively, and θ1 and θ2 areunknown parameters. We write parameters θ1 and θ2 more explicitly as

θ1 = α and θ2 = (α′, β′)′.

5

Also, let θ = (α′, β′)′ and θ0 = (α′0, β′0)′. Define

f∗2 (xt, β) = f2(xt, α0, β)

and f∗ = f1 + f∗2 . We have

Theorem 1 Let Assumption 2 hold. Assume:

(a) f1, f1 and...f1 exist and are H0-regular in a neighborhood of θ0 with ‖(κ1⊗ κ1)−1κ1‖,

‖(κ1 ⊗ κ1 ⊗ κ1)−1 ...κ1 ‖ < ∞, and λ1/2κ1(λ) →∞.

(b)∫|s|≤δ h1(., θ0)h1(., θ0)′ds > 0 for all δ > 0.

(c) f2, f2 and...f 2 exist and are I -regular in a neighborhood of θ0, and∫

|s|≤δ f∗2 (., θ0)f∗2 (., θ0)′ds > 0 for all δ > 0.

Then regression on f is asymptotically equivalent to that on f∗, for which f1 and f∗2 areseparable.

Assumptions (a) and (b) are standard and satisfied by many smooth functions whichare H0. Assumption (c) is also satisfied by many integrable functions. We note that f2, f2

and...f 2 exist and are I -regular in a neighborhood of θ0, so do f∗2 , f∗2 and

...f∗2. Therefore,

the separability of f1 and f∗2 follows immediately from Asymptotics for Additive Model.The reader is referred to Chang et al. (2001) for the details. The asymptotic equivalencebetween f and f∗ is naturally expected since we replace α with the true value α0 in f2 togenerate f∗2 , and NLS estimators are consistent.

According to Theorem 1, we let

f∗2 (xt, θ) =α0e

β(xt−γ)

eβ(xt−γ) + 1· 1xt < 0 − α0

eβ(xt−γ) + 1· 1xt ≥ 0

f∗ = f1 + f∗2

It is easy to check that f1, f2 and f∗2 satisfy the assumptions in Theorem 1. Due toTheorem 1, the regression on f is asymptotically equivalent to that on f∗, and f1 andf∗2 are separable. Now, we may obtain the asymptotic distributions of the NLS estimatessimply by considering the regression functions f1 and f∗2 separately, for which the asymptotictheories developed in the previous works are applicable.

Define 1(V ) = 1 − 1V ≥ 0 and 1(V ) = 1V ≥ 0 − ∫ 10 1V ≥ 0. The limit theories

for the NLS estimator θn of θ in the nonstationary logistic regression given by (1) and (2)are as follows:

Theorem 2 Let Assumptions 1 and 2 hold. We have

(a)√

n(µn − µ0) →d

(∫ 10 1(V )2

)−1 ∫ 10 1(V ) dU

(b)√

n(αn − α0) →d

(∫ 10 1(V )2

)−1 ∫ 10 1(V ) dU

(c) 4√

n(βn − β0) →d MN

(0, σ2

u

(α2

0(π2−6)

18β30

L(1, 0))−1

)

6

(d) 4√

n(γn − γ0) →d MN

(0, σ2

u

(α2

0β0

6 L(1, 0))−1

)

jointly as n →∞.

From Theorem 2, we see that the NLS estimators for the integrated logistic regressionare consistent. The rates of convergence, however, may differ from the standard

√n rate in

the stationary logistic regression. In particular, βn and γn converge to the true parametersat 4√

n rate. The limiting distributions of βn and γn are mixed normal, and thus the t-and chi-square tests in the usual manner are asymptotically valid. µn and αn convergeto the true parameters at

√n rate. Their asymptotic distributions, however, are generally

nonnormal, and become normal mixture only when the Brownian motions U and V areuncorrelated, i.e. when xt is exogenous. The standard tests are therefore not applicable forparameters µ and α. This problem is addressed in more detail in the following discussions.

Corollary 3 Let Assumptions 1 and 2 hold. Then σ2n →p σ2

u as n →∞.

The error variance estimator σ2n is also consistent in the nonstationary logistic regression

model as shown in Corollary 3. This result is not surprising since the logistic regressionfunction can be viewed as a summation of I -regular and H0-regular functions. It is shownin Chang et. al (2001) that σ2

n is consistent in general nonlinear additive regressions withI - and H - regular and stationary and deterministic regression functions.

We now consider the t-tests for the coefficients µ, α, β and γ, which we denote by tn(µ),tn(α), tn(β) and tn(γ), respectively. Let ρ be the correlation coefficient of the Brownianmotions U and V , and let P and Q be two independent standard Brownian motions.

Theorem 4 Under Assumptions 1 and 2, we have

(a) tn(µ) →d

√1− ρ2 P (1) + ρ

∫ 1

01(Q)dQ

(∫ 1

01(Q)2

)1/2

(b) tn(α) →d

√1− ρ2 P (1) + ρ

∫ 1

01(Q)dQ

(∫ 1

01(Q)2

)1/2

(c) tn(β) →d N(0, 1)(d) tn(γ) →d N(0, 1)

as n →∞.

Theorem 4 shows that the limiting distributions of tn(µ) and tn(α) are nonnormal anddependent upon the asymptotic correlation coefficient of the Brownian motions U and V .Thus, if the correlation coefficient ρ is not zero, the conventional t-tests for coefficients µand α are invalid. In order to use the usual inferences, we need to remove the nuisanceparameter ρ. This is addressed in the next section in the sprit of the efficient nonlinearleast squares estimation introduced in Chang et al. (2001).

7

4. Efficient Nonlinear Least Squares Estimation

As shown in Theorem 2, the usual NLS estimators of µ and α are generally nonnormal andinefficient. Nonnormality of the estimators invalidates t- and chi-square tests and the failureof utilizing the information contained in the unit root of the regressor is partly responsiblefor the inefficiency. It is shown by Phillips (1991) in linear cointegrating regressions thata more efficient estimator can be constructed if this information is used. The FM-OLSmethod by Phillips and Hansen (1990) and the CCR method by Park (1992) both yieldefficient estimators for parameters in linear cointegrating regression. Following these, theEN-NLS is introduced by Chang, Park and Phillips (2001). It is derived for nonlinearcointegrating regression model. The EN-NLS estimation is more efficient, and has a mixednormal limiting distribution for every parameter estimate. It yields asymptotically validt- and chi-square tests in the usual manner. In order to obtain the EN-NLS estimates forour model, we need more assumptions about the innovation process vt = ϕ(L)εt of theintegrated regressor xt.

Assumption 3 We assume that

(a) ϕ(z) is bounded and bounded away from zero for |z| ≤ 1, and

(b) if we write ϕ(z)−1 = 1−∑∞k=1 Πkz

k, then `s∑∞

k=`+1 |Πk|2 < ∞ for some s ≥ 9.

To estimate our model efficiently, we first run the regression

vt = Π1vt−1 + · · ·+ Π`vt−` + ε`,t .

As in Chang et al. (2001), we let ` = nδ which allows ` to increase as n →∞. We select δso that

r + 22r(s− 3)

< δ <r

6 + 8r, (4)

where r is given by the moment condition for (εt), i.e. E|εt|r < ∞ for some r > 8 given inAssumption 2. It is easy to see that δ satisfying condition (4) exists for all r > 8, if s ≥ 9 asis assumed in Assumption 3. For ARMA models, Assumptions 2 and 3 hold for any finiter and s. We choose any δ such that 0 < δ < 1/8.

We definey∗t = yt − σuε(σ2

ε)−1ε`,t+1

where

σuε =1n

n∑

t=1

utε`,t+1 and σ2ε =

1n

n∑

t=1

ε`,tε`,t

with the first step NLS residual ut. Then consider the regression

y∗t = f(xt, θ) + u∗t (5)

in place of (1).

8

The efficient estimator is the NLS estimator θ∗n of θ, computed from the transformedregression (5). Just as the FM-OLS and CCR methods, the EN-NLS corrects the long-rundependency between the regression errors and the innovations of the integrated regressor.

The following theorem presents the limit theory for the EN-NLS estimator θ∗n. Letθ∗n = (µ∗n, α∗n, β∗n, γ∗n)′ and define U∗ = U −ωuv(ω2

v)−1V. The process U∗ is independent of V ,

and its variance is given by σ2∗ = σ2u − ωuv(ω2

v)−1ωvu, i.e. the long-run conditional variance

of U given V , which is less than σ2u.

Theorem 5 Under Assumptions 1-3, we have

(a)√

n(µ∗n − µ0) →d

(∫ 10 1(V )2

)−1 ∫ 10 1(V )dU∗

(b)√

n(α∗n − α0) →d

(∫ 10 1(V )2

)−1 ∫ 10 1(V )dU∗

(c) 4√

n(β∗n − β0) →d MN

(0, σ2∗

(α2

0(π2−6)

18β30

L(1, 0))−1

)

(d) 4√

n(γ∗n − γ0) →d MN

(0, σ2∗

(α2

0β0

6 L(1, 0))−1

)

jointly as n →∞.

From Theorem 5, we can see that EN-NLS estimates have the same convergence ratesas the usual NLS estimates derived in Theorem 2. However, EN-NLS estimation correctsnonnormality of µn and αn. The new estimators now have mixed normal limit distributions.As a result, the usual t- and chi-square tests based on the EN-NLS estimators are now validfor all parameters. Moreover, the asymptotic variances are reduced when the regressionerrors are correlated with the future or past as well as the present values of the innovationof the integrated regressor. Therefore, EN-NLS estimators are generally more efficient thanusual NLS estimators.

5. Simulations

In this section, we perform a set of simulations to investigate the finite sample properties ofthe NLS and EN-NLS estimators in the nonstationary logistic regression model specified in(1) and (2). We choose the true values of the parameters as: µ0 = 0, α0 = 1, β0 = 1, γ0 = 0,which meet the parameter restrictions. In order to satisfy the conditions in Assumptions 1and 2, the regressor xt and error term ut are generated as following:

ut = ε1,t+1/√

2 + ε2,t+1/√

2,

and∆xt = vt = ε2,t + 0.5ε2,t−1,

where (ε1,t) and (ε2,t) are randomly drawn from N(0, σ2) distributions with σ2 = 0.12.From the data generating process, it is easy to see that the regression error is a martin-

gale difference sequence and is asymptotically correlated with the innovation that generatesthe integrated process xt. Theorems 2 and 5 are readily applicable to this model. The NLSestimates for the parameters µ and α on the asymptotic homogeneous component converge

9

Figure 1: Densities of NLS and EN-NLS estimates, n=250

at√

n rate to nonGaussian distributions due to the asymptotic correlation between theerror term and the innovation. The asymptotic distributions of the t-statistics based onthese two estimators are consequently nonstandard. On the contrary, the NLS estimatorsof β and γ converge to mixed normal distributions with the convergence rate n1/4 whichimplies that the standard t-statistics based on these two estimates are valid. On the otherhand, the limiting distributions of the EN-NLS estimates of µ, α, β, and γ are all mixednormal, implying that the t-statistics constructed from them have standard normal limitingdistributions. Moreover, the EN-NLS estimates have reduced long-run variances, makingthem asymptotically more efficient than their NLS counterparts.

In the simulation, samples of size 250 and 1000 are drawn 5000 times to estimate the NLSand EN-NLS estimators and t-statistics based on these estimators. One-period ahead fittedinnovations ε`,t+1 are used to construct the EN-NLS correction term. The fitted innovationsare obtained from the `th order vector autoregressions of νt with ` = 1, 2, respectivelyfor n = 250 and n = 1000. For the nonlinear estimation, we use the GAUSS optimizationroutine and the Gauss-Newton algorithm. The simulation results are summarized in Figures1-3. The estimators are scaled by their respective convergent rates as provided by theasymptotic theories. Figures 1 and 2 present the densities of the usual and scaled NLS and

10

Figure 2: Densities of NLS and EN-NLS estimates, n=1000

EN-NLS estimators for sample sizes n = 250 and 1000. The distributions of the t-ratiosbased on these estimates are included in Figure 3.

The finite sample behaviors of the NLS and EN-NLS estimators are mostly as expected.The NLS estimators of µ and α suffer from bias that does not vanish as the sample sizeincreases. The finite sample distributions of µn and αn are skewed, therefore nonnormalwhich is also consistent with the limiting distribution theory of Theorem 2.

In the contrast, the finite sample distribution of γn is symmetric and well centered asexpected from its asymptotics. However, from Figures 1 and 2, we can see that the NLSestimator for β has a noticeable bias which does not go away quickly as the sample sizeincreases. We may therefore say that the asymptotic approximation for the NLS estimator ofβ is not good enough. The performances of the EN-NLS estimators reflect our asymptotictheory even better. Although the small sample behaviors of EN-NLS estimators do notperfectly present the asymptotic results, Figure 1 shows an substantial improvement of thebehaviors of EN-NLS estimates compared with the NLS estimates. In the large sample case,all the EN-NLS estimators are well centered and have symmetric distributions as Theorem5 predicts. Our correction not only fixes the bias and nonGaussian problems of the NLSestimators of µ, α, but also effectively removes the finite sample bias and the distribution

11

Figure 3: Densities of t-statistics, n=250 and 1000

asymmetry of the NLS estimator of β. Moreover, it is worthwhile to mention that theEN-NLS estimators are noticeably more concentrated around the true parameter values asour theory suggests.

The distributions of the t-statistics constructed from the estimators are also mostlyconsistent with our asymptotic theory. As shown in Figure 3, the distributions of thet-ratios constructed from the EN-NLS estimators are very close to the standard normaldistribution, and the approximation gets improved as the sample size gets larger. However,the distributions of the t-ratios based on the NLS estimators of µ and α, are nonstandardboth in small and large samples as the theory predicts. The distribution of t-ratios based onthe NLS estimator γn is symmetric and well centered, only a little lower than the standardnormal distribution. This improves as the sample size increases. Since the finite sampleNLS estimator of β suffers from bias as discussed before, the distribution of the t-ratio basedon this estimator does not properly approximate its limiting standard normal distributionas we can expect.

12

6. Conclusion

In this paper, we establish the asymptotic theories for the nonstationary logistic regres-sion model. The NLS estimators are consistent and have well defined limiting distributionswhich can be represented as functions of Brownian motion and Brownian local time. TheNLS estimators for the constant term µ and the coefficient α converge with rate

√n and

have nonGaussian, biased and nuisance parameter dependent limiting distributions. On thecontrary, the NLS estimators for the coefficients β and γ are unbiased and have mixed nor-mal limiting distributions. For those estimators whose distributions are nonGaussian, thestandard t− and chi-square tests are invalid. To correct this, we apply EN-NLS estimationto get better estimators for this model. The EN-NLS estimators are unbiased and havenormal distributions which allow us to use standard t− or chi-square tests. We only includeone integrated regressor in our model. Our technical results do not cover the logistic modelwhich also includes stationary regressors and/or time trend. Such extensions to this modelprovide an interesting area for future research.

7. Mathematical Appendix

Proof of Theorem 1 To prove Theorem 1, for an appropriately chosen normalizingsequence νn we need to show that

ν−1n

n∑

t=1

f(xt, θ0)ut = ν−1n

n∑

t=1

f∗(xt, θ0)ut + op(1) (6)

(νn ⊗ νn)−1n∑

t=1

f(xt, θ0)ut = (νn ⊗ νn)−1n∑

t=1

f∗(xt, θ0)ut + op(1) (7)

ν−1n

n∑

t=1

f(xt, θ0)f(xt, θ0)′ν−1′n = ν−1

n

n∑

t=1

f∗(xt, θ0)f∗(xt, θ0)′ν−1′n + op(1), (8)

Moreover, we need to establish regularity condition AD7 required for deriving limit theoriesfor nonstationary NLS estimators. The reader is referred to Park and Phillips (2001) for thedetail of AD7. Given AD7, it follows from (6)-(8) that the regression on f is asymptoticallyequivalent to that on f∗.

Write κ1n = κ1(√

n), and let cn = ‖n−1/4κ−11n ‖. By assumption (a), cn → 0 as n → ∞.

Also, we define fi and Fi, for i = 1, 2, similarly as f and F . Moreover, let

f2 =(

f2α

f2β

)and F2 =

(F2αα F2αβ

F2βα F2ββ

).

It follows that

f =(

f1 + f2α

f2β

), F =

(F1 + F2αα F2αβ

F2βα F2ββ

)(9)

13

and

f∗ =(

f1

f2β

), F∗ =

(F1 00 F2ββ

). (10)

To show (6) - (8), we note that

f − f∗ =(

f2α

0

), F − F∗ =

(F2αα F2αβ

F2βα 0

)

and

f f ′ − f∗f ′∗ =(

f1f′2α + f2αf ′1 + f2αf ′2α f2αf2β

f2β f2α 0

),

which follow directly from (9) - (10). Now we may easily deduce (6), since∥∥∥∥∥ν−1

n

n∑

t=1

(f − f∗)(xt, θ0)ut

∥∥∥∥∥ ≤ cn

∥∥∥∥∥14√

n

n∑

t=1

f2α(xt, θ0)ut

∥∥∥∥∥ →p 0.

It also follows that∥∥∥∥∥(νn ⊗ νn)−1n∑

t=1

(f − f∗)(xt, θ0)ut

∥∥∥∥∥ ≤∥∥∥∥∥(νn ⊗ νn)−1

n∑

t=1

f2(xt, θ0)ut

∥∥∥∥∥

≤ 1 + cn + c2n

4√

n

∥∥∥∥∥14√

n

n∑

t=1

f2(xt, θ0)ut

∥∥∥∥∥ = op(1)

which proves (7), due to the fact that ‖(νn ⊗ νn)−1‖ ≤ n−1/2(1 + cn + c2n). Also we define

that smax = max0≤r≤1 V (r), smin = min0≤r≤1 V (r), and K = [smin− 1, smax +1], it followsthat ∥∥∥∥∥ν−1

1n

n∑

t=1

f1(xt, θ0)f2α(xt, θ0)′ν−1′1n

∥∥∥∥∥ ≤ cn4√

n

∥∥∥h1

∥∥∥K

1√n

n∑

t=1

∥∥∥f2α(xt, θ0)∥∥∥ + op(1)

∥∥∥∥∥ν−11n

n∑

t=1

f2α(xt, θ0)f2α(xt, θ0)′ν−1′1n

∥∥∥∥∥ ≤ c2n

1√n

n∑

t=1

∥∥∥f2α(xt, θ0)f2α(xt, θ0)′∥∥∥

∥∥∥∥∥ν−11n

n∑

t=1

f2α(xt, θ0)f2β(xt, θ0)′ν−1′2n

∥∥∥∥∥ ≤ cn1√n

n∑

t=1

∥∥∥f2α(xt, θ0)f2β(xt, θ0)′∥∥∥ .

The right hand side of the inequalities are all op(1), from which (8) is therefore proved.Now we show that AD7 holds for some ηn. let δ be any number such that 0 < δ < 1/12,

and let ηn = n−δνn. Also, define Nn = θ : ‖η′n(θ−θ0)‖ ≤ 1 as in AD7. Clearly, ηnν−1n → 0

as required. Furthermore, we have∥∥∥η−1

n f(x, θ)∥∥∥ ≤ n−1/2+δ

∥∥∥κ−11n f1(x, α)

∥∥∥ + n−1/4+δ∥∥∥f2(x, α, β)

∥∥∥ (11)∥∥∥(ηn ⊗ ηn)−1f(x, θ)

∥∥∥ ≤ n−1+2δ∥∥(κ1 ⊗ κ1)−1κ1

∥∥∥∥∥κ−1

1n f1

∥∥∥+n−1/2+2δ(1 + cn + c2

n)∥∥∥f2(x, α, β)

∥∥∥ (12)

14

for all θ ∈ Nn. We write Qn(θ)− Qn(θ0) as

Qn(θ)− Qn(θ0) =(D1n(θ) + D1n(θ)′

)+ D2n(θ) + D3n(θ) + D4n(θ) (13)

where

D1n(θ) =n∑

t=1

f(xt, θ0)(f(xt, θ)− f(xt, θ0)

)′

D2n(θ) =n∑

t=1

(f(xt, θ)− f(xt, θ0)

) (f(xt, θ)− f(xt, θ0)

)′

D3n(θ) =n∑

t=1

F (xt, θ) (f(xt, θ)− f(xt, θ0))

D4n(θ) = −n∑

t=1

(F (xt, θ)− F (xt, θ0)

)ut,

and define

$2in(θ) =

∥∥∥η−1n Din(θ)η−1′

n

∥∥∥

for i = 1, ..., 4. It follows immediately from (11) and (12) that

$21n(θ), $2

2n(θ) and $23n(θ) →p 0

as n →∞, uniformly in θ ∈ Nn.

Now

$24n(θ) ≤

∥∥∥∥∥(ηn ⊗ ηn ⊗ ηn)−1n∑

t=1

...f (xt, θ)ut

∥∥∥∥∥

with θ between θ and θ0. However,∥∥∥∥∥(ηn ⊗ ηn ⊗ ηn)−1

n∑

t=1

...f (xt, θ)ut

∥∥∥∥∥ ≤ n3δ

√n

∥∥(κ1 ⊗ κ1 ⊗ κ1)−1 ...κ1

∥∥∥∥∥∥∥

1n

n∑

t=1

...κ−11n

...f 1 (xt, α)ut

∥∥∥∥∥

+n3δ(1 + cn + c2

n + c3n)

4√

n

∥∥∥∥∥1√n

n∑

t=1

...f 2 (xt, α, β)ut

∥∥∥∥∥ .

Notice that ∥∥(ηn ⊗ ηn ⊗ ηn)−1∥∥ ≤ n−3/4+3δ(1 + cn + c2

n + c3n).

Since we have by Lemma A7 (p145) in Park and Phillips (2001)∥∥∥∥∥

1n

n∑

t=1

...κ−11n

...f 1 (xt, α)ut

∥∥∥∥∥ ,

∥∥∥∥∥1√n

n∑

t=1

...f 2 (xt, α, β)ut

∥∥∥∥∥ →p 0

15

uniformly in α and β, it follows that $24n(θ) →p 0 uniformly on Nn. The condition AD7 is

thus satisfied.Because f2, f2 and

...f 2 exist and are I -regular in a neighborhood of θ0, so do f∗2 , f∗2

and...f∗2. The separability of the f1 and f∗2 follows immediately from the Asymptotics for

Additive Model in Chang et al. (2001).

Proof of Theorem 2 To prove Theorem 2, we only need to look at the regression functionsf1 and f∗2 separately, due to Theorem 1. First, we look at regression function f1:

yt = µ + α · 1xt ≥ 0+ ut.

We note that the function 1xt ≥ 0 is an H -regular function with asymptotic order 1 andlimit homogeneous function itself. Hence, due to Park and Phillips (2001), we have

1n

n∑

t=1

1xt ≥ 0 →a.s.

∫ 1

01V ≥ 0dr,

1√n

n∑

t=1

1xt ≥ 0ut →d

∫ 1

01V ≥ 0dU

and1√n

n∑

t=1

1n

n∑

t=1

1xt ≥ 0ut =1n

n∑

t=1

(1√n

n∑

t=1

1xt ≥ 0ut

)

→d

∫ 1

0

∫ 1

01V ≥ 0dUdr.

as n →∞.

For part (a),

√n(µn − µ0) =

(1− 1

n

n∑

t=1

1xt ≥ 0)−1

·(

1√n

n∑

t=1

ut − 1√n

n∑

t=1

1xt ≥ 0ut

)

→d

(1−

∫ 1

01V ≥ 0

)−1

·(∫ 1

0

(1−

∫ 1

01V ≥ 0dr

)dU

)

=(∫ 1

01(V )2

)−1 ∫ 1

01(V )dU

where 1(V ) = 1− 1V ≥ 0.

16

For part (b),

√n(αn − α0) =

(1n

n∑

t=1

1xt ≥ 0 − 1n

n∑

t=1

1xt ≥ 0 · 1n

n∑

t=1

1xt ≥ 0)−1

·(

1√n

n∑

t=1

1xt ≥ 0ut − 1√n

n∑

t=1

(1n

n∑

t=1

1xt ≥ 0)

ut

).

→d

(∫ 1

01V ≥ 0 −

(∫ 1

01V ≥ 0

)2)−1

·(∫ 1

01V ≥ 0dU −

∫ 1

0

∫ 1

01V ≥ 0drdU

)

=(∫ 1

01(V )2

)−1 ∫ 1

01(V )dU

where 1(V ) = 1V ≥ 0 − ∫ 10 1V ≥ 0.

For parts (c) and (d), we can look at the regression function

yt = f∗2 (xt, β, γ) + ut.

We know that f∗2 is an I -regular function satisfying the regularity conditions required forTheorem 7 in Chang et al. (2001). Thus we may deduce directly from the result mentionedthere that:

4√

n

(βn − β0

γn − γ0

)→d MN

0, σ2

u

(ÃL(0, 1)

∫ ∞

−∞

(f∗2β f∗2β f∗2β f∗2γ

f∗2γ f∗2β f∗2γ f∗2γ

)ds

)−1

as n →∞, where f∗2β and f∗2γ are the partial derivatives of function f∗2 (s, β, γ) with respectto β and γ evaluated at the true parameter values. We have

∫ ∞

−∞f∗2β f∗2βds =

∫ ∞

−∞

α20(s− γ0)2(

e−12β0(s−γ0) + e

12β0(s−γ0)

)4 ds

=8α2

0

β30

∫ ∞

−∞

m2

(e−m + em)4dm

=α2

0(π2 − 6)

18β30

∫ ∞

−∞f∗2γ f∗2γds =

∫ ∞

−∞

α20β

20(

e−12β0(γ0−s) + e

12β0(γ0−s)

)4 ds

= 2α20β0

∫ ∞

−∞

1(em + e−m)4

dm

=α2

0β0

6

17

∫ ∞

−∞f∗2β f∗2γds =

∫ ∞

−∞

α20β0(s− γ0)(

e−12β0(γ0−s) + e

12β0(γ0−s)

)4 ds

=∫ ∞

−∞

α20β0m(

e−12β0m + e

12β0m

)4 dm

= 0

Now the stated result follows immediately.

Proof of Corollary 3 Let

σ2n =

1n

n∑

t=1

(yt − f(xt, θn)

)2and σ2

n =1n

n∑

t=1

(yt − f(xt, θ0))2 .

It follows from Assumption 1(b) that σ2n →p σ2

u. Therefore, we only need to show σ2n →p σ2

n

i.e. σ2n − σ2

n →p 0 to finish the proof.

σ2n − σ2

n =1n

(n∑

t=1

(yt − f(xt, θn)

)2−

n∑

t=1

u2t

)

=1n

(n∑

t=1

(f(xt, θ0) + ut − f(xt, θn)

)2−

n∑

t=1

u2t

)

=1n

n∑

t=1

(f(xt, θn)− f(xt, θ0)

)2+

2n

n∑

t=1

(f(xt, θ0)− f(xt, θn))ut

Therefore, we only need to show 1n

∑nt=1


)2= op(1) and 1

n

∑nt=1(f(xt, θ0)−

f(xt, θn))ut = op(1) to complete the proof. Since f(xt, θ0)− f(xt, θn) = (θn − θ0)′f(xt, θn),for some θn between θn and θ0, we have

1n

n∑

t=1


)2=

1n

∣∣∣∣∣n∑

t=1


)2∣∣∣∣∣

≤ 1n

n∑

t=1

∣∣∣(θn − θ0)′ · f(xt, θn)∣∣∣2

≤ ‖θn − θ0‖2 · 1n

n∑

t=1

∥∥∥f(xt, θn)∥∥∥

2

=((µn − µ0)2 + (αn − α0)2 + (βn − β0)2 + (γn − γ0)2

)

· 1n

n∑

t=1

(f2µn

+ f2αn

+ f2βn

+ f2γn

)

fµn , fαn , fβn and fγn here are the partial derivatives of f(xt, θ) with respect to µ, α, βand γ, evaluated at θn. It is easy to check that fµn and fαn are H -regular functions with

18

asymptotic order κ = 1 for both functions. As a result, f2µn

and f2αn

are also H -regularfunctions with asymptotic order κ = 1. fβn and fγn , on the other hand, are two I -regularfunctions, so are f2

βnand f2

γn. Since the convergence rates of sample mean for I - and H -

regular functions are n−1/2 and n−1κ−1 respectively according to the asymptotic theoriesfor regular functions derived in Park and Phillips (2001), we can easily deduce that

1n

n∑

t=1

(f2µn

+ f2αn

+ f2βn

+ f2γn

) = Op(1) (14)

Because of the consistency of θn, we have

(µn − µ0)2 + (αn − α0)2 + (βn − β0)2 + (γn − γ0)2 = op(1). (15)

1n

∑nt=1


)2= op(1) therefore follows immediately from (14) and (15).

Moreover,

1n

n∑

t=1

(f(xt, θ0)− f(xt, θn))ut =1n

n∑

t=1

(θn − θ0)′ · f(xt, θn)ut

= (θn − θ0)′ · 1n

n∑

t=1

f(xt, θn)ut

f(xt, θn) is a vector with elements being regular functions. Again, according to the asymp-totic theories for regular functions, the convergence rates for sample covariance of I - andH -regular functions are n−1/4 and n−1/2κ−1. It follows that

1n

n∑

t=1

f(xt, θn)ut = op(1).

θn − θ0 is also op(1) because of the consistency of θn. The proof is therefore finished.

Proof of Theorem 4 Theorem 2 shows that:

√n(µn − µ0) →d

(∫ 1

01(V )2

)−1 ∫ 1

01(V )dU.

Let s(µn) be the estimator of the standard error of µn and σ2µ be the OLS estimator of σ2

u.Then

√ns(µn) =

(1− 1

n

n∑

t=1

1xt ≥ 0)−1/2

· σu.

19

Therefore,

tn(µ) =µn − µ0

s(µn)

→d1σu

(∫ 1

01(V )2

)−1/2 ∫ 1

01(V )dU

=1σu

(∫ 1

01(V )2

)−1/2 ∫ 1

01(V )d

(V · ρ · σu

σv+ P (1) ·

√1− ρ2 · σu

)

= ρ

(∫ 1

0 1(V )

σv2

)−1/2 ∫ 1

0

1(V )σv

dV

σv+

√1− ρ2

(∫ 1

01(V )2

)−1/2 ∫ 1

01(V )dP (1)

=√

1− ρ2 P (1) + ρ

∫ 1

01(Q)dQ

(∫ 1

0

1(Q)

2)1/2

where Q = Vσv

and P(1) are standard Brownian motions, and they are independent. Thet-distribution of the NLS estimator of α can be derived in the same way. The t-distributionsof βn and γn are easy to get, since their asymptotic distributions are mixed normal. Because

4√

n(βn − β0) →d MN

(0,

(α2

0(π2 − 6)

18β30

L(1, 0))−1

σ2u

)

4√

n · s(βn) →p

(α2

0(π2 − 6)

18β30

L(1, 0))−1/2

σu,

it follows directly thattn(β) →d N(0, 1).

In the same way, we can showtn(γ) →d N(0, 1).

Proof of Theorem 5 For part (a), we define dnt = 1 − 1xt ≥ 0 and d(r) = 1(V ) =1 − 1V ≥ 0, and for part (b), we define dnt = 1xt ≥ 0 − 1

n

∑nt=1 1xt ≥ 0, and

d(r) = 1(V ) = 1V ≥ 0 − ∫ 10 1V ≥ 0. The results follow immediately from the proof

(b) of Theorem 10 in Chang et al. (2001). For the proof of parts (c) and (d), the reader isreferred to the proof (a) of Theorem (10) in Chang et al. (2001).

20

References

Akonom, J. (1993). Comportement asymptotique du temps d’occupation du processus dessommes partielles. Annales de l’Institut Henri Poincare 29, 57–81.

Andrews, D.W.K. and C.J. McDermott (1995). Nonlinear econometric models with deter-ministically trending variables. Review of Economic Studies 62 343-360.

Bates, D.M. and D.G. Watts (1988). Nonlinear Regression Analysis and Its Applications.Wiley, New York.

Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.

Chang, Y. and Park, J.Y. (2003). Index models with integrated time series. Journal ofEconometrics 114: 73-106.

Chang, Y., Park, J.Y. and P.C.B. Phillips (2001). Nonlinear econometric models withcointegrated and deterministically trending regressors. Econometrics Journal 4: 1-36.

Chung, K.L. and R.J. Williams (1990). Introduction to Stochastic Integration, 2nd ed.Birkhauser, Boston.

Granger, C. W. J. (1995). Nonlinear relationships between extended-memory variables.Econometrica, 63, 265-280.

Hansen, B.E. (1992). Convergence to stochastic integrals for dependent heterogeneousprocesses. Econometric Theory 8, 489–500.

Hansen, L. P. (1982). Large sample properties of generalized method of moments estima-tors. Econometrica 50, 1029–1054.

Jennrich, R.I. (1969). Asymptotic properties of non-linear least squares estimation. Annalsof Mathematical Statistics 40, 633–643.

Karatzas, I. and S.E. Shreve (1988). Brownian Motion and Stochastic Calculus. Springer-Verlag, New York.

Malinvaud, E. (1970). The consistency of nonlinear regressions. Annals of MathematicalStatistics 41, 956–969.

Park, J.Y. and P.C.B. Phillips (1988). Statistical inference in regressions with integratedprocesses: Part 1. Econometric Theory 4, 468–497.

Park, J.Y. and P.C.B. Phillips (1999). Asymptotics for nonlinear transformations of inte-grated time series. Econometric Theory 15: 269-298.

Park, J.Y. and P.C.B. Phillips (2001). Nonlinear regressions with integrated time series.Econometrica 69, 1452-1498.

21

Phillips, P. C. B. (1986). Understanding spurious regressions in econometrics. Journal ofEconometrics 33, 311–340.

Phillips, P. C. B. (1987). Time series regression with a unit root. Econometrica, 55,277–301.

Phillips, P. C. B. (1991). Optimal inference in cointegrated systems. Econometrica 59,283–306.

Phillips, P. C. B. and S. N. Durlauf (1986). Multiple time series with integrated variables.Review of Economic Studies 53, 473–496.

Phillips, P. C. B. and J. Y. Park (1998). Nonstationary density estimation and kernelautoregression. Yale University, mimeographed.

Phillips, P.C.B. and W. Ploberger (1996). An asymptotic theory of Bayesian inference fortime series. Econometrica 64, 381–412.

Phillips, P.C.B. and V. Solo (1992). Asymptotics for linear processes. Annals of Statistics20, 971–1001.

Pollard, D. (1984). Convergence of Stochastic Processes. Springer–Verlag, New York.

Revuz, D. and M. Yor (1994). Continuous Martingale and Brownian Motion, 2nd ed.Springer–Verlag, New York.

Saikkonen, P. (1995). Problems with the asymptotic theory of maximum likelihood esti-mation in integrated and cointegrated systems. Econometric Theory, 11, 888-911.

Shorack, G. R. And J. A. Wellner (1986). Empirical Processes with Applications to Statis-tics. New York: Wiley.

Wooldridge, J. M. (1994). Estimation and inference for dependent processes. In R.F.Engle and D.L. McFadden (eds.) Handbook of Econometrics, Vol. IV, pp. 2639–2738.Elsevier, Amsterdam.

Wu, C.F. (1981). Asymptotic theory of nonlinear least squares estimation. Annals ofStatistics 9, 501–513.

22

nonstationary logistic regression1 2people.tamu.edu/~ganli/camp11/pv2231.pdfkey words and phrases:...

Documents