some inequalities for strong mixing random variables with applications to density estimation
TRANSCRIPT
Statistics and Probability Letters 81 (2011) 250–258
Contents lists available at ScienceDirect
Statistics and Probability Letters
journal homepage: www.elsevier.com/locate/stapro
Some inequalities for strong mixing random variables with applicationsto density estimation✩
Yongming Li a,∗, Shanchao Yang b, Chengdong Wei ca Department of Mathematics, Shangrao Normal University, Shangrao, Jiangxi 334001, PR Chinab Department of Mathematics, Guangxi Normal University, Guilin, Guangxi 541004, PR Chinac School of Mathematical Science, Guangxi Teachers Education University, Nanning, Guangxi 530004, PR China
a r t i c l e i n f o
Article history:Received 19 June 2010Received in revised form 5 October 2010Accepted 6 October 2010Available online 14 October 2010
MSC:60E1562G0562G20
Keywords:Strong mixing processesEsseen-type inequalityKernel estimate
a b s t r a c t
In this paper, we establish an inequality of the characteristic functions for strongly mixingrandom vectors, by which, an upper bound is provided for the supremum of the absolutevalue of the difference of two multivariate probability density functions based on stronglymixing random vectors. As its application, we consider the consistency and asymptoticnormality of a kernel estimate of a density function under strong mixing. Our resultsgeneralize some known results in the literature.
© 2010 Elsevier B.V. All rights reserved.
1. Introduction
Sadikova (1966) obtained a two-dimensional Esseen inequality (Sadikova’s inequality) for distribution functions (d.f.’s),and Gamkrelidze (1977) generalized Sadikova’s inequality to multidimensional distribution functions. It is well known thatSadikova’s inequality has proved useful in nonparametric estimation of d.f. and probability density function (p.d.f.) in theframework of association; see for example Bagai and Prakasa Rao (1991) and Roussas (1991, 1995).
In a recent paper, Roussas (2001), following the ideas in Sadikova (1966), obtained an upper bound for the supremumof the absolute value of the difference of the probability density functions of two k-dimensional random vectors, and asapplication, established the consistency of a kernel estimate of a p.d.f. under association. Rao (2002) obtained an alternateinequality for the supremumof the absolute value of the difference of the probability density functions of two k-dimensionalrandom vectors.
Next, let us introduce briefly the result of Roussas (2001). Let ξ = (ξ1, . . . , ξm) and ξ ′= (ξ ′
1, . . . , ξ′m) be two
k-dimensional random vectors with respective characteristic functions (ch.f.’s) ϕξ and ϕ′
ξ , which satisfy that:
✩ This research is supported by theNational Natural Science Foundation of China (11061029 and 11061007) and theNatural Science Foundation of Jiangxi(2008GZS0046).∗ Corresponding author.
E-mail addresses: [email protected] (Y. Li), [email protected] (S. Yang), [email protected] (C. Wei).
0167-7152/$ – see front matter© 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.spl.2010.10.004
Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258 251
Assumption 1.1. The ch.f.’s ϕξ and ϕ′
ξ are absolutely integrable and the p.d.f.’s fξ and fξ ′ are bounded and satisfy a Lipschitzcondition of order one, that is, for every x ∈ Rm and some finite positive constant C, fξ (x+ u) − fξ (x) ≤ C
∑mj=1 |uj|, fξ ′(x+
u) − fξ ′(x) ≤ C∑m
j=1 |uj|. Then
Theorem 1.1 (Roussas, 2001). For any Tj > 0, j = 1, . . . ,m,
supx∈Rm
{|fξ (x) − fξ ′(x)|} ≤1
(2π)m
∫ Tm
−Tm· · ·
∫ T1
−T1|ϕξ (t) − ϕξ ′(t)|dt + 4C
√3
m−j=1
1Tj
. (1.1)
Further, if ξ1, . . . , ξm are associated, ξ ′
1, . . . , ξ′m are independent and ξ ′
j is distributed as ξj, then
supxj∈R,j=1,...,m
{|fξ1,...,ξm(x1, . . . , xm) − fξ1(x1) · · · fξm(xm)|}
≤1
4π k
−1≤i<j≤k
(T 2i T
2j )
∏l=i,j
Tl
|Cov(ξi, ξj)| + 4C
√3
k−j=1
1Tj
. (1.2)
In particular, for k = 2, assuming that Cov(ξ1, ξ2) = 0, then
supx1,x2∈R
{|fξ1,ξ2(x1, x2) − fξ1(x1)fξ2(x2)|} ≤
1
4π2+ 8C
√3
|Cov(ξ1, ξ2)|1/5. (1.3)
As is well known, the above inequalities are used effectively in studying approximations of p.d.f.’s by way of theCentral Limit Theorem. To the best of our knowledge, however, there have been no results available related to the Esseen-type inequality under mixing dependence, which are quite different with the dependence structure of the associated ormartingale difference variables.
Among various mixing conditions used in the literature, strong mixing is reasonably weak and has many practicalapplications (see Cai (1998) for more details). We know that the stationary autoregressivemoving average processes, whichare widely applied in time series analysis, are strong mixing with exponential mixing coefficient. Recently, Genon-Catahotet al. (2000) proved that continuous time diffusion models and stochastic volatility models are strongly mixing as well,which are the most popular models in the pricing theories of financial assets, such as the Black–Scholes pricing theory ofoptions.
In order to meet practical demand, in this paper, we mainly discuss the inequalities for the characteristic functions andprobability density functions based on stationary strong mixing random sequences. As its application, we establish theconsistency and asymptotic normality of a kernel density estimate for strong mixing random variables.
Recall that a sequence {ξi : i ∈ Z} is said to be strong mixing (or α-mixing) if the α-mixing coefficient, α(n) =
sup|P(AB) − P(A)P(B)| : A ∈ F k
1 , B ∈ F ∞
n+k
, converges to zero as n → ∞, where F n
m denotes the σ -algebra generatedby {ξi : m ≤ i ≤ n}.
Throughout this paper, the symbol C denotes a positive constant whose value may change from one place to another,bn = O(an) means bn ≤ Can, xn ∼ yn means that xn/yn tends to a constant as n → ∞, [x] denotes the integral part ofx, ‖ · ‖r = (E| · |
r)1/r .
2. Main inequalities
In this section, we investigate some inequalities for the supremum of the absolute value of the difference of thecharacteristic function for strongly mixing random vectors, and by which we establish inequalities for the supremum ofthe absolute value of the difference of the probability density functions for strongly mixing random vectors.
We first give the following assumption:
Assumption 2.1. Let ξ = (ξ1, . . . , ξm) and ξ ′= (ξ ′
1, . . . , ξ′m) be twom-dimensional random vectors, where, ξ1, . . . , ξm are
strong mixing random variables, ξ ′
1, . . . , ξ′m are independent and ξ ′
j is distributed as ξj.
Assumption 2.2. The ch.f.’s ϕξ and ϕ′
ξ are absolutely integrable and the p.d.f.’s fξ and fξ ′ are bounded and satisfy a Lipschitzcondition of order one.
Remark 2.1. By Assumption 2.1, we can get that fξ ′1,...,ξ
′m(x1, . . . , xm) = fξ1(x1) · · · fξm(xm), and ϕξ ′
1,...,ξ′m(x1, . . . , xm) =
ϕξ (x1) · · · ϕξm(xm).
Our main results in this section are then the following:
252 Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258
Theorem 2.1. Let {ξi; 1 ≤ i ≤ m} be stationary strong mixing random variables. If r > 0, s > 0 and 1/s+1/r = 1. Then thereexists a constant C such that
|ϕξ1,...,ξm(t1, . . . , tm) − ϕξ1(t1) · · · ϕξm(tm)| ≤ Cα1/s(1)m−i=1
|ti| · ‖ξi‖r . (2.1)
In particular, for ξi, ξj, (j > i), inequality (2.1) yieldsϕξi,ξj(ti, tj) − ϕξi(ti)ϕξj(tj) ≤ Cα1/s(j − i)
−l=i,j
|tl| · ‖ξl‖r . (2.2)
Theorem 2.2. Let {ξi; i ≥ 1} be a stationary strong mixing sequence of random variables. Let p0 and q0 be positive integers. Setηj =
∑(j−1)(p0+q0)+p0i=(j−1)(p0+q0)+1 ξi for 1 ≤ j ≤ m. If r > 0, s > 0 and 1/s + 1/r = 1, then there exists constant C such that
ϕη1,...,ηm(t, . . . , t) − ϕη1(t) · · · ϕηm(t) ≤ C |t|α1/s(q0)
m−j=1
‖ηj‖r .
Theorem 2.3. Assume that Assumptions 2.1 and 2.2 hold. Then for any Tj > 0, j = 1, . . . ,m, we have
supxj∈R,j=1,...,m
{|fξ1,...,ξm(x1, . . . , xm) − fξ1(x1) · · · fξm(xm)|} ≤1
2πmα1/s(1)
m−i=1
T 2i
∏l=i
Tl
‖ξi‖r + 4C
√3
m−j=1
1Tj
.
Theorem 2.4. Assume that Assumptions 2.1 and 2.2 hold. If α1/s(|j − i|) = 0, then we have
supxi,xj∈R
{|fξi,ξj(xi, xj) − fξi(xi)fξj(xj)|} ≤
1
2π2
−l=i,j
‖ξl‖r + 8C√3
α
14s (|j − i|).
Now, we provide the proofs of the theorems above.
Proof of Theorem 2.1. We only give the proof of the inequality of (2.1). The proof of the inequality of (2.2) is analogous. Itis easy to see that
I0 :=ϕξ1,...,ξm(t1, . . . , tm) − ϕξ1(t1) · · · ϕξm(tm)
=ϕξ1,...,ξm(t1, . . . , tm) − ϕξ1,...,ξm−1(t1, . . . , tm−1) · ϕξm(tm)
+ϕξ1,...,ξm−1(t1, . . . , tm−1) − ϕξ1(t1) · · · ϕξm−1(tm−1)
=: I1 + I2. (2.3)
Noting that exp(ix) = cos(x)+ i sin(x), sin(x+y) = sin(x) cos(y)+cos(x) sin(y), cos(x+y) = cos(x) cos(y)−sin(x) sin(y),we have that
I1 =
E exp
i
m−l=1
tlξl
− E exp
im−1−l=1
tlξl
E exp(itξm)
≤
Covcos
m−1−l=1
tlξl
, cos(tmξm)
+Cov
sin
m−1−l=1
tlξl
, sin(tmξm)
+
Covsin
m−1−l=1
tlξl
, cos(tmξm)
+Cov
cos
m−1−l=1
tlξl
, sin(tmξm)
=: I11 + I12 + I13 + I14. (2.4)
By Lemma A.1 in the Appendix and | sin(x)| ≤ |x|, we have
I12 ≤ Cα1/s(1)‖ sin(tmξm)‖r ≤ Cα1/s(1)|tm| · ‖ξm‖r ,
I14 ≤ Cα1/s(1)|tm| · ‖ηm‖r .(2.5)
Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258 253
Now, note that cos(2x) = 1 − 2 sin2(x). Then it follows that
I11 =
Covcos
m−1−l=1
tlξl
, 1 − 2 sin2(tmξm/2)
= 2
Covcos
m−1−l=1
tlξl
, sin2(tmξm/2)
≤ Cα1/s(1)E1/r
| sin(tmξm/2)|2r
≤ Cα1/s(1)E1/r| sin(tmξm/2)|r ≤ Cα1/s(1)|tm| · ‖ξm‖r . (2.6)
Similarly,
I13 ≤ Cα1/s(1)|tm| · ‖ξm‖r . (2.7)
From (2.4)–(2.7), we obtain
I1 ≤ Cα1/s(1)|tm| · ‖ξm‖r . (2.8)
Thus, from (2.3) and (2.8), we obtain
I0 =ϕξ1,...,ξm(t1, . . . , tm) − ϕξ1(t1) · · · ϕξm(tm)
≤ Cα1/s(1)|tm| · ‖ξm‖r + I2. (2.9)
For I2 in (2.9), using the same decomposition as in (2.3) above, we obtain
I2 =ϕξ1,...,ξm−1(t1, . . . , tm−1) − ϕξ1(t1) · · · ϕξm−1(tm−1)
=ϕξ1,...,ξm−1(t1, . . . , tm−1) − ϕξ1,...,ξm−2(t1, . . . , tm−2) · ϕξm−1(tm−1)
+ϕξ1,...,ξm−2(t1, . . . , tm−2) − ϕξ1(t1) · · · ϕξm−2(tm−2)
=: I3 + I4, (2.10)
and similarly to the calculation of I1, we get I3 ≤ Cα1/s(1)|tm−1| · ‖ξm−1‖r . Thus, we obtain
I2 ≤ Cα1/s(1)|tm−1| · ‖ξm−1‖r + I4. (2.11)
From (2.9)–(2.11), and constantly repeating the above procedure, we can complete the proof of (2.1) of the theorem. �
Proof of Theorem 2.2. The proof is almost identical to the proof of Theorem 2.1, so we omit the details. �
Proof of Theorem 2.3. By the Esseen-type inequality of (1.1) in Theorem 1.1, we have
supxj∈R,j=1,...,m
{|fξ1,...,ξm(x1, . . . , xm) − fξ1(x1) · · · fξm(xm)|}
≤1
(2π)m
∫ Tm
−Tm· · ·
∫ T1
−T1|ϕξ1,...,ξm(t1, . . . , tm) − ϕξ1(t1) · · · ϕξm(tm)|dt1 · · · dtm + 4C
√3
m−j=1
1Tj
. (2.12)
Thus, the proof of the theorem is completed by (2.1) in Theorem 2.1. �
Proof of Theorem 2.4. By Theorem 2.3, we obtain
supxi,xj∈R
{|fξi,ξj(xi, xj) − fξi(xi)fξj(xj)|} ≤TiTj2π2
α1/s(|j − i|)−l=i,j
Tl‖ξl‖r + 4C√3−l=i,j
1Tl
. (2.13)
And by taking Ti = Tj = (α1/s(|j − i|))−1/4, inequality (2.13) yields
supxi,xj∈R
{|fξi,ξj(xi, xj) − fξi(xi)fξj(xj)|} ≤
1
2π2
−l=i,j
‖ξl‖r + 8C√3
(α1/s(|j − i|))1/4.
The proof of Theorem 2.4 is completed. �
3. Applications
To show the application of the inequalities in Section 2, here we discuss the consistency and asymptotic normality of thekernel estimate for the probability density function with strongly mixing random variables.
Nonparametric estimation of a probability density is an interesting problem in statistical inference and has an importantrole in communication theory and pattern recognition. There are too many references to be included. We only refer to
254 Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258
Alejandro (1997) for the asymptotic properties of the kernel type nonparametric estimation for the derivatives of the densityfunction with a strongly mixing time series, Kim and Lee (2005) for the consistency and Central Limit Theorem of the kerneldensity estimator for strong mixing processes.
Consider the problem of estimating the p.d.f. f of the strictly stationary strong mixing random variables X1, . . . , Xn bythe kernel estimate. The kernel estimate of f (x) is given by
fn(x) =1
nhn
n−j=1
Kx − Xj
hn
, (3.1)
where hn is a sequence of positive bandwidths tending to zero as n → ∞, K(x) be a known kernel function. Set
Znj =1
√nhn
[Kx − Xj
hn
− EK
x − Xj
hn
], then fn(x) − Efn(x) =
1√nhn
n−j=1
Znj. (3.2)
We split the sum∑n
j=1 Znj into large blocks and small blocks as follows. Let 0 < p = pn < n, 0 < q = qn < n be integerstending to ∞ along with n → ∞, and q < p. Let 0 ≤ k = kn → ∞ (as n → ∞) be defined by k = [n/(p + q)]. Thenk(p + q) ≤ n and k(p + q)/n → 1. For m = 1, . . . , k, split the set {1, 2, . . . , n} into k (large) p-blocks, Im, and k (small)q-blocks, Jm, as follows:
Im = {i; i = (m − 1)(p + q) + 1, . . . , (m − 1)(p + q) + p},Jm = {j; j = (m − 1)(p + q) + p + 1, . . . ,m(p + q)},
the remaining points form the set: {l; k(p + q) + p + 1 ≤ l ≤ n} (which may be ∅). Form = 1, . . . , k, set
ynm =
(m−1)(p+q)+p−i=(m−1)(p+q)+1
Zni, y′
nm =
m(p+q)−j=(m−1)(p+q)+p+1
Znj, y′′
nk =
n−l=k(p+q)+1
Znl. (3.3)
Also set
Sn =
n−j=1
Znj, Tn =
k−m=1
ynm, T ′
n =
k−m=1
y′
nm, T ′′
n = y′′
nk, (3.4)
so that
Sn = Tn + T ′
n + T ′′
n . (3.5)
We will make use of the following assumptions gathered together for easy reference.
Assumption 3.1. (i) The randomvariables X1, X2, . . . , form a strictly stationary strongmixing sequence, having the p.d.f.fwith a bounded derivative in R.
(ii) For each j ≥ 2, the joint p.d.f.’s fX1,Xj(x1, xj) are bounded and satisfy a Lipschitz condition of order one, and ch.f.’s ϕX1,Xjare absolutely integrable.
(iii) There exists δ > 0 such that∑
∞
n=1 α(n)δ/(2+δ) < ∞.(iv) supj≥1 E|Xj|
r < ∞, where s > 0, r > 0 with 1/s + 1/r = 1.
Assumption 3.2. The function K(·) is a known p.d.f. such that:
K(·) ≤ C, |u|K(u) → 0, as |u| → 0,∫R|u|K(u)du < ∞, and
∫R|u|K 2(u)du < ∞.
Assumption 3.3. The bandwidth hn is such that: 0 < hn → 0 and nhn → ∞.
Assumption 3.4. (i) np−1α(q) → 0, (ii) p2/nhn → 0.
Our main results of this section are then the following:
Theorem 3.1. Let the strong mixing random variables X1, . . . , Xn satisfy Assumptions 1.1, 2.1 and 3.1–3.4. Then
fn(x)P
→ f (x), x ∈ R.
Theorem 3.2. Let the strongmixing randomvariables X1, . . . , Xn satisfy Assumptions 1.1, 2.1 and 3.1–3.4. Then, for all continuitypoints x of f , x ∈ C(f ),
fn(x) − Efn(x)√Var(fn(x))
d→ N(0, 1).
Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258 255
Remark 3.1. Assumptions 3.1–3.3 are fairly mild. Assumption 3.1(iii) is used by Alejandro (1997), and implies thathn∑p−1
j=1 (α1/s(j))1/4 → 0 by taking δ ≤ 2/(4s − 1), where s > 0, r > 0 with 1/s + 1/r = 1, which can be used inthe proof of preliminary lemmas.
Remark 3.2. Assumption 3.4 is easily satisfied. For example if p and q are chosen as follows: with hn → 0, let p ∼
h−δ1n , q ∼ h−δ2
n , 0 < δ2 < δ1 < 1, set the mixing coefficients α(n) = r0n−θ (θ > 1, r0 > 0), then np−1α(q) ∼ nhθδ2−δ1n .
Thus Assumption 3.4(i) is satisfied, provided, nhθδ2−δ1n → 0. It is easily seen that Assumption 3.4(ii) is satisfied, provided
nh1+2δ1n → ∞.
Now, we provide the proofs of Theorems 3.1 and 3.2.
Proof of Theorem 3.1. We observe
|fn(x) − f (x)| ≤ |fn(x) − Efn(x)| + |Efn(x) − f (x)|. (3.6)
For the second term, according to Lemma A.5, we obtain
|Efn(x) − f (x)| → 0, x ∈ R. (3.7)
Now, we handle the first term. By the Cauchy–Schwarz inequality and Lemma A.4, we obtain Cov(Tn, T ′n) →
0, Cov(Tn, T ′′n ) → 0, Cov(T ′
n, T′′n ) → 0. Then, we derive that
Var
n−
j=1
Znj
= Var(Tn) + Var(T ′
n) + Var(T ′′
n ) + 2Cov(Tn, T ′
n) + 2Cov(Tn, T ′′
n ) + 2Cov(T ′
n, T′′
n ) → σ 2(x). (3.8)
Hence, according to Assumption 3.3, we obtain
P(|fn(x) − Efn(x)| > ε) = P
n−j=1
Znj
> εnhn
≤
1ε2nhn
Var
n−
j=1
Znj
→ 0,
which implies that
Efn(x)P
→ fn(x), x ∈ R. (3.9)
Therefore, Theorem 3.1 is verified from (3.6), (3.7) and (3.9). �
Proof of Theorem 3.2. Let σ 2(x) = f (x)R K
2(u)du. Note that
fn(x) − Efn(x)√Var(fn(x))
=
√nhn[fn(x) − Efn(x)]√nhnVar(fn(x))
. (3.10)
By (3.8) in the proof of Theorem 3.1, we get that nhnVar(fn(x)) → σ 2(x) ∈ (0, +∞). Then, to prove Theorem 3.2, by meansof (3.2), (3.5) and (3.10), it suffices to show that
σ−1(x)nhn[fn(x) − Efn(x)] = σ−1(x)(Tn + T ′
n + T ′′
n )d
→ N(0, 1), x ∈ C(f ). (3.11)
By Lemma A.4, we obtain that Var(σ−1(x)T ′n) → 0,Var(σ−1(x)T ′′
n ) → 0. Thus, by (3.11), all we have to do is establish that
σ−1(x)Tn = σ−1(x)k−
m=1
ynm =
k−m=1
ξnmd
→ N(0, 1). (3.12)
For establishing the asymptotic normality of∑k
m=1 ξnm, assume that {ηnm : m = 1, . . . , k} are independent randomvariables, and the distribution of ηnm is the same as ξnm for m = 1, . . . , k. Then, we have Eηnm = 0,Var(ηnm) = Var(ξnm).
Let s2n =∑k
m=1 Var(ηnm), then by Lemma A.2(i), we know that s2n → 1. Setting Tnm = ηnm/sn, we obtain that{Tnm : m = 1, . . . , k} is an independent random sequence with ETnm = 0 and
∑km=1 Var(Tnm) = 1.
Noticing that φ∑km=1 ξnm
(t) = ϕξn1,...,ξnk(t, . . . , t), we obtainφ k∑m=1
ξnm(t) − exp
−
t2
2
≤ϕξn1,...,ξnk(t, . . . , t) − ϕξn1(t) · · · ϕξnk(t)
+
ϕTn1(t) · · · ϕTnk(t) − exp
−t2
2
=: T1 + T2.
256 Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258
Thus, convergence (3.12) will be established by showing that T1 → 0, T2 → 0.Applying Theorem 2.2 and Lemma A.2(i), we get
T1 ≤ C |t|α1/2(q)k−
m=1
‖ξnm‖2 ≤ C |t|α1/2(q)k−
m=1
km+p−1−i=km
σ−2(x)EZ2ni
1/2
≤ C |t|α1/2(q)k−
m=1
km+p−1−i=km
(kp)−1
1/2
≤ C |t|(np−1α(q))1/2
which, by Assumption 3.4(i), implies that T1 → 0.
Next, in order to prove T2 → 0, we need to show that∑k
m=1 Tnmd
−→ N(0, 1). According to Lemma A.3, for (3.4) it
suffices to show that∑k
m=1 ηnmd
−→ N(0, 1). By the Feller–Lindeberg criterion (Loeve, 1963, p. 280), it suffices to showthat, for every ε > 0
gn(ε) = k∫
(|x|≥ε)
x2dGn → 0,
where Gn is the distribution function of ηn1. From (3.3), |yn1| ≤ Cp√nhn where C is a bound for K(x), so that |ηn1| ≤
Cσ−1(x)p/√nhn. Thus, by Assumption 3.4(ii), we obtain that
gn(ε) = kE[η2n1I(|ηn1|≥ε)] ≤
C2kp2
σ 2(x)nhnP(|ηn1| ≥ ε)
≤C2kp2
σ 2(x)nhn
Var(ηn1)
ε2=
C2p2
σ 2(x)nhn
kVar(ηn1)
ε2=
C2
ε2
p2
nhn→ 0,
which completes the proof of Theorem 3.2. �
Acknowledgements
The authors would like to express their thanks to the referees and Editors for their many helpful comments. We are alsothankful to Guodong Xing who gave us some helpful advice in improving the quality of the paper.
Appendix
Lemma A.1 (Roussas and Ioannides, 1987). Let {ξi, i ≥ 1} be a sequence of strong mixing random variables, and ξ ∈ F m1 , η ∈
F ∞m+n. If E|ξ |
s < ∞, |η| ≤ B < ∞ a.s. and 1/s + 1/r = 1 where s, r > 0, then |E(ξη) − (Eξ)(Eη)| ≤ 6Bα1/r(n)‖ξ‖s.
Let σ 2(x) = f (x)R K
2(u)du, under parts only or all of Assumptions 2.1, 2.2 and 3.1–3.3, then we have
Lemma A.2. With Znj given by (3.2), the following hold: (i) pkVar(Zn1) → σ 2(x); (ii) |Cov(Zn1, Znj)| ≤ Chnn−1(α1/s
(j − 1))1/4, j ≥ 2; (iii) k∑
1≤i<j≤p Cov(Zni, Znj) → 0.
Lemma A.3. With ynj given by (3.3), the following hold:(i) kVar(yn1) → σ 2(x); (ii) |Cov(yn1, yn,l+1)| → 0, l ≥ 1;(iii)
∑1≤l<v≤k Cov(ynl, ynv)
→ 0; (iv) kVar(y′
n1) → 0; (v)∑
1≤l<v≤k Cov(y′
nl, y′nv) → 0.
Lemma A.4. With Tn, T ′n and T ′′
n given by (3.4), the following hold:(i) Var(Tn) → σ 2(x); (ii) Var(T ′
n) → 0; (iii) Var(T ′′n ) → 0.
Lemma A.5. Efn(x) → f (x), x ∈ R holds.Now, we give the proof of Lemmas A.2–A.5.
Proof of Lemma A.2. (i) By using pk/n → 1 and hn → 0, we have
pkVar(Zn1) = (pk/n)∫RK 2(u)f (x − hnu)du − (pk/n)hn
[∫RK(u)f (x − hnu)du
]2→ f (x)
∫RK 2(u)du.
Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258 257
(ii) It is easily seen that by Theorem 2.4 and Assumption 3.1(iv)
|Cov(Zn1, Znj)| =1
nhn
∫R2
Kx − uhn
Kx − v
hn
[fX1,Xj(u, v) − fX1(u)fXj(v)]dudv
=
hn
n
∫R2
K(u)K(v)[fX1,Xj(x − uhn, x − vhn) − fX1(x − uhn)fXj(x − vhn)]dudv
=hn
nsupu,v∈R
{|fX1,Xj(u, v) − fX1(u)fXj(v)|} ≤ Chnn−1(α1/s(j − 1))1/4.
(iii) By stationarity and part (ii) of the lemma, Remark 3.1, let δ ≤ 2/(4s − 1) in Assumption 3.1(iii), we obtain thatk −1≤i<j≤p
Cov(Zni, Znj)
≤
k p−1−j=1
(p − 1)Cov(Zn1, Zn,j+1)
≤ pkp−1−j=1
|Cov(Zn1, Zn,j+1)|
≤ Cpkn−1hn
p−1−j=1
(α1/s(j))1/4 → 0. �
Proof of Lemma A.3. (i) Observe that by Assumption 3.1
kVar(yn1) = kVar
p−
i=1
Zni
= pkVar(Zn1) + k
−1≤i<j≤p
Cov(Zni, Znj).
By parts (i) and (iii) of Lemma A.2, part (i) follows.(ii) By stationarity, part (ii) of Lemma A.2, Remark 3.1, and taking δ ≤ 2/(4s − 1) in Assumption 3.1(iii), we obtain
|Cov(yn1, yn,l+1)| =
p−i=1
l(p+q)+p−j=l(p+q)+1
|Cov(Zni, Znj)|
≤
p−u=1
(p − u + 1)Cov(Zn1, Zn,l(p+q)+u)
+ p−u=2
(p − u + 1)Cov(Znu, Zn,l(p+q)+1)
≤
p−u=1
(p − u + 1)Cov(Zn1, Zn,l(p+q)+1)
+ p−u=1
(p − u)Cov(Zn1, Zn,l(p+q)−u+1)
≤ 2p
l(p+q)+p−u=l(p+q)−p
Cov(Zn1, Znu) ≤ Chnn−1pl(p+q)+p−
u=l(p+q)−p
(α1/s(u − 1))1/4 → 0.
(iii) By stationarity, part (ii), Remark 3.1, and again taking δ ≤ 2/(4s − 1) in Assumption 3.1(iii), we have −1≤l<v≤k
Cov(ynl, ynv)
=
k−1−l=1
(k − l)Cov(yn1, yn,l+1)
≤
k k−1−l=1
Cov(yn1, yn,l+1)
≤ k
k−1−l=1
Chnn−1pl(p+q)+p−
u=l(p+q)−p
(α1/s(u − 1))1/4
≤ kChnn−1p(k−1)(p+q)+p−
u=q
(α1/s(u − 1))1/4 → 0.
(iv) By parts (i) and (iii) of Lemma A.2, and q/p → 0, we have
kVar(y′
n1) = qkVar(Zn1) + k−
1≤i<j≤q
Cov(Zni, Znj)
≤ (q/p)pkVar(Zn1) + k−
1≤i<j≤q
Cov(Zni, Znj) → 0.
(v) Again, working as in the proof of part (iii) of the lemma, Remark 3.1, and taking δ ≤ 2/(4s − 1) in Assumption 3.3(iii),we obtain −
1≤l<v≤k
Cov(y′
nl, y′
nv)
≤ Ckqn−1hn
k−1−l=1
l(p+q)+q−u=l(p+q)
(α1/s(u − 1))1/4
≤ Ckqn−1hn
(k−1)(p+q)+q−u=p+q
(α1/s(u − 1))1/4 → 0. �
258 Y. Li et al. / Statistics and Probability Letters 81 (2011) 250–258
Proof of Lemma A.4. (i) Since
Var(Tn) = Var
k−
m=1
ynm
= kVar(yn1) + 2
−1≤l<v≤k
Cov(ynl, ynv),
the result follows by parts (i) and (iii) of Lemma A.3.(ii) Note that
Var(T ′
n) = kVar(y′
n1) + 2−
1≤l<v≤k
Cov(y′
nl, y′
nv),
so it is easy to verify part (ii) by parts (iv) and (v) of Lemma A.3.(iii) It can be easily seen that
Var(T ′′
n ) = Var
n−
i=k(p+q)+1
Zni
≤ pVar(Zn1) + 2
−1≤i<j≤p
Cov(Zni, Znj),
then by parts (i) and (iii) of Lemma A.2, part (iii) follows. �
Proof of Lemma A.5. Note that the probability density function f has a bounded derivative in R, and Assumptions 3.2 and3.3 hold. Then the result follows by standard arguments. �
References
Alejandro, Q.R., 1997. Nonparametric estimation of density derivatives of dependent data. J. Statist. Plann. Inference 61, 155–174.Bagai, I., Prakasa Rao, B.L.S., 1991. Estimation of the survival function for stationary associated processes. Statist. Probab. Lett. 12, 385–391.Cai, Z., 1998. Asymptotic properties of Kaplan–Meier estimator for censored dependent data. Statist. Probab. Lett. 37, 381–389.Gamkrelidze, N.G., 1977. Esseen’s inequality for multidimensional distribution functions. Theory Probab. Appl. 22, 877–880.Genon-Catahot, V., Jeantheau, T., Laredo, C., 2000. Stochastic volatility models as hidden Markov models and applications. Bernoulli 6 (6), 1051–1079.Kim, T.Y., Lee, S., 2005. Kernel density estimator for strong mixing processes. J. Statist. Plann. Inference 133 (2), 273–284.Loeve, M., 1963. Probability Theory, 3rd ed. Van Nostrand, Princeton, NJ.Rao, P., 2002. Another Esseen-type inequality for multivariate probability density functions. Statist. Probab. Lett. 60, 191–199.Roussas, G.G., 1991. Kernel estimates under association: strong uniform consistency. Statist. Probab. Lett. 12, 393–403.Roussas, G.G., 1995. Asymptotic normality of a smooth estimate of a random field distribution function under association. Statist. Probab. Lett. 24, 77–90.Roussas, G.G., 2001. An Esseen-type inequality for probability density functions with an application. Statist. Probab. Lett. 51, 397–408.Roussas, G.G., Ioannides, D.A., 1987. Moment inequalities for mixing sequences of random variables. Stoch. Anal. Appl. 5, 61–120.Sadikova, S.M., 1966. Twodimensional analogues of an inequality of Esseenwith applications to the central limit theorem. Theory Probab. Appl. 11, 325–335.