testing independent sparse heterogeneous mixtures
TRANSCRIPT
![Page 1: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/1.jpg)
ARTICLE IN PRESS
Statistics & Probability Letters 74 (2005) 205–211
0167-7152/$ -
doi:10.1016/j.
$This work�Tel.: +979
E-mail add
www.elsevier.com/locate/stapro
Testing independent sparse heterogeneous mixtures$
Johan Lim�
Department of Statistics, Texas A&M University, College Station, TX 77843-3143, USA
Received 12 November 2003; received in revised form 13 December 2004; accepted 1 April 2005
Available online 24 June 2005
Abstract
We study the sequential discernibility between the independent fair coin tossing m0 and the sparseheterogeneous mixtures HMða; gÞ � ð1� �nÞ � Bernoullið1=2Þ þ �n � Bernoulliðð1þ ynÞ=2Þ with �n�n�g andyn�n�a. Extending the result in Lim (2003), we show that HMða; gÞ and m0 are sequentially discernible when0oaþ gp0:5, but are not so when 0:5ogþ ao1. It will be shown that each of the three differentdiscernibility in [Lim, 2003. Testing stochastic processes: stationarity, independence, and ergodicity.Technical Report], (1) discernibility with an entire sample path, (2) uniformly discernibility with an entiresample path, and (3) sequentially discernibility, is equivalent to each other under this setting, andfurthermore, it follows that sequential decision procedures are equivalent to the decisions based on anentire sequence. Finally, we show that the coin tossing with finitely many trials of biased coins is notsequentially discernible from that with infinitely many trials.r 2005 Elsevier B.V. All rights reserved.
MSC: primary 62M07; 62G10, secondary 62F03
Keywords: Discernibility; Heterogeneous mixtures
1. Introduction
Mixture models are useful in describing a wide variety of random phenomena and, because oftheir flexibility in modeling, they have continued to receive increasing attention over the years
see front matter r 2005 Elsevier B.V. All rights reserved.
spl.2005.04.049
is supported in part by NSF grant #DMS-0072331.
8621 576; fax: +979 8453 144.
ress: [email protected].
![Page 2: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/2.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211206
from both practical and theoretical point of view. Fields in which mixture models are successfullyapplied include astronomy, biology, genetics, medicine, engineering, and many other fields inphysical and social sciences. In those disciplines, testing the number of mixture components ortesting homogeneity in mixture distributions are often the main research objective or the first steptoward it.Testing hypothesis in statistics is a procedure for deciding between two exclusive sets of
measures H0 and H1 using observations fX ig1i¼1. For deciding between H0 and H1, at each n, we
construct a testing function f nðfX igni¼1Þ which assigns observations to 0 (H0) or 1 (H1). Then, one
of the most fundamental question is when there exists a sequence of testing functions ff ng1n¼1 that
make finitely many errors with probability 1. In this paper, we are particularly interested in testing‘‘independence sparse heterogeneous departures’’ (ISHM) against to the ‘‘homogeneous’’counterpart. To be specific, the problem is on testing
HMða; gÞ � ð1� �nÞ � Bernoulli1
2
� �þ �n � Bernoulli
1þ yn
2
� �(1.1)
against the fair coin tossing, when �n � n�g and yn � n�a for g and a 2 ½0; 1�. un � n�g implies0olim infn!1 unngplim supn!1 unngo1. Here the sparsity implies that the average proportionof the biased coin in the first n draws is n�g; it thus follows that a finite dimensional empiricalmeasure does not deviate from that of the fair coin tossing when n ! 1. Also, it should bepointed out that, though our discussion in this paper is limited to model (1.1) of binaryobservations, the results of this paper can be extended to more general settings without difficultyas in Section 3.2 in Lim (2003).Testing ISHM is closely related to the phase transition in random coin tossing. The phase
transition in random coin tossing studies the minimal amount of bias that is detectable against afair coin tossing using infinite sequence of observations fX ng
1n¼1. One interesting result on the issue
is shown by Kakutani (1948) which states that mb and m0 are mutually singular if and only ifP1
n¼1b2n ¼ 1; mb is the distribution of independent coin tosses where b ¼ ðb1; b2; . . .Þ and bn is the
bias of the nth coin and m0 is the distribution of independent fair coin tosses. Here the mutualsingularity implies that the bias in mb is detectable from the fair coin tossing when the entiresequence of observations fX ng
1n¼1 is given. However, in statistical hypothesis testing point of view,
it is more interesting to assume the sequential observability of fX ng1n¼1. In the remainder of the
paper, partially motivated by the results of Kakutani (1948), we answer two questions on testingthe ISHM. First, we show that the sequential decisions are different from those based on an entiresample path. Second, we provide a necessary and sufficient condition for the minimal class ofheterogeneity that is sequentially discernible from the homogeneous counterpart.Before we proceed, it should be stated that this paper uses the following three different modes
of discernibility in Lim (2003).
Definition 1.1. Two classes of probability measures H0 and H1 on ðB1;R1Þ are Discernible with
an Entire Sample path (DES), if, for each P 2 H0 and Q 2 H1; there exists a measurable functionf P;Q : R1 ! f0; 1g such that f P;QðX
11 Þ ¼ 1; Q-a.s. and f P;QðX
11 Þ ¼ 0; P-a.s. When the discerning
function f P;Q does not depend on ðP;QÞ, H0 and H1 are called Uniformly Discernible with anEntire Sample path (UDES).
![Page 3: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/3.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211 207
Definition 1.2. Two classes of probability measures H0 and H1 are Sequentially Discernible (SD),if there exists a sequence of measurable functions f n : Rn ! f0; 1g such that limn!1 f nðx1;x2; . . . ;xnÞ ¼ 1 Q-a.s. and limn!1 f nðx1;x2; . . . ; xnÞ ¼ 0 P-a.s., for every P 2 H0 and Q 2 H1.
Under IID setting, UDES is equivalent to DES and they are trivial in the sense that any disjointfamilies of probability measures are UDES (DES) because infinite number of samples provideexact knowledge on the true distribution. On the contrary, several different notions of SD havebeen proposed in previous literature. Hoeffding and Wolfowitz (1958) proposed five differentclasses of tests relying on the sample size function N and the decision function fðX 1; . . . ;X nÞ.Here, N is a stopping time with respect to fsðX 1; . . . ;X nÞg
1n¼1; and f 2 sðX 1; . . . ;X nÞ is the testing
function in f0; 1g. Among their five classes, the class of tests PðNo1Þ ¼ 1 has been widely used inprevious literature and known as SD (Fisher and Van Ness, 1969; Cover, 1973; Dembo and Peres,1994; Kulkarni and Zeitouni, 1996; Lim, 2003), whereas recently Nobel (2003) considered the testwith continuous functions f n with a range of ½0; 1�.Suppose fDng
1n¼1 is an unobserved binary sequence and an independent coin with bias yn40 is
tossed when Dn ¼ 1; a fair coin is tossed otherwise; the coin with bias y implies that PðX ¼ 1Þ ¼ð1þ yÞ=2 and PðX ¼ 0Þ ¼ ð1� yÞ=2. Let mðY; lÞ withY ¼ ðy1; . . . ; yn; . . .Þ and l ¼ ðt1; t2; . . . ; tn; . . .Þbe the probability measure for the record of independent coin tosses, where Dn ¼ 1 if n 2 ft1; t2; . . .g,and 0 otherwise. We say that l has 0-density when limn!1 jl \ f1; 2; . . . ; ngj=n ¼ 0, where jAj is thenumber of elements in A. For every l, in particular, mð0; lÞ is the same measure of fair coin tossing,denoted by m0. Finally, let INFðyÞ be the collection of mðY; lÞ with limn!1 jl \ f1; 2; . . . ; ngj=n ¼ 0and jlj ¼ 1 and FINðyÞ be the collection of mðY; lÞ and jljo1 with yn (in Y) is y for every n.In time series setting, Lim (2003) shows that sequential procedures are not equivalent to those
with an entire sample path; Lim provides an example that is UDES but not SD by using theergodic process constructed by cut-and-stack procedure (Shields, 1991). In this paper, we showthat SD ¼ UDES ¼ DES in the above IID setting; hence the sequential procedures are equivalentto the procedures based on an entire sequence. In addition, Lemma 2.3 provides an example tonon-equivalence under general settings, which is simpler than that presented in Lim (2003).In the next section, we prove the following results. Firstly, we show that HMða; gÞ and m0 are
not DES when 0:5oaþ go1 by using the Kakutani’s Dichotomy (Theorem 2.A). Secondly,HMða; gÞ and m0 are shown to be SD when 0oaþ go0:5 (Theorem 2.1). Theorem 2.2 proves thatHMða; gÞ and m0 are SD when aþ g ¼ 0:5. In addition, it can be shown that the boundary for theUDES and DES is aþ g ¼ 0:5; thus, SD, UDES, and DES are equivalent under this setting.Finally, we prove that INFðyÞ and FINðyÞ are not SD for every y 2 ð0; 1� (Theorem 2.4).
2. Main results
2.1. aþ b ¼ 0:5 is the boundary for SD, UDES, and DES
First, Theorem 2.A shows that HMða; gÞ is not DES from m0, when aþ g 2 ð0:5; 1Þ. Second,Theorem 2.1 shows that they are SD when aþ g 2 ð0; 0:5Þ. Finally, Theorem 2.2 shows that, theyare SD when aþ g ¼ 0:5.
Theorem 2.A. HMða; gÞ and m0 are not DES when aþ g 2 ð0:5; 1Þ.
![Page 4: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/4.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211208
Proof. Let fX ng1n¼1 be the observation from HMða; gÞ. Then it has the same law with the
independent random coin tossing Y n, where Y n�Bernoulliðð1þ �nynÞ=2Þ. The bias of the nth coin(denoted by bn) has the order of n�ðaþgÞ and
P1
n¼1b2no1. It then follows that HMða; gÞ is not DES
from m0 from the Kakutani’s Dichotomy, which states that it is mutually singular to m0 whenP1
n¼1b2n ¼ 1; otherwise, it is absolutely continuous with respect to m0 (Kakutani, 1948). &
Theorem 2.1. HMða; gÞ and m0 are SD when aþ g 2 ð0; 0:5Þ.
Proof. Let Sn ¼Pn
i¼1X i. In fair coin tossing, it can be shown that Sn is eventually smaller thanð1þ �Þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
pm0-a.s. by the law of iterative logarithm. Thus it suffices to show that, for
every ml 2 HMða; gÞ,
ml Sn4ð1þ �Þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
peventually
n o¼ 1. (2.2)
To show (2.2), under ml,
P Snoð1þ �Þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
p� ¼ P fan � Sng4 an � ð1þ �Þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
pn o� ð2:3Þ
pConst. � exp �2 an � ð1þ �Þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog log n
p� 2n
� �ð2:4Þ
¼ Const. � expf�2 � n1�2ðaþgÞg, ð2:5Þ
where an ¼ EðSnÞ ¼Pn
k¼1k�ðaþgÞ
�n1�ðaþgÞ. Eq. (2.4) is from the Hoeffding inequality for the sumof independent bounded random variables (Hoeffding, 1963). Since (2.5) has a finite sum foraþ g 2 ð0; 0:5Þ, Sn is eventually larger than ð1þ �Þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
pfrom the Borel–Cantelli lemma.
Finally, the function
f nðX 1; . . . ;X nÞ ¼ I Sn4ð1þ �Þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2n log log n
p� sequentially discerns HMða; gÞ from m0. &
Theorem 2.2. HMða; gÞ and m0 are SD when aþ g ¼ 0:5.
Proof. Let fX ng1n¼1 be sequential binary observations. Define the sequence ftðnÞg
1n¼1 such that tðnÞ
is the smallest integer satisfying
�
tðnÞ4tðn � 1Þ andP � d1o 1tðnÞ�tðn�1ÞtðnÞ�tðn�1Þj¼1
1ffiffiffiffiffiffiffiffiffiffiffiffiffiffitðn�1Þþj
p od2 for positive constants d1 and d2.
Let mn ¼ maxfk : tðkÞpng and the statistics Zn be
Zn ¼1
mn
1ffiffiffiffiffibn
pXmn
k¼1
Stðk�1Þþ1;tðkÞ þ1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n � tðmnÞp StðmnÞþ1;n
( ),
where Sa;b ¼ X a þ X aþ1 þ � � � þ X b. Now we will show that
(i)
limn!1 Zn ¼ 0, m0-a.s. (ii) For every �40 and m1 2 HMða; gÞ, jZnj4� eventually, m1-a.s.![Page 5: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/5.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211 209
First, it can be shown that mn�1= n fromXn ffiffiffip
ffiffiffip
d1mnok¼1
1= k�ffiffiffin
pod2mn.
To show (i), letting Y k ¼ Stðk�1Þ;tðkÞ=ffiffiffiffiffibk
p, we find EY k ¼ 0 and the ð2pÞth moment
EjY kj2p ¼
1
bpk
� EfX tðk�1Þþ1 þ � � � þ X tðkÞg2p ¼ Oð1Þ. (2.6)
It follows
PfjZnj4�g ¼ P1
mn
Xmn
k¼1
Y k þ1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n � tðmnÞp StðmnÞþ1;n
����������4�
( )
pP1
mn
Xmn
k¼1
Y k
����������4 �
2
( )þ P
1
mn
1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffin � tðmnÞ
p StðmnÞþ1;n
����������4 �
2
( )
p64 � EXmn
k¼1
Y k
����������6
8<:
9=;,
fm6n�6g þ 64 � EfjStðnÞþ1;nj
6g=fðn � tðmnÞÞ3m6ng ð2:7Þ
¼ Oð1=m3nÞ, ð2:8Þ
where the inequality in (2.7) is from (2.6). Then (2.8) has a finite sum and, by the Borel–Cantellilemma,
m0fjZnjo�; eventuallyg ¼ 1.
To show (ii), let m1 be a measure in HMða; gÞ with aþ g ¼ 0:5. Under m1,
1
mn
Xmn
k¼1
Y k ¼d 1
mn
Xmn
k¼1
1ffiffiffiffiffibn
pXbn
j¼1
X%
ðk�1Þbnþj þ1
mn
Xmn
k¼1
1ffiffiffiffiffibn
pXbn
j¼1
1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðk � 1Þbn þ j
p41
mn
Xmn
k¼1
1ffiffiffiffiffibn
pXbn
j¼1
X%
ðk�1Þbnþj þ d1, ð2:9Þ
where X%
i is the result of fair coin tossing and A¼d
B in (2.9) implies that A and B have the samedistribution. Therefore, it can be shown that jZnj is larger than �, eventually ml-a.s. &
Theorem 2.1-2 show that HMða; gÞ and m0 are SD when aþ gp0:5, so they are UDES and DES.On the contrary, Theorem 2.A shows that they are not DES when aþ g40:5. Therefore, theboundary for SD is the same as those of UDES and DES.Similar problems in Gaussian mixtures have been recently addressed by Jin and Donoho (2003),
where they obtain the boundary of SD for g 2 ð0; 0:5Þ in the model �n � Nð0; 1Þ þ ð1� �nÞ � Nðmn; 1Þwith �n�n�g and mn�
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2q log n
p. In their work, the diverging mean of the heterogeneous
components allows the positive result, although g is larger than 0:5. In contrast, in Lim (2003), heprovides a positive result for regular mixture components with 0ogo0:5 by introducing a specific
![Page 6: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/6.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211210
type of dependence structure induced by a hidden renewal process; he assumes l is a random set oftimes where a renewal process fDng
1n¼1 visits a distinguished state.
2.2. INFðyÞ and FINðyÞ are not SD for every y 2 ð0; 1�
We first show that INFðyÞ and FINðyÞ are not SD when y ¼ 1. Then we prove that INFðyÞand FINðyÞ are not SD for every y 2 ð0; 1� by using the monotonicity between the models inTheorem 2.4.
Lemma 2.3. Let A be the set of infinite binary sequences (of 0 and 1) with finite numbers of 1,and let B be the set of infinite binary sequences (of 0 and 1) with infinite numbers of 1 suchthat the density of 1 is 0 in the sense that limn!1 jl \ f1; 2; . . . ; ngj=n ¼ 0. Then, there is no sequence
of functions f n : f0; 1gn ! f0; 1g satisfying that limn!1 f nðaÞ ¼ 0 for all a 2 A andlimn!1 f nðbÞ ¼ 1 for all b 2 B. Here f nðaÞ is defined as f nða1; a2; . . . ; anÞ where ða1; a2; . . . ; anÞ is
the first n components of a.
Proof. Suppose that there exist sequential discerning functions ff ng1n¼1 satisfying the condition in
the Lemma. Then it suffices to find an element l 2 B and a subsequence of ff ng1n¼1;say ff nk
g; thatsatisfy f nk
ðlÞ ¼ 0 for all k. Take any sequence b in B and choose any f n1satisfying
f n1ð0; 0; 0; . . .Þ ¼ 0. Replace the first n1 digits of b with 0 and denote the resulting sequence by
b1, which is clearly in B. Find the first nonzero entry in b1; denoted by m1; and let a1 be thesequence obtained by truncating b1 after the first nonzero entry and replacing those entries with 0.Since a1 is in A; there is n2 greater than maxðn1;m1Þ such that f n2
ða1Þ ¼ 0. Replace the first n2digits of b1 with those of a1. Let b2 is the resulting sequence which is again clearly in B. Note thatthe first n1 digits of b2 are identical to those of ð0; 0; . . .Þ; thus, f n1
ð0; 0; . . .Þ ¼ 0. Similarly, since thefirst n2 digits of b2 are identical to those of a1; f n2
ða1Þ ¼ 0. Inductively find a sequence bn in B.Finally l ¼ limn!1 bn defines a sequence in B such that f nk
ðlÞ ¼ 0 for all k. This is acontradiction. &
Theorem 2.4. INFðyÞ and FINðyÞ are not SD for every y 2 ð0; 1�.
Proof. Let B be the set of sequences l such that limn!1 jl \ f1; 2; . . . ; ngj=n ¼ 0 and jlj ¼ 1, andA be that of sequences l with jljo1.Let fX ng
1n¼1 be a binary process with values in f�1; 1g from INFðyÞ. Then there is a sequence
fY ng1n¼1 in B satisfying.
X n ¼ Un � Y n þ Vn � ð1� Y nÞ, (2.10)
where Un has a value 1 with probability ð1þ yÞ=2 and otherwise �1. Suppose INFðyÞ is SD fromFINðyÞ with the sequence of testing functions ff ng
1n¼1; hence, limn!1 f nðX 1; . . . ;X nÞ ¼ 0 if m 2
FINðyÞ and limn!1 f nðX 1; . . . ;X nÞ ¼ 1 if m 2 INFðyÞ. Then, define the sequence of testingfunctions for FINðyÞ against INFðyÞ as follows:
gnðY 1; . . . ;Y nÞ ¼ f nðX 1; . . . ;X nÞ
¼ f 0nððY 1;U1;V1Þ; . . . ; ðY n;Un;VnÞÞ, ð2:11Þ
where f 0n is obtained by substituting X n with Un � Y n þ Vn � ð1� Y nÞ in f n. It follows that gn
converges to 1 when fY ng1n¼1 is from B. Likewise, it can be shown that limn!1 f nðX 1; . . . ;X nÞ ¼ 0
![Page 7: Testing independent sparse heterogeneous mixtures](https://reader036.vdocuments.mx/reader036/viewer/2022080102/5750204f1a28ab877e9a1b98/html5/thumbnails/7.jpg)
ARTICLE IN PRESS
J. Lim / Statistics & Probability Letters 74 (2005) 205–211 211
if m 2 FINðyÞ, and fgng1n¼1 converges to 0 if fY ng
1n¼1 2 A. Hence, B and A are SD, which
contradicts Lemma 2.3. &
Devroy and Lugosi (2002) recently considered a similar testing problem in mixture models andshowed that the finite mixture classes are not SD from the mixture classes with infinite number ofmixture components. Their proof is similar to Lemma 2.3 in the sense that it finds a probabilitymeasure which contradicts to the existence of testing functions.
Acknowledgements
The author is grateful to Amir Dembo and anonymous referees for many suggestions.
References
Cover, T., 1973. On determining the irrationality of the mean of a random variable. Ann. Statist. 1, 862–871.
Dembo, A., Peres, Y., 1994. A topological criterion for hypothesis testing. Ann. Statist. 22, 106–117.
Devroy, L., Lugosi, G., 2002. Almost sure classification of densities. J. Nonparametric Statist. 14, 675–698.
Fisher, L., Van Ness, J.W., 1969. Distinguishability of probability measures. Ann. Math. Statist. 40, 381–399.
Hoeffding, W., 1963. Probabilistic inequality for sums of bounded random variables. J. Am. Statist. Association 58,
13–30.
Hoeffding, W., Wolfowitz, J., 1958. Distinguishability of sets of distributions (the case of independent and identically
distributed chance variables). Ann. Math. Statist. 29, 700–718.
Kakutani, S., 1948. On equivalence of infinite product measures. Ann. Math. 49, 214–224.
Kulkarni, S.R., Zeitouni, O., 1996. A general classification rule for probability measures. Ann. Stat. 23, 1393–1407.
Lim, J., 2003. Testing stochastic processes: stationarity, independence, and ergodicity. Technical Report, Department
of Statistics, Stanford University.
Nobel, A., 2003. Hypothesis testing for families of ergodic processes. Preprint
Shields, P.C., 1991. Cutting and stacking: a method for constructing stationary processes. IEEE Trans. Inform. Theory
37, 1605–1617.