a note on estimation in bernoulli trials with dependence
TRANSCRIPT
This article was downloaded by: [University of Toronto Libraries]On: 29 October 2014, At: 10:47Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsta20
A note on estimation in bernoulli trials withdependenceBertram Price aa Graduate School of Business Administration , New York University ,Published online: 27 Jun 2007.
To cite this article: Bertram Price (1976) A note on estimation in bernoulli trials with dependence, Communications inStatistics - Theory and Methods, 5:7, 661-671, DOI: 10.1080/03610927608827383
To link to this article: http://dx.doi.org/10.1080/03610927608827383
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purposeof the Content. Any opinions and views expressed in this publication are the opinions and views of theauthors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content shouldnot be relied upon and should be independently verified with primary sources of information. Taylorand Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses,damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connectionwith, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
COMMUN. STATIST.-THEOR. MFTH., A5(7), 661-671 (1976)
A NOTE ON ESTIMATION IN BERNOULLI TRIALS WITH DEPENDENCE
Bertram Price
Graduate School of Business Administration New York University
Ke_u Words & Phrases: Markac chains; ra t i o estimator; Monte CarZo evaluation; dependence parameter.
ABSTRACT
Finite sample properties of estimators for the parameters of
a dependent Bernoulli process are investigated using Monte Carlo
techniques. A ratio estimator is proposed for the dependence
parameter of the model and is compared to the approximate maximum
likelihood estimator given by Klotz. It is shown that both esti-
mators have a downward bias that is extreme in certain cases and
that samples well in excess of 200 may be necessary before the
asymptotic theory can be applied.
1. INTRODUCTION
The simple generalization of the Bernoulli trials model to a
Markov chain with an additional parameter that measures dependence
has great potential for analyzing many processes that arise in
both the physics1 and social sciences. The model has been used in
Copyright O 1976 by Marcel Dekker, Inc. All Rights Resewed. Neither this work nor any part may be reproduced or transmitted in any form or by any means, electron~c or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher.
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
662 PRICE
s t u d i e s of r a i n f a l l , b i r t h s , behavioral change, e f f ec t iveness of
weapon systems and with analyses of queueing systems (see Gabr ie l ,
1959; Kabak and P r i c e , 1974; Klotz , 1972; Reiger, 1968; Rustagi
and Sr ivas tava , 1968). I n general i t can serve a s a s t a r t i n g
point f o r cons t ruc t ing models of any two s t a t e s equen t i a l phe-
nomenon.
Klotz (1973) has given an a n a l y t i c formulation of t h e model.
He obtained t h e j o i n t d i s t r i b u t i o n of the s u f f i c i e n t s t a t i s t i c s
f o r the process and has e s t ab l i shed the asymptotic d i s t r i b u t i o n of
maximum l ike l ihood e s t ima to r s (m.1 . e . ' ~ ) and approximate m.1.e. '~
f o r the model parameters when both parameters a r e unknown. Moti-
va t ion f o r the approximation of t he m . 1 . e . ' ~ stems from t h e f a c t
t h a t t h e m . 1 . e . ' ~ a r i s e as so lu t ions t o a p a i r of nonlinear equa- ,
t i ons . The so lu t ions can be obtained numerically but i t is doubt-
f u l t h a t t he ex t r a complication of t h e computations can be j u s t i -
f i e d when compared t o the approximate m.1 . e . ' ~ which can be obtained
a s closed form expressions. Klotz (1973) proves t h a t t h e m . 1 . e . ' ~
and the approximate es t imators have the same l imi t ing d i s t r i b u t i o n s .
One very i n t u i t i v e and easy t o compute a l t e r n a t i v e es t imator
t h a t was not considered i s a r a t i o es t imator f o r t he dependence
parameter. Using a combination of a n a l y t i c and Monte Carlo tech-
niques i t i s shown i n t h i s paper t h a t the r a t i o es t imator i s a s
good a s t h e approximate m.1.e. As a by-product, some small sample
p rope r t i e s of both es t imators a r e e s t ab l i shed . Both the r a t i o
es t imator and the approximate m.1.e. of the dependence parameter
a r e shown t o exh ib i t a negat ive b i a s t h a t can be se r ious i n c e r t a i n
cases.
2. THE MODEL
Let X 1'
X2, . . . , X be a sequence of random va r i ab le s t h a t take n
the values 0 o r 1 a s i n the standard Bernoul l i model. Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
BERNOULLI TRIALS WITH DEPENDENCE
Following Klotz ( l973),
p [ Xi = 1) = 1 - P(Xi = 0) = p, i = 1,2 ,..., n . (2.1) and o
PI: Xi = 1 I Xi,l = 13 = A , i = 2, 3, ..., n. (2.2)
The remaining t r a n s i t i o n p r o b a b i l i t i e s may be e a s i l y derived from
(2.1) and ( 2.2). The parameters of t he model a r e p and k
s a t i s f y i n g ( i ) 0 5 p ?e: 1 and ( i i ) max (0, (2p - l ) / p ) < A S 1.
Note t h a t t h e case A = p corresponds t o independent Bernoul l i
t r i a l s .
The m . 1 . e . ' ~ a s we l l a s t he competing e s t ima to r s a r e funct ions n n
X X S = of t h e s u f f i c i e n t s t a t i s t i c s R = zio2 i-l i , Ci,lXi and
T = X + X . The e s t ima to r s given i n Klotz (1973) which approximate 1 n
A I
where q = 1 - p, m = n - 1 and the p o s i t i v e r o o t i s r e t a ined . The 4 - 4 ^ p a i r (n (k - A), n (p - p)) i s shown t o be asymptot ica l ly equivalent
t o the m.1 .e . '~ . The asymptotic d i s t r i b u t i 0 n . i ~ b i v a r i a t e normal
with zero means and covariance matrix
An a l t e r n a t e
i s t he r a t i o
es t imator f o r A t h a t i s suggested by i n t u i t i o n
- The asymptotic p rope r t i e s of ), can be e s t ab l i shed us ing s e r i e s
expansions i n R and S about t h e i r expected values. The
requi red moments of R and S a r e given below followed by t h e - p r o p e r t i e s of A .
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
PRICE
Let
so t h a t
3. ASYMPTOTIC PROPERTIES OF
By d i r e c t computation i t follows t h a t
n-2 (n-l)Var(r) = h p ( 1 - hp) +-
A ) ki
where
and (n-l)Cov(r,s)
Using these expressions i n a s e r i e s expansion (see Rao, 1965)
of about E(r) and E(s) and r e t a i n i n g f i r s t order terms, i t
# - follows t h a t n O, - A) has a l i m i t i n g d i s t r i b u t i o n t h a t is normal
and t h a t
By adding second order terms t o the expansion it follows t h a t
EK) = -& + o(n-1) . nP
(3 6)
I n suarmary, i t follows from (3 .1 ) th rough (3.6 ) t h a t f o r any
parameter poin t (p,A) contained i n t h e i n t e r i o r of t he parameter
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
BERNOULLI TRIALS WITH DEPENDENCE 665
- space ( i ) h, is cons i s t en t f o r A , ( i i ) t he asymptotic var iance of 4 - ,v
n (A - A) and n4< - A ) a r e t he same and hence 1 i s an asymp-
t o t i c a l l y e f f i c i e n t es t imator f o r A , and ( i i i ) f o r l a r g e samples -1
the b i a s i n i s -Aq/np where terms of order smaller than n
a r e ignored.
These asymptotic r e s u l t s suggest t h a t i t may be d i f f i c u l t t o
ob ta in accu ra t e es t imates of X e s p e c i a l l y when p i s small and
A i s l a rge . However, t hese r e s u l t s a r e no t necessa r i ly i n d i c a t i v e A
of the p rope r t i e s of h and 1 f o r reasonably s ized f i n i t e samples.
An a n a l y s i s of t hese es t imators f o r f o r f i n i t e samples i s given i n
Sect ion 4.
4. FINITE SAMPLE ANALYSIS
Comparisons by a n a l y t i c methods of t h e small sample p rope r t i e s A
of A and' begin with the j o i n t d i s t r i b u t i o n of the s u f f i c i e n t
s t a t i s t i c s f o r t h e dependent Bernoul l i process. The d i s t r i b u t i o n C)
is given i n Klotz (1973). Because of the complexity of A and the
d i s t r i b u t i o n of t h e s u f f i c i e n t s t a t i s t i c s i t i s not poss ib l e t o ob-
t a i n simple closed form expressions f o r t he mean and var iance of
t he competing es t imators except f o r t h e case p = A = #. I n t h a t
case t h e expected value of can be obtained by not ing t h a t
3 0 with p r o b a b i l i t y -
4 n
p r o b a b i l i t y - n ,"
X = 0, 1 i s def ined t o be zero) . Then (When S = ,rill
Using (4.1) leads to
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
666 PRICE
This expression i s useful i n t h a t it i s an exact r e s u l t t h a t serves
t o confirm the exis tence of negative b i a s and i s a reference point
t h a t can be used f o r comparison with t h e empirical r e s u l t s t h a t ,. follow. Expressions f o r the expected value of h and the variances
#.
of A and a r e very d i f f i c u l t t o obta in i n closed form even f o r
t h i s simple case. It is doubtful t h a t the expressions could be made
simple enough t o be of much use i n the remainder of t h i s paper.
The remaining ana lys i s i s based on es t imates of small sample
means and mean square e r r o r s t h a t were obtained using the Monte
Carlo technique. Sequences of Bernoul l i t r i a l s of length n - 40,
LOO, and 200 were used i n t h e analys is . For each f ixed p a i r of
parameter values and sample s i z e ( (p , h ) and n), n dependent
Bernoul l i v a r i a t e s were constructed us ing t h e random number gener-
a t i o n methods ava i l ab le on an IBM 370/145. The n vnr i a t e s were - C
used t o compute p , h and y. The process was r ep l i ca ted and means
and mean square e r r o r s were computed using t h e r e p l i c a t i o n s . For
each case (choice of p, A , n) t h e nlimber of r e p l i c a t i o n s was Large
enough t o inaure an es t imat ion e r r o r l ees than . O 1 with p robab i l i ty
equal t o 95f. The required number of r e p l i c a t i o n s was determined
us ing asymptotic expressions f o r variance and the normal approx-
imation.
Eighteen d i f f e r e n t parameter p a i r s were used. The r e s u l t s on
es t imat ing h appear i n Tablea I, 11, and 111.
5 . DISCUSSION OF RESULTS
The empirical r e s u l t s show t h a t t h e negative b i a s i n the e s t i -
mators of suggested by the asymptotic expressions holds f o r
f i n i t e samples. From Tables I through I11 it can be seen t h a t the re
i s no essential d i f f e rence i n b i a s between " X n d T. The empirical r e s u l t s confirm t h a t se r ious b ia s e x i s t s i n both
es t imators f o r cases where p i s small and i s large. See f o r Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
BERNOULLI TRIALS WfTtI DEPENDENCE 667
TABLE I
Empir ica l R e s u l t s ; n = 40
example t h e (p , X) p a i r s f o r p l e s s t h a n .5 and i n p a r t i c u l a r
( 1 7 ) 1 9 3 7 and 3 9 ) Turning t o e f f i c i e n c y ,
t h e r a t i o s M S E i ) / M S E c ) a r e r e l a t i v e l y c l o s e t o one i n a l l c a s e s . A -
However, t h e r e i s a h i n t t h a t ?, may b e s l i g h t l y b e t t e r t h a n A
when p 2 .5 and t h e sample i s smal l . A f i n a l judgement cannot b e n -
g i v e n on e f f i c i e n c y of h v e r s u s A s i n c e t h e r a t i o MSEG)/MSE(A")
found i n t h e t a b l e s i s s u b j e c t t o sampling v a r i a t i o n and t h e d e s i g n
o f t h e Monte C a r l o experiment was n o t c o n s t r u c t e d t o c o n t r o l t h i s
v a r i a t i o n .
B i a s and mean s q u a r e e r r o r f o r samples o f s i z e n = 200 a r e
compared w i t h b i a s and mean s q u a r e e r r o r determined from asympto t ic Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
PRf CE
TABLE I1
Empirical Resul t s ; n = 100
expressions i n Table I V . I t appears t h a t samples of s i z e 200 a r e
not s u f f i c i e n t l y l a r g e f o r the asymptotic va lues t o be accura te .
However, from a p r a c t i c a l po in t of view, even a t n = 200 t h e asymp-
t o t i c express ions may only cause s e r ious t r o u b l e i n a few cases
namely f o r (p , h ) p a i r s ( . I , . 3 ) , ( . I , . 5 ) , ( . I , .7) and ( .1, .9).
It i s c l e a r t h a t when the t r u e value of p i s smal ler than ,1,
with t h e exception of cases where j, s p, both b i a s and mean square
e r r o r would be excess ive and e s t ima t ion would only be poss ib l e with
samples much l a r g e r than 200.
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
BERNOULLI TRIALS WITH DEPENDENCE
TABLE I11
Empirical Results; n = 200
In general, a recommendation for samples much larger than 200
is sound unless there is some solid prior information available
about the plausible ranges of A and p. When both and p
are unknown at the outset, an insufficient sample may make it
impossible to identify those processes that have very small p
and/or relatively large A .
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014
PRICE
TABLE IV
Comparison of Empirical and Asymptotic Proper t ies of h f o r n 3 200
Estimeted Asymptotic Estimate,# Asymptot_ic P. X Bias* Bias* n MSE(h) n MSECj,)
.7,.7 .0020 .OO 15 .346 .300
.7,.9 .0020 .0019 ,124 .I29
.9,.9 .0010 .0005 .094 .lo0
* A l l values a r e negative.
ACKNOWLEDGEMENT
The author acknowledges the r e fe ree fo r the concise de r iva t ion
of EK) t h a t appears i n the paper and f o r other comments tha t have
been used t o improve the presenta t ion.
BIBLIOCRAPHY
Gabrie l , K.R. (1950). The Dis t r ibu t ion of t he Number of Successes i n a Sequence of Dependent Tr ia l s . Biometrika 46, 454-60. D
ownl
oade
d by
[U
nive
rsity
of
Tor
onto
Lib
rari
es]
at 1
0:47
29
Oct
ober
201
4
BERNOULLI TRIALS WITH DEPENDENCE 671
Kabak, I. and P r i c e , B. (1974). Ra t io Es t imates . i n Monte Carlo Simulat ions. Proc. 1974 Winter Simul. Conf. Elmont, .N.Y.: Assoc. Comp. Mach., I n c .
K lo t z , J. (1972). Markov Chain C lus t e r i ng of B i r t h s by Sex. E. 6 t h Berkeley Symp. Math. S t a t i s t . Prob . , 11. Berkeley: Univ. of Ca l i fo rn i a .
K lo t z , J. (1973). S t a t i s t i c a l I n f e r ence i n Be rnou l l i T r i a l s with Dependence. Ann. S t a t i s t . I., 373-9.
Rao, C.R. (1965). L inear S t a t i s t i c a l I n f e r ence and I t s Appl ica t ions . New York: John Wiley and Sons, Inc .
Reiger , Mary H. (1968). A Two S t a t e Markov Model f o r Behavioral Change. J. Amer. S t a t i s t . Assoc. 63, 993-9.
Rus tag i , J. and S r iva s t ava , R.C. (1968). Parameter Es t imat ion i n a Markov Dependent F i r i n g D i s t r i b u t i o n . Operat. Res. l6, 1222-7.
Received bfmciz 1975; revised December 1975; retyped February 1976.
Recommended i;y C. Z. Rzstagi, The Olzio State University.
Refereed anonpaus l y .
Dow
nloa
ded
by [
Uni
vers
ity o
f T
oron
to L
ibra
ries
] at
10:
47 2
9 O
ctob
er 2
014