grete usterud fenstad - uio

19
IVIETHODS FOR COI-TSTRUCTING ASYMPTOTICALLY NORMAL ESTIIviATORS by Grete Usterud Fenstad University of Oslo September 1965

Upload: others

Post on 19-Mar-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

IVIETHODS FOR COI-TSTRUCTING ASYMPTOTICALLY NORMAL ESTIIviATORS

by

Grete Usterud Fenstad

University of Oslo

September 1965

O. INTRODUCTION

In this paper which is a sequel to [4], we shall give a rather general theorem on constructing asymptotically normal esti­

mators with minimum asymptotic variances. This theorem genera­

lizes previous work of Ferguson [5], Chiang [3] and Stene [6]. The methods of proof are similar to Chiang [3]. In fact~ his methods

carry over completely; thus we refer the reader to his paper for

the details of proof.

The theorem is stated in Section 1. In Section 2 we use this result to give a simple treatment of the multinomial case, and

in Section 3 we shall apply it to the gamma-distribution, cons~­

dering estimators that caa be constructed by means of the general

method of Section 1.

-2-

ASSUl\fi>TION 3. The kxs-matrix

• • •

• A(Q) = -· - - -

cl_A1 ( Q)

dQk • • •

has rank k for every Q E. @. ASSUMPTION 4. There exists a symmetric, posi ti Yo def'j.n:i te

sxs-matrix d(Q) such that

• • A(Q)d(Q)a(Q) = A(Q)

where a(Q) = (aij(Q)) is the covariance matrix of ea:ch Zo(.•

Assumption 4 generalizes the usual condition that a(Q) is

positive definite. If a(Q) is positive definite one may choose

d(Q) = a(Q)-1• The generalization makes the treatment of the

multinomial distri!?ution very simple (see section 2). The observa­

tion that Assumption 4 is sufficient is due to Ferguson [5].

In a previous paper [4; Theorem 1], we have proved that if

Zo(, o(= 1, 2, ••• satisfy the Assumptions 1 to 4, there exists

a consistent, asymptotically normal estimator of Q. This esti-

mator depends on ~, ~= 1, 2, ••• only through

"' "' " 1 n Zn = ( zn 1 ' • • • ' zns) = n 2: Zo<.

o(..=1

on a neighborhood S of R = A(QD) = tA(Q); Q{~} "The esti­

mator is continuously differentiable with respect to Zn on s. We now make the following

.QE.£INITION. The matrix "~- is greater than. or- equal.--:t·o· --3--"'-

(A),. B) if A-B is positive semi-de:fini te.

T""' -·- .L li and B are two covariance m::~trices of the same ·order

and A 1- B, it follows that the variances of A is greater than

or equal to the corresponding variances of B. -a In the aforementioned paper it was also provedl •; Theorem 2 J

that any consistent estimator being a function of Zn on S and

continuously differentiable on S has an asymptotic covariance

matrix greater than or equal to

Thus the problem is to construct estimators with this asymptotic

covariance matrix. The following theorem gives one particular

method to construct such estimators.

THBOHEivi. Let the functions ••• 9

t 0 li< S , g = ( g 1 , • • • , g S ) OD. ~~~~ S t 0 i~) S F;J1.d C . . , i, j = 1 , •• 9 S lJ (k:)_ sx tffi to ·,'h:' 1 t f th d · t · b l TI J:' • on 1 ~ ,r sa is ·y ~e con 1 1ons e ow. e£1ne a

quadratic form Q by

where " = ( c . . ( zn, g)) • lJ Then

( i) as n -1 oo , there exists, with a probabili~y ·tending

to 1, one and only one function Q(Zn) which locally "

minimizes the quadr8tic form Q(Zn,G);

(ii) G(Zn) is a consistent estimator of Q09 the true

parameter;

-4-

(iii) Q(Zn) is continuously dj_fferentio.ble with re:spect

to Zn;

(iv) Vn([g(zn) - Q0 ] is asymptotically normal distributed

with mean 0;

(v) Q(Zn) has asymptotic covariance matrix

~~.(Qo)d(Qo) -<\(Qo) J -1 •

fhe conditions on f are~

(a) f nay be differentie.ted twice with respect to Q and A

once with respect to Zn, and the derivatives are all si~ultaneously

continuous in zn and Q· ' "

(b) I -? f 1 ( zn, Q)

I a g, • { f(Q) = I - - -

"' \ 0 f1(Zn,Q) \ \ ~ Q

v k

. . . " \

-~f (Z Q) \ 0 s n' \

3 Q1 \

" -()f s ( zn' g) . . .

d Qk A( Q) 'Q

is of rank k for every Q~ ® and

(c) ( 0 fi (Zn,Q) \

:::: 0 for every i,j = 1 ' s d z . )

.. ' ' nJ A( Q) 'Q

The conditions on g are~

(d) g is continuously differentiable with respect to 17 0

~.~n,

(e) g(A(Q)) = f(A(Q),Q) for every Q E--@

(f) g(A(Q')) =l= f(A(Q') ,Q) for every Q' =f= Q

A .... \ (g)

·a g1 ( zn) d gs(Zn) \ .... ... 0 zn1 a zn1 \

. I g(Q) = - - - - - - I "'

rogs(Zn) . . . ----" ~z

\; :~~z:)-(7Z

\ ns '""' ns J A(Q)

is non-singular for every Q E. @ ; and

. . . (h) f(Q) = A(Q) g(Q)

The conditions on c = (cijl_ are:

i,j = 1, ••• , s may be differentiated twice

with respect to Q and once with respect to Zn, and the deriva­

tives are simultaneously continuous in Zn and Q;

(j) c is symmetric and _9ositive definite for every

( Zn, Q) 6 ff<Sx ([1 ; and • • i

(k) g(Q) c(Q) g(Q) = d(Q) for every Q~ ® , where

c(Q) = (cij(A(Q)~Q)) •

The proof of this theor2m is similar to the proof of Theo­

rem 6 (and Theorem 2) in Chiang [ 3 J , and the reader may be re-

ferred to that paper for the details.

In the definition of Q and in the conditions imposed on

f, g and c we have implicitly assumed that the various require-

ments hold for all zn and Q, It suffices, however, to 1 .. equire

that Q is defined on and that (a) - (h) hold in some open set

containing (A(Qo),Qo) and that (i) - (k) hold in some open set

containing A(Qo), because zn converges to A(Qo) in probability

and becG.use we are only deali:.1g vvi th Q.S37n:ptotic :properties.

The extension made in relation to Ferguson[ 5] .and Chiang[$]

is that we allow the function f in the quadratic form to depend ....

on both zn and Q, requiring only that f(A(Q),Q) = g(A(Q)).

This added freedom gives U'S a convenient method for finding ....

an ""'(Z Q) ..L n' in applying the theorem (this 11ethod was fil'\St pro-

" " ~"" .., posed by Stene :_r, j ) : For simplicity, let g( Zn) = Zn, then

f(A.(Q),G) must be equal to A(Q). Let Q*(zn) be a consistent,

continuously differentiable estimator of Q. Expanding A(Q) in " a Taylor series about Q = Q*(Z ) n '

we obtain for the two first terms A

g'~cz '' • A

A(Q)~A(Q*(Zn)) + (Q - A(G*(z )). n n

We then propose to choose

" A " . " f(Zn,G) = A(G*(zn)) + (G - Q7(- ( zn)) A ( g* ( zn) )

' and one easily verifies that this choice of f satisfies the con-

ditions of the theorem if A(Q) may be differentiated twice. This

:particular f leads to linear equations if c depends only

An analogous generalization is possible for the linear form

of Ferguson [5] .

Let f and g satisfy the condition (a) - (h) (actually,

condition (a) of f may·be replaced by (a)' f is continuously A "

differentiable with respect to (Zn,G)), and let b(Zn,Q) = A

(b (Zn,Q)) be a kxs-matrix with real elements satisfying the CR.i

conditions

(l) b , ~= 1, ••• ,k, i=1, ••• ,s is continuously dif­~i

"' ferentiable with respect to (Zn,Q);

• (m) b (g) f( Q) is non-singular for every.. Q c ® ; -~.vhere·';

b(Q) = b(A(Q),Q) ; and

(n) b(Q) g(Q)' = .Ll(Q)d(Q)

Then, as n --7 ro , there exists with a probabili t;sr tending " A A

to 1, one and only one function Q(Zn) which locally solves the

equation

" " " Q(Zn) is a consistent estimator of Q0 and continuously diffe-

rentiable with respect to Zn. :B"'urther normal distributed with covariance matrix

~ " e(Zn) is asymptoticallS

The proof of this statement may be carried out similar to

the proof of the theorem. The essential point is the use of the

implicit function theorem.

The definition of L and the conditions made on f 9 g and

b need only be satisfied "locally", as the remarks made in con-

nection with the above theorem also apply in this case.

2. APPLICATION TO THE MULTINOMIAL DISTRIBUTION

To illustrate the theorem of Section 1, we shall give an

application to the multinomial distribution. Consider a multino-

mial experiment of s mutually exclusive events R1 ' • •. ' Rs.

For each i the probability of R· l is pi ( Q) ' pi(Q)> o. Let

G if Ri occurs in the o( -th trial

z = o<i if Ri does not occur in the 6( -th trial

i = 1 ' ... ' s ' o<.= 1 ' 2' 0 • •

-8-

Using the notstion already introduced we have

z d... -- (Z

' . ' . ' z ) ·:\ = 1 ~ 2 ~ 0( 1 _),_ s

s A(Q) = (p1 (Q)' • • • ,ps(G))' 2:.- pi(G) = 1

i=1

a( G) (

= ( r-\ . . p . ( Q ) - p . ( Q ) p • ( Q ) ) ..... l J l l J

s Because of the condition ~ p1 (G) = 1 we have that a(G) is

i=1

singular, but

d(G)

! ( ) -1 fP1 G Q I

= \

\ 0

satisfJes the A.ssumption 4.

It is tacitly understood that Assumptions 2 and 3 are satis-

fieC~-. A

If we generate estimators of G by minimizing Q(Zn,G) A

or solving L(Zn,G) = 0 we get consistent, asymptotically normal

estimators rvi th covariance matrix

which is also the Cramer-Rao lower bound for estimators of G, viz.

where

- E( o<_ 1 [ C) ln p ( Z. ; Q )

n ro G~z_

p ( Z ~ ; G)= TI p. (G) z £X. i i=1 l

-9-

(It has been sho~n ( r 1 -~ '- _, and [ Lf J ) that if the com-

mon probability density of Zx, ~ = 1,2, ..• , p(.;G), satisfies

some additional regularity conditions, estimators generated lJJ> Dne

of these methods have asymptotic covariance matrix equal to the

Cram~r-Rao lower bound if and only if p(.;Q) is of the expo-

nential type, i.e.

p ( z ; Q) = exp fc( (G) + cb ( z ) + ~ C( ( Q) z l ) L 0 · i=1 i ij •

We may, if pi(G), i=1, ••• ,s is twice continuously differen-

tiable, for example 8hoose

A

fi(Zn,G) A

gi(Zn) A

c .. (Z ~G) lJ n

in the quadratic form

A

fi(Zn,G) A

=

=

=

Q,

=

pi(G)

z . nl--

6 . _,p. (G)-1 lJ l

or

pi(G) A

,, f

r i,j 1 ' = ... ' s

I J

I g. ( z )

l n = 2ni j i,j = 1, ••• , s

A

c .. (Z ,G) l J n = (" "-1

d i j z . nJ

getting respectively the minimum-;r 2 - estimators and the minimum­

modified- X 2 - estimators. /l

This is easily generalized to the case of several independent

multinomial experiments depending on the same basic parameter G.

In this case special choices of f,g and c will give the

result of Taylor [ 8 ; Theorem p.88J and thus, as pointed out by

2 r 2• Taylor, Berkson's minimum logi t ~ -method L J •

-10-

3. APPLICATION TO THE GAMMA-DISTRIBU'riON

We shall use the results of Section 1 to propose some new

estimators of the unknown parameters in the gamma-distribution.

Let Xj, j=1,2, ••• be independent, identically distributed

randoril variables with probability d·ensi ty

1 p(x;O(.' cr) = rcc() o-c(_

with respect to the Lebesque measure.

X Dt .. -1- (J

x e X I 0, ci..,U ) 0

The maximum likelihood estimators of oL , CJ are found as the

solution of the equations

\~ (o(*)+ -* ln U ( 1 ) I

ot....* v*

where 1 n

2n1 = L n j=1 i

and -~ (d.. ) = f"i (d.,_ )

['(.::{)

= 2n2

= 2n1

x. J

1 n = n z

j=1 lnX.

J

( * *) D(,C) is nsymptotically normal distributed with mean

(<X, d) and covariance matrix

1 1

n d- '-V' ( ot_)-1

foe.

\-u ~ u ) ' J.. 'tl (oL) <Y

(See for example Sverdrup [7; Ch. ~fl~, p.120]).

-11-

" " We get another set of estimators (((, cr) by solving the

equations

EX = 1 n

LX.= X n j=1 J

var X 1 n 2

= L (X. - X) = n j=1 J

One easily finds

EX = c<.. 0'"' and var X = cL C5' 2 ,

thus we get

" -2 X 0(_=

s2 ' " "

It may be shown that (ex., cr) is asymptotically normal distributed

with mean ( o(., <1 ) and covariance matrix

11 2o£.(ot.+1) iii

\-2(ot ... +1 )0"

-2 ( 0(_ + 1 )<J \

2 o( + 3 0'2 J oL

* * The asymptotic variances of o(. and (J are smaller than "

the asymptotic variances of c<.. and 0" (this will be shown later).

( ~* *) This suggests that ~ , cr should be preferred as estimator

* ·~ since we do not know any non-asymptotic properties of (0( , cr ) " "

or ( o1,., cr ) • However, a Monte-Carlo experiment has suggested for

some finite n and special values of (<i., cr) that the estimator

* * gives values aloser to the real parameter than ( oL , a).

Introducing

we may write

(2) cr=

-12-

Thus -)(-

o-*) " " " " (d ' depends on ( 2n1' 2n2) and (~' c;) depends on "' "

' 2n3) • vre shall propose an estimator depending on z = n "'

(Zn1 "

::(Zn1' zn2' zn3), hoping it will have the asymptotic properties of

* ( d... ' -}f- A A

cr ) and the finite properties of ( d. o-).

Put

lnXj,

then Z., j=1,2, ••• J

is a sequence of independent, identically dis-

tributed random vectors. In order to apply the theorem of 0ection 1

we need the first and second order moments of 2 Z = (X, lnX, X )

(deleting the index j), We know that

(3) co

f' C ""- ) crc( = J X

o{ -1 cs x e dx , 0

differentiating on both sidas gives

(4) I' I ( d_) () ol + r ( o(. ) (J" cJ._ ln fJ"

X 00 \ x 0(_ - 1 e a- dx = .J ln x

0

and once more

( 5 ) jl 11 ( d. ) U d._ + 2 r I ( o( ) (j '~ ln C) + r ( oL ) (J o(_ ( ln a ) 2 =

X

= ~ ( ln x) 2 xc/... - 1 e- () dx 0

Substituting d. .. + r for ol.. in ( 1) and dividing by r (d.) ad

gives

In the same way we

and from (5)

EXr = r ( oL + r ) (J r • 1.., ( cJ. )

obtain from (4)

= ( r' ( oL+r) + r ( o(. +r) ln CJ )cr r ' r(o() · rCx.>

Ex"(ln X) 2 = t~'f~Jr) + 2 ~· f~t) lnCJ + r~~+))(ln<d)(}r

1

-13-

These formulae give us

and

A ( ol. , IS" ) = ( eX.. r:r , 'f ( ct.. ) + ln o- , d... ( oc + 1 ) a 2 ) ,

a(o£..,0') = I

~(d.)

(2o<.+1)cr 2

2oc(«.+1)CJ' 3

(2cx..+1)c- 2

2o( ( o( + 1 ) ( 2o<.. + 3) (J 4

Differentiating A(oL,cr) we get

1 (j

(2CX+1)0" 2

2oL(o/.+1)Cl)

Assumptions 1 to 3 of Section 1 are easily verified to

hold. The covariance matrix a( oL, C)) is positive definite for

all ( oL, C)); if not there would have been a linear relation be-f"'\

tween X, ln X and X~ with probability 1 (Sverdrup [7; Ch. XII,

p. 16]). We may therefore c "loose d (Dl. , cr) = a(o( , () ) - 1 in Assump­

tion 4. Thus, applying one of the methods in Section 1, the con-

structed estimators will have asymptotic covariance matrix

1 c· . , l-1 n A( o!.., CJ )d( of..,, c; )A( .. :.L. ,V') .J .

To evaluate this matrix it is necessary to calculate

d(«.,o) = a(cX:.,cr)- 1 :

=: la(cx,o)l

, ( 2o: ( cx+1 ) ( 2cx+ 3) '1! (a:)-( 2cx+1 ) 2) c/J.

-4cx(cx+1)o5

-L~cx(o:+1 )o5

2cx2 (cx+1 )o6

( -2cx(cx+1 )'l'' (cx)+(2cx+1) )o;

0:04

(-2a:(cx+1 )'I'' (a:)+(2cx+1) )o3 0:04 )

-14-

where Ia( 0:., CJ )\ = cL[2oc(Q.+1) lf' (o() - (2c< +3)] cf 6 • This gives

(6) 1 I ( ol..'f \o()-1

which is the same asymptotic covariance matrix as the one given above

for the maximum likehood estimators.

RE1~RK. We are now able to show, as previously stated, that the A

asymptotic variances of ::J... and a- are greater than the asymptotic

* variances of ex. and * cl Since a( eX., d) is positive defi-

nite, it follows that

\ a ( ct., 0' ) \ = o1.l2o( ( o( + 1 ) lV' ( ci ) - ( 2<X. + 3 U d 6 ) 0 ,

and therefore

' ( ) 2oc +3 tV of.. > 2o<.(o<.+1)

Thus the ratios between the asymptotic variances are

.l 2~ (o<,+ 1 ) ":;-1 ....;n-._~a-- = 2a(a+1 )11!' (a)-2(a+1 )>(2a+3 )-2(a+1) = 1

n ' a'l! (a)-1

and

At this point we have to choose the functions f and g

satisfying the conditions (a) - (h), and the matrix c satisfying

(i) - (k) if we want to use the quadratic form Q, or the matrix b

satisfying (l) - (n) if we want to use the linear form L. Each

possible choice .oi these functions will result in estimators which

are asymptotically normal distributed with mean ( 0{., cr) and co-

variance matrix (6).

-15-

We shall give some examples of specific choices of f, g, c

and b:

Let

"' f ( zn, eX: , cr) = A ( cJ.. , cr )

then • • • g( ol..' ct) = I and f(oL, 6) = A(cL. ~d) •

It is easily proved that f and g satisfy conditions (a) - (h). "" To apply L we need a 2x3 matrix b(Zn,e<, ct) satisfying (n):

b(A(ol,d),o(,d) = b(oc,d) = A(cx., ct)d(o(,ct)(g(ot.,ct)')-1 =

We simply choose 1

0

1

0

The conditions (1) and (m) are now satisfied. The solution of

L ( zn' ol ' cr) = b ( zn' o(' (f ) l zn - A ( ol..' o') J I = 0

with respect to ( o/., a') are estimators with the stated properties. "'

Bacause of the simple form of b ( zn ,o( , ct ) ' the equation above is

equivalent to ,..

2n2 = A2 (oL , cf)

zn1 = A1 (o( ' d )

Thus we get the estimators (;; ,et) as the solution of ('I.J ('\) ""

'4/ ( ~ ) . + ~n ~ = :n2

ol. Cf = zn 1

which is seen to be the maximum likelihood estimators (1).

-16-

P~other possibility is to choose

z n ~

A A 1\ A ,._ e ,.. A

f ( zn ~ o< , a ) = A ( oL ~ cf ) + ( cL - o( , if - d ) A ( ac:., lf ) •

" f(Zn, d, d) is obtained as mentioned after the theorem in Section 1;

we have evaluated A( eX.~ d) "about" the estimator (2). One may veri­

fy that the conditions (a) - (h) are satisfied, and that " " "

c ( zn, ot , d) = d ( ot , o' )

satisfies conditions (i) - (k). Differentiating

Q ( zn' oL ' (f ) = [ Zll - f ( Zll f o( ' ()' ~ d (~, d ) [ Zn - f ( Zn' o/. J d )] I

with respect to (~, d) and setting the derivatives equal to 0

we get • " " " ~)[zn - f ( Zn' eX ' d ) J I -2A( o1., 0") d ( o(' = o.

(In fact we would have obtained an equivalent equation using " • " " " "' b ( zn, o( , 0') = A( o( ' lf)d( ol..' cr) in the linear form · L.)

+0 1 0 ) • " " " " A(cL, a') d ( c:~..., cr) ~

~2 0 0 0

As

the equation is equivalent to " " 2n2 = f2( zn,o(, d)

" " 2n1 = f( zn' o( ' if )

" " Using the fact that d. d = zn1' we get the estimator ..., v

(a( , c1 ) defined

by "' "

v ~ [1 zn2- ~ (&_) -ln o-

J oL = + I

oL\.jl (0<'..)-1

" " v " [ . zn2 - '4) (:i)- ln 0' ] o- = cr 1 - .....;;.;;;.;;;..._.,...,... __.._ ____ _

oll.v'~)-1

I .. '

-17-

" ... * •* Substituting (OL , a' ) for ( ~, a) in the expression

" " " zn2- 'P( o() -ln if

I

d. 'yJ (ex.) -1

" " "' F(Z~1' 2n2' 2n3) =

we see that F = 0 in this case. Generally it is to be expected

that F

hope that

is small (at least asymptotically, ~0 ), thus one may v v

(of.., if ) is an estimator with finite properties similar " ...

to those of ( cJ.., 0" ), having at the same time the best possible

asymptotic properties. However, the only way to obtain a more de-

finite conclusion concerning the finite properties of the estimators

considered in this section, is through numerical calculations.

REFER'~NCES

[11 E.W. Barankin and J. Gurland (1951): "On asymptotically normal) efficient estimators: I", University of California Publications in Statistics, Vol.1, No.6, pp.89-130.

[2] J. Berkson (1934): "Application of the logistic function to bioassay", J. Amer. Statist. Ass., Vol.29, pp.357-365.

[3] C.L. Chiang (1956): "On regular best asymptotically normal estimates", Ann. Math. Stat., Vol.27, pp.336-351.

[4) ~U. Fenstad (1965):"0n asymptotically normal estimators", University of Oslo (to be published).

L5] T.S. Ferguson (1958):"A method of generating best asymptotically normal estimates with application to the estimation of bacterial densities", .Ann. Math. Stat., Vol.29, pp.1046-1062.

[6] J. Stene (1963):"Bidrag til teorien om beste, asymptotisk normale estimatorer", University of Oslo, (unpublished).

[7] E. Sverdr@ (1964):"Lov og tilfeldighet II", Universitetsfor­laget, Oslo.

[s} W.F. Taylor (1953):"Distance functions and regular best asymp­totically normal estimates", .Ann. Math. Stat., Vol.24, pp.85-92.