[wiley series in probability and statistics] methods and applications of linear models (regression...

Appendix B: Statistics

This appendix contains a summary of some of the basic results in the classical theory of estimation and hypothesis testing. It is assumed that the reader is familiar with much of this material and the results are stated here for ease of reference and conform to the notation used in this book. (For more detail, see Mood, Graybill, and Boes (1974).)

B.1 ESTIMATION

B.I.1 Maximum Likelihood Estimation

If y is a vector of random variables whose density depends on the parameter vector 8, then the density function g(y, O ) , when viewed as a function of 8, is called the likelihood function and written L(8). Maximization of this function with respect to 8 yields a solution, say @(g), that is called the maximum likelihood estimator. Evaluation of this function for the observed data y yields a maximum likelihood estimate, denoted by 3. In this book the maximum will occur at a stationary point of the likelihood function or, equivalently, of its natural logarithm denoted by Z(8).

For fixed effects models and certain mixed models, the maximizer has a closed-form expression and it is possible to establish some exact distributional results. In other cases, the maximum is determined numerically and we must appeal to general large sample properties of the estimates. In particular, if we let F denote the information matrix whose (i,j)th element is given by

then the asymptotic distribution of the maximum likelihood estimator is n

8 N ( 8 , F- I).

A lower bound on the variance of an unbiased estimate of t9, is given by the ith diagonal element of F-'. The Cramer-Rao lower bound is given by the reciprocal ofhi.

699

Methods and Applications of Linear Models: Regression and the Analysis of Variance, 2nd Edition. Ronald R. Hocking

Copyright 0 2003 John Wiley & Sons, Inc. ISBN: 0-471-23222-X

700 Appendix B: Statistics

B.I.2 Constrained Maximum Likelihood Estimation

In some applications, the allowable values of the maximizer are constrained to lie in a subspace, say, 0, of the parameter space. In this case the maximum likelihood estimators are given by the solution of the constrained problem, written as

17202 qe) subject to : 8 E 8.

With fixed-effects models, the constraints are defined by linear equalities. This special case yields closed-form solutions and exact distributional results. With mixed models the variance component estimates may be constrained to be non- negative. Except in special cases the solutions must be determined numerically, and there are no exact distributional results. (See Appendix A for a discussion of constrained optimization methods.)

B.I.3 Complete, Sufficient Statistics

Let t ( y ) be a vector of statistics such that the conditional distribution of y given t does not depend on the parameter vector 8. Then t is said to be sufficient for 8. We are interested in a minimal set of sufficient statistics that are complete in the sense that there is no function of t with expected value zero.

Exponential Family. general expression

The exponential b i l y of densities is defined by the

where the range on the variables does not depend on the parameters. Subject to some mild conditions, the statistics t j ( y ) are a set of complete, sufficient statistics for 8 and the density is said to be complete.

An extension of this result that applies to mixed models was given by Gautschi (1959), who showed that the family of distributions given by

is also complete. A discussion of the application of this extension to the mixed model is given by Clason and Murray (1992).

Rao-Blackwefl Theorem. Let g(y) be an unbiased estimator for a scalar parameter function h(8) and let t (y ) be a vector of sufficient statistics. Then

B.11 Tests of Hypotheses and Confidence Regions 701

there exists an estimator p(t) , a function of t, such that E[p(t)] = h(8) and Var[p(t)] 5 Var[g(t)]. This estimator can be constructed by determining the conditional expected value&) = E&) I 13.

LehmannScheffe' Theorem. I f t is a complete, sufficient vector for 8, then p(t), as defined by the Rao-Blackwell theorem, is the unique, minimum variance, unbiased estimator for 8.

B.11 TESTS OF HYPOTHESES AND CONFIDENCE REGIONS

The test statistics in this book are derived using the likelihood ratio principle. To define the test statistic, let L(B) denote the likelihood function, let the parameters be constrained by 8 E 0, and assume that we are interested in testing the hypothesis Ho : 8 E against the alternative HA : 8 4 el. Then we reject Ho in favor of HA if I- 5 I-*, where

max ye) subject to : e E el u e

max L(8) '

subjectto : 8 E 9

I-=

We choose T* so that

~r0b t . r 5 T* I e E ell 5 a .

In practice, we may use any monotone function of T as our test statistic.

the test and is defined by The probability of rejecting Ho, as a function of 8, is called the power of

The exact, small sample distribution of the test statistic is generally difficult to obtain, but a usefid approximation is that under the null hypothesis

- 210&(T) A X 2 ( V ) ,

where v denotes the number of independent constraints on the hypothesis. Confidence regions on 8 or confidence intervals on individual parameters

or functions of them are generally obtained by inverting test statistics. In general, a lOO(1 - a)% confidence region on 8 consists of all values 8* that are acceptable under the hypothesis Ho: : 8 = 8* of size a.

[wiley series in probability and statistics] methods and applications of linear models (regression...

Documents