4004 suhartono statistics 109406 8484 ijbas ijens

Upload: ilkom12

Post on 09-Mar-2016

216 views

Category:

Documents


0 download

DESCRIPTION

Statistics

TRANSCRIPT

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 79

    1

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    Estimation for Multivariate Linear Mixed Models

    I Nyoman Latra1, Susanti Linuwih2, Purhadi2, and Suhartono2

    1 Doctoral Candidate at Department of Statistics FMIPA-ITS Surabaya, Email : [email protected]

    2 Promotor and Co-promotors at Department of Statistics FMIPA-ITS Surabaya.

    AbstractThis paper discusses about estimation of multivariate linear mixed model or multivariate component of variance model with equal number of replications. We focus on two estimation methods, namely Maximum Likelihood Estimation (MLE) and Restricted Maximum Likelihood Estimation (REMLE) methods. The results show that the parameter estimation of fixed effects yields unbiased estimators, whereas the estimation for random effects or variance components yields biased estimators. Moreover, assume that both likelihood and ln-likelihood functions hold some of regularity conditions, it can be proved that estimators as a solutions set of the likelihood equations satisfy strong consistency for large sample size, asymptotic normal and efficiency.

    KeyWords: Linear Mixed Model, Multivariate Linear Model, Maximum Likelihood, Asymptotic Normal and Efficiency, Consistency.

    I. INTRODUCTION

    Linear mixed models or variance components models have been effectively and extensively used by statisticians for analyzing data when the response is univariate. Reference [12] discussed the latent variable model for mixed ordinal or discrete and continuous outcomes that was applied to birth defects data. Reference [16] showed that maximum likelihood estimation of variance components from twin data can be parameterized in the framework of linear mixed models. Specialized variance component estimation software that can handle pedigree data and user-defined covariance structures can be used to analyze multivariate data for simple and complex models with a large number of random effects. Reference [2] showed that Linear Mixed Models (LMM) could handle data where the observations were not independent or could be used for modeling data with correlated errors. There are some technical terms for predictor variables in linear mixed models, those are (i) random effects, i.e. the set of values of a categorical predictor variable that the values are selected not completely but as random sample of all possibility values (for example, the variable product has values representing only 5 of a possible 42 brands), (ii) hierarchical effects, i.e. predictor variables are measured at more than one level, and (iii) fixed effects, i.e.

    predictor variables which all possible category values (levels) are measured.

    Otherwise, in this paragraph there are many papers discussed about linear model for multivariate cases. Reference [5] applied multivariate linear mixed model to Scholastic Aptitude Test and proposed Restricted Maximum Likelihood (REML) to estimate the parameters. Reference [6] used multivariate linear mixed model or multivariate variance components model with equal replication to predict the sum of the regression mean and the random effects of models. This prediction problem reduces to the estimation of the ratio of two covariance matrices. The estimaion of the ratio matrix obtaine the James-Stein type estimators based on the Bartletts decomposition, the Stein type orthogonally equivariant estimators, and the Efron-Morris type estimators. Recently, Reference [10] applied linear mixed model in Statistics for Biology Systems to calculate both covariates and correlations between signals which followed non-stationary time series. They used the estimation algorithm based on Expectation-Maximization (EM) which involved dynamic programming for the segmentation step. Reference [11] discussed a joint model for multivariate mixed ordinal and continuous responses. The likelihood is found and modified Pearson residuals. The model is applied to medical data, obtained from an observational study on woman.

    Most of previous papers about multivariate linear mixed models focused on estimation method and not yet discuss about the asymptotic properties of the estimators. By assumption that both likelihood and ln-likelihood functions hold for some of regularity conditions, then a solutions set of the likelihood equations satisfy strong consistency for large sample size, asymptotic normal and efficiency.

    II. MULTIVARIATE LINEAR MIXED MODEL

    In this section, we firstly discuss about the univariate linear mixed models and then continue to multivariate linear models without replications. Whereas discuss about the multivariate linear mixed models will be included in next section. Linear mixed models are statistical models for continuous outcome variables in which the residuals are normally distributed but may not be independent or have constant variance. A linear mixed model is a parametric linear model for clustered, longitudinal, or repeated-measures data. It may include

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 80

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    both fixed-effect parameters associated with one or more continuous or categorical covariates and random effects that associated with one or more random factors. The fixed-effect parameters describe the relationships of the covariates to the dependent variable for an entire population, and the random effects are specific to clusters or subjects within a population [4], [9], [15], [17].

    In matrix and vector, a linear mixed population model (univariate) is written as [4], [7], [9], [17]

    { 321randomfixed

    eZXy ++= (1) where y is a n 1 vector of observations, X is a n (p+1) matrix of known covariates, Z is a n m known matrix, is a (p + 1) 1 vector of unknown regression coefficients which are usually known as the fixed effects, is a m 1 vector of random effects, and e is a n 1 vector of random errors. Both vectors and e are unobservable. The basic assumption of Eq. (1) for and e are

    )(~ G0, N , and )(~ R0,e N , where G and R are the variance-covariance matrices (then called variance matrices). To simplify the estimation of model parameters, random effect is marginalized to errors e to create e* that is N(0,), where = R + ZGZT. After marginalization process, (1) can be written as,

    *eXy += or )(~ ,Xy N . (2)

    Equation (2) is called marginal model or sometimes called as model of population mean.

    For point estimation process, each of variance component of model can be notated by that include in vector = vech() = T

    nn],,[

    )1(1 21 + L . It implies Gaussian

    mixed model has a joint probability density function as follows:

    ( ) ( ){ }2/||)2(/1)( 1 XyXy ,y, = Tn

    exp

    f (3)

    where n is dimension of vector y and vech() indicates a vector with all lower triangular elements of have been stacked by column. A ln-likelihood function of (3) is

    ( ) ( )( ) ( ).)2/1(

    ||)2/1(2)2/(),( L)(1 XyXy

    ,y,

    ==

    Tlnlnnln l

    (4)

    Taking the partial derivatives of (4) with respect to , q and makes each equal to zero such that we have parameters estimation as follows:

    11XX = )( T yX 1T , (5)

    2/)1(...,,2,1;)(

    +==

    nnq

    tr

    qq qq

    T yPPy 1 , (6)

    where P 1= X 1 11XX )( T 1X T . Equation (5) has a closed-form. Whereas (6) does

    not have a closed-form, therefore [4], [17] proposed three algorithms that could be used to calculate the parameter estimation q, i.e. EM (expectation-maximization), N-R (Newton-Raphson), and Fisher Scoring algorithm. Due to the MLE is consistent and asymptotically normal with covariance matrix asymptotic equal to inverse of Fisher Information matrix, so we need to obtain Fisher Information matrix components. In both univariate and multivariate linear mixed models, the regression coefficients and covariance matrix components will be placed it in to a vector , that is

    .)( TTT , = Thus, the Fisher Information matrix will have expressions as follows

    =

    TEVar ll 2 , (7)

    by an asumption that second partial derivative of exist. Mean value of second partial derivative respect to fixed components or variance components will be [4]

    XX

    1=

    T

    TEl2 ,

    q

    2

    E

    l = 0 ; 1 q n(n+1)/2,

    '

    2

    qq E l

    =

    '

    11(1/2)qq

    tr ;

    1 q, q n(n+1)/2. Base on (5) and a result of Theorem 1 in [15], the ML Iterative Algorithm can be used to compute the MLE of the unknown parameters in model (3). By using iteration process, the estimator of variance components are found as elements of )( =vech such that estimator of covariance matrix can be written as,

    q

    = (8)

    that is generally bias. By subtituting (8) to (5), it yields 11 )()( = XX TVar that is bias because (8) is bias.

    Bias in MLE can be eliminated by Restricted Maximum Likelihood Estimation (REMLE).

    REML estimation uses a transformation y1 = ATy, where rank(X) = p + 1, A is n (n p 1) with rank(A) = n p 1, such that ATX = 0. It is easy to show that that )(~ AA,0y1

    TN . Furthermore, the joint probability density function of y1 is

    { }.((1/2)||)2(/1)(

    11T

    1

    11

    yA)Ay

    AAy

    =

    T

    TpnR

    exp

    f (9)

    The ln-likelihood of (9) can be written as

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 81

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    ])()()[2/1()( 11

    1 yAAyAA TTTR lnl . (10)

    Estimation process can be carried out by partially derivation of (10) respect to q ; 2/)1(...,,1 += nnq (because it does not contain components of ),

    )}({)2/1(qq

    T

    q

    R

    tr

    = PPyPyl (11)

    where P in (11) is TT AAAA 1)( which basically is equal to P that the estimator is P in (6). Fisher Information matrix of (11) is

    .2

    =

    TRR EVarll (12)

    By asumming that the second derivative of in (10) is exist, then it can be proven that its components Fisher is

    =

    '

    2

    qq

    R

    E l

    '

    )2/1(qq

    tr PP ;

    1 q, q 2/)1( +nn . (13) The estimator that be obtained from REMLE method is unbias. However, the linear mixed models will yield

    )(Var that is bias, because )(Var is taken from (5) by

    doing a substitution of by . Bias from variance of fixed effect estimator that constitute the diagonal elements of )(Var is a downward bias both in MLE and REMLE method [17].

    Multivariate linear model that does not depend to random factors was discussed by [1], [9]. Multivariate linear model without replications in the matrix form could be written as

    EXBY += (14)

    where Y = [ ijY ], E = [ij], X = [xia], and B = [aj]; i = 1, ..., n; j = 1, ..., s; a = 0, ..., p; intercepts xi0 = 1 for all i. For individual i, (14) could be presented in the matrix and vector form as follows:

    yi = iT xB + ei , (15)

    where yi =

    is

    i

    i

    Y

    YY

    M2

    1

    , B =

    pspp

    s

    s

    21

    11211

    00201

    LMOMM

    LL

    ,

    xi =

    ip

    i

    x

    xM

    1

    1

    , and ei =

    is

    i

    i

    2

    1

    M .

    The error vectors ei in (15) are random vectors that be assumed follow multivariate normal distribution with zero mean vector in Rns. The assumption of rows of E are independents because each of them are related to different observations, however the columns of E are still allowed to have a correlation. Thus,

    iii

    i'i

    CoviiCov

    ee0ee

    ==

    ),(',),( (16)

    for all i = 1, 2, ..., n. The other assumption is matrix of variance errors in each observation i = with size s s is same for all i and positive definite. Then, (16) can be written as [13]

    Cov(vec(ET), vec(ET)) = In

    where vec(ET) is a vector that indicates all elements of a matrix E have been stacked by rows and In is identity matrix with size n n.

    Moreover, estimation of B can be done by assuming that is known. The likelihood function of (14) for n observations is

    }2/Qexp{||)2(/1))(),(( MLM= nnsT vechvecL B (17)

    ( )( ).)()()(vec

    )()()()(vecQwhereTT

    1TTMLM

    BIXYIBIXY

    vecvec

    s

    nT

    s

    =

    Maximize (17) with constraint that is known will yield estimator

    YXXXB TT 1)( = or )())(()( 1 Ts

    TTT vecvec YIXXXB = . (18)

    The estimation of can be obtained by assuming that B is known. Then, maximize (17) and assume that B is known will yield bias variance estimator, i.e.

    YXXXXIY ))(()2/1( 1 TTT = . The unbias estimator of variance [1] is,

    ))(/())(( 1 XYXXXXIYS ranknTTT = . (19)

    Hence, estimator in (18) is an unbias estimator with the variance matrix

    XXB)())(( 1 = TTvecVar . (20)

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 82

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    III. PARAMETERS ESTIMATION for MULTIVARIATE LINEAR MIXED MODEL

    The multivariate linear mixed model is an extension of the multivariate linear model. It means that the model can be constructed by adding a random component part to (14) and assuming that each elements of Y has a linear correlation with systematic part of the model. This model is called as the multivariate linear mixed model without replication. There are three types of the linear mixed model, i.e. cluster, longitudinal, and replication. This paper focuses on the linear mixed model without replication, i.e.

    Y = XB + ZD + E (21)

    where X and Z are known as covariates matrices with size n (p+1) and n k respectively, B is a (p+1) s matrix of unknown regression coefficients of fixed effect, D is a k s matrix of specific coefficients of random effect, and E is a n s matrix of errors. It is also assumed that n-rows of E are independents and each row is N(0,) and random effect D satisfies vec(D) ~ N(0,). Thus, the distribution of respons Y in model (21) can be written in matrix and vector form as

    ))()()(),()((~)( IIZIZBIXY + nTssTsT vecNvec (22)

    Let a matrix of variance )()()( IIZIZV += nTss with size nsns, then variance components of model (21) are elements of = vech(V) = Tnsns ],,[ 2/)1(1 +L . The likelihood function of probability density function (22) is

    }2/Qexp{||)2(/1))(),(( MLMM= VVB nsT vechvecL (23)

    ( )( ).)()()(

    )()()(Qwhere1

    MLMM

    Ts

    T

    TTs

    T

    vecvecvecvec

    BIXYVBIXY

    =

    By applying partial derivation of ln-likelihood function (23) respect to )( Tvec B and q , then make equal to zero and it will yield estimators

    ),()()]()[()( 111 TTssT

    sT vecvec YVIXIXVIXB =

    (24)

    and by using a nonlinear optimization, with inequality constraints imposed on so that positive definiteness requirements on the and matrices are satisfied. There is no closed-form solution for the , so the estimate of is obtained by the Fisher scoring algoritm or Newton-Raphson algoritm. After an iterative computational process we have

    == VV , i.e.

    )()()( IIZIZV += nTss . (25)

    Both components estimator of and can be obtained too by applying iteration that are included in iteration process for computing V . The result of estimation in (24) is unbiased estimator with variance matrix 11 )]()[())(( = sTsTvecVar IXVIXB .

    IV. ESTIMATOR PROPERTIES: CONSISTENCY, EFFICIENCY, and ASYMPTOTICALLY

    Let vectors of random samples in the form of observations nyyy ,,, 21 L from a vector of random variable y with distribution F belonging to a family

    F { } = ,F , where Tk ],,[ 1 L= . For each 0 , a subset N( 0 ) of is a neighborhood of 0

    iff N( 0 ) is a superset of an open set G containing 0

    [8]. Assume kR and there are three regularity conditions on F, those are [3], [14]:

    (R1). For each , the derivatives i

    ),( yl ,

    '

    2 ),(ii

    yl , '''

    3 ),(iii

    yl exist, for all y and '',', iii = 1, 2, , k .

    (R2). For each 0 , there are exist functions g(y), h(y), and H(y) (possibly depending on 0 ) such that for in a neighborhood N( 0 ), the relations

    )(),( yy gLi

    , )(),(

    '

    2

    yy hLii

    ,

    )(),(

    '''

    3

    yHy

    iii

    L hold for all y, and

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 83

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    Lemma: Let )( y,L and )( y,l are likelihood function and

    ln-likelihood function respectively. Assume that the regularity conditions (R1) and (R2) hold. Fisher Information is the variance of the random variable,

    y,

    )(l ; i.e.,

    )].([)()()(2

    y,y,

    y,

    I lll TT EEVar =

    =

    = (27)

    Proof: The ),( yL is a likelihood function. Thus, it can be

    considered as a joint probability density function, so

    1),( =

    yy dLL .

    Taking the partial derivative with respect to vector , then the results are as follows:

    0yy

    =

    dL ),(L . (28)

    Equation (28) can be written as

    0yyy

    y =

    dL

    L

    L),(

    ),(

    ),(L or

    0yyy =

    dLL ),()),((lnL . (29)

    Writing (29) as an expectation, we have established

    0y =

    ),(lE . (30)

    It means that the vector mean of random variable

    y

    ),(l is 0. If (29) be partially differentiated again, it

    follows that

    .)},()),((ln

    ),()),((ln{2

    0yy

    y

    yy

    =

    +

    dLL

    LL TL

    This is equivalent to

    yy

    y dLL T ),(

    )),((ln2L

    + .),()),((ln2

    0yyy =

    dLLL

    This last equation can be written as an expectation, i.e.

    0y

    y =

    +

    22 ),(),( ll EE T

    By using (30) and definition of variance, the latter expression can be rewritten as

    0y

    y =

    +

    ),(),(2 ll VarE T or

    =

    = TEVar y,

    y,

    I )()()(2llQ

    )]([ y,lTE = [3], [14]. Theorem:

    Let n vectors observations nyyy ,,, 21 L be iid with distribution F , for . Assume regularity conditions (R1), (R2), and (R3) hold on the family F. Then, by

    m converges with probability 1 to , the likelihood

    equations admit a sequence of solutions { }m satisfying a). strong consistency:

    mwpm ; 1 ; (31) b). asymptotic normality and efficiency

    ( ))()/1(, 1 I mANdm , (32) where )]([)( y,I lTE = , ),( AN is a multivariate asymptotic normal distribution. Proof:

    Expanding the function y,

    )(l into a Taylor series

    of second order about 0 and evaluating it at m , as follows:

    Tm

    Tm ))[(2/1()({)()(

    000 y,

    y,

    y, ++

    = lll

    ))}(()]( 0* y,H11 mmT (33)

    where mm *

    0

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 84

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    == m

    ii

    Tm m

    10 )()/1( ,yB l then )(][ 00 IB =mE , and

    =

    = mi

    mim m1

    * )()2/1( ,yHC then

    )]([)2/1(][ *** mmmm

    mEmE y,HC

    = )].([)2/1( ** m

    mE y,H

    = Thus, (34) can be written as

    mmmTT

    mm m aC11B =+ )(})]()[({ 00 (35) By applying the Law of Large Numbers (LLN), it can be shown that

    0a 1wpm , )( 0

    1 IB wpm (36) )]([)2/1( *1 * m

    wpm

    mE y,HC

    .

    Next, it will be seen that )( *my,H is bounded in

    probability. Let c0 be a constant and the 00 cm

  • International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: 10 No: 06 85

    109406-8484 IJBAS-IJENS December 2010 IJENS I J E N S

    REFERENCES

    [1] R. Christensen, Linear Models for Multivariate, Time Series, and Spatial Data. New York: Springer-Verlag, 1991, ch. 1.

    [2] G. D. Garson, Linear mixed models: random effects, hierarchical linear, multilevel, random coefficients, and repeated measures models (Statnotes style), Lecturer notes at Department of Public Administration, North Carolina State University, 2008, pp. 1- 49.

    [3] R. V. Hogg, J. W. McKean, and A. T. Craig, Introduction to Mathematical Statistics, 6rd ed., Intern. ed., New Jersey: Pearson-Prentice Hall, 2005, ch. 6,7.

    [4] J. Jiang, Linear and Generalized Linear Mixed Models and Their Applications, New York: Springer Science+Business Media, LCC., 2007, ch. 2.

    [5] H. A. Kalaian and S. W. Raudenbush A multivariate mixed linear model for meta-analysis, Psychological Methods, vol. 1(3), pp. 227-235, Mar. 1996.

    [6] T. Kubokawa and M. S. Srivastava, Prediction in multivariate mixed linear models (Discussion papers are a series of manuscripts style), Discussion Papers CIRJE-F-180, pp. 1-24, Oct. 2002.

    [7] L.R. LaMotte,A direct derivation of the REML likelihood function, Statistical Papers, vol. 48, pp. 321-327, 2007.

    [8] S. Lipschutz, Schaums Outline of Theory and Problems of General Topology. New York: McGraw-Hill Book Co., 1965, ch. 1.

    [9] K. E. Muller and P. W. Stewart, Linear Model Theory: Univariate, Multivariate, and Mixed Models, New Jersey: John Wiley & Sons, Inc., 2006, ch. 2-5, 12-17.

    [10] F. Picard, E. Lebarbier, E. Budinsk, and S. Robin, Joint segmentation of multivariate gaussian processes using mixed linear models (Report style), Statistics for Systems Biology Group, Research Report (5), pp. 1-10, 2007.

    http://genome.jouy.inra.fr/ssb/ [11] E. B. Samani and M. Ganjali, A multivariate latent

    variable model for mixed continuous and ordinal responses, World Applied Sciences Journal, vol. 3(2), pp. 294-299, 2008.

    [12] M. D. Sammel, L. M. Ryan, and J. M. Legler, Latent variable models for mixed discrete and continuous outcomes, Journal of Royal Statistical Society B, vol. 59(3), pp. 667-678, 1997.

    [13] S. Sawyer, Multivariate linear models (Unpublished style), unpublished, pp. 1-29, 2008. www.math.wustl.edu/~sawyer/handouts/multivar.pdf 4/18/2008.

    [14] R. J. Serfling, Approximation Theorems of Mathematical Statistics, New York: John Wiley & Sons, Inc., 1980, ch. 1, 4.

    [15] C. Shin, On the multivariate random and mixed coefficient analysis (Dissertation style), Ph.D. dissertation, Dept. of Stat., Iowa State University

    Ames, Iowa, 1995. [16] P. M. Visscher, B. Benyamin, and I. White, The

    use of linear mixed models to estimate variance components from data on twin pairs by maximum likelihood, Twin Research vol. 7(6), pp. 670-674, 2004.

    [17] B. T. West, K. B. Welch, and A. T. Gatecki, Linear Mixed Models a Practical Guide Using Statistical Software. New York: Chapman & Hall, 2007(2007). .