gen linear

Upload: bellysitompul

Post on 05-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Gen Linear

    1/23

    General linear models in small areaestimation: an assessment in

    agricultural surveys

    Carlo Russo, Massimo Sabbatini, and Renato SalvatoreUniversity of Cassino, Italy

    The MEXSAI Conference

  • 7/31/2019 Gen Linear

    2/23

    Some small area estimation

    references Ghosh M., Rao J. N. K. (1994), Small area estimation: an

    appraisal, Statistical Science, Vol. 9, No. 1, pp. 55-93

    He Z., Sun D. (2000), Hierarchical Bayes estimation of

    hunting success rates with spatial correlations,Biometrics, 56, 102-109

    Malec D., Sedransk J., Moriarity C. L., LeClere F. B.(1997), Small area inference for binary variables in theNational Health Intgerview Survey, Journal of the

    American Statistical Association, Vol. 92, 439, 815-826 Rao J. N. K. (2002), Small area estimation with

    applications to Agriculture, Proceedings of theConference on agricultural and environmental statisticalappications in Rome, Vol. III, 555-564

    Rao J. N. K. (2003), Small area estimation, Wiley, London

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    3/23

    Small area estimation: a simple

    outlineThe term small area usually denote a small

    geographical area, such as a county, a province, anadministrative area or a census division

    From a statistical point of view the small area is a smalldomain, that is a small subpopulation constituted byspecific demographic and socioeconomic group ofpeople, within a larger geographical areas

    Sample survey data provide effective reliable estimatorsof totals and means for large areas and domains. But itis recognized that the usual direct survey estimatorsperforming statistics for a small area, have

    unacceptably large standard errors, due to thecircumstance of small sam le size in the area

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

    The MEXSAI Conference

  • 7/31/2019 Gen Linear

    4/23

    Small area estimation: a simple

    outline In fact, sample sizes in small areas are reduced, due to

    the circumstance that the overall sample size in asurvey is usually determined to provide specific

    accuracy at a macro area level of aggregation, that isnational territories, regions ad so on (Ghosh and Rao,1994)

    Small area statistics are important tools for planning

    agricultural policies in specific regional andadministrative areas

    But important is also the information demand from othersectors, such as private, especially for questions related

    to local social and economics conditions, in local areamarketin research, and so on

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    5/23

    Small area estimation: a simple

    outline

    The small area statistics are based on a collection ofstatistical methods that borrow strength form related

    or similar small areas through statistics models thatconnect variables of interest in small areas with vectorsof supplementary data, such as demographic,behavioral, economic notices, coming fromadministratvive, census and specific sample surveys

    records

    Small area efficient statistics provide, in addition of this,excellent statistics for local estimation of population,farms, and other characteristics of interest in post-

    censual years

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    6/23

    Small area estimation: a simple

    outlineThe most commonly used tecniques for small area

    estimation are the empirical Bayes (EB) procedures, thehierarchical Bayes (HB) and the empirical best linearunbiased prediction (EBLUP) procedures (Rao, 2003)

    Some utilization of this tecniques in agrigulturalstatistics are related to the implementation of satellitedata, and, in general, of differently-oriented sumpleysurveys in model-based frameworks

    There are two types of small area models that includerandom area-specific effects: in the first type, the basicarea level model, connection through response and areaspecific auxiliary variables is established, because the

    limited availability at such type of data at unit level

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    7/23

    Small area estimation: a simple

    outlineThe second type are the unit level area models, in which

    element-specific auxiliary data are available for thepopulation elements (Ghosh and Rao, 1994; Rao, 2002)

    The simplest way to perform small area statistics is,however, to derive synthetic estimates from large areadata assumptions on related local areas: sintheticestimators are generally used because of theirapplicability to general sampling designs and of their

    improving efficiency in relation to exploiting informationfrom similar small areas

    The problem is that such type of estimators arepotentially design-biased. Following the composite

    estimate approach to small area analyis, the way ofbalance the bias of synthetic estimator against the

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    8/23

  • 7/31/2019 Gen Linear

    9/23

    Small area estimation: a simple

    outline4)

    4) is the direct area estimator with sampling errors

    Combining 2) and 4):

    5)

    5) this model involves design random variables and, at the

    same time, the model-based random variables. It is anexample of general linear mixed model (GLMM) withdiagonal covariance structure

    mieiii ,..., 1=+=

    mieuz iiiT

    ii ,..., 1=++= x

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    10/23

    Small area estimation: a simple

    outlineThe BLUP (best linear unbiased prediction) estimator is a

    weighted average of the design-based estimator and theregression-synthetic estimator

    The MSE of the BLUP estimator depends on the varianceparameter of the random area effects

    In practical applications this parameter is unknown, and itis replaced by an estimator

    Then, we have a two-stage estimator, called empiricalBLUP (EBLUP)

    Since the MSE of the EBLUP estimator is insensitive to thechoiche of the random area effect varaince estimator, itis larger than the BLUP estimator

    Assuming normality of random effects, the relatedvariance area parameters can be estimated either by

    maximum likelihood (ML) or restricted maximumlikelihood (REML) methods

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    11/23

  • 7/31/2019 Gen Linear

    12/23

    Small area estimation: a simple

    outline Instead of EBLUP and EB, if we follow the HB

    (hierarchical Bayes) approach, first a prior distributionon the model parameters is specified, and then theposterior distribution of the parameters of interest is

    obtained

    The usually estimation small area problem are solvedexploiting the posterior distribution framework. Theevaluation of parameters of interest is obtained by its

    posterior mean-based estimate, and the precision of theestimate in terms of its MSE is measured with theposterior variance

    The HB approach is computationally intensive, involvingin much cases high dimensional integration

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    13/23

    Small area estimation: a simple

    outline Some tools, such as Gibbs sampling and importance

    sampling, the latter jointly employed with Monte Carlonumerical integration methods, are commonly used in

    order to overcome some computational problems

    In the recent years, comparative studies concerning theEBLUP, EB, and HB approaches lead in general to closevalues of predictors. All of the three in certain particular

    situations can work better than others

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    14/23

    Qualitative data

    Qualitative data are becoming relevant in theagricultural economics field for two major reasons:firstly theoretical development stress the relevance of

    discrete and intrinsically qualitative phenomena,secondly the increasing sophistication of the statisticsapproach in the field allows economist to drawquantitative conclusions from discrete data

    The qualitative data about households, including therole of women, services availability, presence/absenceof infrastructures are considered as relevant factor inthe analysis that require close consideration in anyeconomic model in the field

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    15/23

    Qualitative data

    Agricultural economists are also interested to the socialanalysis of the rural territory. The segmentation of theuniverse, based on qualitative variables (such asgender, age, education) becomes relevant to define the

    dynamics of specific groups and to analyze issues ofinterest

    The shift of the policy focus from producers support torural development in high income countries is one of themajor factor determining the new interest in theanalysis of the qualitative aspects of agriculture

    In the contest of qualitative data analyses, bothcontinuos and binary or nonnumeric data are availableby the large data sets exploration of some arrays, suchas in agricultural census data. The complete exploitationof that large number of informations about farms is

    often feasible only with some explorative data analyses,in particular homogeneity and correspondence analysis

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    16/23

    Qualitative data

    On the other hand, it is recognized that some complexaspects of farms structure are correctly pointed out ifwe implement in economic models, at the same time, all

    possible information

    Small area statistics are powerful methods in estimatingsmall area farms characteristics, but some agriculturalpolicies need further information, especially thoserelated with particular classes of farms

    The apparteinance of farms in well-recognized classes,jointly used with other area information, is then a basicpolicy-makers tool

    From this standpoint is very useful try to achieve smallarea random effects models that combine continuous

    and categorical predictors and use binary responsevariables. The goal is to estimate proportions of farms

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    17/23

    The general logistic-linear mixed model

    small area analysisThe extension of the GLM models to binary response

    variables small area analysis is given in Malec et al.,1997 and 1999. The related unit level model combine,

    in the paper application example, small area-specificcovariates with unit level demographic andsocioeconomic data. Then estimates was stated relatingindividuals and classes, using a HB approach

    In that GLM model, it is assumed that each individual inthe population is assigned to one of mutually exclusiveand exhaustive classes, based on the individualsdemographic and socioeconomic status

    Given a vector of random effects the estimation ofparameters of GLMM model for binary responsesrequests computation of high dimensional integrals,with dimension equal to the number of levels of the

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    18/23

    The general logistic-linear mixedmodel

    small area analysis One approach in literature was done in the contest of

    HB framework. He and Sun, 2000, given an example ofhierarchical Bayes estimation procedure of a logistic-

    linear mixed model in hunting success rates at the sub-area level for post-season harvest surveysThe model implements fixed week effects and random

    geographic effects, in the contest of autoregressive (AR)

    and conditional autoregressive (CAR) approach to theanalysis of spatial correlations between neighboringsub-areas. The process of estimation needs, as in thecase of the GLM represented by the logistic-linear modelabove, Gibbs sampling procedures

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    19/23

    The general logistic-linear mixedmodel

    small area analysis We introduce in the paper a Monte Carlo Newton-

    Raphson ML procedure (McCulloch, 1997) in estimatingparameters in the following general logistic-linear mixedmodel

    The estimation problem in closed form likelihoodintegral expressions is proposed to solve numerically viaMonte Carlo approach.

    Another problem is how to generate starting values ofthe parameters in likelihood expressions if, previously,we dont specify the vector of random effects. A naturalway to solve the problem is to adopt the Metropolis

    algorithm, that is a simple Markov Chain Monte Carlo(MCMC) algorithm

    iiTkikikik uppp +== )logit())/(log( X1

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    20/23

    The general logistic-linear mixedmodel

    small area analysisThe basic characteristic of a MCMC is that the sequence

    of generated points takes a kind of random walk inparameter space, instead of each point being

    generated, one independently from another

    Moreover, the probability of jumping from one point toan other depends only on the last point and not on theentire previous history (this is the peculiar property of a

    Markov chain)

    The paper shows the Monte Carlo approach to theNewton-Raphson procedure of estimating logistic linearparameters estimation via an iterative procedure that

    leads to convergent MLE estimates, under assumptionof normalit

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    21/23

    Conclusions

    In this paper, a Monte Carlo Newton-Raphson algorithmhas been outlined, assuming normality of random areaeffects, in order to approach the MLE estimation issues

    related to the logistic-linear mixed model, in the contextof qualitative small area estimation

    As generally recognized, the focus of the recenteconomic theory on qualitative data can be summarizein two major points: the increasing interest in theanalysis of discrete phenomena, and the explanatorypower of qualitative variable in describing the currenttrend in the agricultural sector

    Statistical methods able to convey the qualitativeinformation in the estimation models are able toincrease efficiency

    The MEXSAI Conference

    GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS

  • 7/31/2019 Gen Linear

    22/23

  • 7/31/2019 Gen Linear

    23/23

    Thank you

    Please find much more methodological details inthe paper available on the conference website

    e-mail to: [email protected]