gen linear
TRANSCRIPT
-
7/31/2019 Gen Linear
1/23
General linear models in small areaestimation: an assessment in
agricultural surveys
Carlo Russo, Massimo Sabbatini, and Renato SalvatoreUniversity of Cassino, Italy
The MEXSAI Conference
-
7/31/2019 Gen Linear
2/23
Some small area estimation
references Ghosh M., Rao J. N. K. (1994), Small area estimation: an
appraisal, Statistical Science, Vol. 9, No. 1, pp. 55-93
He Z., Sun D. (2000), Hierarchical Bayes estimation of
hunting success rates with spatial correlations,Biometrics, 56, 102-109
Malec D., Sedransk J., Moriarity C. L., LeClere F. B.(1997), Small area inference for binary variables in theNational Health Intgerview Survey, Journal of the
American Statistical Association, Vol. 92, 439, 815-826 Rao J. N. K. (2002), Small area estimation with
applications to Agriculture, Proceedings of theConference on agricultural and environmental statisticalappications in Rome, Vol. III, 555-564
Rao J. N. K. (2003), Small area estimation, Wiley, London
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
3/23
Small area estimation: a simple
outlineThe term small area usually denote a small
geographical area, such as a county, a province, anadministrative area or a census division
From a statistical point of view the small area is a smalldomain, that is a small subpopulation constituted byspecific demographic and socioeconomic group ofpeople, within a larger geographical areas
Sample survey data provide effective reliable estimatorsof totals and means for large areas and domains. But itis recognized that the usual direct survey estimatorsperforming statistics for a small area, have
unacceptably large standard errors, due to thecircumstance of small sam le size in the area
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
The MEXSAI Conference
-
7/31/2019 Gen Linear
4/23
Small area estimation: a simple
outline In fact, sample sizes in small areas are reduced, due to
the circumstance that the overall sample size in asurvey is usually determined to provide specific
accuracy at a macro area level of aggregation, that isnational territories, regions ad so on (Ghosh and Rao,1994)
Small area statistics are important tools for planning
agricultural policies in specific regional andadministrative areas
But important is also the information demand from othersectors, such as private, especially for questions related
to local social and economics conditions, in local areamarketin research, and so on
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
5/23
Small area estimation: a simple
outline
The small area statistics are based on a collection ofstatistical methods that borrow strength form related
or similar small areas through statistics models thatconnect variables of interest in small areas with vectorsof supplementary data, such as demographic,behavioral, economic notices, coming fromadministratvive, census and specific sample surveys
records
Small area efficient statistics provide, in addition of this,excellent statistics for local estimation of population,farms, and other characteristics of interest in post-
censual years
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
6/23
Small area estimation: a simple
outlineThe most commonly used tecniques for small area
estimation are the empirical Bayes (EB) procedures, thehierarchical Bayes (HB) and the empirical best linearunbiased prediction (EBLUP) procedures (Rao, 2003)
Some utilization of this tecniques in agrigulturalstatistics are related to the implementation of satellitedata, and, in general, of differently-oriented sumpleysurveys in model-based frameworks
There are two types of small area models that includerandom area-specific effects: in the first type, the basicarea level model, connection through response and areaspecific auxiliary variables is established, because the
limited availability at such type of data at unit level
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
7/23
Small area estimation: a simple
outlineThe second type are the unit level area models, in which
element-specific auxiliary data are available for thepopulation elements (Ghosh and Rao, 1994; Rao, 2002)
The simplest way to perform small area statistics is,however, to derive synthetic estimates from large areadata assumptions on related local areas: sintheticestimators are generally used because of theirapplicability to general sampling designs and of their
improving efficiency in relation to exploiting informationfrom similar small areas
The problem is that such type of estimators arepotentially design-biased. Following the composite
estimate approach to small area analyis, the way ofbalance the bias of synthetic estimator against the
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
8/23
-
7/31/2019 Gen Linear
9/23
Small area estimation: a simple
outline4)
4) is the direct area estimator with sampling errors
Combining 2) and 4):
5)
5) this model involves design random variables and, at the
same time, the model-based random variables. It is anexample of general linear mixed model (GLMM) withdiagonal covariance structure
mieiii ,..., 1=+=
mieuz iiiT
ii ,..., 1=++= x
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
10/23
Small area estimation: a simple
outlineThe BLUP (best linear unbiased prediction) estimator is a
weighted average of the design-based estimator and theregression-synthetic estimator
The MSE of the BLUP estimator depends on the varianceparameter of the random area effects
In practical applications this parameter is unknown, and itis replaced by an estimator
Then, we have a two-stage estimator, called empiricalBLUP (EBLUP)
Since the MSE of the EBLUP estimator is insensitive to thechoiche of the random area effect varaince estimator, itis larger than the BLUP estimator
Assuming normality of random effects, the relatedvariance area parameters can be estimated either by
maximum likelihood (ML) or restricted maximumlikelihood (REML) methods
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
11/23
-
7/31/2019 Gen Linear
12/23
Small area estimation: a simple
outline Instead of EBLUP and EB, if we follow the HB
(hierarchical Bayes) approach, first a prior distributionon the model parameters is specified, and then theposterior distribution of the parameters of interest is
obtained
The usually estimation small area problem are solvedexploiting the posterior distribution framework. Theevaluation of parameters of interest is obtained by its
posterior mean-based estimate, and the precision of theestimate in terms of its MSE is measured with theposterior variance
The HB approach is computationally intensive, involvingin much cases high dimensional integration
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
13/23
Small area estimation: a simple
outline Some tools, such as Gibbs sampling and importance
sampling, the latter jointly employed with Monte Carlonumerical integration methods, are commonly used in
order to overcome some computational problems
In the recent years, comparative studies concerning theEBLUP, EB, and HB approaches lead in general to closevalues of predictors. All of the three in certain particular
situations can work better than others
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
14/23
Qualitative data
Qualitative data are becoming relevant in theagricultural economics field for two major reasons:firstly theoretical development stress the relevance of
discrete and intrinsically qualitative phenomena,secondly the increasing sophistication of the statisticsapproach in the field allows economist to drawquantitative conclusions from discrete data
The qualitative data about households, including therole of women, services availability, presence/absenceof infrastructures are considered as relevant factor inthe analysis that require close consideration in anyeconomic model in the field
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
15/23
Qualitative data
Agricultural economists are also interested to the socialanalysis of the rural territory. The segmentation of theuniverse, based on qualitative variables (such asgender, age, education) becomes relevant to define the
dynamics of specific groups and to analyze issues ofinterest
The shift of the policy focus from producers support torural development in high income countries is one of themajor factor determining the new interest in theanalysis of the qualitative aspects of agriculture
In the contest of qualitative data analyses, bothcontinuos and binary or nonnumeric data are availableby the large data sets exploration of some arrays, suchas in agricultural census data. The complete exploitationof that large number of informations about farms is
often feasible only with some explorative data analyses,in particular homogeneity and correspondence analysis
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
16/23
Qualitative data
On the other hand, it is recognized that some complexaspects of farms structure are correctly pointed out ifwe implement in economic models, at the same time, all
possible information
Small area statistics are powerful methods in estimatingsmall area farms characteristics, but some agriculturalpolicies need further information, especially thoserelated with particular classes of farms
The apparteinance of farms in well-recognized classes,jointly used with other area information, is then a basicpolicy-makers tool
From this standpoint is very useful try to achieve smallarea random effects models that combine continuous
and categorical predictors and use binary responsevariables. The goal is to estimate proportions of farms
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
17/23
The general logistic-linear mixed model
small area analysisThe extension of the GLM models to binary response
variables small area analysis is given in Malec et al.,1997 and 1999. The related unit level model combine,
in the paper application example, small area-specificcovariates with unit level demographic andsocioeconomic data. Then estimates was stated relatingindividuals and classes, using a HB approach
In that GLM model, it is assumed that each individual inthe population is assigned to one of mutually exclusiveand exhaustive classes, based on the individualsdemographic and socioeconomic status
Given a vector of random effects the estimation ofparameters of GLMM model for binary responsesrequests computation of high dimensional integrals,with dimension equal to the number of levels of the
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
18/23
The general logistic-linear mixedmodel
small area analysis One approach in literature was done in the contest of
HB framework. He and Sun, 2000, given an example ofhierarchical Bayes estimation procedure of a logistic-
linear mixed model in hunting success rates at the sub-area level for post-season harvest surveysThe model implements fixed week effects and random
geographic effects, in the contest of autoregressive (AR)
and conditional autoregressive (CAR) approach to theanalysis of spatial correlations between neighboringsub-areas. The process of estimation needs, as in thecase of the GLM represented by the logistic-linear modelabove, Gibbs sampling procedures
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
19/23
The general logistic-linear mixedmodel
small area analysis We introduce in the paper a Monte Carlo Newton-
Raphson ML procedure (McCulloch, 1997) in estimatingparameters in the following general logistic-linear mixedmodel
The estimation problem in closed form likelihoodintegral expressions is proposed to solve numerically viaMonte Carlo approach.
Another problem is how to generate starting values ofthe parameters in likelihood expressions if, previously,we dont specify the vector of random effects. A naturalway to solve the problem is to adopt the Metropolis
algorithm, that is a simple Markov Chain Monte Carlo(MCMC) algorithm
iiTkikikik uppp +== )logit())/(log( X1
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
20/23
The general logistic-linear mixedmodel
small area analysisThe basic characteristic of a MCMC is that the sequence
of generated points takes a kind of random walk inparameter space, instead of each point being
generated, one independently from another
Moreover, the probability of jumping from one point toan other depends only on the last point and not on theentire previous history (this is the peculiar property of a
Markov chain)
The paper shows the Monte Carlo approach to theNewton-Raphson procedure of estimating logistic linearparameters estimation via an iterative procedure that
leads to convergent MLE estimates, under assumptionof normalit
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
21/23
Conclusions
In this paper, a Monte Carlo Newton-Raphson algorithmhas been outlined, assuming normality of random areaeffects, in order to approach the MLE estimation issues
related to the logistic-linear mixed model, in the contextof qualitative small area estimation
As generally recognized, the focus of the recenteconomic theory on qualitative data can be summarizein two major points: the increasing interest in theanalysis of discrete phenomena, and the explanatorypower of qualitative variable in describing the currenttrend in the agricultural sector
Statistical methods able to convey the qualitativeinformation in the estimation models are able toincrease efficiency
The MEXSAI Conference
GENERAL LINEAR MODELS IN SMALL AREA ESTIMATION: AN ASSESSMENT IN AGRICULTURAL SURVEYS
-
7/31/2019 Gen Linear
22/23
-
7/31/2019 Gen Linear
23/23
Thank you
Please find much more methodological details inthe paper available on the conference website
e-mail to: [email protected]