rod little a modeler’s perspective on total survey error
Post on 17-Jan-2016
226 Views
Preview:
TRANSCRIPT
Rod Little
A modeler’s perspective on Total Survey Error
Modeler’s view of TSE 2
Outline• Total Survey Error – strengths and weaknesses,
according to Groves and Lyberg• A modeler’s view of TSE• An application: multiple imputation for regression
involving covariates with measurement error, including data from a calibration sample– Epidemiological, but implications for survey practice
– Involving calibration samples, heteroscedastic measurement errors
– Compare with classical calibration methods
My philosophy: Calibrated Bayes• Bayes for inference, frequentist for model development and
assessment – seek inferences that are “frequentist calibrated”– In surveys this often leads to “weak, robust” models with
“reference” prior distributions” – “model-based, design-assisted inference” (Little 2012)
• Bayes inference is optimal under well-specified models• Frequentist calibration creates resistance to “bad models”
– E.g. “design-consistency” forces models that account for survey design
– unlike e.g. models in Hansen, Madow and Tepping (1983)
• Frequentist simulations suggest this approach can yield better repeated-sampling properties than design-based approaches– Bayes propagates error in estimating parameters
Modeler’s view of TSE 3
Total Survey Error (TSE)• Groves and Lyberg (GL, 2010)
• TSE paradigm is the conceptual foundation of the field of survey methodology.
• Quality properties of survey statistics are functions of essential survey conditions that are independent of the sample design.
• More successful as an intellectual framework than a unified statistical model of error properties of survey statistics.
• More importation of modeling perspectives from other disciplines could enrich the paradigm.– In this case, biostatistics and epidemiology
Modeler’s view of TSE 4
Strengths of TSE (GL)• Explicit attention to the decomposition of
errors
• Separation of phenomena affecting statistics in various ways
• Success in forming the conceptual basis of the field of survey methodology, pointing the direction for new research.
Modeler’s view of TSE 5
Current weaknesses of TSE (GL)• Key quality concepts are not included (notably those
of user)
• Quantitative measurement of many components burdensome and lagging
• Has not led to enriched error measurement in practical surveys
• Assumptions required for some estimators of error terms are frequently not true
• Mismatch between existing error models and theoretical causal models of the error mechanisms
• Misplaced focus on descriptive statistics, and failure to integrate error models developed in other fields
Modeler’s view of TSE 6
Current weaknesses of TSE (GL)• Key quality concepts are not included (notably those
of user)
• Quantitative measurement of many components burdensome and lagging
• Has not led to enriched error measurement in practical surveys
• Assumptions required for some estimators of error terms are frequently not true
• Mismatch between existing error models and theoretical causal models of the error mechanisms
• Misplaced focus on descriptive statistics, and failure to integrate error models developed in other fields
Modeler’s view of TSE 7
Unified modeling addresses these points
TSE Components Linked to Steps in Measurement and Representational Inference Process (Groves et al. 2004)
8
Construct Inferential Population
Measurement
Response
Target Population
Sampling Frame
Sample
Validity
Measurement Error
Coverage
Error
Sampling Error
Representation
Respondents
Nonresponse
ErrorEdited Data
ProcessingError
Survey Statistic
• Measurement
Errors of observation
Errors of nonobservationModeler’s view of TSE
Commentary 1• Dual inferential approaches – model-based for measurement,
design-based for analysis – inhibits integration of the two streams into a unified analysis– “The isolation of survey statisticians and methodologists from the
mainstream of social statistics has, in our opinion, retarded the importation of model-based approaches to many of the error components in the total survey error format.” (GL)
– The great disappointment regarding the TSE perspective is that it has not led to routine fuller measurement of the statistical error properties of survey statistics. While official statisticians and much of social science have accepted the probability sampling paradigm and routinely provide estimates of sampling variance, there is little evidence that the current practice of surveys in the early 21st century measures anything more than sampling variance routinely.
Modeler’s view of TSE 9
Commentary 1• Dual inferential approaches – model for measurement, design
for analysis – inhibits integration of the two streams into a unified analysis– “There are exceptions worth noting. The tendency for some
continuing surveys to develop error or quality profiles is encouraging (Kalton, Winglee, and Jabine, 1998; Kalton, Winglee, Krawchuk, and Levine, 2000; Lynn, 2003). These profiles contain the then-current understanding of statistical error properties of the key statistics produced by the survey. Through these quality profiles, surveys with rich methodological traditions produce well-documented sets of study results auxiliary to the publications of findings. None of the quality profiles have attempted full measurement of all known errors for a particular statistic.” (GL)
Modeler’s view of TSE 10
Commentary 2• The standard decomposition of RMSE into components of bias
and variance in the TSE approach implies a particular model that may not be realistic, and is restricted to simple statistics like means– “3.6 Assumptions Patently Wrong for Large Classes of Statistics.
Many of the error model assumptions are wrong most of the time. For example, the Kish (1962) linear model for interviewer effects assume that the response deviations are random effects of the interviewer and respondent, uncorrelated to the true value for the respondent. However, for example, reporting of drug usage has been found to have an error structure highly related to the true value.” (GL)
– “3.7 Mismatch between Error Models and Theoretical Causal Models of Error. The existing survey models are specified as variance components models devoid of the causes of the error source itself… Missing in the history of the TSE formulation is the partnership between scientists who study the causes of the behavior producing the statistical error and the statistical models used to describe them.” (GL)
Modeler’s view of TSE 11
Commentary 2• The standard decomposition of RMSE into components of bias
and variance in the TSE approach implies a particular model that may not be realistic, and is restricted to simple statistics like means– “3.8 Misplaced Focus on Descriptive Statistics. Hansen, Hurwitz,
and Pritzker (1964); Hansen, Hurwitz, and Bershad (1961), Biemer and Stokes (1991), and Groves (1989) all describe error models for sample means and/or estimates of population totals… survey data in our age are used only briefly for estimates of means and totals. Most analysts are interested more in subclass differences, order statistics, analytic statistics and a whole host of model-based parameter estimates.” (GL)
• A unified model for sampling and nonsampling error makes assumptions explicit, and allows inference for parameters other than means
Modeler’s view of TSE 12
TSE as a missing data problem (Rubin, 1974)
Modeler’s view of TSE 13
Z A X Y1 Y2 …YJ D1 … DK
ExperimentalUnits
SampleRespondents
Unit Nonrespondents
NonsampledUnits
? ? √ √ × …× ? … ?? ? √ × × …√ ? … ?
√ √ × √ × …× ? … ?√ √ × × × …√ ? … ?
√ √ × × × …× × …×√ √ × × × …× × …×
√ × × × × …× × …×√ × × × × …× × …×
√ = observed, × = missing, ? = observed or missing
Z = frame/design variables, A = meta-data (e.g. mode)X = unobserved true, Yj = observed for mode jDk=kth variable not subject to major measurement error
Errors of observation concern columns
Errors of non-observation concern rows
Application: measurement error in epidemiology
• Many variables in epidemiology are measured with error (dietary intake, biomarkers, …)
• Measurement error attenuates effect of variables, distorts inferences for other variables
• E.g. effect of dietary intake on cancer• Existing methods (e.g. regression calibration) assume
measurement variance is constant, but this is often a poor assumption
• Proposed approach: multiple imputation under a Bayesian model with non-constant variance– (Guo and Little 2011, Guo Little and McConnell, 2011)
Modeler’s view of TSE 14
Modeler’s view of TSE 15
Data for two vitamins
Modeler’s view of TSE 16
Measurement Error Model
This model links unobserved covariate X with error-prone measurement Y, considering potentially nonlinear mean functions and heteroscedastic measurement error
2ind( | , ) ~ ( ( ; ), ( ; , ))i i i iy x N x g x
2
10
),;(
);(
ii
ii
xxg
xx
with , the function g to model heteroscedasticity. specifically, we assume that
),( 2
Modeler’s view of TSE 17
Estimates of for eight analytes, linear model.
AnalytesResidual
Regression
ML Bayes
Estimate Post. Mean 95% HPD
Gamma tocopherol 0.56 0.56 0.56 (0.51,0.61)
Lutein 0.61 0.63 0.63 (0.58,0.68)
Alpha tocopherol 0.62 0.62 0.62 (0.57,0.67)
Delta tocopherol 0.63 0.65 0.65 (0.53,0.77)
Beta Cryptoxanthin (BC)
0.65 0.64 0.64 (0.58,0.71)
Lycopene 0.70 0.72 0.71 (0.67,0.76)
Retinol 0.71 0.68 0.67 (0.63,0.72)
Carotene 0.77 0.72 0.72 (0.67,0.77)
Modeler’s view of TSE 18
Regression on Covariates with measurement error
X: covariate of interest but unobserved
Y: observed error-prone measurement related to X
Calibration sample
Main sample
D X Y
(b) Internal Calibration design
X YD
D: response variable, interest in regression of D on X
(more generally D can include other covariates).
(a) External Calibration design
D measured in calibration sample
Modeler’s view of TSE 19
Analysis Model
This model links unobserved covariate X with outcome D. For simplicity we assume the model
20 1( , , )
2ind 0 1( | , ) ~ ( , )i i id x N x
where , although more generally nonlinear relationships between Y and X can be modeled.
Our aim is to estimate the unknown regression parameters, taking into account the measurement error in X.
Modeler’s view of TSE 20
(B) Non-differential measurement error (NDME) assumption
D X Y
Calibration
Main Study
(a) External calibration design
D X Y
Calibration
Main Study
(b) Internal calibration design
Non-differential measurement error: D is independent of Y given X
With external calibration design, assumption is needed to identify
parameters
With internal calibration design, assumption is not needed but
improves efficiency if assumed and true
Modeler’s view of TSE 21
Conventional Calibration (CA)
fits an appropriate curve to the calibration data
estimates the true value of X by inverting the fitted calibration
curve (usually assumes a linear association)
Regresses D on calibration estimates of X
X
Y
Modeler’s view of TSE 22
Regression Calibration (RC)
)|(ˆ YXEX
Estimates the regression of X on Y using the calibration data
Replaces the unknown values of X in main study with predictions
Simple and easy to be applied (Carroll and Stefanski,1990)
Standard errors: Asymptotic formula or bootstrap both data sources
Modeler’s view of TSE 23
“Efficient” Regression Calibration (ERC)
When the internal calibration data is available, direct estimates of the regression D on X are available from the calibration sample
These can be combined with the RC estimates, weighting the two estimates by their precision
Spiegelman et al. (2001) call this efficient regression calibration
Modeler’s view of TSE 24
Multiple Imputation (MI)
Multiply impute all the values of X using draws from their predictive distribution given the observed data.
We develop MI methods based on a fully Bayes model and
),,,,,(),,|(),,,,,|(
),,,,,(),,|(),,,,,,|(
),,,,,(),,,,,|,(),,,,,,|(
XpXYpXWp
XpXYpYXWp
XpXYWpYWXp
measurement error model
Main study model
prior distribution
Modeler’s view of TSE 25
Comparisons with constant measurement error variance
• Freedman et al. (2008) evaluate the performance of CA, RC, ERC and MI for the case of internal calibration data.
• CA biased, ERC better than RC, MI
• But: ERC assumes non-differential measurement error, and MI based on a model that does not make this assumption
– This accounts for superiority of ERC
• A limitation of their work is that it assumes the variance of the measurement errors is constant. As discussed, in many real applications, the variance of the measurement error increases with the underlying true value.
• We compare methods when measurement variance is not constant
Modeler’s view of TSE 26
Weighted Regression Calibration (WRC)
An alternative to RC, taking into account heteroscedastic measurement error. We reformulate the measurement error model as
2 20 1( | , , , ) ~ ( , )i i ind i ix y N y y
estimate λ as the slope of a simple regression of logarithm of the squared residuals of the regression on X on Y on the logarithm of the squared Y using the calibration data.
estimating and by weighted least squares.
substituting unknown values X in main study with estimates,
1
0 1ˆ ˆ ˆWRCX Y
0
Modeler’s view of TSE 27
Multiple Imputation (MI)MI is applied with a measurement error model that incorporates non-
constant variance of form .
(a) Full posterior distribution requires Metropolis step for draws of
(b) Approximation based on weighted least squares avoids Metropolis step
2 2ix
Modeler’s view of TSE 28
Prior distributions
Noninformative prior distributions for the marginal distribution of X and the parameters. Specifically, we assumed
),,,,()(),,,,,( pXpXp
where the prior distribution of X is normal with mean 0 and variance 1000, and
22.,))log(,,),log(,( constp
Modeler’s view of TSE 29
Simulation Study
• Twelve simulation scenarios were generated by combining the following choices of parameters:
analysis model:
measurement error model:
• Factors varied:study design: external calibration and internal calibration
measurement error size
outcome-covariate relationship
• Main study sample size = 400, calibration data sample size = 80
0 10, 0.3 or 0.75
0 10, 0.6 or 0.8
Var( | ) 0.5,0.8 or 1; 0.4Y X
Modeler’s view of TSE 30
Table 1. Empirical bias *1000 of estimators of γx with internal calibration data based on 500 simulations, when measurement error is
heteroscedastic. (Empirical standard deviation *1000 in parenthesis).
β1 σ γx CA RC WRC WERC MI
0.8 0.5 0.3 -164 (39) 13 (100) 10 (95) -5 (87) 3 (90) 0.8 0.8 0.3 -226 (30) 34 (170) 26 (143) -12 (96) -5 (95) 0.8 1 0.3 -248 (26) 72 (528) 51 (343) -19 (105) -4 (101)
0.8 0.5 0.75 -409 (55) 34 (164) 32 (152) -8 (102) -2 (97) 0.8 0.8 0.75 -564 (47) 92 (604) 73 (463) -16 (108) -3 (103) 0.8 1 0.75 -619 (41) 195 (1843) 105 (931) -22 (113) -6 (106)
0.6 0.5 0.3 -202 (33) 26 (129) 17 (121) 3 (97) 5 (93) 0.6 0.8 0.3 -251 (25) 58 (1077) 29 (520) -23 (114) -7 (100) 0.6 1 0.3 -269 (21) 124 (2111) 60 (1136) -25 (118) -5 (104)
0.6 0.5 0.75 -502 (53) 69 (198) 61 (185) -5 (105) 6 (101) 0.6 0.8 0.75 -633 (41) 148 (1637) 120 (988) -18 (118) -7 (105) 0.6 1 0.75 -669 (33) 360 (3265) 144 (1428) -24 (123) --9 (109)
Modeler’s view of TSE 31
β1 σ γx CA RC WRC WERC MI
0.8 0.5 0.3 3.0 71.6 72.4 76.4 94.5 0.8 0.8 0.3 0.2 77.2 80.4 80.2 94.5 0.8 1 0.3 0.0 80.2 82.4 82.6 94.1
0.8 0.5 0.75 0.0 89.2 90.6 90.0 94.0 0.8 0.8 0.75 0.0 90.0 90.0 90.2 93.8 0.8 1 0.75 0.0 89.2 90.8 89.6 93.5
0.6 0.5 0.3 0.0 74.2 78.2 78.6 94.3 0.6 0.8 0.3 0.0 80.0 81.4 81.0 94.5 0.6 1 0.3 0.0 83.4 83.0 82.6 94.0
0.6 0.5 0.75 0.0 90.0 91.0 91.0 94.2 0.6 0.8 0.75 0.0 89.6 90.6 90.0 94.0 0.6 1 0.75 0.0 90.8 91.4 91.0 93.5
Table 2. Coverage of 95% confidence interval of the estimator of γx with the internal validation calibration data based on 500 simulations.
Modeler’s view of TSE 32
Conclusions• Conventional approach is very biased and the estimate is
attenuated when measurement error increase.
• Regression calibration works poorly in presence of heteroscedastic measurement error. Weighted regression calibration is better, but still biased with below nominal coverage
• Multiple imputationMultiple imputation yields satisfactory results with small biases and good coverage.
• Similar findings for external calibration, under NDME
Modeler’s view of TSE 33
Implications for survey research• Combining data from measurement error experiments and
main survey yields better inferences
• Survey applications are more complex:
– Including other covariates is an easy extension
– Include complex design features in model, or apply design-based analysis to multiply-imputed data
– Relax normal assumptions
– Models needed for categorical data
• Bayesian prediction framework incorporates sample design and measurement features seamlessly
• Already starting to happen …
A Method of Correcting for Misreporting Applied to the Food Stamp Program
Nikolas Mittag (U.S. Census Bureau Dissertation Fellow), University of Chicago March 28, 2013
Using administrative and survey data I show that survey misreporting leads to biases in common statistical analyses. Standard corrections for measurement error cannot remove these biases. I develop a method to obtain consistent estimates by combining parameter estimates from the linked data with publicly available data… Administrative data on SNAP receipt and amounts linked to American Community Survey data from New York State show that survey data can misrepresent the program in important ways… The conditional density method I describe recovers the correct estimates using public use data only… Extrapolation to the entire U.S. yields substantive differences to survey data and reduces deviations from official aggregates by a factor of 4 to 8 compared to survey aggregates.
Modeler’s view of TSE 34
ReferencesGroves, R.M. and Lyberg, L. (2010). Total Survey Error: Past, Present,
and Future. POQ, 74 (5): 849-879.Guo, Y. & Little, R.J. (2011). Regression Analysis Involving Covariates
with Heteroscedastic Measurement Error. Statistics in Medicine, 30, 18, 2278–2294.
Guo, Y., Little, R.J. and McConnell, D.S. (2011). On Using Summary Statistics from an External Calibration Sample to Correct for Covariate Measurement Error. Epidemiology, 23(1), 165-174.
Hansen, M.H., Madow, W.G. & Tepping, B.J. (1983), “An evaluation of model-dependent and probability-sampling inferences in sample surveys” (with discussion), JASA 78, 776-793.
Little, R.J. (2012). Calibrated Bayes: an Alternative Inferential Paradigm for Official Statistics (with discussion and rejoinder). J. Official Statist., 28, 3, 309-372.
Rubin, D. B. (1974), "Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies," J. Educ. Psych., 66, 688-701. Modeler’s view of TSE 35
top related