rod little a modeler’s perspective on total survey error

Rod Little

A modeler’s perspective on Total Survey Error

Modeler’s view of TSE 2

Outline• Total Survey Error – strengths and weaknesses,

according to Groves and Lyberg• A modeler’s view of TSE• An application: multiple imputation for regression

involving covariates with measurement error, including data from a calibration sample– Epidemiological, but implications for survey practice

– Involving calibration samples, heteroscedastic measurement errors

– Compare with classical calibration methods

My philosophy: Calibrated Bayes• Bayes for inference, frequentist for model development and

assessment – seek inferences that are “frequentist calibrated”– In surveys this often leads to “weak, robust” models with

“reference” prior distributions” – “model-based, design-assisted inference” (Little 2012)

• Bayes inference is optimal under well-specified models• Frequentist calibration creates resistance to “bad models”

– E.g. “design-consistency” forces models that account for survey design

– unlike e.g. models in Hansen, Madow and Tepping (1983)

• Frequentist simulations suggest this approach can yield better repeated-sampling properties than design-based approaches– Bayes propagates error in estimating parameters

Total Survey Error (TSE)• Groves and Lyberg (GL, 2010)

• TSE paradigm is the conceptual foundation of the field of survey methodology.

• Quality properties of survey statistics are functions of essential survey conditions that are independent of the sample design.

• More successful as an intellectual framework than a unified statistical model of error properties of survey statistics.

• More importation of modeling perspectives from other disciplines could enrich the paradigm.– In this case, biostatistics and epidemiology

Strengths of TSE (GL)• Explicit attention to the decomposition of

errors

• Separation of phenomena affecting statistics in various ways

• Success in forming the conceptual basis of the field of survey methodology, pointing the direction for new research.

Current weaknesses of TSE (GL)• Key quality concepts are not included (notably those

of user)

• Quantitative measurement of many components burdensome and lagging

• Has not led to enriched error measurement in practical surveys

• Assumptions required for some estimators of error terms are frequently not true

• Mismatch between existing error models and theoretical causal models of the error mechanisms

• Misplaced focus on descriptive statistics, and failure to integrate error models developed in other fields

Current weaknesses of TSE (GL)• Key quality concepts are not included (notably those

of user)

• Quantitative measurement of many components burdensome and lagging

• Has not led to enriched error measurement in practical surveys

• Assumptions required for some estimators of error terms are frequently not true

• Mismatch between existing error models and theoretical causal models of the error mechanisms

• Misplaced focus on descriptive statistics, and failure to integrate error models developed in other fields

Unified modeling addresses these points

TSE Components Linked to Steps in Measurement and Representational Inference Process (Groves et al. 2004)

Construct Inferential Population

Measurement

Response

Target Population

Sampling Frame

Sample

Validity

Measurement Error

Coverage

Sampling Error

Representation

Respondents

Nonresponse

ErrorEdited Data

ProcessingError

Survey Statistic

• Measurement

Errors of observation

Errors of nonobservationModeler’s view of TSE

Commentary 1• Dual inferential approaches – model-based for measurement,

design-based for analysis – inhibits integration of the two streams into a unified analysis– “The isolation of survey statisticians and methodologists from the

mainstream of social statistics has, in our opinion, retarded the importation of model-based approaches to many of the error components in the total survey error format.” (GL)

– The great disappointment regarding the TSE perspective is that it has not led to routine fuller measurement of the statistical error properties of survey statistics. While official statisticians and much of social science have accepted the probability sampling paradigm and routinely provide estimates of sampling variance, there is little evidence that the current practice of surveys in the early 21st century measures anything more than sampling variance routinely.

Commentary 1• Dual inferential approaches – model for measurement, design

for analysis – inhibits integration of the two streams into a unified analysis– “There are exceptions worth noting. The tendency for some

continuing surveys to develop error or quality profiles is encouraging (Kalton, Winglee, and Jabine, 1998; Kalton, Winglee, Krawchuk, and Levine, 2000; Lynn, 2003). These profiles contain the then-current understanding of statistical error properties of the key statistics produced by the survey. Through these quality profiles, surveys with rich methodological traditions produce well-documented sets of study results auxiliary to the publications of findings. None of the quality profiles have attempted full measurement of all known errors for a particular statistic.” (GL)

Commentary 2• The standard decomposition of RMSE into components of bias

and variance in the TSE approach implies a particular model that may not be realistic, and is restricted to simple statistics like means– “3.6 Assumptions Patently Wrong for Large Classes of Statistics.

Many of the error model assumptions are wrong most of the time. For example, the Kish (1962) linear model for interviewer effects assume that the response deviations are random effects of the interviewer and respondent, uncorrelated to the true value for the respondent. However, for example, reporting of drug usage has been found to have an error structure highly related to the true value.” (GL)

– “3.7 Mismatch between Error Models and Theoretical Causal Models of Error. The existing survey models are specified as variance components models devoid of the causes of the error source itself… Missing in the history of the TSE formulation is the partnership between scientists who study the causes of the behavior producing the statistical error and the statistical models used to describe them.” (GL)

Commentary 2• The standard decomposition of RMSE into components of bias

and variance in the TSE approach implies a particular model that may not be realistic, and is restricted to simple statistics like means– “3.8 Misplaced Focus on Descriptive Statistics. Hansen, Hurwitz,

and Pritzker (1964); Hansen, Hurwitz, and Bershad (1961), Biemer and Stokes (1991), and Groves (1989) all describe error models for sample means and/or estimates of population totals… survey data in our age are used only briefly for estimates of means and totals. Most analysts are interested more in subclass differences, order statistics, analytic statistics and a whole host of model-based parameter estimates.” (GL)

• A unified model for sampling and nonsampling error makes assumptions explicit, and allows inference for parameters other than means

TSE as a missing data problem (Rubin, 1974)

Z A X Y1 Y2 …YJ D1 … DK

ExperimentalUnits

SampleRespondents

Unit Nonrespondents

NonsampledUnits

? ? √ √ × …× ? … ?? ? √ × × …√ ? … ?

√ √ × √ × …× ? … ?√ √ × × × …√ ? … ?

√ √ × × × …× × …×√ √ × × × …× × …×

√ × × × × …× × …×√ × × × × …× × …×

√ = observed, × = missing, ? = observed or missing

Z = frame/design variables, A = meta-data (e.g. mode)X = unobserved true, Yj = observed for mode jDk=kth variable not subject to major measurement error

Errors of observation concern columns

Errors of non-observation concern rows

Application: measurement error in epidemiology

• Many variables in epidemiology are measured with error (dietary intake, biomarkers, …)

• Measurement error attenuates effect of variables, distorts inferences for other variables

• E.g. effect of dietary intake on cancer• Existing methods (e.g. regression calibration) assume

measurement variance is constant, but this is often a poor assumption

• Proposed approach: multiple imputation under a Bayesian model with non-constant variance– (Guo and Little 2011, Guo Little and McConnell, 2011)

Data for two vitamins

Measurement Error Model

This model links unobserved covariate X with error-prone measurement Y, considering potentially nonlinear mean functions and heteroscedastic measurement error

2ind( | , ) ~ ( ( ; ), ( ; , ))i i i iy x N x g x

with , the function g to model heteroscedasticity. specifically, we assume that

Estimates of for eight analytes, linear model.

AnalytesResidual

Regression

ML Bayes

Estimate Post. Mean 95% HPD

Gamma tocopherol 0.56 0.56 0.56 (0.51,0.61)

Lutein 0.61 0.63 0.63 (0.58,0.68)

Alpha tocopherol 0.62 0.62 0.62 (0.57,0.67)

Delta tocopherol 0.63 0.65 0.65 (0.53,0.77)

Beta Cryptoxanthin (BC)

0.65 0.64 0.64 (0.58,0.71)

Lycopene 0.70 0.72 0.71 (0.67,0.76)

Retinol 0.71 0.68 0.67 (0.63,0.72)

Carotene 0.77 0.72 0.72 (0.67,0.77)

Regression on Covariates with measurement error

X: covariate of interest but unobserved

Y: observed error-prone measurement related to X

Calibration sample

Main sample

(b) Internal Calibration design

D: response variable, interest in regression of D on X

(more generally D can include other covariates).

(a) External Calibration design

D measured in calibration sample

Analysis Model

This model links unobserved covariate X with outcome D. For simplicity we assume the model

20 1( , , )

2ind 0 1( | , ) ~ ( , )i i id x N x

where , although more generally nonlinear relationships between Y and X can be modeled.

Our aim is to estimate the unknown regression parameters, taking into account the measurement error in X.

(B) Non-differential measurement error (NDME) assumption

Calibration

Main Study

(a) External calibration design

Calibration

Main Study

(b) Internal calibration design

Non-differential measurement error: D is independent of Y given X

With external calibration design, assumption is needed to identify

parameters

With internal calibration design, assumption is not needed but

improves efficiency if assumed and true

Conventional Calibration (CA)

fits an appropriate curve to the calibration data

estimates the true value of X by inverting the fitted calibration

curve (usually assumes a linear association)

Regresses D on calibration estimates of X

Regression Calibration (RC)

)|(ˆ YXEX

Estimates the regression of X on Y using the calibration data

Replaces the unknown values of X in main study with predictions

Simple and easy to be applied (Carroll and Stefanski,1990)

Standard errors: Asymptotic formula or bootstrap both data sources

“Efficient” Regression Calibration (ERC)

When the internal calibration data is available, direct estimates of the regression D on X are available from the calibration sample

These can be combined with the RC estimates, weighting the two estimates by their precision

Spiegelman et al. (2001) call this efficient regression calibration

Multiple Imputation (MI)

Multiply impute all the values of X using draws from their predictive distribution given the observed data.

We develop MI methods based on a fully Bayes model and

),,,,,(),,|(),,,,,|(

),,,,,(),,|(),,,,,,|(

),,,,,(),,,,,|,(),,,,,,|(

XpXYpXWp

XpXYpYXWp

XpXYWpYWXp

measurement error model

Main study model

prior distribution

Comparisons with constant measurement error variance

• Freedman et al. (2008) evaluate the performance of CA, RC, ERC and MI for the case of internal calibration data.

• CA biased, ERC better than RC, MI

• But: ERC assumes non-differential measurement error, and MI based on a model that does not make this assumption

– This accounts for superiority of ERC

• A limitation of their work is that it assumes the variance of the measurement errors is constant. As discussed, in many real applications, the variance of the measurement error increases with the underlying true value.

• We compare methods when measurement variance is not constant

Weighted Regression Calibration (WRC)

An alternative to RC, taking into account heteroscedastic measurement error. We reformulate the measurement error model as

2 20 1( | , , , ) ~ ( , )i i ind i ix y N y y

estimate λ as the slope of a simple regression of logarithm of the squared residuals of the regression on X on Y on the logarithm of the squared Y using the calibration data.

estimating and by weighted least squares.

substituting unknown values X in main study with estimates,

0 1ˆ ˆ ˆWRCX Y

Multiple Imputation (MI)MI is applied with a measurement error model that incorporates non-

constant variance of form .

(a) Full posterior distribution requires Metropolis step for draws of

(b) Approximation based on weighted least squares avoids Metropolis step

Prior distributions

Noninformative prior distributions for the marginal distribution of X and the parameters. Specifically, we assumed

),,,,()(),,,,,( pXpXp

where the prior distribution of X is normal with mean 0 and variance 1000, and

22.,))log(,,),log(,( constp

Simulation Study

• Twelve simulation scenarios were generated by combining the following choices of parameters:

analysis model:

measurement error model:

• Factors varied:study design: external calibration and internal calibration

measurement error size

outcome-covariate relationship

• Main study sample size = 400, calibration data sample size = 80

0 10, 0.3 or 0.75

0 10, 0.6 or 0.8

Var( | ) 0.5,0.8 or 1; 0.4Y X

Table 1. Empirical bias *1000 of estimators of γx with internal calibration data based on 500 simulations, when measurement error is

heteroscedastic. (Empirical standard deviation *1000 in parenthesis).

β1 σ γx CA RC WRC WERC MI

0.8 0.5 0.3 -164 (39) 13 (100) 10 (95) -5 (87) 3 (90) 0.8 0.8 0.3 -226 (30) 34 (170) 26 (143) -12 (96) -5 (95) 0.8 1 0.3 -248 (26) 72 (528) 51 (343) -19 (105) -4 (101)

0.8 0.5 0.75 -409 (55) 34 (164) 32 (152) -8 (102) -2 (97) 0.8 0.8 0.75 -564 (47) 92 (604) 73 (463) -16 (108) -3 (103) 0.8 1 0.75 -619 (41) 195 (1843) 105 (931) -22 (113) -6 (106)

0.6 0.5 0.3 -202 (33) 26 (129) 17 (121) 3 (97) 5 (93) 0.6 0.8 0.3 -251 (25) 58 (1077) 29 (520) -23 (114) -7 (100) 0.6 1 0.3 -269 (21) 124 (2111) 60 (1136) -25 (118) -5 (104)

0.6 0.5 0.75 -502 (53) 69 (198) 61 (185) -5 (105) 6 (101) 0.6 0.8 0.75 -633 (41) 148 (1637) 120 (988) -18 (118) -7 (105) 0.6 1 0.75 -669 (33) 360 (3265) 144 (1428) -24 (123) --9 (109)

β1 σ γx CA RC WRC WERC MI

0.8 0.5 0.3 3.0 71.6 72.4 76.4 94.5 0.8 0.8 0.3 0.2 77.2 80.4 80.2 94.5 0.8 1 0.3 0.0 80.2 82.4 82.6 94.1

0.8 0.5 0.75 0.0 89.2 90.6 90.0 94.0 0.8 0.8 0.75 0.0 90.0 90.0 90.2 93.8 0.8 1 0.75 0.0 89.2 90.8 89.6 93.5

0.6 0.5 0.3 0.0 74.2 78.2 78.6 94.3 0.6 0.8 0.3 0.0 80.0 81.4 81.0 94.5 0.6 1 0.3 0.0 83.4 83.0 82.6 94.0

0.6 0.5 0.75 0.0 90.0 91.0 91.0 94.2 0.6 0.8 0.75 0.0 89.6 90.6 90.0 94.0 0.6 1 0.75 0.0 90.8 91.4 91.0 93.5

Table 2. Coverage of 95% confidence interval of the estimator of γx with the internal validation calibration data based on 500 simulations.

Conclusions• Conventional approach is very biased and the estimate is

attenuated when measurement error increase.

• Regression calibration works poorly in presence of heteroscedastic measurement error. Weighted regression calibration is better, but still biased with below nominal coverage

• Multiple imputationMultiple imputation yields satisfactory results with small biases and good coverage.

• Similar findings for external calibration, under NDME

Implications for survey research• Combining data from measurement error experiments and

main survey yields better inferences

• Survey applications are more complex:

– Including other covariates is an easy extension

– Include complex design features in model, or apply design-based analysis to multiply-imputed data

– Relax normal assumptions

– Models needed for categorical data

• Bayesian prediction framework incorporates sample design and measurement features seamlessly

• Already starting to happen …

A Method of Correcting for Misreporting Applied to the Food Stamp Program

Nikolas Mittag (U.S. Census Bureau Dissertation Fellow), University of Chicago March 28, 2013

Using administrative and survey data I show that survey misreporting leads to biases in common statistical analyses. Standard corrections for measurement error cannot remove these biases. I develop a method to obtain consistent estimates by combining parameter estimates from the linked data with publicly available data… Administrative data on SNAP receipt and amounts linked to American Community Survey data from New York State show that survey data can misrepresent the program in important ways… The conditional density method I describe recovers the correct estimates using public use data only… Extrapolation to the entire U.S. yields substantive differences to survey data and reduces deviations from official aggregates by a factor of 4 to 8 compared to survey aggregates.

ReferencesGroves, R.M. and Lyberg, L. (2010). Total Survey Error: Past, Present,

and Future. POQ, 74 (5): 849-879.Guo, Y. & Little, R.J. (2011). Regression Analysis Involving Covariates

with Heteroscedastic Measurement Error. Statistics in Medicine, 30, 18, 2278–2294.

Guo, Y., Little, R.J. and McConnell, D.S. (2011). On Using Summary Statistics from an External Calibration Sample to Correct for Covariate Measurement Error. Epidemiology, 23(1), 165-174.

Hansen, M.H., Madow, W.G. & Tepping, B.J. (1983), “An evaluation of model-dependent and probability-sampling inferences in sample surveys” (with discussion), JASA 78, 776-793.

Little, R.J. (2012). Calibrated Bayes: an Alternative Inferential Paradigm for Official Statistics (with discussion and rejoinder). J. Official Statist., 28, 3, 309-372.

Rubin, D. B. (1974), "Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies," J. Educ. Psych., 66, 688-701. Modeler’s view of TSE 35

rod little a modeler’s perspective on total survey error

existing error models

fieldsmodelers view

enriched error measurement

modelers view of tsestrengths

estimators of error

tse paradigm

survey designunlike

robust models

Documents

norris pro-rod coiled rod

error 404: h&m cover not found rod johnson, marine...

population sample bias/error sampling error coverage...

smoke modeling blueskyrains and shrmc-4s rick gillam u.s....

error error code error type description of the

acopos error texts error number: error text

polished rod clamp rod rotator stuffing box pumping tee...

quickbooks error : resolving h202 error

a molecular modeler’s guide to statistical mechanics

englishingilizce.com · error error error error error error...

· 2018-12-22 · error error error error error error...

threaded rod threaded rod threaded rod & accessories

allsealsinc.comallsealsinc.com/pdf2/merkel.pdf · m merkel...

imagenes intervencion error no error

research on the transmission error of swing-rod movable

error absoluto error relativo

an ecological modeler’s primer on...

tungsten carbide r drill with internal coolant ·...

street rod rears technical manual - winters rod tech...

predictive modeling: a modeler’s introspection€¦ ·...