using modern missing data analyses for effective inference about hunters’ satisfaction towards ofw...
TRANSCRIPT
Using Modern Missing Data Analyses for effective inference about Hunters’ satisfaction towards OFW Program
Muhammad Imran Khan
Motivation of Study• Hunting & fishing are part of Nebraska's
heritage
• NGPC is interested in improving hunter/angler recruitment & retention (NGPC,2008)
• Data collected in 2013 to know about hunters’ motivations & satisfactions towards OFW lands
• Purpose of this study is to compare estimates using appropriate imputation methods
2
Missing Data• Missingness in Surveys (Groves et al., 2004)– Noncoverage– Unit Nonresponse– Item Nonresponse– Partial Nonresponse (Brick & Kalton,1996)– Data Entry Error (Anne & Andrea,2014)
• Missing data Mechanism(Buuren, 2012)– Missing Completely At Random (MCAR)– Missing At Random (MAR)– Missing Not At Random (MNAR)
3
How much missing data is “problematic”• Researchers assign some limits:– > 5% (Schafer,1999)– >10% (Benntt,2001)– >20% (Peng et al., 2006)– (Widaman,2006) specified the following scaleo 1%-2% (Negligible)o 5%-10% ( Minor)o 10%-25% (Moderate)o 25%-50% (High)o >50% (Excessive)
• Important problems of missingness (Bell & Fairclough,2013)– decrease in precision– Increase bias in parameter estimation
4
NGPC & UNL conducted survey• Sampling frame: hunters who purchased hunting
license for hunting in 2012 in NE– The survey contained three parts:o Where, & what hunt; Environment Impacto Motivations(Relatedness, Competence, Autonomy)o Socio-demographic factors
• About collected data– Total questions = 42 (used 19 Qus. for analysis)– Sample size = 8181– Completely filled =1555 (19%)– Unit nonresponse = 627 (8%)– Item nonresponse = 5999 (73%)o Varies from 1 to 8 missingness per respondent in all 19 Qus.
5
81%
Determining Type of Missing Data• Test for MCAR (Little, 1988)– Little’s Test of MCAR (Omnibus test of all specified
variables) o If test is not significant, then data can be assumed MCARo If test is significant, then Then, data may be MAR or MNAR
• For given data test is sig. So data are MAR– 3256.783 with .
• Table shows number & percent missing
6
M. Satisf. Rel_1 Rel_2 Comp. Auto. H_Days “Harvest” Educ. Income Age
Ns. 5171 332 332 345 397 5096 0 1088 1465 1263
% 0.685 0.04 0.04 0.046 0.053 0.675 0 0.144 0.194 0.167
Data used for analysis• 13 Questions for motivation based on SDT
5 Questions on relatedness transformed to 2 factors
7
Data used for analysis• 13 Questions for motivation based on SDT
4 Qus. on competence & autonomy transformed each to 1 factor
8
Satisfaction=Rel_1+Rel_2+Comp+Auto+ Educ+Age+Income+H_Days+Harvest
Model used for the analysis9
Variable Description of the variable [measured on 7 point Likert scale]
Satisfaction How satisfied were you with your experience on private lands enrolled in the Open Fields and Waters (OFW)?
Releatedness_1 I enjoy mentoring other huntersReleatedness_2 I go hunting primarily to spend time with others & people I care aboutCompetence Overall, Hunting makes me feel competent in other areas of my lifeAutonomy Hunting helps me to feel independent; self-sufficient and more control
in lifeEducation Highest level of education that you have complete (<HS;HS;S.C;C;≥G)
Age Age (Approximately in years)
Income Total annual income for your household before taxes (8 diff. levels)
Hunting_Days Visiting OFW sites allowed me to increase total days I spent hunting
“Harvest” If you hunted in 2012 on a OFW site, did you harvest? (Yes/No)
• Deletion or non-imputing methods:o List-wise Deletion (Pigott, 2001)o Pair-wise Deletion (Bennett, 2001)
• Nonstochastic or ad-hoc methods:o Mean Imputation (Graham,2003)o Regression Imputation (Qin et.al., 2007)
• Stochastic or Established methods:o Stochastic Regression (Todd et al., 2013)o Multiple Imputation(MI) (John, et al., 2007)o Full Information Maximum Likelihood(FIML)o Expectation Maximization (EM)(Yiran & Chao-Ying, 2013)
Methods for Handling Missing Data 10
Mean Imputation 11
Comparing Results 12
Fitted Model
List-wise Deletion Mean Imputation
p-value p-valueIntercept 0.415 0.205 0.043 0.381 0.062 0.000Releatedness_1 -0.023 0.040 0.565 -0.005 0.010 0.614Releatedness_2 0.038 0.045 0.401 0.017 0.011 0.120Competence 0.147 0.079 0.062 0.023 0.019 0.227Autonomy
0.049 0.075 0.5140.009 0.018 0.619
Education-0.045 0.039 0.241
-0.011 0.010 0.296
Age -0.001 0.003 0.682 0.000 0.001 0.563Income 0.003 0.022 0.903 0.002 0.006 0.754Hunting_Days 0.135 0.017 0.000 0.162 0.007 0.000“Harvest” 0.569 0.077 0.000 0.364 0.028 0.000
5999 cases or rows are Deleted m=1, maxit=1
Multiple Imputation 13
Comparing Results 14
Fitted Model
List-wise Deletion Mean Imputation Multiple Imputation
p-value p-value p-valueIntercept 0.415 0.205 0.043 0.381 0.062 0.000 0.316 0.183 0.093Releatedness_1 -0.023 0.040 0.565 -0.005 0.010 0.614 -0.019 0.037 0.605Releatedness_2 0.038 0.045 0.401 0.017 0.011 0.120 0.048 0.037 0.205Competence 0.147 0.079 0.062 0.023 0.019 0.227 0.097 0.077 0.219Autonomy 0.049 0.075 0.514 0.009 0.018 0.619 0.017 0.061 0.787Education -0.045 0.039 0.241 -0.011 0.010 0.296 -0.032 0.027 0.245Age -0.001 0.003 0.682 0.000 0.001 0.563 -0.001 0.002 0.731Income 0.003 0.022 0.903 0.002 0.006 0.754 0.007 0.022 0.761Hunting_Days 0.135 0.017 0.000 0.162 0.007 0.000 0.152 0.013 0.000“Harvest” 0.569 0.077 0.000 0.364 0.028 0.000 0.575 0.060 0.000
5999 cases or rows are Deleted m=1, maxit=1 m=20, maxit=10
Comparing Results 15
Fitted Model
List-wise Deletion
Full Information Maximum Likelihood
(FIML) Imputation
Expectation Maximization
(EM) Imputation
p-value p-value p-valueIntercept 0.415 0.205 0.043 0.309 0.185 0.096 0.301 0.155 0.053Releatedness_1 -0.023 0.040 0.565 -0.012 0.032 0.713 -0.010 0.034 0.781Releatedness_2 0.038 0.045 0.401 0.061 0.036 0.089 0.061 0.034 0.076Competence 0.147 0.079 0.062 0.102 0.065 0.116 0.106 0.065 0.106Autonomy 0.049 0.075 0.514 0.016 0.062 0.798 0.013 0.062 0.839Education -0.045 0.039 0.241 -0.034 0.034 0.319 -0.030 0.033 0.359Age -0.001 0.003 0.682 -0.001 0.002 0.779 0.005 0.020 0.803Income 0.003 0.022 0.903 0.006 0.020 0.766 -0.001 0.002 0.752Hunting_Days 0.135 0.017 0.000 0.148 0.014 0.000 0.148 0.015 0.000“Harvest” 0.569 0.077 0.000 0.599 0.062 0.000 0.598 0.060 0.000
5999 cases or rows are Deleted EM algorithm (MLE) converges in 37 iterations
• EM only shows that Releadness_2 is significant• EM estimates smallest standard error for Income• Comparison of Imputation Methods
Summary16
% of smaller estimations than List-wise Deletion out of 10 variablesApproaches Estimates Std. Err. P-value SuggestionsList-wise Deletion Base Base Base Avoid to useMean Imputation 60% 100% 40% Careful useMultiple Imputation 30% 100% 20% BetterFull Information Maximum Likelihood
30% 100% 20% Better
Expectation Maximization
40% 90% 20% Preferred if converged
Thanks for your kind attention
Special Thanks to: Dr. Andrew Tyre, Uni. Of Nebraska, LincolnDr. Lisa Pennisi, Uni. Of Nebraska, Lincoln Dr. Allan McCutcheon, Uni. Of Nebraska, Lincoln
Nebraska Game & Parks Commission
Anne-Kathrin,F. & Andrea B. (2014). The economic performance of Swiss drinking water utilities. Journal of Prod. Analysis.41:383-397. doi 10.1007/s11123-013-0344-0
Bell, M. L.,& Fairclough,D.L. (2013). Practical and statistical issues in missing data for longitudinal patient reported outcomes.Statistical Methods in Medical Research, 0(0), 1-20. doi: 10.1177/0962280213476378Bennett, D.A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, 25,
464-469.Brick, J., & Kalton, J. (1996). Handling missing data in survey research. Statistical Methods in Medical Research, 5, 215–238. doi:10.1177/096228029600500302Buuren, S.V.(2012). Flexible imputation of missing data. Taylor & Francis, FL: CRC Press.John, W. G. & Allison E. O. & Tamika D. G.(2007). How many imputations are really needed? some practical clarifications of multiple imputation theory, Springer,8:206- 213.Graham, J. W. (2003). Adding missing-data-relevant variables to FIML based structuralequation models. Structural Equation Modeling, 10,80–100.Groves, R., Fowler, F., Couper, M., Lepkowski, J., Singer, E., & Tourangeau, R. (2004). Survey methodology. Hoboken, NJ: John Wiley.Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association , 83, 1198-1202. NGPC (2008). Nebraska 20 year hunter/angler recruitment, development and retention plan. Lincoln, NE.Pigott, T. D. (2001). A Review of Methods for Missing Data. Educational Research and Evaluation, 7(4), 353-383.Peng, C.Y., Harwell, M., Liou, S.M., & Ehman, L.H. (2006). Advances in missing data methods and implications for educational
research. In S Sawilowsky (Ed.), Real data analysis (pp.31-78), Greenwich, CT: Information Age.Qin,Y.,Zhang,S.,Zhu,X.,Zang,J.,& Zhang,C. (2007). Semi-parametric optimization for missing data imputation. Appl Intell 27,79-88.
DOI 10.1007/s10489-006-0032-0Schafer, J.L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research. 8: 3-15. Todd D. L., Terrence D. J., Kyle M. L., & Whitney M. (2013). On the joys of missing data. Journal of Pediatric Psychology, 1-12. doi:10.1093/jpepsy/jst048 Yiran D. & Chao-Ying J.P.(2013). Principled missing data methods for researchers. Springer, 2:222.
References18
Questions & Comments!are most welcome
Contact Information: [email protected]