who are the nonresondents? an analysis based on a new subsample of the german socio-economic panel...
Post on 25-Dec-2015
220 Views
Preview:
TRANSCRIPT
Who are the Nonresondents?
An Analysis Based on a New Subsample of the German Socio-Economic Panel (SOEP)
including Microgeographic Characteristics and Survey-Based Interviewer Characteristics
Jörg-Peter Schräpler12, Jürgen Schupp23 and Gert G. Wagner24
1 Ruhr-University Bochum, LDS NRW 2 DIW Berlin, 3 FU Berlin, 4 Berlin University of Technology
Q 2008 Conference, Rome, 8th.-11th. July 2008
2
Outline
Introduction Reasons for Unit Nonresponse Nonresponse in Sample H
Descriptive Analysis Microgeographic data Interviewer data
Multilevel Analysis Consequences Summary and Conclusion
3
Introduction
Unit nonresponse is one of the most important issues in the empirical social science
Danger of selectivity: leads to biased samples, samples are not random it is important to investigate in which manner the realized
sample differ from the intended sample and to look at the consequences
Main reasons for Nonresponse: Problem of Non-Accessibility Problem of Non-Ability Problem of Refusals
4
Reasons for Nonresponse1. level: Accessibility
Result of impossibility to contact household members. It can be seen as (Groves/Couper 1998):
a function of the physical reachability of the household the circadian rhythm of the household members contact strategies of the interviewers Problem: Causes often can‘t be measured directly
Some empirical findings:
socio-economic status, household size, vocational status and age are important for mobility (cf. Goyder 1987, Schneekloth/Leven 2003, Koch 1997, Schräpler 2000)
Interviewers with higher workload have less nonresponse due to non-reachability (cf. Schräpler 2000)
5
Reasons for Nonresponse2. level: Ability
Unit Nonresponse depends on the ability of the household member to participate
Individuals are ill and can‘t participate. Assumption: health problems increase with the age of the respondent (c.f. Schneekloth/Leven 2003)
Assumption: sometimes an alibi and a „soft refusal“
6
Reasons for Nonresponse 3. level: Motivation/Cooperation
depends on respondents’ assessment of the interview situation and evaluation of the consequences of possible actions (RC theory)
Opportunity costs an interview takes time, survey has to serve a meaningful purpose
Privacy and confidentially concerns invasion of privacy (cf. Singer et al 1993) critical distance and possible mistrust of surveys in more
intellectual environments in Germany (Schneekloth/Leven 2003)
Fear of crime high population density areas, anonymous residential zones (cf.
Schnell 1997, Koch 1997, Goyder 1987, DeMaio 1980)
Interviewer interviewer’s age, gender, motivation, attitudes and experience (cf.
Esser 1986, Loosveldt et al. 1998, Schräpler 2006, 2004, 2000)
7
SOEP - Sample H - Fieldwork
Subsample H of SOEP started in year 2006
From 6,000 household addresses (4 per sample point) overall 3,931 household addresses were recorded by random walk
The process of address recording is separated from the interviewing process: the interviewer receives fixed addresses from the fieldwork
organization
The first wave was launched by 234 interviewers. Of these, 143 were already members of the SOEP staff.
All interviews were carried out by CAPI
8
Nonresponse in Sample H
Reasons for Nonresponse in Sample H N % Gross Sample 3931 100 ./. Non-systematic Drop-Outs Household not detectable 169 4.30 At the moment not feasible 12 0.31 Adjusted Gross Sample 3750 100 ./. Systematic Drop-Outs Not accessible 485 12.93 Refusal 1487 39.65 Not able to participate (c.f. nursing case) 172 4.59 Whole Sample Point lost 15 0.40 Individual household without treatment 82 2.19
Analyzable Interviews 1509 40.24 Source: SOEP 2006, Sample H, household level
9
Nonresponse Analyses – Information gap
Serious problem for nonresponse analysis: Information gap on respondents and nonrespondents
to fill the gap we use• commercial microgeographic data on the households‘
immediate neighbourhood• demographic variables of the interviewers• results of an interviewer questionnaire
10
Microgeographic Information
Use of additional commercial microgeograhic data on the households’ immediate neighbourhoods from the MOSAIC Data system
contains more than 75 individual characteristics used to analyse and describe customer databases or markets for instance Sinus Milieus®, Status, removal volume etc.
information is available at the address level and contains 17.8 million buildings in Germany the building level contains seven or eight households on average (at
least five households)
Important: linked information is not necessary in line with the reality of the particular household (only an approximation for the neighbourhood)
11
Interviewer data
Use of interviewer data from the SOEP interviewer data set mainly demographic variables like gender, age,
education, family status etc. Use of a dataset based on a interviewer
questionnaire mainly personality variables and self assessments filled out by 165 of the 234 SOEP interviewers in
sample H
12
Respondents by Sinus Milieus (N=1,449)
rel. Bias in %Ref.: Milieu distr. for addresses
< -50
> -50 till -30
> -30 till -10
> -10 till +10
> +10 till +30
> +30 till +50
> +50
13
Refusals by Sinus Milieus (N=1,435)
rel. Bias in %Ref.: Milieu distr. for addresses
< -50
> -50 till -30
> -30 till -10
> -10 till +10
> +10 till +30
> +30 till +50
> +50
14
Noncontact by Sinus Milieus (N=470)
rel. Bias in %Ref.: Milieu distr. for addresses
< -50
> -50 till -30
> -30 till -10
> -10 till +10
> +10 till +30
> +30 till +50
> +50
15
„Not Able to Participate“ by Sinus Milieus (N=167)
rel. Bias in %Ref.: Milieu distr. for addresses
< -50
> -50 till -30
> -30 till -10
> -10 till +10
> +10 till +30
> +30 till +50
> +50
16
Four Multilevel Logit Models Model 1 – probability for response variable „interview“
(participation) vs. non-response Model 2 – probability for response variable „refuse to participate“
vs. „participate“ Model 3 – probability for response variable „household not
reachable“ vs. „participate“ Model 4 – probability for response variable „household not able to
participate“ vs. „participate“
Two sets of Predictors:1. Model version A with demographic and household variables for the
potential respondents, microgeographic variables and demographic variables for the interviewer
2. Model version B with additional interviewer variables from the interviewer questionnaire
17
Two-level Logit Models
* participation 1, if 0,
0, otherwiseij
ij
yy
* unit-nonresponse (refuse, nocontact, not able)1, if 0,
0, otherwiseij
ij
yy
ij ij ijy u
1
0 , , 01
1 exp( ( ))H
ij j h ij h ij jh
x v
Random-Intercept Model:
Level 1: respondents, Level 2: interviewers
18
Version A: Multilevel logit estimates – age of thepotential respondents
Variable Coeff. Coeff. Coeff. Coeff.
Fixed effect
(Intercept) -1,482 -2,13 * 0,360 0,47 0,279 0,22 -1,106 -0,58
Age < = 35 years (Ref.)Age > 35 - 40 y. -0,160 -0,78 0,221 0,99 0,297 0,77 -0,238 -0,38Age > 40 - 45 y -0,292 -1,52 0,239 1,14 0,309 0,84 0,550 0,97Age > 45 - 50 y. -0,454 -2,39 * 0,519 2,52 * 0,437 1,18 0,181 0,32Age > 50 - 55 y. -0,019 -0,10 0,050 0,24 0,167 0,43 0,079 0,14Age > 55 - 60 y. -0,128 -0,64 0,102 0,47 0,561 1,46 -0,275 -0,45Age > 60 - 65 y. -0,294 -1,45 0,313 1,43 0,407 1,02 0,232 0,39Age > 65 y. -0,212 -1,00 0,251 1,10 0,306 0,71 0,315 0,52
... ... ... ... ...to continue
Participation vs. Refused Nocontact vs.
Nonparticipation vs. Participation Participation
Not Able vs.
Participation
z-value z-value z-value z-value
19
Version A: Multilevel logit estimates – SinusMilieus for the potential respondents
Variable Coeff. Coeff. Coeff. Coeff.
Sinus MilieuWell-Established (Ref.)Post-Materialists -0,194 -1,12 0,282 1,51 0,023 0,07 0,190 0,30Modern Performers -0,419 -2,06 * 0,499 2,25 * 0,247 0,68 0,764 1,20Upper-Conservatives -0,249 -1,20 0,200 0,89 -0,547 -1,30 1,986 3,27 ***Traditionalists -0,108 -0,62 0,146 0,78 -0,634 -1,83 + 1,483 2,60 **Nostalgics of former DDR -0,052 -0,24 0,064 0,26 -0,255 -0,63 0,359 0,49New Middle Class -0,288 -1,64 + 0,351 1,87 + 0,060 0,17 0,980 1,66 +Materialists -0,267 -1,36 0,336 1,58 0,099 0,27 0,219 0,32Hedonists/Escapists -0,222 -1,02 0,276 1,15 0,044 0,11 0,769 1,13Experimentalists -0,565 -2,24 * 0,508 1,84 + 0,635 1,47 0,695 0,80
... ... ... ... ...to continue
Participation vs. Refused Nocontact vs.
Nonparticipation vs. Participation Participation
Not Able vs.
Participation
z-value z-value z-value z-value
20
Version A: Multilevel logit estimates –Interviewer variables
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...purch. power per HH > 530 €-0.329 -0.96 0.452 1.22 -0.097 -0.15 0.331 0.29Status 0.018 0.71 -0.017 -0.63 -0.029 -0.61 0.082 1.14East-Germany 0.075 0.37 -0.117 -0.54 -0.199 -0.50 0.601 1.04
interviewerIsex (1 - men) 0.074 0.54 -0.127 -0.87 -0.271 -1.02 0.599 1.69 +age of interviewer 0.002 0.27 0.000 -0.05 -0.002 -0.15 -0.023 -1.16interviewer age < 40 & male-0.508 -1.32 0.785 1.95 + 0.519 0.72 -0.981 -0.86second. modern school (Ref.)
secondary school -0.068 -0.40 0.037 0.20 -0.004 -0.01 -0.081 -0.18high school diploma -0.455 -1.67 + 0.401 1.39 0.618 1.22 0.597 0.89university with and without deg.0.008 0.04 0.049 0.23 -0.158 -0.41 -0.456 -0.87Workload 0.018 3.26 ** -0.019 -3.17 ** -0.018 -1.63 + -0.024 -1.78 +SOEP Experience 0.066 0.45 -0.210 -1.36 0.323 1.16 -0.059 -0.16... ... ... ... ...
to continue
z-value z-value z-value z-value
Not Able vs.
Participationvs. Participation Participation
Participation vs. Refused Nocontact vs.
Nonparticipation
21
Version A: Multilevel logit estimates – area description
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...City -0,442 -2,58 *** 0,407 2,22 * 0,834 2,79 ** -0,047 -0,10simple urban row estate (Ref.)good earning families, new privat owned home 1,030 2,16 * -0,770 -1,48 -1,251 -1,29 -3,404 -2,31 *old families in outkripts 1,028 3,00 ** -1,002 -2,61 ** -0,724 -1,30 -1,841 -1,89 +self-employed in new houses 0,994 2,92 ** -0,842 -2,24 * -1,382 -2,19 * -2,015 -2,01 +good new detached houses 0,957 1,94 + -0,847 -1,55 -1,487 -1,52 -0,488 -0,39villages in outskirts 0,842 2,44 * -0,732 -1,92 + -1,087 -1,71 + -1,366 -1,47old city centre 0,834 2,33 * -0,605 -1,55 -1,398 -2,12 * -2,139 -1,84 +social climber, upscale professions, outskirts 0,771 2,00 * -0,697 -1,63 + -1,064 -1,67 + -1,424 -1,36dignified detached houses 0,760 2,10 * -0,638 -1,61 -0,812 -1,28 -2,289 -1,92 +simple vocactions in rural areas 0,752 2,15 * -0,494 -1,29 -1,118 -1,83 + -2,364 -2,07 *social housing, simple apartment buildings 0,654 2,05 * -0,588 -1,64 + -0,859 -1,69 + -0,592 -0,69low qualified worker 0,643 1,71 + -0,428 -1,05 -1,095 -1,60 -2,109 -1,88 +middle class in older accomodations 0,629 2,08 * -0,478 -1,42 -0,897 -1,84 + -1,426 -1,61younger villager 0,605 1,73 + -0,234 -0,61 -1,856 -2,61 ** -2,185 -2,19 *social hotspot 0,507 1,52 -0,416 -1,11 -1,117 -2,09 * 0,238 0,29humble-borns in apartments 0,297 0,91 -0,195 -0,54 -0,879 -1,68 + -0,598 -0,68attractive address in city 0,022 0,06 0,021 0,05 -0,382 -0,70 0,919 1,05old social housing -0,301 -0,72 0,462 1,03 -0,180 -0,30 1,103 1,05... ... ... ... ...
to continue
z-value z-value z-value z-value
Nonparticipation vs. Participation Participation Participation
Participation vs. Refused Nocontact vs. Not Able vs.
22
Version A: Multilevel logit estimates –size ofhouses and frequency of moves
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...1-2 family houses in homog. street section (Ref.)
1-2 family houses in inhomog. street section -0,167 -1,12 0,158 1,00 0,308 0,87 -0,033 -0,073-5 family houses -0,011 -0,06 -0,086 -0,45 0,519 1,35 -0,494 -0,886-9 family houses 0,133 0,62 -0,078 -0,34 0,032 0,07 -0,976 -1,52apartment buildings with 10 - 19 HH 0,290 1,11 -0,234 -0,82 0,157 0,32 -2,253 -2,85 **high-riser with 10 and more HH 0,636 1,49 -0,502 -1,03 -0,127 -0,19 -2,591 -2,23 *maily commercial used -0,392 -0,87 0,196 0,40 1,529 2,06 * -1,725 -1,03
MOVE -0,003 -0,13 -0,033 -1,16 0,082 1,74 + 0,140 1,91 +... ... ... ... ...
to continue
Participation vs. Refused Nocontact vs.
Nonparticipation vs. Participation Participation
Not Able vs.
Participation
z-value z-value z-value z-value
23
Version A: Multilevel logit estimates – familystructure in the neigbourhood
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...mainly single household (Ref.)far above average share of single HH -0,005 -0,02 0,574 2,31 * -0,597 -1,90 + -0,319 -0,58above average share of single HH -0,130 -0,60 0,686 2,70 ** -0,765 -2,28 * 0,255 0,47light above average share of single HH 0,104 0,47 0,398 1,54 -0,778 -2,25 * 0,078 0,14mixed family structure -0,087 -0,39 0,704 2,71 ** -0,674 -1,92 + -0,052 -0,09light above average share of family with children0,169 0,73 0,434 1,63 -1,062 -2,78 ** 0,082 0,14above average share of family with children 0,164 0,69 0,460 1,69 + -1,211 -3,08 ** -0,226 -0,37far above average share of family with children0,135 0,55 0,581 2,09 * -1,605 -3,70 *** -1,052 -1,56almost only families with children 0,278 1,08 0,451 1,55 -1,536 -3,13 ** -1,793 -2,06 *... ... ... ... ...
to continue
Participation vs. Refused Nocontact vs.
Nonparticipation vs. Participation Participation
Not Able vs.
Participation
z-value z-value z-value z-value
24
Version A: Multilevel logit estimates –Random effects
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...Random effectsuii π²/3 π²/3 π²/3 π²/3
95% Interv. 95% Interv. 95% Interv. 95% Interv.
v0j (intercept) 0,400 (0,38 - 0,83)0,395 (0,39 - 0,94)1,300 (1,25 - 3,0) 1,370 (0,52 - 1,6)ICC 0,108 0,107 0,283 0,294interviewerhouseholdsLogLikelihood
Pseudo-R²
-405-2142 -1805 -8292774 18253408
227
Participation vs. Refused Nocontact vs.
Nonparticipation
219
vs. Participation Participation
224
Not Able vs.
Participation
215
0,140,140,07 0,07
z-value z-value z-value z-value
1523
25
Version B: Variables from the interviewer data set
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...interviewerIsex (1 - men) 0,079 0,53 -0,241 -1,45 0,235 0,76 0,959 2,70 *age of interviewer -0,006 -0,74 0,010 1,10 -0,011 -0,68 0,000 0,00interviewer age < 40 & male -0,266 -0,60 0,809 1,64 + -0,717 -0,78 -0,770 -0,73secondary modern school (Ref.)
secondary school -0,211 -1,16 0,219 1,07 0,269 0,71 -0,394 -0,92high school diploma -0,634 -2,27 * 0,625 1,99 * 0,975 1,80 + 0,313 0,51university with and without deg. -0,296 -1,34 0,437 1,75 + 0,236 0,53 -0,583 -1,13Workload 0,021 3,15 ** -0,024 -3,24 ** -0,009 -0,68 -0,043 -2,56 *Soep-Experience 0,295 1,97 + -0,461 -2,71 ** 0,014 0,05 -0,022 -0,06... ... ... ... ...
to continue
Participation vs. Refused vs. Nocontact vs. Not Able vs.
Participation
z-value z-value z-value
Nonparticipation Participation Participation
z-value
26
Version B: Variables from the interviewer questionnaire
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...years for SOEP in future 0,215 1,54 -0,239 -1,53 0,042 0,15 -0,297 -0,92amicable (1- 7) 0,193 2,66 ** -0,231 -2,87 ** -0,070 -0,47 -0,073 -0,40reserved (1 - 7) 0,147 2,86 ** -0,104 -1,80 + -0,336 -3,16 ** -0,235 -1,91 +life satisfaction (0 - 10) 0,082 2,02 + -0,081 -1,78 + -0,072 -0,92 -0,277 -2,79 **communicative (1 - 7) 0,079 0,83 -0,030 -0,28 -0,109 -0,55 -0,699 -3,01 **inquisitive ( 1- 7) 0,074 0,92 -0,039 -0,43 0,030 0,19 -0,341 -1,91 +often worry about things (1 - 7) 0,073 1,50 -0,096 -1,75 + -0,049 -0,50 -0,020 -0,17creativ (1 - 7) 0,052 0,75 -0,013 -0,16 -0,397 -2,84 ** 0,259 1,37sometimes too brusque(1 - 7) 0,031 0,48 -0,057 -0,79 0,178 1,44 -0,340 -1,86 +own riskpropensity ( 1 - 10) -0,027 -0,76 0,022 0,55 -0,052 -0,74 0,169 1,83 +forgive others (1 - 7) -0,054 -0,92 0,052 0,79 -0,033 -0,27 0,318 2,01 *sluggish (1 - 7) -0,073 -1,14 0,079 1,09 0,024 0,19 0,276 1,82 +patient (1 - 10) -0,097 -2,75 ** 0,082 2,06 * 0,094 1,32 0,164 1,94 +easily flustered (1 - 7) -0,103 -1,76 + 0,095 1,44 0,071 0,60 0,285 1,99 *social desirability indicator (1-0) -0,163 -1,12 0,127 0,78 0,252 0,86 0,773 2,28 *... ... ... ... ...
to continue
Participation vs. Refused vs. Nocontact vs. Not Able vs.
Participation
z-value z-value z-value
Nonparticipation Participation Participation
z-value
27
Version B: Multilevel logit estimates –Random effects
Variable Coeff. Coeff. Coeff. Coeff.
... ... ... ... ...
Random effects
uii π²/3 π²/3 π²/3 π²/3v0j (intercept)ICCinterviewerhouseholdsLogLikelihood
Pseudo-R²
Participation vs. Refused vs. Nocontact vs.
165 165 1632592 2111 1409
0.07 0.07 0.14
Not Able vs.
Participation
1621184
0.22
z-value z-value z-value
Nonparticipation Participation Participation
z-value
-1641 -1368 -636 -310
0.052 0.069 0.184 0.0690,180 0,243 0,740 0,242
28
Summary (1)
Refusals, noncontact and “unable to participate” relate to different respondent, area and interviewer characteristics:
Respondent is easy to persuade: well-established Sinus Milieu age <= 35 years high income families, new private owned houses, old families
in outskirts interviewer with high workload, with experience,
with self assessment: amicable, satisfied with own life, not easy flustered
29
Summary (2) Respondent refuse more likely:
Sinus Milieu: new middle class, experimentalists, modern performer
age > 45 – 50 years families with children cities, simple urban estate interviewer with
low workload, with less experience, high level education, age < 40 & male with self assessment: not amicable, unsatisfied with own life, patient,
not reserved
30
Summary (3)
Respondent is difficult to contact:
Sinus Milieu: experimentalists, modern performer age > 45 – 50 & age > 55 – 60 years single household cities, simple urban estate, areas with high freq. of moves interviewer with
high level education, with self assessment: not creative, not reserved
31
Summary (4) Respondent use “not able to participate” :
Sinus Milieu: upper conservative, traditionalists, new middle class, modern performer
smaller than cities areas with higher frequency of moves interviewer with
male low workload, with self assessment: not communicative, unsatisfied with own life,
sluggish, not inquisitive, easy flustered, patient, not reserved with higher need of social approval
Result does not indicate illness of respondents as expected, but that it may be an alibi used by respondents to avoid participation
32
Conclusion
Microgeographic data, interviewer data as well as interviewer questionnaires are an important source to fill the information gap on respondents and nonrespondents.
Next step of analyses: interaction terms between respondent, interviewer and
area Multilevel Poisson Regressions for the number of
contacts used in this sample
top related