za abstract - nis

1

Weighting for Nonresponse on Round Two

of the New Immigrant Survey

Douglas S. Massey

Princeton University

Guillermina Jasso

New York University

Monica Espinoza

Princeton University

January 17, 2017

Abstract

The New Immigrant Survey, a longitudinal survey of persons who became legal

permanent residents in 2003 and were re-interviewed approximately five years later, experienced

a marked decline in response rates between the baseline (R1) and follow-up (R2) interviews.

Using R1 data we develop a statistical model to identify the determinants of response on R2. We

then use that model to derive weights that correct for nonresponse and evaluate their efficacy

through a counterfactual analysis. We then examine the effect of weighting for nonresponse on

estimated trends. Although few variables were associated with the probability of response, the

likelihood of inclusion in the follow-up survey was by no means random. However, a

counterfactual analysis we undertook to test the weighting scheme suggested that they were

effective in correcting for nonresponse bias and important to use in assessing trends between the

two survey rounds. We recommend the use of nonresponse weights that are now available to

users on the NIS website. Although the precision of estimates based on R2 data is somewhat

reduced because of the smaller sample size, point estimates computed using nonresponse weights

should yield valid and unbiased inferences about the progress of new immigrants in the United

States.

2

The New Immigrant Survey (NIS) is a longitudinal survey of adults who became legal

permanent residents of the United States during May through November of 2003. Follow-up

interviews with the same respondents were conducted from June 2007 through December 2009.

The goal of the survey project was to generate a representative, reliable, and accurate public use

data file on the characteristics of new legal immigrants and their progress in the United States.

Data from the first round of the survey, officially labeled NIS 2003-1 but here referred to simply

as Round 1 (R1), were released in 2006. Data from the follow-up survey, labeled NIS 2003-2 but

here referred to as Round 2 (R2), were released in April 2014. Data from both rounds of the

survey are available for download from the project website (http://nis.princeton.edu/). Whereas

the R1 survey achieved a robust response rate of 68.6% the R2 response rate came in at a more

anemic 46.1%. In this paper we undertake a methodological analysis to consider the implications

of the one-third reduction in response rate between R1 and R2. Although the NIS included

samples of children and spouses, in this paper we address nonresponse only among sampled

adults.

We begin with a description of the design and implementation of the survey’s two rounds

and discuss the likely reasons for the loss of respondents between R1 and R2. Drawing on R1

data we then specify and estimate an equation to predict the likelihood of a successful re-

interview on R2. After interpreting the model’s coefficients to identify the determinants of

selection into R2, we use the estimated equation to generate predicted probabilities of response

and use them to develop a set of case weights to correct for nonresponse bias. We evaluate the

efficacy of the weighting scheme by undertaking a counterfactual analysis for Round 1. This

exercise compares parameter estimates for R1 derived under three conditions: from the entire set

of R1 respondents, from R1 respondents without R2 non-respondents, and from R1 respondents

3

without R2 non-respondents but weighted to correct for nonresponse. Finally, we examine

changes over time for variables included on both survey rounds, comparing trends observed with

and without using weights to correct for nonresponse. We conclude with an appraisal of the

reliability and validity of the R2 data for making inferences about the progress of new legal

immigrants to the United States.

TWO ROUNDS OF THE NEW IMMIGRANT SURVEY

As noted above, the baseline NIS survey is a representative sample of adults who became

legal permanent residents (LPRs) in the United States during May through November of 2003,

with a midpoint roughly in August of 2003. Respondents were randomly selected from a list of

new permanent residents who received their “Green Cards” during this period. The list was

obtained from the U.S. Citizenship and Immigration Services (USCIS), a successor agency to the

earlier Immigration and Naturalization Service (INS). The average time between admission to

permanent residence and interview was 17 weeks. A total of 12,488 immigrants were sampled

and 8,573 completed the survey for a response rate of 68.6%. In addition to the main sampled

immigrant, interviewers also surveyed spouses if they lived in the household (n=4,334) and

interviewed up to two co-resident children aged 8-12 (n=1,072).

The R1 survey employed a stratified sampling design that under-sampled immigrants

with spouse-of-U.S.-citizen visas and over-sampled persons admitted as principals on

employment and diversity visas. In order to derive representative, unbiased point estimates of

population parameters design weights must be used, where the weights are the inverse of the

probability of selection into the sample. These are applied to individual cases when estimating

parameters such as means and proportions and their associated standard errors. Since the

selection probabilities are fractions under 1.0, weighting by their inverse increases the

4

contribution of the case as the selection probability falls, thus giving more weight to respondents

who were under-sampled and less weight to those who were over-sampled.

Follow-up interviews with these same respondents were conducted from June 2007

through December 2009, yielding a rough midpoint in September 2008, just over five years from

the midpoint of the baseline survey. Between the time of the R1 and R2 surveys, however, the

context for interviewing immigrants had changed quite dramatically. Between August 2003 and

September 2008, GDP growth dropped from 3.74% to 1.88% per year and from September 2008

to the end of the interview period in December 2009 GDP shrank by a remarkable 9%. Over the

same period, consumer confidence fell by 52% and business bankruptcies rose by 24%, a wave

of economic turmoil that coincided with an upsurge in anti-immigrant sentiment (Massey and

Sanchez 2010).

Indeed, between 2000 and 2006 the percentage of Americans who labeled immigrants as

a burden to society rose from 38% to 52% and by the latter date 53% had come to believe that

illegal immigrants should be required to go home (Pew Research Center 2006). Not

coincidentally immigrant deportations rose by 50% from 2000 to 2006 and by the end of

fieldwork in 2009 the annual total reached 392,000. Under these conditions it is hardly surprising

that in 2007 some 72% of foreign born Hispanics said that the immigration policy debate had

made life in the U.S. more difficult for Hispanics and 67% agreed that they worried some or a lot

about deportation (Pew Research Center 2007).

The context for interviewing immigrants thus became much more hostile between the

baseline and follow-up survey, a shift that logically can be expected to have made locating R1

respondents and securing their cooperation more difficult. The tasks of finding and re-

interviewing R1 respondents were further complicated by authorities at USCIS, who reneged on

5

an agreement signed in January 2007 to provide address updates from the Change-of-Address

(AR-11) files for NIS investigators to use in locating respondents for the follow-up. LPRs are

required to notify USCIS of changes of address within ten days of the change, thus yielding an

up-to-date address file for all legal resident aliens, at least in theory.

Unfortunately access to this resource was unexpectedly denied without explanation by

authorities at USCIS and investigators were forced to rely on their own data and other methods

to track down respondents from the baseline survey (which is ironic since USCIS was an original

funder of the study). NIS interviewers naturally recorded the respondent’s address at the time of

the R1 survey, and in addition asked about future travel plans while also obtaining the “name of

a friend or relative who does not live with you at this address but who resides in the United

States and who will know how to get in touch with your if you move.”

Immigrants are by definition a mobile population, and with an average gap of around five

years between R1 and R2, many of the original respondents and their relatives had indeed

moved. Moreover, in the hostile anti-immigrant context that had emerged by the time of the

follow-up interviews, friends and relatives were often reluctant to provide contact information to

outsiders they did not know. As a result, apart from the difficulty of securing respondents’

cooperation it proved challenging to locate respondents in the first place and roughly half of the

R1 respondents were never found.

In the end, the downturn in the economy, the rise of anti-immigrant sentiment, and the

sharp increase in deportations between the R1 and R2 surveys, when combined with the lack of

cooperation from the USCIS, augured for a lower response rate than achieved on the baseline

survey; and as already noted this outcome did, in fact, come to pass. The number of completed

interviews with adult immigrants was 3,902, for a response rate of 46.1%. In addition, follow-up

6

interviews were completed with just 1,771 spouses (40.9% of the R1 cohort) and 392 children

(36.6% of the R1 cohort). Although we cannot truly know the reason for the marked decline in

response rates, it is clear that something happened to trigger the drop. The more important

question, however, is what effect the decline in response rates has on the reliability and validity

of the NIS as a longitudinal survey.

NONRESPONSE, RELIABILITY, AND VALIDITY

At a minimum, the drop in sample size from 8,573 to 3,902 will reduce the precision of

parameter estimates, increasing standard errors and thus decreasing the reliability with which R2

variables and R1-R2 trends can be measured. Little can be done to offset this decline in

precision. In practical terms, the reduction in sample size makes it more difficult to find

significant effects and thus increases the likelihood of Type II errors. In making inferences from

R2 data, therefore, researchers are statistically more likely to reject a true hypothesis than to

confirm a false one.

A reduced sample size by itself does not necessarily introduce bias, however. Whether

estimates are biased because of a loss to follow-up depends on the degree to which R2

respondents differ from those in R1 and the extent to which these differences are associated with

variables under study. Mathematically there is no fixed relationship between response rate and

bias (Bethlehem 2002) and a meta-analysis by Peytchev (2013) indeed found no empirical

correlation between response rates and degree of bias across a broad sample of surveys.

According to an equation derived by Bethlehem (2002), the degree of bias introduced by

nonresponse is inversely related to the response rate but directly related to the correlation

between the probability of response and survey variables of interest (Olson 2013). If variables of

interest are uncorrelated with the factors that produced the nonresponse, no bias will be

7

introduced into estimates no matter how low the response rate is (Lessler and Kalsbeek 1992;

Massey and Tourangeau 2013).

In simple cross-sectional surveys it is difficult to know the degree to which factors that

produced a high degree of nonresponse are correlated with survey variables and thus likely to

produce biased results. At best, “paradata” can be used to estimate a model predicting whether or

not an interview was completed. Paradata are auxiliary data outside the survey that are available

for both respondents and nonrespondents (Olson 2013). Examples include information such as

the time of attempted contacts, the number of call-backs, interviewer observations about

recruitment encounters, and any administrative or census data that can be linked to sampling

elements (Smith and Kim 2013).

If paradata are available, models to predict the likelihood of response can be estimated

and used to generate predicted probabilities of response, the inverse of which yield weights that

in theory correct for nonresponse bias (Czajka 2013). As with design weights, cases with a

response probability near 1.0 will yield weights that confer less influence in computing the

parameter estimate whereas those with low response probabilities carry more influence (Czajka

2013; Massey and Tourangeau 2013). In general, the greater the amount of reliable paradata

available to the researcher, the more accurate the predicted probability of response and the more

effective the correction for nonresponse bias (Smith and Kim 2013).

In the case of longitudinal surveys, of course, considerable data are available from the

baseline sample to predict the likelihood of response on the follow-up (Schoeni et al. 2013). To

the extent that variables measured on R1 of the NIS are related to the likelihood of response on

R2, therefore, we are in a position to generate weights that correct for nonresponse bias. To the

extent that unobserved variables are correlated with those included on the R1 survey, weighted

8

parameter estimates will also correct for bias introduced by unobserved heterogeneity, though

how well the correction works to eliminate bias from unobserved factors cannot be known in

practice.

MODELING RESPONSE PROBABILITIES

Our goal in selecting independent variables for this analysis was to be as encompassing

as possible and to use all available R1 data to maximize the fit of the model and to identify

which of the many possible predictors determined selection into the R2 sample. The set of

predictors we compiled includes demographic characteristics such as age, gender, number of

children in the household, number of children living outside the United States, marital status, and

race/ethnicity. Indicators of human capital used in the model include years of education, English

ability, foreign language skills, and ratings of current and prior health. Geographic effects were

assessed using dummy variables to indicate country or region of birth and place of interview.

The model also included a battery of labor market indicators, such as current

employment, hours worked, hourly wage, union membership, occupation, whether the job was

obtained before receiving permanent residence, total household income; and as a control for

potential racial discrimination we added an interviewer-assessed skin color rating. Wealth was

measured using dummy variables to indicate categories of net worth and home ownership.

Immigration-related variables include immigrant visa, months of U.S. experience prior to

permanent residence, whether the respondent reported prior undocumented experience, and

stated intention to live permanently in the United States. Finally, the model included religious

affiliation and frequency of religious service attendance.

We began by estimating a logistic regression model that used all available R1 data to

predict whether respondents successfully completed an R2 interview. Using the full set of

9

independent variables, however, we discovered that the sample size dropped from 8,573 to 6,435

owing to the list-wise deletion of cases with missing values. The drop in sample size obviously

creates a problem if our goal is to generate nonresponse weights for all cases, so we inspected the

distribution of missing values across variables and found unusually high frequencies for seven

variables: whether the interview was in English, whether the respondent had ever spoken another

language, whether another language was spoken at home, whether the respondent belonged to a

labor union, hours worked per week on the current job, hourly wage, and months of prior U.S.

experience. We then examined inspected the logistic regression coefficients for these variables

(the full model is presented in Appendix A) and determined than none was significant in

predicting the likelihood of response. We then eliminated these variables from the model and re-

estimated the logistic regression model to generate the equation estimates for the complete R1

sample, shown in Figure 1. Much of the missing was not due to item non response but rather

aspects of the study design, as when certain questions were only asked of a randomly selected

portion of the respondents or when the interview was done by phone, thus precluding the

interviewer assessment of skin tone.

TABLE 1 ABOUT HERE

Rarely do researchers have such a wealth of data to estimate a model predicting the

likelihood of survey response. Even after we eliminated variables characterized by high rates of

missing data on R1 the breadth of information is impressive. Given the large number and

diversity of predictors in the model what is perhaps most surprising is how few were actually

significant in determining the likelihood of response on R2. In addition to the variables already

eliminated, the estimates in Table 1 indicate that inclusion in R2 was not significantly related to

marital status, current health status, occupation, whether the current job was obtained before or

10

after achieving permanent residence, skin color, net worth, prior undocumented experience, or

frequency of religious attendance.

The strongest predictors pertained to demographic background, years of education, and

intentions for future U.S. residence. Among demographic characteristics, females were

significantly more likely than males to respond to the R2 survey, with the odds being 22%

greater for women [determined by taking the exponent of the logistic regression coefficient to

derive the associated odds ratio: exp(0.201)=1.22]. The likelihood of response varied in

curvilinear fashion with respect to age. With each additional year the odds of response rose by

3% but declined by 0.03% with respect to age squared, yielding a curve that rises from age 18 to

50 and then declines into older age.

The odds of response also increase by 4.8% for each additional child present in the

household, and by 10% for each child living abroad. With respect to education, the odds of

response rise steadily as years of schooling go beyond six. Compared to those with a primary

education of less, the odds of inclusion are 23% greater for those with 6-11 years of education,

33% greater for high school graduates, 42% greater for those with some college, and 53% greater

for college graduates. Finally with respect to settlement intentions, the odds of response were

paradoxically 23% lower for those intending to remain in the U.S. for the rest of their lives (and

40% lower for those who didn’t answer this question).

Although not as strongly or systematically related to the likelihood of response as the

foregoing factors, race/ethnicity, English ability, health compared to a year ago, and health

before coming to the United States were also significant in predicting the likelihood of R2

response. Only one racial/ethnic category was associated with the likelihood of response.

Hispanics were 32% more likely than all other groups to complete the R2 survey. This result,

11

combined with the fact that prior undocumented status and months of prior U.S. experience (see

Appendix A) had no significant effect on response probabilities, suggests that people who

reported prior undocumented experience did not self-select out of the R2 sample, despite the rise

in anti-immigrant sentiment from 2003 to 2008.

Perhaps surprisingly, the likelihood of an R2 response decreased as English ability

increased, culminating in a significant coefficient for those who understood English very well,

who displayed 25% lower odds of response than those reporting lower levels of English

comprehension. Although current health had no effect on the likelihood of an R2 response, the

odds of inclusion were 19% lower for those who reported worse health compared to a year ago

but 13% higher for those who reported better health than they experienced before coming to the

United States. The two benchmarks are not necessarily the same because many “new” permanent

residents were already living in the United States and were simply “adjusting status” to become

legal permanent residents. In any event, the results are consistent in suggesting that better prior

health yields a higher likelihood of response.

Of the 10 categories for country or region of birth, only three proved to be statistically

significant in predicting the likelihood of response, all negative in their effect. Persons from the

Middle East and North Africa were 37% less likely to respond to the R2 survey, which is perhaps

not surprising given the tenor of the social climate in the wake of 9/11 and the rise of anti-

Muslim sentiment associated with the “War on Terror.” The odds of response were likewise 31%

lower among immigrants from South Asia and the Pacific, a region that also includes many

Muslims. Although East Asia contains very few Muslims, immigrants from that region

nonetheless displayed 60% lower odds of responding to the R2 survey. The fact that immigrants

from Mexico were neither more nor less likely to be included in the follow-up than those in

12

English speaking nations again suggests that persons with prior undocumented experienced were

not systematically selected out of the R2 sample.

With respect to place of interview only four of the 15 geographic categories were

significant in predicting the likelihood of response. Whereas the odds of inclusion were 21%

lower for respondents interviewed in New York, they were 29% higher among those from New

England (i.e. Connecticut, Massachusetts, Maine, New Hampshire, Rhode Island, or Vermont),

28% higher among those from the South Atlantic (i.e. Georgia, North Carolina, South Carolina,

Virginia, or West Virginia), and 34% higher among those from the West South Central region

(i.e. Iowa, Minnesota, Missouri, North Dakota, South Dakota, Nebraska, or Kansas). Again, the

fact that California, Florida, Illinois, New Jersey, and Texas did not display significantly lower

probabilities response suggests that immigrants with prior undocumented experience did not self-

select out of R2, as these states house a disproportionate share of America’s undocumented

population (see Warren and Warren 2013). This proposition is buttressed by the fact that the

South Atlantic and West North Central themselves displayed higher likelihoods of response, and

these are regions containing a large share of new immigrant destinations (Massey and Capoferro

2008).

Completion of the R2 survey was also not very selective with respect to labor force

indicators. As already noted the likelihood of response was not affected by occupation, wages,

the timing of job acquisition (before or after permanent residence), wages, or income; and with

respect to current employment out of six categories only those temporarily laid off displayed a

statistically significant departure from the other groups, with 52% lower odds of being included

in the follow-up.

13

Turning to immigrant class of admission, we see that immigrants entering with

numerically-limited relative-of-U.S. citizen visas, diversity visas, and “other” visas were more

likely to be in R2 than those holding other kinds of visas. Thus the odds of inclusion were 24%

greater for those holding a numerically-limited citizen-family visa, 18% greater for those on a

diversity visa, and 28% greater for those in the residual “other” visa category. It is not

immediately clear why those on numerically-limited citizen-family visas, diversity visas, or other

visas were more likely to respond to R2 sample. These effects certainly cannot be attributed to

differences in the intent to stay in the United States, since that variable was separately controlled

in the equation. The fact that those admitted with a legalization visa were neither more or less

likely than others to complete the R2 survey once again suggests that there was little self-

selection out of the panel by persons with prior undocumented experience.

Finally religious affiliation does not systematically predict inclusion in R2, except that

the odds of response were 25% greater for those who professed no religion at all. Likewise, the

categories for frequency of religious service attendance displayed no significant effects. Thus

neither religion no religiosity seems to have affected response probabilities on Round 2. The fact

the coefficient for Muslims here is not significant suggests that if anti-Muslim sentiment had an

effect on response rates, it was expressed regionally rather on the basis of religion belief per se,

with Muslim immigrants from the Middle East, North Africa, and South Asia bearing more

visible and bearing the brunt of the effect and those from the Balkans or Caucuses largely

escaping the effect.

14

THE EFFICACY OF NONRESPONSE WEIGHTING

Using the results shown in Table 1 we inserted the R1 characteristics of each respondent

observed into the estimated equation to generate a predicted probability of inclusion in R2. We

then took the inverse of this probability to derive weights to correct for nonresponse bias.

In order to assess the efficacy of the correction we undertook a counterfactual analysis using the

data from R1. Fist we applied design weights to the full R1 sample to derive unbiased estimates

of the population parameters. Then we eliminated R2 non-respondents from the R1 data and

derived parameter estimates from this reduced sample using design weights alone, thereby

creating point estimates uncorrected for the process of non-response observed on R2. We then

re-estimated the parameters after applying the nonresponse weights in addition to the design

weights to derive corrected estimates. Finally we subtracted the uncorrected and corrected

estimates from the unbiased estimates and compared them to the original unbiased estimates to

assess the efficacy of our correction procedure.

Table 2 presents the results of this exercise. The first column shows variable values

estimated using the full R1 sample with design weights (the “true” values). The second column

shows values estimated using design weights for the R1 sample after R2 non-respondents were

removed (i.e. the “biased” values: those that would be obtained if the nonresponse process

observed on R2 had occurred on R1). The third column takes the estimates of column two and

applies the nonresponse weights (yielding “corrected” values: those that would be achieved by

applying the proposed weighting scheme). The final columns show the error values for

computations based on the “biased” and “corrected” values, in column (4) subtracting column (2)

from column (1) (i.e. biased minus true) and in column (5) subtracting column (3) from column

(1) (i.e. corrected minus true).

15

TABLE 2 ABOUT HERE

Consider the first panel, which displays the gender distribution achieved when using the

true, biased, and corrected values. From the model in Table 1 we know that females were more

likely to respond than males, with the odds of response being 23% greater than those of men.

Hence, when the design weights are applied to the R1 sample reduced by the R2 nonresponse

process we observe an over-representation of women and an under-representation of men.

Whereas the “true” distribution derived when design weights are applied to the full sample

consists of 56.4% women and 43.6% men, the selection-reduced sample yields estimates of

58.8% women and 41.2% men, clearly overstating the presence of women.

When the nonresponse weights are applied, however, the distribution moves much closer

to the “true” distribution, yielding an estimate of 56.9% women and 43.1% men. Although still

not equal to the true value, the overestimate of women has been reduced from 2.4 to 0.5 points

and the underestimate of men of necessity simultaneously shrank from -2.4 to -0.5 points (see

columns (4) and (5), a clear improvement in accuracy. Whereas before the correction for

nonresponse the estimated percentage of women in the reduced R1 sample was significantly

different from that computed from the full R1 sample (p<0.001) once the weights were applied

the difference was no longer close to statistical significance.

In Columns (4) and (5) those errors that simple t-tests reveal to be significant departures

from true values (p<0.05) are marked with asterisks. For any variable (e.g. education), the t-tests

are not independent of one another, since errors in one category will affect values in the other

categories. In the prior example, for example, a 2.4 overestimate of women necessarily implies a

2.4 percent under estimate of men. This effect becomes less obvious as the number categories

16

increases, but the principle is the same. Nonetheless each asterisk indicates a significant error in

the estimate of that one single parameter.

As can be seen, Column (4) is riddled with asterisks, 75 to be exact, indicating numerous

significant differences from true values in point estimates based on the selection-reduced sample

uncorrected for nonresponse. In column (5), however, we see that the number of asterisks has

been dramatically reduced by applying the nonresponse weights, falling to just ten; and four of

these are for point estimates of the interviewer-assigned skin color rating, an error-prone

subjective judgment to begin with. Even here, however, the mean skin color rating is identical

across all estimates. At the bottom of Columns (4) and (5) we compute total error by summing

the absolute values of all departures from the true value across all variables, yielding figures of

138.2 and 49.6 for the biased and corrected estimates, respectively. In other words, application

of the nonresponse weights has reduced the total error by 64% and left very few significant

departures from true parameters, suggesting the weights are indeed effective in countering biases

in the data introduced by nonresponse.

The foregoing constituted a counterfactual analysis that examined the effect that

nonresponse would have on R1 estimates if the baseline survey had been subject to the same

process of nonresponse as observed in the follow-up. We cannot perform a comparable analysis

on R2 data since we cannot derive a benchmark of “true” R2 values. We can, however, assess

what effect weighting or not weighting for nonresponse might have on the measurement of

trends between R1 and R2. Table 3 thus shows values of variables included on both survey

rounds estimated with and without nonresponse weights to discern how different trends would be

in the absence of correcting for nonresponse. Column (1) presents values estimated circa 2003

using the full R1 survey; Column (2) presents values of the same variables estimated circa 2008

17

without using nonresponse weights; Column (3) repeats the foregoing estimation with

nonresponse weights; and columns (4) and (5) show the 2003-2008 trends that result from using

and not using nonresponse weights for the R2 estimates. Statistically significant differences

(p<0.05) between the latter two columns are indicated with an asterisk.

TABLE 3 ABOUT HERE

A number of trends in variable values do not seem to be significantly affected by

application or non-application of nonresponse weights. Trends over time are not statistically

different, for example, when using weighted or unweighted R2 estimates for occupational status,

overall health insurance coverage, private health insurance coverage, source of private health

insurance coverage, coverage by non-U.S. health insurance, coverage by Medicaid, or total

household income.

Although the differences are generally small, we nonetheless observe significant

differences across many other variables. Thus in the absence of nonresponse weighting we would

underestimate the increase in the percentage of respondents reporting poor health, as well as the

increase the percentage registered for Medicare. In contrast, we would overestimate the increase

in the percentage married as well as the percentage aged 35-44, 45-54, and 65+, the percentage

Hispanic, and the percentage of homeowners. Likewise we would overestimate the decrease in

the percentage Buddhist, the percentage aged <25 and 25-34 as well as the percentage Asian.

More seriously, in the absence of weighting for nonresponse we would mistakenly report a

decrease in the percentage of respondents living in New York, an increase in percentage of

Catholics, a decrease in the percentage of Muslims, an increase the percentage with no religious

affiliation, and a decrease in the percentage retired. It is thus clear that reliance on unweighted

estimates would in many cases lead to incorrect conclusions. We have therefore made

18

nonresponse weights available on the NIS website and recommend their use in deriving

parameter estimates using R2 data.

The use of weights necessarily decreases the efficiency of estimation by introducing an

additional source of variation beyond sampling error. The loss of effectiveness associated with

the use of weights is indicated by computing the design effect, which is the ratio of the variance

of a weighted estimate to that which would have been achieved using simple random sampling

(and hence no weights). Table 4 presents design effects associated with the three weighting

schemes employed in Table 3: the “true” R1 parameter estimates achieved using design weights

alone; the “biased” R2 estimates achieved by using design weights but no correction for

nonresponse; and the “corrected” R2 estimates achieved by applying weights for both design and

nonresponse.

TABLE 4 ABOUT HERE

Comparing the design effects in the third column with those I the first and second

columns we see that applying nonresponse weights has a very modest effect. Well-designed

surveys generally have design effects in the range of 1.0 to 3.0. When weights to correct for the

stratified sampling design of the NIS are applied, the average design effect across all variables in

the table is 1.33, and when weights for nonresponse are added to the weighting scheme the

design effect rises to just 1.43, a relatively small effect. In other words, weighting the data to

correct for nonresponse entails little loss of efficiency in estimation, again underscoring the

efficacy of the proposed correction.

19

CONCLUSION

Between the baseline (R1) and follow-up (R2) samples of the New Immigrant Survey the

response rate dropped from 69% to 46%, resulting in the loss of 54% of respondents from the

longitudinal database. Here we conducted a detailed analysis to assess the implications of this

loss to follow-up for the reliability and validity of estimates derived from the R2 data. Very

clearly the reduction of sample size means that parameters estimated using R2 data will be less

precise and reliable, thus increasing the likelihood of Type II errors---failing to confirm

hypotheses that are, in fact, true but undetectable because of a lack of statistical power. The

decline in sample size normally will not increase the likelihood of making a Type I error,

however—mistakenly concluding that a hypothesis is true when it is not—and is in this sense the

effect of nonresponse is conservative.

An elevated rate of nonresponse, however, also increases the potential for bias in

estimates based on R2 data. Prior work has shown that the degree of bias is not a given, however.

The size and direction of the bias introduced by nonresponse inversely related to the response

rate and directly related to the size of the correlation between the response probability and

variables of interest. As a consequence, there is no universal level of bias that can be assigned to

a dataset owing to nonresponse. In practice, the degree of bias will vary from topic to topic

depending on the variables under analysis and the degree of their correlation with the likelihood

of response.

In our analysis, we took advantage of the wealth of data available from the R1 survey to

identify which variables observed in the baseline sample were, indeed, associated with the

probability of successfully completing an R2 interview. Our estimated model predicting response

is reassuring in that many variables likely to be of interest to immigration researchers were

20

unrelated to the likelihood of response, including age, marital status, race/ethnicity, English

ability, foreign language skills, current health status, health status before coming to the United

States, current employment status, hours worked, hourly wages, union membership, whether a

job was obtained before achieving permanent residence, skin color, months of prior U.S.

experience, prior undocumented experience, or frequency of religious attendance.

The likelihood of completing an R2 interview was not random, however. According to

our logistic regression estimates, the odds of inclusion proved to be greater for women,

professionals, homeowners, Catholics, persons holding numerically-limited relative-of-U.S.-

citizen visas, diversity visas, legalization, and “other” visas, those professing no religious

affiliation, persons interviewed in the South Atlantic region, and those from households reporting

a negative net worth and incomes between $53,000 to $95,000. The odds were lower for persons

reporting their health to be worse than a year ago, born in the Middle East and North Africa or

Southeast Asia and the Pacific, those interviewed in New York state, and respondents intending

to live in the U.S. for the rest of their lives.

Using the logistic regression model, we inserted variable values observed for each R1

respondent to generate predicted probabilities of inclusion in the R2 survey and then computed

nonresponse weights by taking the inverse of the estimated response probability. We then

undertook a counterfactual analysis to test the efficacy of our weighting scheme by removing R2

non-respondents from the R1 data and applying weights to the remaining R1 data to observe how

close weighted estimates of variable values came to the actual values computed from the entire

R1 sample. We found that unweighted parameter estimates based on the reduced R1 sample

displayed numerous statistically significant discrepancies from the “true” values computed from

21

the full R1 sample. In other words, if the same selective pattern of nonresponse observed on R2

were to have affected the R1 data, many biased estimates would result.

We also found, however, that when nonresponse weights were applied, total error was

reduced by 64% and that statistically significant bias was eliminated from the vast majority of

point estimates. Moreover, in the few cases where significant differences persisted the absolute

value of the discrepancy was generally small and unlikely to affect overall conclusions. Finally

when we turned to the R2 data and inspected trends in variables measured on both rounds of the

survey using weighted and unweighted estimates we found that the estimated trends were often

statistically different from one another, underestimating increases in variable values in two cases,

overestimating increases in six cases, overestimating decreases in four cases, and mistakenly

detecting nonexistent increases or decreases in five cases.

Although the size of the bias in measuring trends was in most instances small, we

nonetheless recommend using nonresponse weights for computing point estimates from R2 data

and to this end have made the weights available on the NIS website. Although the precision of

estimates based on R2 data may be lower because of the smaller sample size, the increase in the

design effect attributable to the application of nonresponse weights is quite small. In the end, we

conclude that parameter estimates computed using both design and nonresponse weights should

produce valid and unbiased inferences about the progress of new immigrants in the United

States.

22

REFERENCES

Bethlehem, Jelke. 2002. “Weighting Nonresponse Adjustments Based on Auxiliary

Information.” Pp. 275-88 in Survey Nonresponse, eds. R. M. Groves, D. A. Dillman, J.

L. Eltinge and R. J. A. Little. New York: John Wiley & Sons.

Czajka, John L. 2013. “Can Administrative Records Be Used to Reduce Nonresponse Bias?

Annals of the American Academy of Political and Social Science 645: 171-184

Lessler, Judith T., and William D. Kalsbeek. 1992. Nonsampling Error in Surveys. New York:

John Wiley & Sons.

Massey, Douglas S., and Chiara Capoferro. 2008. “The Geographic Diversification of U.S.

Immigration.” Pp. 25-50 in Douglas S. Massey, ed., New Faces in New Places: The

Changing Geography of American Immigration. New York: Russell Sage.

Massey, Douglas S., and Magaly Sánchez. 2010. Brokered Boundaries: Creating Immigrant

Identity in Anti-Immigrant Times. New York: Russell Sage Foundation.

Massey, Douglas S., and Roger Tourangeau. 2013. ““Where Do We Go from Here?

Nonresponse and Social Measurement.” Annals of the American Academy of Political

and Social Science 645: 222-236.

Olson, Kristen. 2013. “Paradata for Nonresponse Adjustment.” Annals of the American

Academy of Political and Social Science 645: 142-170,

Pew Research Center. 2006. America's Immigration Quandary: No Consensus on Immigration

Problem or Proposed Fixes. Washington, DC: Pew Research Center.

Pew Hispanic Center. 2007. The 2007 National Survey of Latinos: As Illegal Immigration

Issue Heats Up, Hispanics Feel a Chill. Washington, DC: Pew Research Center.

23

Peytchev, Andy. 2013. “Consequences of Survey Nonresponse.” Annals of the American

Academy of Political and Social Science 645: 88-111.

Schoeni, Robert F., Frank Stafford, Katherine A. Mcgonagle, and Patricia Andreski. 2013.

“Response Rates in National Panel Surveys.” The Annals of the American Academy of

Political and Social Science 645: 60-87

Smith, Tom W., and Jibum Kim. 2013. “An Assessment of the Multi-level Integrated Database

Approach.” Annals of the American Academy of Political and Social Science 645: 185-

221.

Warren, Robert, and John Robert Warren. 2013. “Unauthorized Immigration to the United

States: Annual Estimates and Components of Change, 1990-2010. International

Migration Review 47(3):296-329.

24

Table 1. Logistic regression model used to generate nonresponse weights for full

sample of 8,573 respondents.

__________________________________________________________________________

Standard

Independent Variables Coefficient Error .

Demographic Background Age 0.030*** 0.011

Age squared -0.0003** 0.0001

Female 0.201*** 0.053

No. Children in Household 0.047** 0.021

No. Children living outside of US 0.095* 0.051

Marital Status Never Married-Not in Union ---- ----

Separated-Divorced-Widowed -0.040 0.105

Married or in Union 0.095 0.069

Race/Ethnicity Non-Hispanic White ---- ----

Non-Hispanic Asian 0.222 0.140

Non-Hispanic Black 0.216 0.142

Non-Hispanic Other 0.137 0.253

Hispanic 0.276** 0.135

Years of Education <6 years ---- ----

6-11 years 0.214** 0.095

12 Years 0.283*** 0.108

13-15 years 0.353*** 0.109

16+ Year 0.426*** 0.110

English Ability Understand Not at All ---- ----

Understand Not Well 0.036 0.081

Understand Well -0.120 0.090

Understand Very Well -0.287*** 0.097

Current Health Status Poor ---- ----

Fair 0.208 0.213

Good 0.128 0.207

Very good 0.085 0.210

Excellent 0.048 0.211

_________________________

Continued

25

Table 1. Continued.

______________________________________________________________________________

Standard


Health Compared to Year Ago About the Same ---- ----

Better -0.096 0.069

Worse -0.210* 0.109

Health Before Coming to U.S. About the Same ---- ----

Better 0.123* 0.065

Worse 0.148 0.092

Country/Region of Birth English Speaking Nations ---- ----

Western Europe -0.137 0.201

Eastern Europe 0.080 0.164

Central Asia 0.420 0.293

Middle East and North Africa -0.460** 0.183

Sub-Saharan Africa -0.202 0.200

South Asia -0.305 0.204

Southeast Asia and Pacific -0.375* 0.199

East Asia -0.517** 0.204

Mexico -0.210 0.199

Other Latin America/Caribbean -0.239 0.186

Place of Interview California ---- ----

Florida 0.005 0.104

Illinois 0.130 0.111

New Jersey 0.047 0.103

New York -0.237*** 0.084

Texas 0.023 0.094

New England 0.256** 0.105

Middle Atlantic -0.004 0.106

South Atlantic 0.250** 0.104

East South Central -0.062 0.256

East North Central -0.089 0.124

West North Central 0.292* 0.157

West South Central 0.103 0.345

Mountain -0.153 0.120

Pacific -0.139 0.127

Non-Continental US territories 0.822 1.232

__________________

Continued

26

Table 1. Continued

______________________________________________________________________________

Standard


Current Employment Working ---- ----

Unemployed and looking -0.209 0.233

Temporarily laid off -0.645* 0.338

Disabled -0.187 0.342

Retired -0.259 0.271

Homemaker -0.250 0.239

Other 0.350 0.240

Occupation Laborers and Helpers ---- ----

Not Working 0.154 0.307

Service Workers -0.153 0.133

Operatives -0.197 0.143

Craft Workers -0.121 0.157

Administrative Support Workers -0.044 0.160

Sales Workers -0.205 0.154

Technicians 0.081 0.333

Managerial 0.168 0.171

Professionals 0.127 0.147

Other 0.147 0.280

When Job Obtained Not Working ---- ----

Job before LPR 0.044 0.219

Job after LPR 0.230 0.220

Total Household Income Zero -0.087 0.076

1 to <1800 -0.001 0.103

1800 to <6500 -0.018 0.089

6500 to <23784) ---- ----

23784 to <52734 0.028 0.077

52734 to <95000 0.155 0.095

95000 to <132000 -0.109 0.139

>=132000 -0.022 0.143

Missing cases -0.547*** 0.198

Darkness of Skin Color Skin Color Rating 0.013 0.015

Skin Color Missing 0.103 0.080

___________________

Continued

27

Table 1. Continued

_____________________________________________________________________________

Standard


Net Worth Negative 0.173 0.118

Zero -0.064 0.081

1 to <10,000 -0.083 0.082

10,000 to <50,000 ---- ----

50,000 to 200,000 -0.005 0.088

>=200,000 0.121 0.107

Missing -0.097 0.160

Property Home Owner 0.143** 0.071

Immigrant Class of Admission Rel. of Citizen-Unlimited ---- ----

Rel. of Citizen-Limited 0.215** 0.109

Relative of LPR 0.078 0.154

Employment 0.062 0.084

Diversity 0.165* 0.092

Refugee/Asylee/Parolee -0.029 0.112

Legalization 0.193 0.119

Other 0.250*** 0.095

Prior Immigrant Experience Formerly Undocumented 0.070 0.085

Future Intentions Intends to Live in US Rest of Life 0.267** 0.111

Intends Missing 0.200* 0.108

Religious Affiliation Protestant ---- ----

Catholic 0.114 0.071

Orthodox 0.038 0.095

Muslim 0.096 0.121

Jewish 0.296 0.221

Buddhist -0.005 0.144

Hindu 0.021 0.140

No Religion 0.223** 0.101

Other Religion -0.248 0.204

__________________

Continued

28

Table 1. Continued

______________________________________________________________________________

Standard


Frequency of Religious Attendance Never ---- ----

Sporadically 0.005 0.085

Regularly 0.069 0.094

Frequently 0.086 0.079

Very Frequently -0.085 0.120

Constant -1.259*** 0.451

LR chi2(115) 352.480***

Log likelihood -5731.576

Pseudo R2 0.030

Observations 8,573

_____________________________________________________________________________

29

Table 2. Estimated values of selected variables from Round 1 of the New Immigrant

Survey under three conditions: Full R1 sample with design weights, R1 sample

without missing R2 cases and design weights, and R1 sample without R2 missing

cases and weights for design and nonresponse.

_____________________________________________________________________________

(1) (2) (3) (4) (5)

“True” “Biased” “Corrected”

R1 Sample R1 Sample

Full R1 without without Missing Error with

Sample Missing R2 R2 Cases and and without

with Cases and Weights for Weighting for

Design Design for Design & Nonresponse

Variable Weights Weights Nonresponse Before After

Gender Female 56.4 58.8 56.9 2.4* 0.5

Male 43.6 41.2 43.1 -2.4* -0.5

Age at Interview <25 11.6 10.7 12.2 -0.9* 0.6

25 to 34 35.0 34.8 34.6 -0.2 -0.4

35 to 44 25.3 27.3 25.3 2.0* 0.0

45 to 54 13.8 14.8 14.0 1.0* 0.2

55 to 64 7.9 7.2 7.4 -0.7* -0.5

>=65 6.5 5.2 6.4 -1.3* -0.1

Education < 6 years 10.3 9.4 10.4 -0.9* 0.1

6-11 years 25.7 25.9 25.8 0.2 0.1

12 years 16.4 16.0 16.2 -0.4 -0.2

13-15 years 19.7 19.5 19.4 -0.2 -0.3

16+ years 27.8 29.1 28.2 1.3* 0.4

Children in Household No Children 48.3 46.3 48.8 -2.0* 0.5

1 Child 22.9 22.9 22.6 0.0 -0.3

2 Children 17.6 18.7 18.0 1.1* 0.4

3+ Children 11.2 12.1 10.6 0.9* -0.6

Children Outside US No Children 92.9 92.1 92.6 -0.8* -0.3

1 Child 4.6 4.9 4.7 0.3 0.1

2 Children 1.8 2.2 1.9 0.4* 0.1

3+ Children 0.7 0.8 0.7 0.1 0.0

Current Health Excellent 34.4 33.6 34.4 -0.8 0.0

Very good 28.5 28.8 28.7 0.3 0.2

Good 27.3 27.6 27.0 0.3 -0.3

Fair 8.3 8.7 8.6 0.4 0.3

Poor 1.4 1.2 1.4 -0.2 0.0

_________________

Continued

30

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)

R1 Sample R1 Sample






Health Compared to Last U.S. Trip Better 21.3 21.2 21.2 -0.1 -0.1

About the Same 68.8 68.2 68.4 -0.6 -0.4

Worse 9.9 10.6 10.4 0.7 0.5

Health Compared to Year Ago Better 17.0 16.2 16.9 -0.8 -0.1

About the Same 76.3 77.4 76.3 1.1* 0.0

Worse 6.6 6.4 6.8 -0.2 0.2

Religion at Interview Catholic 41.8 44.1 41.8 2.3* 0.0

Orthodox 8.8 8.5 8.4 -0.3 -0.4

Protestant 16.8 16.7 17.3 -0.1 0.5

Muslim 7.1 6.4 7.4 -0.7* 0.3

Jewish 1.3 1.3 1.2 0.0 -0.1

Buddhist 4.3 3.5 3.9 -0.8* -0.4

Hindu 5.6 5.5 6.0 -0.1 0.4

No Religion 12.5 12.6 12.3 0.1 -0.2

Other 1.8 1.3 1.7 -0.5* -0.1

Frequency of Service Attendance Never 18.2 17.9 18.4 -0.3 0.2

Sporadically 17.2 16.3 17.0 -0.9 -0.2

Regularly 13.4 13.9 13.3 0.5 -0.1

Frequency 46.6 47.9 46.6 1.3* 0.0

Very Frequently 4.6 4.0 4.7 -0.6* 0.1

Immigrant Class of Admission Relative of Citizen-Unlimited 49.5 46.8 49.5 -2.7 0.0

Relative of Citizen-Limited 6.4 6.7 6.4 0.3 0.0

Relative of LPR 2.4 2.8 2.5 0.4 0.1

Employment 9.6 9.5 9.4 -0.1 -0.2

Diversity 8.1 8.7 8.3 0.6 0.2

Refugee/Asylee/Parolee 6.6 6.3 6.5 -0.3 -0.1

Legalization 8.0 9.2 8.1 1.2* 0.1

Other 9.4 10.0 9.4 0.6 0.0

_________________

Continued

31

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Sample R1 Sample






Marital Status Never Married-Not in Union 15.4 14.6 15.7 -0.8 0.3

Separated-Divorced-Widowed 8.1 7.3 8.1 -0.8* 0.0

Married or In Union 76.5 78.1 76.2 -1.6* -0.3

Prior US Experience Mean Months 64.6 65.8 63.1 1.2 -1.5

Zero to 3 Months 34.5 33.6 34.8 -0.9 0.3

4 Months to 1 Year 3.7 3.8 4.1 0.1 0.4*

1 to 1.5 Years 3.5 3.5 3.4 0.0 -0.1

1.5 to 2 Years 2.4 2.4 2.3 0.0 -0.1

2 to 3 Years 6.2 6.1 6.3 -0.1 0.1

3 to 4 Years 5.4 5.1 5.3 -0.3 -0.1

4 to 5 Years 4.7 4.6 4.6 -0.1 -0.1

5 to 10 Years 17.1 16.8 16.7 -0.3 -0.4

10 to 15 Years 13.8 14.8 14.0 1.0* 0.2

>15 Years 8.9 9.2 8.5 0.3 -0.4

Undocumented Experience Formerly Undocumented 19.0 21.1 19.3 2.1* 0.3

Documented-No Prior Experience 81.0 78.9 80.7 -2.1* -0.3

Intends to Live in U.S. Rest of Life Yes 88.5 89.8 88.1 1.3* -0.4

No 11.5 10.2 11.9 -1.3* 0.4

Current Employment Working Now 55.5 55.9 54.3 0.4 -1.2

Unemployed and Looking 16.9 17.1 17.9 0.2 1.0*

Temporarily Laid Off 0.9 0.8 1.0 -0.1 0.1

Disabled 0.9 0.8 0.9 -0.1 0.0

Retired 3.8 3.0 3.6 -0.8* -0.2

Homemaker 17.6 17.6 17.6 0.0 0.0

Other 4.4 4.7 4.6 0.3 0.2

_________________

Continued

32

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Sample R1 Sample






Occupational Status Not Working 43.4 43.1 44.6 -0.3 1.2

Managerial 3.3 3.0 3.1 -0.3 -0.2

Professionals 8.9 9.7 8.9 0.8* 0.0

Technicians 0.6 0.5 0.5 -0.1 -0.1

Sales Workers 5.5 5.0 5.5 -0.5 0.0

Administrative Workers 5.5 5.6 5.4 0.1 -0.1

Craft Workers 5.1 4.8 4.8 -0.3 -0.3

Operatives 8.2 8.3 8.2 0.1 0.0

Laborers and helpers 3.7 3.8 3.5 0.1 -0.2

Service Workers 15.0 15.2 14.6 0.2 -0.4

Other 0.9 0.9 1.0 0.0 0.1

Race/Ethnicity Non-Hispanic White 20.0 20.0 20.5 0.0 0.5

Non-Hispanic Asian 28.9 26.7 28.7 -2.2* -0.2

Non-Hispanic Black 11.0 10.2 10.4 -0.8* -0.6

Non-Hispanic Other 1.1 1.0 1.1 -0.1 0.0

Hispanic 39.0 42.1 39.3 3.1* 0.3

Skin Color Rating Mean 5.0 5.0 5.0 0.0 0.0

0 4.7 4.5 4.4 -0.2 -0.3

1 5.2 4.5 4.5 -0.7* -0.7*

2 11.3 12.1 12.2 0.8* 0.9*

3 17.9 17.6 17.6 -0.3 -0.3

4 16.4 17.7 18.1 -0.4* 1.7*

5 22.4 22.7 22.3 1.3 -0.1

6 8.8 7.8 7.6 -1.0* -1.2*

7 5.5 5.4 5.3 -0.1 -0.2

8 4.1 4.2 4.2 0.1 0.1

9 1.7 1.7 1.8 0.0 0.1

10 2.0 1.8 2.0 -0.2 0.0

________________________

Continued

33

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Sample R1 Sample






Country of Birth English Speaking Nations 3.1 3.0 2.9 -0.1 -0.2

Western Europe 2.4 2.2 2.4 -0.2 0.0

Eastern Europe 9.0 10.0 9.1 1.0* 0.1

Central Asia 0.6 0.9 0.7 0.3* 0.1

Middle East and North Africa 5.2 4.4 5.7 -0.8* 0.5

Sub Saharan Africa 6.2 6.2 6.0 0.0 -0.2

South Asia 9.6 9.0 9.5 -0.6 -0.1

Southeast Asia and Pacific 10.5 10.0 10.6 -0.5 0.1

East Asia 9.0 7.8 8.5 -1.2* -0.5

Mexico 17.5 18.8 17.7 1.3* 0.2

Other Latin America-Caribbean 26.8 27.8 26.8 1.0 0.0

State at Time of Interview California 28.4 28.9 28.7 0.5 0.3

Florida 7.9 8.0 7.8 0.1 -0.1

Illinois 4.8 5.4 4.8 0.6* 0.0

New Jersey 5.8 5.9 6.0 0.1 0.2

New York 11.8 10.1 12 -1.7* 0.2

Texas 8.2 8.4 8.1 0.2 -0.1

New England 5.6 6.3 5.7 0.7* 0.1

Middle Atlantic 5.1 4.9 5.0 -0.2 -0.1

South Atlantic 5.7 6.4 5.8 0.7* 0.1

East South Central 0.8 0.7 0.6 -0.1 -0.2

East North Central 3.7 3.4 3.7 -0.3 0.0

West North Central 2.3 2.4 2.2 0.1 -0.1

West South Central 0.5 0.4 0.4 -0.1 -0.1

Mountain 5.5 5.1 5.4 -0.4 -0.1

Pacific 4.0 3.5 3.8 -0.5* -0.2

Non-Continental U.S. Territories 0.0 0.0 0.0 0.0 0.0

English Ability Understand Not at All 16.3 15.9 16.2 -0.4 -0.1

Understand Not Well 26.5 28.6 26.6 2.1* 0.1

Understand Well 25.7 26.2 26.1 0.5 0.4

Understand Very Well 31.4 29.3 31.0 -2.1* -0.4

__________________

Continued

34

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Sample R1 Sample






Interview in English Yes 41.9 39.4 41.1 -2.5* -0.8

No 58.1 60.6 58.9 2.5* 0.8

Ever spoken Another language Yes 94.3 95.1 94.6 0.8* 0.3

No 5.7 4.9 5.4 -0.8* -0.3

Speaks Other Language at Home Yes 82.7 83.5 82.6 0.8 -0.1

No 17.3 16.5 17.4 -0.8 0.1

Currently Covered by Health Insurance Yes 34.3 35.7 34.7 1.4* 0.4

No 65.0 64.0 64.9 -1.0 -0.1

Don’t Know/Refused 0.6 0.3 0.4 -0.3* -0.2*

Covered by Private Health Insurance in U.S. Yes 33.5 34.7 33.7 1.2* 0.2

No 65.9 64.9 65.9 -1.0 0.0

Don’t Know/Refused 0.6 0.4 0.4 -0.2* -0.2

Source of Private U.S. Health Insurance R's current employer 51.4 50.9 50.2 -0.5* -1.2

R's former employer 0.7 0.6 0.8 -0.1 0.1

R's union 0.7 0.8 0.8 0.1 0.1

R's school 0.5 0.4 0.4 -0.1 -0.1

R's parents 1.2 1.0 1.2 -0.2 0.0

Spouse's current employer 35.4 36.4 36.3 1.0 0.9

Spouse's former employer 0.4 0.6 0.6 0.2* 0.2

Spouse's union 0.3 1.3 0.0 1.0* -0.3*

Spouse's school 0.9 1.2 1.2 0.3* 0.3*

Spouse's parents 1.2 6.1 1.1 4.9* -0.1

Someplace else 6.7 6.1 6.6 -0.6 -0.2

Don’t Know/Refused 0.7 0.8 0.9 0.1 0.2

Covered by Public Health Insurance from non-U.S. Country Yes 3.8 3.4 3.4 -0.4* -0.4

No 95.5 96.2 96.1 0.7* 0.6 *

DK/RF 0.7 0.4 0.5 -0.3* -0.2*

___________________

35

Table 2. Continued.

_____________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Sample R1 Sample






Covered by Medicare Yes 2.4 2.3 2.5 -0.1 0.1

No 96.7 97.0 96.7 0.3 0.0

DK/RF 0.8 0.7 0.8 -0.1 0.0

Covered by Medicaid Yes 5.9 6.5 6.6 0.6 0.7*

No 92.7 92.2 91.9 -0.5 -0.8

DK/RF 1.4 1.3 1.5 -0.1 0.1

Gross Total Income by Groups

Zero 24.8 22.9 24.6 -1.9* -0.2

1 to <1,800 6.0 6.0 6.2 0.0 0.2

1,800 to <6,500 9.0 8.9 8.9 -0.1 -0.1

6,500 to <23,784 18.1 18.5 18.0 0.4 -0.1

23,784 to <52,734 21.0 22.7 21.3 1.7* 0.3

52,734 to<95,000 10.8 11.8 10.4 1.0* -0.4

95,000 to <132,000 3.0 3.1 3.1 0.1 0.1

>=132,000 2.9 2.9 2.7 0.0 -0.2

Missing 4.4 3.2 4.7 -1.2* 0.3

Net Worth <Zero 5.4 6.1 5.2 0.7* -0.2

Zero 30.9 29.7 31.2 -1.2* 0.3

1 to 10,000 18.7 17.9 18.1 -0.8 -0.6

10,000 to 50,000 15.1 15.5 14.9 0.4 -0.2

50,000 to 200,000 13.7 14.9 13.7 1.2* 0.0

>=200,000 8.1 9.1 8.0 1.0* -0.1

Missing 8.2 6.8 8.8 -1.4* 0.6

Home Owner Yes 23.4 26.3 23.1 2.9* -0.3

No 76.6 73.7 76.9 -2.9* 0.3

Total Error: 138.2 49.6

_____________________________________________________________________________

36

Table 3. Estimated changes in variable values from Round 1 to Round 2 with and without

weighting for nonresponse.

______________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Data: R2 Data: R2 Data: R1 to R2 Change .

Design Design Design and Design Design and

Weights Weights Nonresponse Weights Nonresponse

Variable Alone Alone Weights Alone Weights

Current Health Status Excellent 34.4 23.5 23.8 -10.9 -10.6

Very good 28.5 23.2 23.5 -5.3 -5.0

Good 27.3 31.9 31.1 4.6 3.8

Fair 8.3 17.7 17.5 9.4 9.2

Poor 1.4 3.7 4.1 2.3 2.7*

Marital Status Never Married or in Union 15.4 10.5 11.3 -4.9 -4.1

Separated-Div.-Widowed 8.1 10.5 11.1 2.4 3.0*

Married or in Union 76.5 79.0 77.5 2.5 1.0

Intends to stay in U.S. Permanently Yes 88.5 74.8 74.6 -13.7 -13.9

No 11.5 9.8 10.1 -1.7 -1.4

Don’t Know-Refused 0.0 15.4 15.3 15.4 15.3

State at Time of Interview California 28.4 28.4 28.2 0.0 -0.2

Florida 7.9 7.9 7.8 0.0 0.1

Illinois 4.8 4.8 4.3 0.0 -0.5

New Jersey 5.8 5.7 5.8 -0.1 0.0

New York 11.8 10.4 12.4 -1.4 0.6*

Texas 8.2 8.6 8.3 0.4 0.1

New England 5.6 6.2 5.6 0.4 0.0

Middle Atlantic 5.1 5.2 5.3 0.1 0.2

South Atlantic 5.7 6.2 5.6 0.5 -0.1

East South Central 0.8 1.1 1.0 0.3 0.2

East North Central 3.7 3.9 3.9 0.2 0.2

West North Central 2.3 2.1 1.9 -0.2 -0.4

West South Central 0.5 0.5 0.5 0.0 0.0

Mountain 5.5 5.5 5.8 0.0 0.3

Pacific 4.0 3.4 3.6 -0.6 -0.4

US Territories 0.0 0.1 0.1 0.1 0.1


___________________

Continued

37

Table 3. Estimated changes in variable values from Round 1 to Round 2 with and without

weighting for nonresponse.

______________________________________________________________________________

(1) (2) (3) (4) (5)


R1 Data: R2 Data: R2 Data: R1 to R2 Change



Variable Alone Alone Weights Alone Weights .

Religion at Time of Interview Catholic 41.8 43.5 41.3 1.7 -0.5*

Orthodox 8.8 9.2 9.1 0.4 0.3

Protestant 16.8 16.5 16.9 -0.3 0.1

Muslim 7.1 6.2 7.2 -0.9 0.1*

Jewish 1.3 1.2 1.1 -0.1 -0.2

Buddhist 4.3 3.6 4.1 -0.7 -0.2*

Hindu 5.6 5.6 6.0 0.0 0.4

No Religion 12.5 12.0 11.7 -0.5 -0.8

Other 1.8 2.2 2.7 0.4 0.9*

Age at Time of Interview <25 11.6 2.9 3.5 -8.7 -8.0*

25 to 34 35.0 26.4 27.8 -8.6 -7.2*

35 to 44 25.3 33.8 32.0 8.5 6.7*

45 to 54 13.8 19.2 17.8 5.4 4.0*

55 to 64 7.9 9.3 9.0 1.4 1.1

>=65 6.5 8.3 9.9 1.8 3.4*

Race/Ethnicity Non-Hispanic White 20.0 18.7 18.8 -1.3 -1.2

Non-Hispanic Asian 28.9 26.9 28.7 -2.0 -0.2*

Non-Hispanic Black 11.0 9.0 9.4 -2.0 -1.6

Non-Hispanic Other 1.1 2.5 2.8 1.4 1.7

Hispanic 39.0 42.9 40.3 3.9 1.3*

Current Employment Working Now 55.5 70.3 69.2 14.8 13.7

Unemployed-Looking 16.9 7.3 7.3 -9.6 -9.6

Temporarily Laid Off 0.9 1.3 1.3 0.4 0.4

Disabled 0.9 1.2 1.3 0.3 0.4

Retired 3.8 3.3 4.0 -0.5 0.2*

Homemaker 17.6 15.4 15.8 -2.2 -1.8

Other 4.4 1.2 1.3 -3.2 -3.1

_______________________

Continued

38

Table 3. Continued.

______________________________________________________________________________

(1) (2) (3) (4) (5)





Occupational Status Not Working 43.4 27.5 28.7 -16.1 -14.9

Managerial 3.3 5.8 6.0 2.5 2.7

Professionals 8.9 12.6 12.0 3.7 3.1

Technicians 0.6 1.3 1.4 0.7 0.8

Sales Workers 5.5 6.2 6.4 0.7 0.9

Administrative Workers 5.5 7.4 7.4 1.9 1.9

Craft Workers 5.1 5.3 5.3 0.2 0.2

Operatives 8.2 10.1 9.8 1.9 1.6

Laborers and helpers 3.7 3.3 3.1 -0.4 -0.6

Service Workers 15.0 19.3 18.7 4.3 3.7

Other 0.9 1.1 1.2 0.2 0.3

Currently covered by Health Insurance

Yes 34.4 65.2 64.9 30.8 30.5

No 65.0 34.3 34.6 -30.7 34.6

Don’t Know-Refused 0.6 0.5 0.5 -0.1 -0.1

Covered by Private Health Insurance in U.S. Yes 33.5 67.2 66.1 33.7 32.6

No 65.9 30.1 31.1 -35.8 -34.8


Source of Private U.S. Health Insurance R's Current Employer 51.4 55.0 54.5 3.6 3.1

R's Former Employer 0.7 1.5 1.6 0.8 0.9

R's Union 0.7 0.8 0.9 0.1 0.2

R's School 0.5 0.3 0.4 -0.2 -0.1

R's Parents 1.2 0.5 0.6 -0.7 -0.6

Spouse's Current Employer 35.4 29.5 29.3 -5.9 -6.1

Spouse's Former Employer 0.4 0.6 0.5 0.2 0.1

Spouse's Union 0.3 0.7 0.7 0.4 0.4

Spouse's School 0.9 0.2 0.1 -0.7 -0.8*

Spouse's Parents 1.2 0.0 0.0 -1.2 -1.2

Someplace Else 6.7 9.3 9.8 2.6 3.1


Covered by Public Health Insurance from non U.S. Country Yes 3.8 4.7 4.9 0.9 1.1

No 95.5 94.3 94.0 -1.2 -1.5


_____________________

Continued

39

Table 3. Continued.

______________________________________________________________________________

(1) (2) (3) (4) (5)






Covered by Medicare Yes 2.4 9.2 10.0 -0.1 0.1*

No 96.7 88.5 87.6 0.3 0.0*

Don’t Know-Refused 0.8 2.3 2.5 -0.1 0.0

Covered by Medicaid Yes 5.9 13.6 14.3 7.7 8.4

No 92.7 83.5 82.6 -9.2 -10.1


Gross Total Income Zero 24.8 21.0 21.5 -3.8 -3.3

1 to <1,800 6.0 3.9 3.9 -2.1 -2.1

1,800 to <6,500 9.0 4.2 4.4 -4.8 -4.6

6,500 to <23,784 18.1 12.4 12.2 -5.7 -5.9

23,784 to <52,734 21.0 15.1 14.7 -5.9 -6.3

52,734 to<95,000 10.8 10.7 10.4 -0.1 -0.4

95,000 to <132,000 3.0 5.2 5.0 2.2 2.0

>=132,000 2.9 5.9 5.6 3.0 2.7


Net Worth by Groups <Zero 5.4 7.3 7.0 1.9 1.6

Zero 30.9 16.6 17.7 -14.3 -13.2*

1 to 10,000 18.7 14.1 14.2 -4.6 -4.5

10,000 to 50,000 15.1 11.9 11.7 -3.2 -3.4

50,000 to 200,000 13.7 14.1 13.3 0.4 -0.4

>=200,000 8.1 14.1 13.5 6.0 5.4


Home Owner Yes 23.4 36.4 34.4 13.0 11.0*

No 76.6 63.6 65.6 -13.0 -11.0*

______________________________________________________________________________

40

Table 4. Estimated design effects for Round 2 weighting schemes.

______________________________________________________________________________

(1) (2) (3)


R1 Data: R2 Data: R2 Data:

Design Design Design and

Weights Weights Nonresponse

Variable Alone Alone Weights

Demographic Background

Age at Interview 1.258 1.267 1.512

Female 1.342 1.329 1.448

No. Children in Household 1.326 1.330 1.308

No. Children Living Outside of US 1.273 1.297 1.222

Marital Status

Never Married and Not in Union 1.001 1.012 1.142

Separated, Divorced, Widowed 1.155 1.140 1.354

Married or in Union 1.108 1.102 1.255

Race/Ethnicity

Non-Hispanic White 1.350 1.344 1.507

Non-Hispanic Asian 1.303 1.276 1.427

Non-Hispanic Black 1.281 1.242 1.322

Non-Hispanic Other 1.421 1.597 2.073

Hispanic 1.398 1.399 1.452

Years of Education

< 6 years 1.245 1.277 1.482

6-11 years 1.366 1.374 1.470

12 years 1.407 1.416 1.492

13-15 years 1.403 1.378 1.489

16+ years 1.337 1.344 1.409

English Ability

Understand Not at All 1.257 1.264 1.348

Understand Not Well 1.351 1.349 1.375

Understand Well 1.384 1.402 1.507

Understand Very Well 1.385 1.383 1.522

Foreign Language Skills

Ever Spoke Other Language 1.457 1.455 1.675

Speaks Other Language at Home 1.565 1.599 1.756

Current Health Status

Excellent 1.372 1.375 1.486

Very good 1.378 1.384 1.474

Good 1.344 1.340 1.412

Fair 1.268 1.295 1.350

Poor 1.190 1.159 1.493

Health Compared to a Year Ago

Better 1.341 1.297 1.427

About the Same 1.334 1.325 1.463

Worse 1.361 1.423 1.592

________________

41

Continued

Table 4. Continued.

______________________________________________________________________________

(1) (2) (3)






Health Before Coming to the U.S.

Better 1.344 1.328 1.415

About the Same 1.358 1.362 1.461

Worse 1.452 1.490 1.591

Country/Region of Birth

English Speaking Nations 1.634 1.610 1.684

Western Europe 1.644 1.680 2.003

Eastern Europe 1.193 1.199 1.158

Central Asia 1.094 1.146 0.918

Middle East and North Africa 1.380 1.435 2.002

Sub Saharan Africa 1.139 1.131 1.151

South Asia 1.138 1.148 1.359

Southeast Asia and Pacific 1.371 1.345 1.486

East Asia 1.317 1.229 1.419

Mexico 1.524 1.553 1.543

Other Latin America/Caribbean 1.374 1.376 1.438

Place of Interview

California 1.374 1.377 1.456

Florida 1.415 1.408 1.467

Illinois 1.285 1.360 1.297

New Jersey 1.273 1.304 1.460

New York 1.258 1.249 1.603

Texas 1.406 1.424 1.427

New England 1.305 1.264 1.244

Middle Atlantic 1.249 1.228 1.314

South Atlantic 1.311 1.337 1.293

East South Central 1.320 1.198 1.231

East North Central 1.351 1.389 1.622

West North Central 1.436 1.390 1.333

West South Central 1.528 1.297 1.150

Mountain 1.628 1.651 1.783

Pacific 1.448 1.376 1.526

Non-continental U.S. Territories 0.802 0.902 0.604

_______________

Continued

42

Table 4. Continued.

______________________________________________________________________________

(1) (2) (3)






Current Employment

Working Now 1.359 1.372 1.470

Unemployed and Looking 1.291 1.325 1.486

Temporarily Laid Off 1.277 1.357 1.778

Disabled 1.248 1.207 1.352

Retired 1.121 1.058 1.317

Homemaker 1.474 1.517 1.600

Other 1.369 1.426 1.453

Occupation

Not Working 1.360 1.373 1.473

Managerial 1.289 1.275 1.390

Professionals 1.172 1.201 1.164

Technicians 1.569 1.393 1.248

Sales Workers 1.389 1.383 1.633

Administrative Support Workers 1.551 1.529 1.538

Craft Workers 1.481 1.370 1.490

Operatives 1.368 1.349 1.472

Laborers and Helpers 1.362 1.274 1.244

Service Workers 1.356 1.337 1.382

Other 1.319 1.398 1.577

When Job Obtained

Not Working 1.360 1.373 1.473

Job before LPR 1.372 1.365 1.454

Job after LPR 1.284 1.259 1.285

Household Income

Zero 1.265 1.269 1.409

1 to <1,800 1.169 1.177 1.326

1,800 to <6,500 1.154 1.118 1.179

6,500 to <23,784 1.329 1.333 1.401

23,784 to <52,734 1.508 1.505 1.538

52,734 to<95,000 1.547 1.507 1.431

95,000 to <132,000 1.366 1.439 1.498

>=132,000 1.348 1.330 1.277

Missing 1.557 1.577 2.467

Darkness of Skin Color

Skin Color Rating 1.349 1.360 1.457

Skin Color Missing 1.355 1.353 1.445

______________________

Continued

43

Table 4. Continued.

______________________________________________________________________________

(1) (2) (3)






Net Worth

Negative 1.504 1.490 1.358

Zero 1.288 1.275 1.393

1 to 10,000 1.342 1.293 1.408

10,000 to 50,000 1.486 1.493 1.531

50,000 to 200,000 1.434 1.447 1.436

>=200,000 1.444 1.457 1.367

Missing 1.398 1.409 1.994

Property

Home Owner 1.496 1.491 1.446

Immigrant Class of Admission

Relative of Citizen-Unlimited 1.369 1.422 1.468

Relative of Citizen-Limited 1.101 1.108 1.096

Relative of LPR 1.048 1.069 1.011

Employment 0.718 0.695 0.734

Diversity 0.688 0.717 0.738

Refugee/Asylee/Parolee 1.085 1.074 1.163

Legalization 1.101 1.117 1.032

Other 1.097 1.105 1.092

Prior Immigrant Experience

Formerly Undocumented 1.407 1.425 1.419

Future Intentions

Intends to Live in U.S. for Rest of Life 1.364 1.362 1.433

Missing 1.365 1.367 1.456

Religion Affiliation

Catholic 1.385 1.384 1.449

Orthodox 1.252 1.244 1.281

Protestant 1.356 1.358 1.498

Muslim 1.277 1.323 1.719

Jewish 1.382 1.298 1.320

Buddhist 1.512 1.416 1.631

Hindu 1.057 1.104 1.400

No Religion 1.387 1.378 1.411

Other 1.319 1.168 1.575

______________

Continued

44

______________________________________________________________________________

(1) (2) (3)






Frequency of Religious Attendance

Never 1.368 1.346 1.476

Sporadically 1.408 1.393 1.513

Regularly 1.408 1.407 1.436

Frequency 1.356 1.362 1.447

Very Frequently 1.268 1.328 1.656

Average 1.331 1.330 1.430

__________________________________________________________________________

45

Appendix A. Initial specification of a logistic regression model predicting completion of round

two interview for the New Immigrant Survey

__________________________________________________________________________

Standard


Demographic Background Age 0.018 0.013

Age squared 0.000 0.000

Female 0.204*** 0.062

No. Children in Household 0.053** 0.024

No. Children living outside of US 0.180*** 0.060

Marital Status Never Married-Not in Union ---- ----

Separated-Divorced-Widowed 0.011 0.121

Married or in Union 0.116 0.081

Race/Ethnicity Non-Hispanic White ---- ----

Non-Hispanic Asian 0.194 0.172

Non-Hispanic Black 0.183 0.170

Non-Hispanic Other 0.106 0.307

Hispanic 0.077 0.169

Years of Education <6 years ---- ----

6-11 years 0.164 0.109

12 Years 0.277** 0.124

13-15 years 0.317** 0.124

16+ Year 0.396*** 0.128

English Ability Understand Not at All ---- ----

Understand Not Well 0.095 0.095

Understand Well -0.023 0.109

Understand Very Well -0.194 0.127

Interview in English -0.125 0.079

Foreign Language Skills Ever Spoke Other Language -0.032 0.164

Speaks Other Language at Home 0.102 0.102

Current Health Status Poor ---- ----

Fair 0.102 0.246

Good 0.033 0.241

Very good 0.068 0.245

Excellent 0.004 0.246

_________________________

Continued

46

Appendix A. Continued.

______________________________________________________________________________

Standard


Health Compared to Year Ago About the Same ---- ----

Better -0.115 0.079

Worse -0.253** 0.123

Health Before Coming to U.S. About the Same ---- ----

Better 0.109 0.073

Worse 0.159 0.104

Country/Region of Birth English Speaking Nations ---- ----

Western Europe -0.149 0.243

Eastern Europe 0.148 0.201

Central Asia 0.309 0.341

Middle East and North Africa -0.477** 0.225

Sub-Saharan Africa -0.148 0.247

South Asia -0.325 0.256

Southeast Asia and Pacific -0.389 0.245

East Asia -0.498** 0.252

Mexico -0.055 0.240

Other Latin America/Caribbean -0.173 0.223

Place of Interview California ---- ----

Florida -0.110 0.122

Illinois 0.043 0.131

New Jersey 0.006 0.120

New York -0.233** 0.099

Texas -0.115 0.111

New England 0.219* 0.121

Middle Atlantic -0.062 0.125

South Atlantic 0.317** 0.125

East South Central -0.064 0.316

East North Central 0.009 0.142

West North Central 0.331* 0.180

West South Central -0.307 0.420

Mountain -0.219 0.138

Pacific -0.158 0.143

Non-Continental US territories -0.085 1.427

__________________

Continued

47

Appendix A. Continued

______________________________________________________________________________

Standard


Current Employment Working ---- ----

Unemployed and looking -0.146 0.312

Temporarily laid off -0.570 0.417

Disabled -0.174 0.416

Retired -0.286 0.352

Homemaker -0.178 0.318

Other 0.108 0.321

Work Situation Belongs to Labor Union -0.029 0.147

Hours Worked per Week -0.003 0.003

Hourly Wage -0.002 0.002

Occupation Laborers and Helpers ---- ----

Not Working 0.508 0.470

Service Workers -0.007 0.160

Operatives -0.166 0.171

Craft Workers -0.062 0.187

Administrative Support Workers 0.057 0.193

Sales Workers -0.019 0.185

Technicians 0.195 0.375

Managerial 0.054 0.208

Professionals 0.410** 0.181

Other 0.567 0.439

When Job Obtained Not Working ---- ----

Job before LPR 0.573 0.395

Job after LPR 0.653 0.397

Total Household Income Zero -0.129 0.092

1 to <1800 -0.056 0.118

1800 to <6500 -0.017 0.103

6500 to <23784) ---- ----

23784 to <52734 0.069 0.087

52734 to <95000 0.247** 0.110

95000 to <132000 -0.101 0.162

>=132000 0.057 0.174

Missing cases -0.620 0.936

Darkness of Skin Color Skin Color Rating 0.013 0.017

Skin Color Missing 0.125 0.091

___________________

Continued

48


_____________________________________________________________________________

Standard


Net Worth Negative 0.235* 0.132

Zero -0.020 0.093

1 to <10,000 -0.045 0.092

10,000 to <50,000 ---- ----

50,000 to 200,000 0.049 0.100

>=200,000 0.157 0.122

Missing 0.150 0.929

Property Home Owner 0.147* 0.083

Immigrant Class of Admission Rel. of Citizen- Unlimited ---- ----

Rel. of Citizen-Limited 0.317** 0.130

Relative of LPR 0.174 0.172

Employment 0.023 0.099

Diversity 0.207* 0.108

Refugee/Asylee/Parolee 0.000 0.130

Legalization 0.294** 0.141

Other 0.287*** 0.111

Prior Immigrant Experience U.S. Experience in months 0.000 0.001

Formerly Undocumented 0.052 0.102

Future Intentions Intends to Live in US Rest of Life 0.252** 0.124

Intends Missing 0.205* 0.121

Religious Affiliation Protestant ---- ----

Catholic 0.176** 0.082

Orthodox -0.053 0.111

Muslim 0.034 0.141

Jewish 0.359 0.258

Buddhist -0.011 0.165

Hindu 0.039 0.170

No Religion 0.288** 0.118

Other Religion -0.322 0.237

__________________

Continued

49


______________________________________________________________________________

Standard


Frequency of Religious Attendance Never ---- ----

Sporadically -0.056 0.099

Regularly 0.042 0.109

Frequently 0.049 0.093

Very Frequently -0.095 0.146

Constant -1.481** 0.619

LR chi2(115) 321.800***

Log likelihood -4282.105

Pseudo R2 0.0362

Observations 6,435

_____________________________________________________________________________

za abstract - nis

Documents