disadvantages of linguistic origin—evidence from immigrant literacy scores

4
Economics Letters 123 (2014) 236–239 Contents lists available at ScienceDirect Economics Letters journal homepage: www.elsevier.com/locate/ecolet Disadvantages of linguistic origin—Evidence from immigrant literacy scores Ingo E. Isphording Institute for the Study of Labor (IZA) Bonn, Germany highlights Estimation of gaps in literacy scores resulting from linguistic origin. Unique linguistic data on language differences. Cross-national design to control for origin- and destination-effects. Sizable disadvantages by linguistic origin, increasing with age at arrival. Moderate convergence of linguistically distant immigrants by time of residence. article info Article history: Received 19 November 2013 Received in revised form 12 February 2014 Accepted 14 February 2014 Available online 22 February 2014 JEL classification: F22 J15 J24 J31 Keywords: Linguistic distance Literacy Human capital Immigrants abstract This study quantifies the disadvantage in literacy skills that arises from the linguistic distance between their mother tongue and host country language, combining individual cross-country data on literacy scores with unique information on the linguistic distance between languages. © 2014 Elsevier B.V. All rights reserved. 1. Introduction The rise of information and communication technology and the associated increase in the demand for skills in literacy and nu- meracy display a particular challenge for immigrants from differ- ent linguistic backgrounds. Literacy in the destination language as ‘‘the ability to understand and employ written information in daily activities, at home, at work and in the community’’ (OECD, 2000) comprises a productive trait highly valued in the labor mar- ket (Dougherty, 2003), and insufficient levels of literacy lead to significant hurdles for the economic integration of immigrants Correspondence to: IZA, Schaumburg-Lippe-Str. 5-7, 53113 Bonn, Germany. E-mail addresses: [email protected], [email protected]. (Ferrer et al., 2006; Kahn, 2004). Still, given this importance of lit- eracy and language skills, the literature on the skill formation re- mains surprisingly scarce. Non-native speaking immigrants face the economic decision to acquire a host-country language. The linguistic literature indicates that the costs of language acquisition are associated to the linguistic background of an immigrant. An increased linguistic dissimilarity or distance between the mother tongue of an immigrant and the language of the destination country, decreases the potential language transfer, the application of knowledge in the mother tongue in the destination country language acquisition. To provide an economic interpretation, the linguistic distance displays the degree of transferability of home country language capital into the destination country, analogous to the imperfect portability of education (Friedberg, 2000). Linguistic differences are not straightforward to measure, and the linguistic literature mainly comprises qualitative or small scale http://dx.doi.org/10.1016/j.econlet.2014.02.013 0165-1765/© 2014 Elsevier B.V. All rights reserved.

Upload: ingo-e

Post on 30-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Disadvantages of linguistic origin—Evidence from immigrant literacy scores

Economics Letters 123 (2014) 236–239

Contents lists available at ScienceDirect

Economics Letters

journal homepage: www.elsevier.com/locate/ecolet

Disadvantages of linguistic origin—Evidence from immigrantliteracy scoresIngo E. Isphording ∗

Institute for the Study of Labor (IZA) Bonn, Germany

h i g h l i g h t s

• Estimation of gaps in literacy scores resulting from linguistic origin.• Unique linguistic data on language differences.• Cross-national design to control for origin- and destination-effects.• Sizable disadvantages by linguistic origin, increasing with age at arrival.• Moderate convergence of linguistically distant immigrants by time of residence.

a r t i c l e i n f o

Article history:Received 19 November 2013Received in revised form12 February 2014Accepted 14 February 2014Available online 22 February 2014

JEL classification:F22J15J24J31

Keywords:Linguistic distanceLiteracyHuman capitalImmigrants

a b s t r a c t

This study quantifies the disadvantage in literacy skills that arises from the linguistic distance betweentheir mother tongue and host country language, combining individual cross-country data on literacyscores with unique information on the linguistic distance between languages.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

The rise of information and communication technology and theassociated increase in the demand for skills in literacy and nu-meracy display a particular challenge for immigrants from differ-ent linguistic backgrounds. Literacy in the destination languageas ‘‘the ability to understand and employ written information indaily activities, at home, at work and in the community’’ (OECD,2000) comprises a productive trait highly valued in the labor mar-ket (Dougherty, 2003), and insufficient levels of literacy lead tosignificant hurdles for the economic integration of immigrants

∗ Correspondence to: IZA, Schaumburg-Lippe-Str. 5-7, 53113 Bonn, Germany.E-mail addresses: [email protected], [email protected].

http://dx.doi.org/10.1016/j.econlet.2014.02.0130165-1765/© 2014 Elsevier B.V. All rights reserved.

(Ferrer et al., 2006; Kahn, 2004). Still, given this importance of lit-eracy and language skills, the literature on the skill formation re-mains surprisingly scarce.

Non-native speaking immigrants face the economic decision toacquire a host-country language. The linguistic literature indicatesthat the costs of language acquisition are associated to thelinguistic background of an immigrant. An increased linguisticdissimilarity or distance between the mother tongue of animmigrant and the language of the destination country, decreasesthe potential language transfer, the application of knowledge inthemother tongue in the destination country language acquisition.To provide an economic interpretation, the linguistic distancedisplays the degree of transferability of home country languagecapital into the destination country, analogous to the imperfectportability of education (Friedberg, 2000).

Linguistic differences are not straightforward to measure, andthe linguistic literaturemainly comprises qualitative or small scale

Page 2: Disadvantages of linguistic origin—Evidence from immigrant literacy scores

I.E. Isphording / Economics Letters 123 (2014) 236–239 237

quantitative studies. Van der Slik (2010) offers an overview andnotable exception. A small number of studies have attemptedto implement measures that condense linguistic differences to aone-dimensional summary statistic: Chiswick and Miller (1999)define such a measure using classroom assessments of Americanlanguage students to explain self-reported language fluency;Lohmann (2011) using grammatical features of languages toexplain international trade flows and Adsera and Pytlikova (2012)using language family relations to explain bilateral migrationflows.

Against this background, this study aims to quantify the linguis-tic barriers in literacy skill formation. Data on literacy scores fromthe International Adult Literacy Survey (IALS) is combined with aunique measure of the linguistic distance used by Isphording andOtten (2013) to explain bilateral trade flows. It is based on differ-ences between mother tongue and the host country language interms of pronunciation. Drawn from linguistic research by the Ger-manMax Planck Institute of Evolutionary Anthropology, this mea-sure offers a continuous and cardinally interpretablemeasurementof linguistic differences for any of the world’s languages. Regress-ing literacy scores on the linguistic distance yields estimates ofscore differentials with respect to an immigrant’s linguistic origin.

This data setup offers two key advantages that allow for thecore contributions of this study: first, the cross-sectional design ofthe IALS data allows simultaneously controlling for destination andorigin country specific characteristics, which have been omittedin previous studies using national datasets (Chiswick and Miller,1999; Van der Slik, 2010; Isphording and Otten, 2013). Second,the usage of objective literacy scores allows quantifying resultsfor subjective measures of language skills, avoiding issues of mea-surement error in these self-reported indicators. Third, the com-bination of this dataset with the innovative measure of linguisticdistance allows then the broadening of national results to achievean international perspective. Finally, the study specifically ad-dresses the influence of linguistic origin over time of residence andoffers additional evidence for the so-called Critical Period hypoth-esis, which states that the necessary effort for acquiring a languageincreases with the immigrant’s age at arrival.

2. Material and methods

To assess the magnitude of linguistic barriers in the languageacquisition of immigrants, I combine data from two differentsources—the public use file of the International Adult LiteracyStudy (IALS) and the Automatic Similarity Judgement Program(ASJP), a research program by the German Max-Planck Institute ofEvolutionary Anthropology, aiming at explaining the historical de-velopment and geographical diversity of languages (Brown et al.,2008).

The IALS offers a unique data source on adults’ literacy skillsand socio-economic characteristics over the period from 1994to 1998 (OECD, 2000). After deleting observations with missinginformation, the dataset covers 1521 immigrants from 70 sendingcountries in 9 host countries.1 The dataset offers informationon three dimensions of literacy: prose literacy (the knowledgeto understand and use information in texts), document literacy(the skills to use information stored in documents such as forms,schedules, tables, etc.) and quantitative literacy (the skill to locatenumbers found in printed materials and apply simple arithmeticoperations). This directmeasurement based on test booklets avoids

1 Immigrants are defined as individuals not born in the surveyed country.Detailed information on the country of origin is available in Switzerland, theNetherlands, Sweden, Great Britain, Italy, Slovenia, Czech Republic, Finland andHungary.

dealing with substantial degrees of misreporting in typically usedself-reported measures of language skills (Charette and Meng,1994; Dustmann and van Soest, 2001).2

The IALS data is augmented with a measure of linguistic dis-tance between the mother tongue and host country language us-ing the information on the first language of an immigrant. The ASJPmethod to assess language differences relies on the measurementof similarities in pronunciation by a direct comparison of wordpairs with the samemeaning across different languages. 40 cultur-ally independentwords are transcribed in a phonetic script, e.g. theEnglish word mountain is transcribed as maunt3n, while its Span-ish counterpart is transcribed as monta5a, with each character inthese transcriptions representing a common sound of human com-munication. Within each word pair of the same meaning betweenlanguages, the Levenshtein distance is calculated, i.e. theminimumnumber of sounds that have to be changed, removed or added totransfer the word of one language into the same word in a differ-ent language. Table 1 summarizes some computational examples.The average minimum distance between all 40 word pairs is nor-malized to take into account potential similarities by chance due toshared phonetic inventories, resulting in the final measure of lin-guistic dissimilarities (Brown et al., 2008).

The distances computed by the ASJP are in line with thebasic intuition on language differences. Closest distances emergewithin the same language family (Germanic languages for Englishand German, Romance languages for French and Slavic languagesfor Czech). The closest linguistic distance different from zero inthe present sample relates to Serbian-speaking immigrants inSlovenia, while the largest distance is encountered by Turkishimmigrants in the Netherlands.3

To identify systematic disadvantages of linguistic origin in theliteracy scores, literacy Y is estimated as a function of linguisticdistance LD, years since migration YSM and an indicator for arrivalbefore age of 12 AgeEntry12, separately for each of the three literacydimensions:

Y = β0 + β1LD + β2YSM + β3AgeEntry12 + β4LD

× AgeEntry12 + β5LD × YSM + X ′γ + O′δ + D′λ + ε. (1)

The interaction term LD × YSM accounts for a convergenceover time of residence in literacy scores between native and non-native speakers. LD × AgeEntry12 accounts for an increase in theeffect of the linguistic origin by age at arrival, as indicated in thepsychobiological literature and referred to as the Critical Periodhypothesis (Newport, 2002). Accordingly, the coefficients of themain effects of years since migration and age at entry, β4 andβ5, indicate the effects for the subpopulation of native-speakingimmigrants with LD = 0. Control variables X consist of gender,individual and parental education, birth cohort and the geographicdistance between the origin and destination countries.

The international design of the IALS allows to simultaneouslycontrol for origin- and destination-fixed effects (D andO) capturingpotentially omitted country characteristics, e.g. differences in

2 Specific answers to the test booklet do not indicate a literacy level withcertainty. Due to the restricted number of questions, individuals with differentlevels of literacy might still produce the same set of answers. To account for thisuncertainty, the IALS data provide 5 different plausible values of literacy scores forevery individual. To take into account this sampling procedure of the IALS, I followthe established method of using the simple average of the 5 plausible values oftest scores as the outcome variable. Standard errors are subsequently computed,taking into account the replicate weights offered by IALS. This method accountsfor the unspecified intra-cluster correlation, yet ignores the stratification of thesampling. Brown and Micklewright (2004) show that this method might produceslightly overstated standard errors in some cases.3 The complete matrix of linguistic distances can be found in the web appendix,

Table 8.

Page 3: Disadvantages of linguistic origin—Evidence from immigrant literacy scores

238 I.E. Isphording / Economics Letters 123 (2014) 236–239

Table 1Linguistic distance: computational examples.Source: Brown et al. (2008).

Word Spanish English Distance

You tu yu 1Not no nat 2Person persona pers3n 2Night noCe nEit 3Mountain monta5a maunt3n 5

language acquisition support, or selective migration policiesfavoring skilled immigrants for the receiving country, anddifferences in media exposure to foreign languages or the qualityof the education system for the sending country.4

3. Results

The main results of the estimation of Eq. (1) are summarized inTable 2. Separately estimated for each dimension of literacy, theresults confirm a strong negative influence of the linguistic back-ground on the literacy formation in the destination language of im-migrants. The main effect of linguistic distance displays the initialdisadvantage (at YSM = 0) for young arrivals immigrating at theage of 11 or younger. It is only significant for the prose and quan-titative literacy while it remains insignificant in the document lit-eracy.

The negative effect of the linguistic distance becomes morepronounced for immigrants arriving at an age of 12 or older, in-dicated by the significant coefficients of the interaction termsbetween age of entry 12 or older and the linguistic distance. Thissupports the Critical Period Hypothesis in the linguistic literature:young children are able to acquire new languages almost effort-lessly, while the linguistic background plays an increasingly im-portant role when individuals approach adolescence.5

Regarding the relationship of years since migration and thelinguistic distance, the results indicate a moderate convergenceover time. The positive interaction of linguistic distance and yearssince migration shows that immigrants with a distant linguisticbackground face a steeper assimilation profile and are able to catchup over time.

The main effects of years since migration and age at entry aresmall in levels and insignificant in most cases. This indicates thelack of change in literacy scores for native speaking immigrants.Neither do native speaking immigrants face a disadvantage byarriving at older ages, as they already speak the destinationlanguage prior migration.6

Fig. 1 illustrates the relationship between age at entry, thetime of residence and the linguistic distance in terms of predictedmeans based on the results of Table 2. A similar pattern arisesfor all three dimensions of literacy in the upper panels (a), (b)and (c). Although the linguistic distance only has a small effect forchildhood immigrants (the dark gray line), it distinctively reducesthe test scores for late arrivals, as indicated by the much steepernegative slope of the light gray line. A more distant linguistic

4 The data setup does not allow to control for unobserved heterogeneity onthe level of bilateral origin- and destination-dyads. Robustness checks includingpotentially confounding factors on the bilateral level (cultural differences, migrantstock) at the expense of observational numbers indicate very robust pattern withregard to the linguistic distance. The results are available in the web appendix,Table 4.5 Robustness checks show that the results are not sensitive to the choice of the

actual threshold. The results are available in the web appendix in Table 6.6 Estimations excluding native speakers are available in the web appendix in

Table 5. The general pattern remains the robust, although the coefficients of interestbecome more pronounced.

Table 2Literacy and linguistic origin.

Linguistic distance −0.328**−0.128 −0.247*

(0.09) (0.10) (0.11)

Ling. dist. × age at entry 12 or older −0.446***−0.518***

−0.413***

(0.06) (0.07) (0.08)

Ling. dist. × years since migration 0.013*** 0.008** 0.011***

(0.00) (0.00) (0.00)

Age at entry 12 or older 0.397 7.211 9.333*

(4.15) (3.68) (3.79)

Years since migration −0.333 0.054 0.106(0.22) (0.22) (0.22)

Destination-fixed effects Yes Yes YesOrigin-fixed effects Yes Yes Yes

R2 0.602 0.589 0.569N 1521 1521 1521

Notes: standard errors in parentheses, computed using replicate weights andmean of plausible values to take sampling structure into account. Education basecategory: ISCED1/No schooling. Reference birth cohort: born before 1940. Thedependent variable: literacy test scores (range 0–500). Control variables on theindividual level include gender, individual and parental education, birth cohortand geographic distance. Control variables on the bilateral origin–destination levelinclude migration stock, cultural distance and geographic distance. Full estimatesin the web appendix in Table 3.

* Significant at 5% level.** Significant at 1% level.*** Significant at 0.1% level.

background increases the assimilation rate, albeit only marginally(Fig. 1, panels (d)–(f)). The convergence does not compensate thelarge initial disadvantage of linguistic origin.7

Fixing covariates at their samplemeans, the initial disadvantageof linguistic origin of a linguistically distant immigrant (e.g. a Turkin the Netherlands, LD = 102.33) compared to a native-speakingimmigrant accounts for 33.5 (13.1, 25.3) points in the prose(quantitative, document) scale. This increases to 79.2 (66.1, 67.5)points for immigrants who arrived at the age of 12 or later, andis comparable to the disadvantage of having no formal schoolingor schooling of ISCED 1 (only primary schooling) compared toISCED 5 (short-cycle tertiary education). Due to the only moderateconvergence, the disadvantage prevails over a long period of time,whereby the average disadvantage still accounts for 59.8 (53.9,50.6) points after 15 years of residence.

4. Conclusion

Insufficient literacy skills in the destination language representsa significant hurdle for the integration and assimilation ofimmigrants into the labor market of destination countries. Thisstudy shows that the immigrant’s proneness to insufficient levelsof literacy can be explained to a large extent by the barriers thatarise from a more or less distant linguistic background. Linguisticbarriers lead to a significant disadvantage in literacy scores, whichbecomes more pronounced by a later age at arrival. Althoughlinguistically distant immigrants seem to be able to catch up overtime, this convergence is only moderate does not offset the initialhurdle.

The unique data setup of internationally comparable andobjective measures of literacy combined with a measure oflinguistic distances between any of the world’s languages allowsto quantify and generalize results for nationally assessed self-reported language proficiency, by simultaneously controlling forthe unobserved heterogeneity on the origin and destination level.

7 Estimations including quadratic functions of the years since migration show anegligible decrease in the effect of additional exposure. For reasons of clarity, onlythe linear relationship is reported in the main specifications.

Page 4: Disadvantages of linguistic origin—Evidence from immigrant literacy scores

I.E. Isphording / Economics Letters 123 (2014) 236–239 239

Fig. 1. Interaction effects: linguistic distance, age at entry and years since migration.

The results uncover an important source of typically unob-served differences in heterogeneous immigrant populations andshed light on linguistic barriers as a factor for imperfect humancapital portability. Although the sample at hand does not allowfor a direct assessment of the labor market effects of the linguis-tic origin, comparisons with the literature on literacy and immi-grant earnings (Ferrer et al., 2006; Kahn, 2004) indicate that theestimated disadvantages are likely to lead to increased hurdles inthe labor market assimilation.

Acknowledgments

The author is grateful to Sebastian Otten, Marcos A. Rangel andthe participants of the Symposium on Migration and Language atPrinceton University, the participants of the 10th IZA MigrationWeek in Jerusalem, and the members of the Chair of CompetitionPolicy, Bochum, for their helpful comments and suggestions.

Appendix. Supplementary data

Supplementary material related to this article can be foundonline at http://dx.doi.org/10.1016/j.econlet.2014.02.013.

References

Adsera, A., Pytlikova, M., 2012. The role of language in shaping internationalmigration. In: CReAM Discussion Paper Series, vol. 1206. Centre for Researchand Analysis of Migration (CReAM), Department of Economics, UniversityCollege London. URL: http://ideas.repec.org/p/crm/wpaper/1206.html.

Brown, C.H., Holman, E.W., Wichmann, S., Velupillai, V., 2008. Automatedclassification of the World’s languages: a description of the method andpreliminary results. In: STUF-Language Typology and Universals, Vol. 61.pp. 285–308.

Brown, G., Micklewright, J., 2004. Using International Surveys of Achievement andLiteracy: a View from the Outside. UNESCO Institute for Statistics.

Charette, M., Meng, R., 1994. Explaining language proficiency: objective ver-sus self-assessed measures of literacy. Econom. Lett. 44, 313–321. URL:http://ideas.repec.org/a/eee/ecolet/v44y1994i3p313-321.html.

Chiswick, B.R., Miller, P.W., 1999. English language fluency among immigrants inthe United States. In: Polachek, S.W. (Ed.), Research in Labor Economics, Vol.17. JAI Press, Oxford, pp. 151–200.

Dougherty, C., 2003. Numeracy, literacy and earnings: evidence from thenational longitudinal survey of youth. Econ. Educ. Rev. 22, 511–521. URL:http://ideas.repec.org/a/eee/ecoedu/v22y2003i5p511-521.html.

Dustmann, C., van Soest, A., 2001. Language fluency and earnings: estimation withmisclassified language indicators. Rev. Econ. Stat. 83, 663–674.

Ferrer, A., Green, D.A., Riddell, W.C., 2006. The effect of literacy on immi-grant earnings. J. Hum. Resour. 41. URL: http://ideas.repec.org/a/uwp/jhriss/v41y2006i2p380-410.html.

Friedberg, R.M., 2000. You can’t take it with you? Immigrant assimilation and theportability of human capital. J. Labor Econom. 18, 221–251. URL: http://ideas.repec.org/a/ucp/jlabec/v18y2000i2p221-51.html.

Isphording, I.E., Otten, S., 2013. The costs of babylon—linguistic distance in appliedeconomics. Rev. Int. Econ. 21 (2), 354–369. URL: http://ideas.repec.org/p/rwi/repape/0337.html.

Kahn, L.M., 2004. Immigration, skills and the labor market: internationalevidence. J. Popul. Econ. 17, 501–534. URL: http://ideas.repec.org/a/spr/jopoec/v17y2004i3p501-534.html.

Lohmann, J., 2011. Do language barriers affect trade? Econom. Lett. 110, 159–162.http://dx.doi.org/10.1016/j.econlet.2010.10.023.

Newport, E.L., 2002. Critical periods in language development. In: Encyclopedia ofCognitive Science, Macmillan Publishers Ltd., Nature Publishing Group.

OECD, 2000. Literacy in the Information Age. Final Report of the International AdultLiteracy Survey. Technical Report.

Van der Slik, F.W.P., 2010. Acquisition of dutch as a second language. In: Studies inSecond Language Acquisition, vol. 32. pp. 401–432.