the iecm project: integrating the european census microdata iecm team* *a. cabré, a. esteve,...

18
The IECM project: Integrating the European Census Microdata IECM team* *A. Cabré, A. Esteve, J.Garcia, T. López, M. Valls www.iecm-project.org PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS International Satistical Institute, 56th Conference, Lisbon 2007. Centre d’Estudis Demogràfics

Post on 22-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

The IECM project:

Integrating the European Census MicrodataIECM team*

*A. Cabré, A. Esteve, J.Garcia, T. López, M. Valls

www.iecm-project.org

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Centre d’Estudis Demogràfics

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

IECM forms part of IPUMS international, a global collaboratory of National Statistical Offices and Universities to:

1. Inventory the world’s census microdata2. Preserve endangered microdata and documentation3. Integrate census microdata

a. use standards of UNSD, EuroSTAT, ISCO, ISCED, etc.

b. facilitate comparative research in time and space

4. Anonymize census microdata to preserve statistical confidentiality, using highest standards5. Disseminate restricted access, custom extracts to approved researchers at no cost

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Disseminating: Belarus, France, Greece, Hungary, Portugal, Romania, Spain,

Harmonizing: Austria, Czech Republic,Germany, Ireland, Italy, Netherlands,Slovenia,Turkey, United Kingdom

Negotiating: Belgium, Bulgaria, Norway,Poland, Russia, Switzerland

Contacted: Finland, Iceland, Lithuania, Moldova, Slovak Republic, Ukraine

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Census Year Density (%) Sample size Status

Austria 2001 1991 1981 1971 5 794.657 767.246 747.728 740.012 HBelarus 1999 1989 5 990.706 DBelgium 2001 1991 1981 1971 10 1.030.000 1.000.000 990.000 970.000 NBulgaria 2001 1992 5 395.000 425.000 NCzech Rep. 2001 1991 5 515.000 515.000 HFrance 1999 1990 1982 1975 1968 1962 5 3.005.000 2.360.854 2.631.713 2.629.456 2.487.778 2.320.901 DGermany m 2002 m 1997 m 1992 1987 1970 1 330.300 795.000 320.608 245.824 245.864 HGreece 2001 1991 1981 1971 10 1.028.899 969.407 923.108 845.473 DHungary 2001 1991 1981 1970 5 510.507 518.215 536.007 515.128 HNetherlands m 2001 m 1991 m 1981 1971 1960 1 189.725 150.000 141.000 159.203 143.251 HPoland 2002 1995 m 1988 m 1984 m 1978 m 1974 5 1.930.000 1.940.000 1.900.000 1.850.000 1.745.000 1.680.000 HPortugal 2001 1991 1981 5 500.000 495.000 490.000 HRomania 2002 1992 10 2.240.000 2.280.000 DRussia 2002 1989 5 7.200.000 7.400.000 NSlovenia 2001 1991 1981 10 200.000 200.000 HSpain 2001 1991 1981 5 2.084.221 1.931.458 2.039.274 DUnited Kingdom 2001 1991 1 600.000 574.000 H

m indicate microcensusIn Italics Expected Sample Size

H HarmonizingD DisseminatingN Negotiating

7 Countries, 22 censuses

31,200,731 Personal records and 10,267,616 Household records

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Source Documentation

Country EnglishOriginal

LanguageEnglish

Original Language

EnglishOriginal

LanguageEnglish

Original Language Total general

Austria 2 5 4 3 2 16

Belarus 1 2 1 1 2 2 9

Belgium 1 4 5

Bulgaria 3 3 6

CzechRepublic 1 3 1 5

France 4 6 4 6 2 1 23

Germany 2 4 2 2 10

Greece 5 6 1 4 8 19 4 47

Hungary 3 1 5 4 3 1 17

Ireland 8 1 2 11

Italy 1 1 1 3

Netherlands 1 2 1 3 7

Poland 3 9 12

Portugal 1 4 5

Romania 4 2 2 12 1 21

Russia 1 5 6

Slovenia 2 2

Spain 4 9 1 14

UnitedKingdom 4 1 5

Total 47 61 9 15 29 41 17 5 224

Questionnaires Instructions Other DocumentationCodebooks

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Integrated Documentation (sample designs)

METADATA :: Sample characteristics :: Portugal ( 1981 1991 2001 Selected All )

Census Characteristics 1981 1991 2001

Title XII Recenseamento Geral da Populaçao� e II Recenseamento Geral da Habita�çao

XIII Recenseamento Geral da Popula�çao e III Recenseamento Geral da Habita�çao

XIV Recenseamento Geral da Populaçao� e IV Recenseamento Geral da Habita�çao

Census Agency Portugal Instituto Nacional de Estat�istica (INE)

Portugal Instituto Nacional de Estat�istica (INE)

Portugal Instituto Nacional de Estat�istica (INE)

Population 9.833.014 9.867.147 10.356.117

Universe 100% 100% 100%

De jure or de facto De jure De jure De jure

Enumeration unit Dwelling Dwelling Dwelling

Census day 16 March 1981 15 April 1991 12 March 2001

Field work period February - June March - July February - June

Enumeration forms used

One questionnaire for each of the following statistical units: building, housing unit, household, individual, and "collective questionnaire" (for groups of individuals presents but not residents).

One questionnaire for each of the following statistical units: building, housing unit, private household, institutional household, individual and "collective questionnaire" (for groups of individuals presents but not residents).

One questionnaire for each one of the following statistical units: building, housing unit, private household, institutional household, individual and "collective questionnaire" (for groups of individuals presents but not residents).

Type of field work Direct enumeration Direct enumeration Direct enumeration

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Integrated Documentation (variable description)

MARST Label: Marital status

Select samples

Availability

Belarus: 1999 France: 1975, 1982, 1990, 1962, 1968 Greece: 1991, 1971, 2001, 1981 Hungary: 1990, 2001, 1970, 1980 Portugal: 1981, 1991, 2001 Romania: 1992, 2002 Spain: 1991, 1981, 2001

Universe

Belarus 1999: All persons. France 1962-1990: All persons. Greece 1971-2001: All persons. Hungary 1970-2001: All persons. Portugal 1981-2001: All persons. Romania 1992-2002: All persons. Spain 1981-1991: All persons.

Description MARST describes the person's current marital status according to law or custom. Individuals who remarried should report the status relevant to their most recent marriage. Census instructions rarely explicitly limit marital status to strictly legal unions. Note regarding universe: The lowest age at which a person can be anything but "never married" varies among samples.

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Experts and Producers Contributions

Friday, June 9 - INED, 133 bd Davout

09:30 - 10:30h Economic variables. Deborah Levison, presiding.Teresa Munzi (LIS, Luxembourg): Comments. 10:30 - 11:30h Education variables. Deborah Levison, presiding.Jürgen Hoffmeyer-Zlotnik (ZUMA, Manheim): Comments.

12:00 - 13:00h Family and household variables. Deborah Levison, presiding.Nico Keilman (University of Oslo): Comments.

14:00 - 15:00h Country reports: microdata in planning stage. Sabine Springer, presiding.Gustavo de Santis: Italy. Marcel Heiniger: Switzerland. Meryem Demirci: Turkey. 15:00 - 15:45h The new French "rotating" census. Sabine Springer, presiding.Guy Desplanques (INSEE): Presentation.

Workshop "Integrating European Census Microdata - II"with a satellite meeting of non-European countries

7-10 June 2006

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Harmonization I

Opportunity: A vast quantity of census microdata covering Europe in the period since 1960s survives in machine-readable form

Challenge: Variety of data formats. Census datasets are not clean

Answer: Reformat each sample into a simple consistent hierarchical format & DOCUMENTATION

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Harmonization II

Opportunity: Censuses share characteristics: similar topics (geographic, demographic, economic, education, household, migration characteristics of persons)

Challenge: Census employ different definitions, concepts (de jure vs de facto, households). Major reconciliation of these differences is a central effort for the project

Answer: DOCUMENTATION

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Harmonization III

Opportunity: For each variable, a similar set of basic codes exist. Common rationality (e.g., education: primary, secondary, tertiary)

Challenge: Censuses employ specific classifications

Answer: International Standards (ISCO, ISCED, ISIC), Composite coding schemes & DOCUMENTATION

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Table 2 (edu). Educational attainment/qualifications questions and age limit Age limit Question Austria 1991 2001

15 or more Education completed

Spain 1991 10 or more 2001 16 or more

Indicate the highest level of education completed

France 1962 1968 1975

16 or more

1982

Of the following diplomas, indicate all that you have earned

1990 Indicate your highest level diploma 1999

14 or more Indicate the last diploma acquired

Greece 1991 10 or more 2001 All

Education (write the highest level of studies completed by the respondent)

Hungary 1990 All Educational level

2001 All Type of education: grade, level or class completed

Netherlands

2001 15 or more Information derived from Labour Force Survey

Romania 1992 2002

10 or more Level of education (name and type of the highest level school graduated)

United Kingdom

1991 16 or more Has the person obtained any qualifications after reaching the age of 18

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Table 3 (eco). Compliance with UN classification of standard levels of current activity status AT91 AT01 ES91 ES01 FR62 FR68 FR75 FR82 FR90 FR99 GR91 GR01 HU90 HU01 NL01 RO92 RO02 UK91 Economically active Employed X X X X X X X X X X X X X X X X X Unemployed X X X X X X X X X X X X X X X X X Not economically active

Persons attending educational institutions X X X X X X X X X X X X X X X X

Pension or capital income recipients X X X X X X X X X X X X X X Homemakers X X X X X X X X X Others X X X X X X X X X X X X X X X X X Total number of categories* 8 13 10 11 10 7 6 6 6 7 9 19 7 9 9 9 10

*See annex 1 eco. for a complete list of the categories identified in each country

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

Table 4 (geo). Available levels of geographic detail 1st Level 2nd Level Austria 1991 Nuts (35) 2001 Nuts (35) Spain 1991 Province (52) Municipality > 20.000 inhabitants 2001 Province (52) Municipality > 20.000 inhabitants France 1962 Region (22) 1968 Region (22) 1975 Region (22) 1982 Region (22) 1990 Region (22) 1999 Region (22) Greece 1991 Department (55) 2001 Department (55) Hungary 1991 2001 Netherands 2001 Romania 1992 Region (8) 2002 Region (8) United Kingdom

1991 Region (12) Areas (278)

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

“Census microdata are an invaluable resource – sample density and geographic coverage - for social science research but international comparisons based on these data are rarely attempted, partly because, for much of the world, are either unavailable or restricted, and are therefore seldom used.”

--Robert McCaa and Steven Ruggles, 2002

% of oldest-old (85 + ) Female

EU -25 average = 2 ,45

< 0.50.5 - 0.750.75 - 11 - 1.251.25 - 1.51.5 - 1.751.75 - 22 - 2.252.25 - 2.52.5 - 2.752.75 - 3> 3

< 0.50.5 - 0 .750.75 - 11 - 1 .251.25 - 1 .51.5 - 1 .751.75 - 22 - 2 .252.25 - 2 .52.5 - 2 .752.75 - 3> 3

Eu25 average = 2,45

EU -25 average = 0 ,98

< 0.50.5 - 0.750.75 - 11 - 1.251.25 - 1.51.5 - 1.751.75 - 22 - 2.252.25 - 2.52.5 - 2.752.75 - 3> 3

Eu25 average = 0,98

IECM / IPUMS-Europe forthcoming countries

< 0.50.5 - 0 .750.75 - 11 - 1 .251.25 - 1 .51.5 - 1 .751.75 - 22 - 2 .252.25 - 2 .52.5 - 2 .752.75 - 3> 3

IECM / IPUMS-Europe forthcoming countries

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

% of oldest-old (85 +) Male

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

C onfidence Interva l be low the EU -25 average

C onfidence Interva l include the EU -25 average

C onfidence Interva l over the EU -25 average

EU-25 average = 2,45

C onfidence Interva l be low the EU -25 average

C onfidence Interva l include the EU -25 average

C onfidence Interva l over the EU -25 average

EU-25 average = 0,98

IECM / IPUMS-Europe forthcoming countries IECM / IPUMS-Europe forthcoming countries

Confidence Interval below the EU-25 average

Confidence Interval includes the EU-25 average

Confidence Interval over the EU-25 average

EU-25 average = 2,45

Confidence Interval below the EU-25 average

Confidence Interval includes the EU-25 average

Confidence Interval over the EU-25 average

EU-25 average = 0,98

Is the % of oldest-old higher or lower than the EU average?

Females EU average = 2.45 Males. EU average = 0.98

PROJECT OVERVIEW | DOCUMENTATION | HARMONIZATION | MICRODATA STRENGTHS

International Satistical Institute, 56th Conference, Lisbon 2007.

For more information, please visit:

www.iecm-project.orgor

www.ipums.org

and register now!!!

Contact: [email protected]

THANKS!!!