the inps data archive at frdb - universita’ bocconi orietta dessy (universita’ bocconi, frbd...
TRANSCRIPT
The INPS Data Archive atfRDB - Universita’ Bocconi
Orietta Dessy
(Universita’ Bocconi, fRBD & Dondena)
26th October 2006
Structure of the presentation
• INPS archives and our sample: differences with WHIP
• Possible matches & problems
• The first release: demographic and employees’ archives.
• Variables’ description.
• Next releases and access to data
INPS archives and our sample
• The Italian National Social Security Institute (Istituto Nazionale di Previdenza Sociale – INPS) collects workers’ contributions for a number of social security benefits: pensions, unemployment benefits, family bonuses, …
• Contributions are compulsory for firms and for any of their registered (regular) employees.
INPS archives and our sample: differences with WHIP
• Our sample: 4 birth-dates in a year (the 10th of 4 months) for a sample 1: 90. The same as WHIP.
• WHIP is a pre-constructed panel: an unique individual identifier has been constructed by researchers at Laboratorio Revelli according to subjective criteria (+: ready-to-be-used for researchers; -: very rigid)
• Our data try to be very close to the raw data: cleaning procedure has had the purpose to give all the possible tools for researchers to be able to construct easily their own panel (-: some work still needed to construct the panel/matching; +: extremely flexible)
• Different conditions for accessing the data
INPS archives
•DemographicArchive
PID
Employees’archive
1985-2002PIDFID
Self-employed archive
1986-2005PIDFID
Atypical workArchive
2000-2004PIDFID
•Pensions•Unemployment
•Special agricultural•1996-2002
PID
Household archives2001-2006
PIDHID
Firms ArchiveFID
Job historiesPIDFID
Possible matches & problems
Many possibilities:
• Merge files within-archive
• Merge files between-archives
• Cross sections
• Panels
Matching problems
• Each PID is intended as ‘contributive position’ and not as an individual. Possible to have more than 1 obs. for each PID in each file => need to choose 1 or compact somehow all the observations for the same group of PID for merging files.
(useful command in Stata duplicates)• Same problem on the firms’ side: need to
reconstruct the economic concept of firm
The first release
• The demographic archive
• Employees’ archive 1985-2002
• User manual, with description of variables, codebook and year-by-year tables reporting n.obs., % missing.
Demographic archive
• Contains all the PIDs that have been at least in one of any of the archives
• Adds demographic information on individuals to each single archive, whenever not existing already in the files
• N. obs: 945.576• Variables: year birth, sex, prov./country of
birth, year death. Year-by year residenza since 1997.
Problems encountered in cleaning the demographic archive
• Duplication of individuals: different PIDs can belong to the same person =>problem solved in a non-probabilistic environment (routine in Stata is being generalised), using information on old Fiscal Code and old INPS code available at INPS.
• Possible improvements in a probabilistic environment, taking into account similarities and spell errors in Name, Surname, Address (correct mistakes requires international vocabulary, very expensive, that INPS is buying)
Problems encountered in cleaning the demographic archive
• Estimated impact of further duplications very low. Eventually, the demographic archive will be updated.
• Important note: the demographic archive is updated to 2005. Therefore, demographic information sometimes goes further than the years covered by an archive.
• When demographic information is included in an archive, checks of coherences have been carried out. They are negligible (0% always)
Problems encountered in cleaning the demographic archive
• It might be that a PID in a certain archive is not found in the Demographic archive.
• Not clear explanation for that, probably this depends from the fact that Demographic archive has been updated to 2005
• Suggestion: keep individuals if no additional demographic information is needed, otherwise just exclude them.
Employees’ archiveYEAR N. OBS. N. IND. N. FIRMS
1985 150.542 132.020 88.243
1986 150.686 132.461 88.328
1987 153.052 131.982 91.118
1988 156.779 133.302 94.303
1989 159.849 135.166 95.753
1990 165.975 138.641 98.628
1991 166.427 140.173 98.487
1992 163.017 139.134 97.148
1993 154.316 133.602 92.001
1994 152.669 131.541 90.919
1995 156.097 132.398 92.645
1996 158.730 133.482 94.097
1997 156.717 131.425 92.444
1998 153.573 129.498 91.326
1999 160.367 130.804 96.275
2000 173.419 139.092 102.849
2001 179.758 143.659 106.649
2002 187.241 149.375 110.824
161.067 135.431 95.669
• Note: it is individuals that are sampled, not firms. Firms in the sample are those encountered by individuals in their job history.
• Not many individuals for the same firm
Employees’ archive
• PID & FID• Some qualitative variables on how data have
been reported, but not reliable• Employment: provlav, skill (white collar, blue
collar, executives, CEOs, apprentices, and a few more), part-time/full-time, since 1998 also duration of contract (fixed term, permanent, seasonal).
Employees’ archive: variables
• Income: truncated in thousands when reported in Lire, strong reporting errors in 1998, since 2000 reported both in Lire and in Euro.
• N. weeks paid, n. days paid, months paid in the year (not only the number, but flag on months)
Employees’ archive: variables
• Institutional variables: csc (convertible in NACE, table given) is the best variable for determining the sector of activity, code of contract (regional, prov., firm-level)
• Job classification (Inquadramento): available, but many missing and categories do not match directly usual classifications of national contracts
Employees’ archive: variables
• Severance indemnity (Trattamento di fine rapporto-TFR): amount due to employee from the end-of-service found
• Coordinates for family-related allowances• Up to 4 special income, for workers with
particular job-earnings: start and end of pay, compensation, n. weeks paid
• Reduced pay: illness, maternity, special lay-off pay fund (cig), others.
Employees’ archive: variables
Next releases
• Job-histories archive
• Atypical work archive (parasubordinati)
• Pensions
• Others
• Firms’ archive
Access to data
• Free for all the affiliates to fRDB-Universita’ Bocconi– Follow instructions on the web.– Password & email needed because the
archive will be continuously updated.
• Possible to share routines and programs for constructing most used panels.