access to microdata the australian bureau of statistics approach teresa dickinson...
TRANSCRIPT
This talk...
Legislation and policy
Access modes
–Confidentialised unit record files (CURFs)
–Other
Overseas access to ABS microdata
ABS Outputs Outside Census and Statistics Act
ABS Outputs
Published
Specialised
tables
CD-ROM
tables
Remote
access
ABSOn-site Lab
Low
High
access
Section 16AAssist Statistician in carrying out functions
Regulation 7A Assist Performance of Statistical functions
ABS analysis/Consultancy
Detail
Pro
tect
ion
Low
High
High
A number of legislative provisions, either directly or indirectly, can facilitate access to microdata
Our legislation allows release of microdata but only
“in a manner that is not likely to enable the identification of the particular person or organisation to which it relates”
We can release information about businesses (not individuals) 'to assist the statistician perform statistical functions' - involves collaborations to support the ABS workprogram
We can second certain individuals to the ABS to 'assist the Statistician perform statistical functions'
Australian Legislation
Valuable (and high quality) data is under-utilised.
Researchers may try to collect substitute data sets in order to obtain microdata, which is a waste of public resources (to obtain what is probably lower quality data).
Government agencies may look to use alternative data providers to obtain survey data for research and analysis purposes, resulting in lower quality data (which may not be as widely accessible)
Why provide deeper access to microdata? The Benefits
Risks of providing access
Misuse - deliberate and inadvertent
Lead to beliefs by respondents that researchers have the potential to identify their data, and possibly even use it against them
Loss of trust in processes and work of national statistical offices, leading to reduced response rates
From risk avoidance to risk management
Production of microdata files from household collections is now routine
–well developed polices and processes exist
Beginning to explore ways of making business microdata more accessible, given that it is rare to be able to produce a confidentialised file
Communication with respondents?
Engaging with requests for overseas access on a case-by-case basis
A shift in emphasis...
Policy response - where ABS is heading
Four layers of protection
–Protection in the data–Access method–User education / partnership–Audit and sanctions
Increased variety of access channels
–CD-ROM, Remote Access Datalab, ABS Datalab, collaborations–different combinations but giving the required protection
Policy - who gets access, and how
Researchers - government or academic - with a particular statistical purpose
Undertakings - legally enforceable within Australia
–won't attempt to identify or match–won't share access etc.–will abide by rules in a manual
Undertakings made by the institution and individuals who will work with the data
Organisational level undertakings approved by a Deputy Australian Statistician
Australian Government agencies must charge for some information products according to a set of guidelines
There is recovery of the marginal costs for development and dissemination of CURFs
Access to a microdata file is $A1,200 (+10% GST for Australian users)
Pricing
Policy - creation of files
Subject area creates files using a set of rules devised by the methodology area (e.g. standard categories for some variables)
Methodologists vet the files, making changes as necessary to 'ensure' confidentiality, and 'declare' that the risks of spontaneous identification are acceptably low
The Australian Statistician gives in-principle approval for release of the microdata file
What the client sees...
One stop shop - all the information about how to access microdata is on our website
One client contact point - the CURF Management Unit (CMU). Submits undertakings through this channel and they provide access once it has been approved
Internally however lots of areas involved
–CMU–Subject areas–Methodology (assurance of confidentiality and auditing of output)–Policy area
ACCESS MODE BASICLess detailed data available
for analysis
EXPANDEDGenerally more detailed data
available for analysis
SPECIALISTMay provide high level of
detail for analysis
May include data for collections where previously
CURFs could not be produced
May allow for integration with other datasets in a way that does not identify individuals
CD-ROM Yes Yes
Remote Access Data Lab (RADL)
Yes
ABS On-site data lab(ABSDL)
Yes
ABS CURFs
CURFs are available from a range of ABS surveys (68 in total):
Aboriginal and Torres Strait Islander Social Survey Aspects of Literacy Australians' Employment and Unemployment Patterns Business Longitudinal Survey Census of Population & Housing Child Care Survey Disability, Ageing and Carers Survey General Social Survey Household Expenditure Survey Income and Housing Costs Survey Labour Mobility Survey National Health Survey Mental Health and Wellbeing of Adults Survey Time Use Survey Women's' Safety Survey
Which CURFs?
University Sector - Ph.D. Students - increasing use
- Undergraduate Students -increasing use with the remote access system - lecturers set course work as students can access the CURF on line with their individual passwords, less security risk than on CD-ROM
Government Departments use CURFs as a basis to understand the population to develop public policy
Recent increase in Government Departments using consultants to do CURF analysis for their purposes.
Commercial Research Centres use CURFs to develop models for policy analysis.
How Researchers use CURFs
Examples of work arising from CURFs
Ellis, R.P. and Savage, E. (2004) Where do you run after you run for cover? A model of the demand for private health insurance in Australia, Australian Health Economics Conference, Melbourne, November 2004.
Cumpston, J. (2004) Models of the Future of Australia, 2004 Australian Population Association Conference.
Kok-Wee Ong, The Effect of Literacy on Earnings in Australia, UNSW School of Economics Honours Thesis
Richardson, S. Society's Investment in Children, National Institute of Labour Studies working paper WP151, Flinders University.
Remote Access Data Laboratory (RADL)
A remote system that allow users to undertake analyses in SAS, SPSS, or SDATA on ABS CURFs
Instead of a CD-ROM users get a username and password
There are various rules about printing records and detailed tables - but looking at a few records is permitted
Output is (electronically) audited. 94% of jobs are returned within 2 minutes
- Remaining jobs are manually audited and most are returned within 1 day
A random sample of all jobs are audited
Audit
Audit is critical to monitor user behaviour
All code and output stored
Cumulative file of all unit data viewed
All jobs have a chance of being inspected
Clients require more functionality –e.g. Output format to spreadsheet not text–Ideally clients would like an interactive system
Clients want more detailed data Clients want more business data Clients want longitudinal data
Clients continue to be price sensitive
Emerging issues
Secure room and desktop
Locked down computer
Automatic logging of client activity
No data transmitting devices
No data or output to enter or leave the room with the client.
ABS On-site data lab (ABSDL)
Specialist or interactive access to Expanded CURFs–More detailed and/or sensitive data–Potential future economic survey data
Interactive system–SAS, SPSS, STATA, Excel
All 8 State & Territory ABS Offices on demand basis
ABSDL (cont.)
Collaborations
A way to broaden ABS workprogram by bringing in expertise to 'assist the Statistician with statistical functions'
A way of providing access, for selected partners, to business microdata that can't be produced as a CURF
Designed to be of use to both ABS and researcher
Access is akin to on-site data lab, but data may be close to recognisable (e.g. simply identifiers removed)
Still working out processes etc., but they are proving time consuming (and therefore expensive) to establish and run
Will never be in the position of undertaking large number of collaborations
Overseas Access - ABS data to other organisations
Have a policy
Undertakings not legally valid overseas - but we can apply sanctions
Access on a project-by-project basis under these conditions
–project is of genuine benefit to Australian policy making
–organisation is known to us and trusted
–access is through RADL (almost always)
Processes to apply, pricing etc. are identical to Australian access
Overseas access - international data repositories (e.g. LIS)
Challenging!
Requires establishment of a genuinely collaborative relationship
Processes etc. worked out on a case-by-case basis, but are congruent with our overall policies
Detail of data to be released (must) be less than our CURFs