access to sensitive data in the uk: a principles-based approach felix ritchie

Access to sensitive data in the UK: a principles-based approachFelix Ritchie

Overview

• Design principles

• Policies developed

• Conclusion: why do principles matter?

Part 1

Design principles

The framework principle

• Data access is driven from first principles

user needs

NSI optionslegalenvironment

solution

technology

…which is not this model; how about…

user needs

NSI principles

legalenvironment

solution

technology

principles of access

The principles in use today at ONS

• The value of microdata is well-established

• There are risks in not making full use of data

• Public bodies should be supporting research– for the public benefit– for their own benefit

• Not every research project needs detailed data– data released should be consistent with need

• Access to data should be driven by cost-benefit or cost-effectiveness assessments

The interaction between law and principle• Up to 2002: various dubious practices

– £1 contracts– Researchers using own equipment– Poor records of microdata use

• 2002-2008– New recording system for applications– Review and rationalisation of legal gateways– But still many hurdles to cross

• 2008 – – Experience led to significant provision in law for

research use

The legislative model

• Statistics and Registration Services Act 2007– single law allowing, in principle, access to all govt

data via ONS– flexible Approved Researcher scheme– ONS given a statutory duty to support research– but not a free-for-all

• ONS has a duty to protect confidentiality– even for Approved Researchers– data release has to be consistent with need

→ the data model

The data model: what is it?

• ‘Spectrum’ of access points balancing– value of data– ease of use– disclosure risk

• for a given level of confidentiality, maximise data use and convenience

• no ‘one-size-fits-all’ solution– no absolute prohibitions– trade-off is made explicit– users determine appropriate level of access

Type of access

None VML

ONS sites

VML

Govt sites

Secure data

service

Special

licences

Licensed data

archive

Internet

Anonymi-sation

Little Complete

SDC of inputs

None Complete

Restric-tions on users

Many None

SDC of outputs

Complete None

Use of confidential data: the access spectrum

Distributed access Distributed data

The data model: does it work?

• Options should cover most cases– Can’t be perfect in every case– Jumps between solutions should reflects data utility

and patterns of research use

• Pretty efficient– Fairly transparent– Users balance their own costs/benefits– Economies of scale delivering mass solutions

• How do we define/describe access points?

→ the security model

Part 2

Policies developed

The VML security modelWhy was it needed?

• Tendency to focus on single risks eg IT

• Poor understanding of complementarity of risk management measures

• New developments (eg output SDC, distributed access) not covered by current models

The VML security model:How does it work?

• valid statistical purpose

• trusted researchers

• anonymisation of data

• technical controls around data

• disclosure control of results

safe projects

+safe people

+safe data

+safe outputs

safe use

+safe setting

Active researchermanagement

Principle-basedSDC

Making people safe:researcher vs data management

• Traditional focus on ‘data management’ – Responsibility for security and operations rests with NSI– Security based on ‘worst case’ scenarios

• Consequences of data management approach– High cost of pre release anonymisation– Lack of communication

Lack of mutual understanding of needs, priorities and working practices

– Culture of distrustResearchers do not take responsibility for data

confidentiality

Researchers do not understand, or see the need to understand, SDC

– Risk of researchers attempting to subvert data security

The VML model:Active Researcher Management

• Researchers will engage with NSI if given a chance

• Actively engage with researchers– In explaining NSI goals– In explaining disclosure control– in understanding researcher needs, working practices– In securing cooperation minimise sensitive output

• Responsibility for data security shared between NSI and researcher (NSI always get final say)

• Certify researchers as part of the security model

ARM: A matter of perspective

Negative:

researchers as risks

Positive:

researchers as collaborators

“we’re doing this to protect the data” (from you)

“doing this allows us to supply you with more detailed data”

“you must limit your output to reduce the chance of disclosure”

“limit your output because we have finite resources; people who produce good output get their results back quicker”

Costs and benefits of ARM

• Better security

• More efficient management

• Easier change management

• There are costs:– Initial training costs– Ongoing communication costs

Statistical Disclosure ControlWhy was a new model needed?

• No value in protecting data– Protect only the results people want to take away

• But ‘traditional’ methods rely upon a finite set of outputs – not appropriate for research

Principles-based SDC

• SDC at the point of release

• trained NSI staff and researchers

• agreement on principles and purpose

• safe vs unsafe outputs, based on functional form

• No absolute restrictions– Procedures for resolving differences crystal-clear

Part 3

Concluding comments

What have we learnt?

• Design based on first principles…– made design slow but robust– helped identify failings in current approaches– showed where new models were needed– allowed the evaluation of new and different

models

• But this is the wisdom of hindsight– Development a heuristic process

Next stages

• Translation of VML model in its entirety to academic partners– First major test robustness of procedures

• Cost-benefit analysis of VML operations and review of strategic function– Does the VML have a future?

• Models for international data sharing– Can principles surmount the insurmountable?

Questions?

• Felix Ritchie• Microdata Analysis and User Support• [email protected]

• Virtual Microdata Laboratory (VML)• Microdata Analysis and User Support• [email protected]

mailto:[email protected]




















access to sensitive data in the uk: a principles-based approach felix ritchie

Documents