managing confidential information – trends and approaches

43
Prepared for MIT Libraries Program on Information Research Brown Bag Talk September 2013 Managing Confidential Information – Trends and Approaches Dr. Micah Altman <[email protected]> Director of Research, MIT Libraries

Upload: micah-altman

Post on 11-May-2015

845 views

Category:

Technology


0 download

DESCRIPTION

Personal information is ubiquitous and it is becoming increasingly easy to link information to individuals. Laws, regulations and policies governing information privacy are complex, but most intervene through either access or anonymization at the time of data publication. Trends in information collection and management -- cloud storage, "big" data, and debates about the right to limit access to published but personal information complicate data management, and make traditional approaches to managing confidential data decreasingly effective. This session presented as part of the the Program on Information Science seminar series, examines trends information privacy. And the session will also discuss emerging approaches and research around managing confidential research information throughout its lifecycle.

TRANSCRIPT

Page 1: Managing Confidential Information – Trends and Approaches

Prepared for

MIT Libraries Program on Information Research Brown Bag Talk

September 2013

Managing Confidential Information – Trends and Approaches

Dr. Micah Altman<[email protected]>

Director of Research, MIT Libraries

Page 2: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Standard DisclaimerThese opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators

Secondary disclaimer:

“It’s tough to make predictions, especially about the future!”

-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R.

Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc.

Page 3: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Collaborators & Co-Conspirators

• Privacy Tools for Sharing Research Data Team (Salil Vadhan, P.I.)http://privacytools.seas.harvard.edu/people

• Research SupportSupported in part by NSF grant CNS-

1237235

Page 4: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Related Work. Main Project: • Privacy Tools for Sharing Research Data

http://privacytools.seas.harvard.edu/

Related publications:• Novak, K., Altman, M., Broch, E., Carroll, J. M., Clemins, P. J., Fournier, D.,

Laevart, C., et al. (2011). Communicating Science and Engineering Data in the Information Age. Computer Science and Telecommunications. National Academies Press

• Vadhan, S. , et al. 2010. “Re: Advance Notice of Proposed Rulemaking: Human Subjects Research Protections”. Available from: http://dataprivacylab.org/projects/irb/Vadhan.pdf

• Altman, M. (2012). “Mitigating Threats To Data Quality Throughout the Curation Lifecycle. In G. Marciano, C. Lee, & H. Bowden (Eds.), Curating For Quality. datacuration.web.unc.edu

These slides & most reprints available from:informatics.mit.edu

Page 5: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Level Setting

Page 6: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Identifying Information Is Common• Includes information from a variety of sources,

such as…– Research data, even if you aren’t the original

collector– Student “records” such as e-mail, grades– Logs from web-servers, other systems

• Lots of things are potentially identifying:– Under some federal laws: IP addresses, dates,

zipcodes, …– Birth date + zipcode + gender uniquely identify ~87%

of people in the U.S. [Sweeney 2002]Try it: http://aboutmyinfo.org/index.html

– With date and place of birth, can guess first five digits of social security number (SSN) > 60% of the time. (Can guess the whole thing in under 10 tries, for a significant minority of people.) [Aquisti & Gross 2009]

– Analysis of writing style or eclectic tastes has been used to identify individuals

• Tables, graphs and maps can also reveal identifiable information

Brownstein, et al., 2006 , NEJM 355(16),

Page 7: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Some Sources of Confidentiality Restrictions for University Held Research and Education Information

• Overlapping laws• Different laws

apply to different cases

• Additional data usage agreements and license terms apply

Page 8: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Different Requirements and Definitions

FERPA HIPAA Common Rule MA 201 CMR 17

Coverage Students in Educational Institutions

Medical Information in “Covered Entities”

Living persons in research by funded institutions

Mass. Residents

Identification Criteria

-Direct-Indirect-Linked-Bad intent (!)

-Direct-Indirect-Linked

-Direct-Indirect-Linked

-Direct

Sensitivity Criteria

Any non-directory information

Any medical information

Private information – based on harm

Financial, State, Federal Identifiers

Management Requirements

- Directory opt-out- [Implied] good practice

- Consent- Specific technical safeguards- Breach notification

- Consent- [Implied] risk minimization

- Specifictechnical safeguards- Breach notification

Page 9: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

* 2010

*

Page 10: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Recognized Benefits of Data Sharing

• Pioneering NRC report [Fienberg, et. al 1985] on data sharing recommended:– Sharing data should be a regular practice.– Investigators should share their data by the time of

publication of initial major results of analyses of the data except in compelling circumstances.

– Data relevant to public policy should be shared as quickly and widely as possible.

– Plans for data sharing should be an integral part of a research plan whenever data sharing is feasible.

• Numerous subsequent reports recommend data sharing.

Page 11: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Private Information & Information Services

• Recommendations

• Annotations & Tagging

• Class discussion forum

• Social Highlighting

Page 12: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Access Control ModelAccess Control

ClientResource

Auth

entic

atio

n

Credentials

Auth

oriza

tion

Request/Response

Audi

ting

Log

External AuditorResource Control Model

Page 13: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Disclosure Limitation Data InputOutput Model

Published Outputs

* Jones * * 1961 021*

* Jones * * 1961 021*

* Jones * * 1972 9404*

* Jones * * 1972 9404*

* Jones * * 1972 9404*

“The correlation between X and Y was large and

statistically significant”

Summary statistics

Contingency table

Public use sample microdata

Information Visualization

DATA

DATA

Page 14: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Example

Page 15: Managing Confidential Information – Trends and Approaches

Exemplar: Social Media Analysis

Information Privacy Across the Research Lifecycle

Attribute Type Examples

Data: Structure - network

Data: Attribute Types - Continuous/Discrete/- Scale: ratio/interval/ordinal/nominal

Data: Performance Characteristics

- 10M-1B observations- Sample from stream of continuously

updated corpus- Dozens of dimensions/measures

Measurement: Unit of Observation

- Individuals; Interactions

Measurement: Measurement type

- Observational

Measurement: Performance characteristic

- High volume- Complex network structure- Sparsity- Systematic and sparse metadata

Management Constraints - License; Replication

Analysis methods - Bespoke algorithms (clustering); nonlinear optimization; Bayesian methods

Desired Outputs - Summary scalars (model coefficients)- Summary table- Static /interactive visualization

More Information• Grimmer, Justin, and Gary King. "General purpose computer-

assisted clustering and conceptualization." Proceedings of the National Academy of Sciences 108.7 (2011): 2643-2650.

• King, Gary, Jennifer Pan, and Molly Roberts. "How censorship in China allows government criticism but silences collective expression." APSA 2012 Annual Meeting Paper. 2012.

• Lazer, David, et al. "Life in the network: the coming age of computational social science." Science (New York, NY) 323.5915 (2009): 721.

Page 16: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

What’s wrong with this picture?

Name

SSN Birthdate Zipcode Gender FavoriteIce Cream

# of crimescommitted

A. Jones 12341 01011961 02145 M Raspberry 0

B. Jones 12342 02021961 02138 M Pistachio 0

C. Jones 12343 11111972 94043 M Chocolate 0

D. Jones 12344 12121972 94043 M Hazelnut 0

E. Jones 12345 03251972 94041 F Lemon 0

F. Jones 12346 03251972 02127 F Lemon 1G. Jones 12347 08081989 02138 F Peach 1

H. Smith 12348 01011973 63200 F Lime 2

I. Smith 12349 02021973 63300 M Mango 4

J. Smith 12350 02021973 63400 M Coconut 16

K. Smith 12351 03031974 64500 M Frog 32

L. Smith 12352 04041974 64600 M Vanilla 64

M. Smith 12353 04041974 64700 F Pumpkin 128

N. Smith-Jones

12354 04041974 64800 F Allergic 256

Page 17: Managing Confidential Information – Trends and Approaches

Managing Confidential Data 17

Name SSN Birthdate Zipcode Gender FavoriteIce Cream

# of crimescommitted

A. Jones 12341 01011961 02145 M Raspberry 0

B. Jones 12342 02021961 02138 M Pistachio 0

C. Jones 12343 11111972 94043 M Chocolate 0

D. Jones 12344 12121972 94043 M Hazelnut 0

E. Jones 12345 03251972 94041 F Lemon 0

F. Jones 12346 03251972 02127 F Lemon 1G. Jones 12347 08081989 02138 F Peach 1

H. Smith 12348 01011973 63200 F Lime 2

I. Smith 12349 02021973 63300 M Mango 4

J. Smith 12350 02021973 63400 M Coconut 16

K. Smith 12351 03031974 64500 M Frog 32

L. Smith 12352 04041974 64600 M Vanilla 64

M. Smith 12353 04041974 64700 F Pumpkin 128

N. Smith 12354 04041974 64800 F Allergic 256

What’s wrong with this picture?

v. 23 (7/18/2013)

HIPPA & MAIdentifier

Identifier&

Sensitibe

HIPAAdentifier

HIPAAIdentifier

Sensitive

Unexpected Response?

Mass resident

FERPA too?

Californian

Twins, separated at birth?

IndirectI Identifier

Page 18: Managing Confidential Information – Trends and Approaches

Help, help, I’m being suppressed…

Name SSN Birthdate Zipcode Gender FavoriteIce Cream

# of crimescommitted

[Name 1] 12341 *1961 021* M Raspberry .1

[Name 2] 12342 *1961 021* M Pistachio -.1

[Name 3] 12343 *1972 940* M Chocolate 0

[Name 4] 12344 *1972 940* M Hazelnut 0

[Name 5] 12345 *1972 940* F Lemon .6

[Name 6] 12346 *1972 021* F Lemon .6[Name 7] 12347 *1989 021* * Peach 64.6

[Name 8] 12348 *1973 632* F Lime 3

[Name 9] 12349 *1973 633* M Mango 3

[Name 10] 12350 *1973 634* M Coconut 37.2

[Name 11] 12351 *1974 645* M * 37.2

[Name 12] 12352 *1974 646* M Vanilla 37.2

[Name 13] 12353 *1974 647* F * 64.4

[Name 14] 12354 *1974 648* F Allergic 256Row

VarSynthetic Global Recode Local Suppression Aggregation+

Perturbation

Information Privacy Across the Research Lifecycle

Page 19: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

k-anonymous – but not protected

Name SSN Birthdate Zipcode Gender FavoriteIce Cream

# of crimescommitted

* Jones * * 1961 021* M Raspberry 0

* Jones * * 1961 021* M Pistachio 0

* Jones * * 1972 9404* * Chocolate 0

* Jones * * 1972 9404* * Hazelnut 0

* Jones * * 1972 9404* * Lemon 0

* Jones * * 021* F Lemon 1* Jones * * 021* F Peach 1

* Smith * * 1973 63* * Lime 2

* Smith * * 1973 63* * Mango 4

* Smith * * 1973 63* * Coconut 16

* Smith * * 1974 64* M Frog 32

* Smith * * 1974 64* M Vanilla 64

* Smith * 04041974 64* F Pumpkin 128

* Smith * 04041974 64* F Allergic 256

Law, policy, ethics

Research design …

Information security

Disclosure limitation

Additional background

Homogeneity

Sort Order/Structure

Page 20: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Climate

Page 21: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Commercial Data Breaches

• Data from 100 million individuals exposed this year…

• Only a portion of breaches are reported

• Difficult to trace impacts… but estimated 8.3M identity thefts in 2005

Source: http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/

Page 22: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Cloud computing risks• Cloud computing decouples

physical and computing infrastructure

• Increasingly used for core-IT, research computing, data collection, storage, and analysis

• Confidentiality issues– Auditing and compliance– Access and commingling of data– Location of data and services

and legal jurisdiction– Vulnerabilities of network

communication using single well-known key

– Vulnerability of key storage

Page 23: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Legal & Cultural Challenges

• EU right to be forgotten; French “le droit à l'oubli”;California social media privacy act

• Consumer privacy bill of rights;Do not track; Privacy Icons

• Evolving case law on locational privacy• Public records, mug shots, and revenge porn• State-level action on privacy regulation• Attitudes towards sharing; surveillance

Page 24: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

New Data – New Challenges

• How to limit disclosure without completely destroying utility? – The “Netflix Problem”: large, sparse datasets that

overlap can be probabilistically linked [Narayan and Shmatikov 2008]

– The “GIS”: fine geo-spatial-temporal data impossible mask, when correlated with external data [Zimmerman 2008]

– The “Facebook Problem”: Possible to identify masked network data, if only a few nodes controlled. [Backstrom, et. al 2007]

– The “Blog problem” : Pseudononymous communication can be linked through textual analysis [Tomkins et. al 2004]

[For more examples see Vadhan, et al 2010]

Source: [Calberese 2008; Real Time Rome Project 2007]

Page 25: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Weather

Page 26: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Possible Legal/Regulatory Changes for 2013-15

• Likely– New information privacy laws in selected states– Increased open data requirements

from federal funders– Adoption of data availability

requirements by increasing numbers of journals

Law, policy, ethics

Research design …

Information security

Disclosure limitation

Page 27: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Page 28: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Research

Page 29: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Traditional approaches are failing• Modal traditional approach:

– removing subjects’ names– storing descriptive information in a locked filing cabinet– publishing summary tables– (sometimes) release a public use version that suppressed and

recoded descriptive information• Problems

– law is changing – requirements are becoming more complex– research computing is moving towards the cloud, other

distributed storage– researchers are using new forms of data that create new privacy

issues– advances in the formal analysis of disclosure risk imply the

impracticality of “de-identification” as required by law

Page 30: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

A National Science Foundation Secure and Trustworthy Cyberspace ProjectSupported by award #1237235

Privacy Tools for Sharing Research Data

The Dataverse Network will Distribute and Manage Confidential Databases

Policy tools Guide Information Management Across the Research Lifecycle

Differentially Private Algorithms Shield Individuals in Databases

Page 31: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Approaches• Policy

– Legal Reforms– Information Accountability– Economic rights– Information transparency

• Aboutmydata.com– Privacy Nudges – Privacy Icons

• Cryptography– Multiparty computation– Zero knowledge protocols– Functional encryption– Homomorphic encryption

• Statistics– Synthetic data– Reidentification risk– K-anonymity; homogeneity– Differential privacy

• Information Lifecycle & Infrastructure– Open consent– Metadata frameworks– Information accountability– Policy aware filesystems

• IRODs– Data Vaults

• Project VRM– Secure data enclave– Standardized Data Use Agreements

Page 32: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Recent Work –Economics & Public Policy Research/Outreach

• March 2013 – Dwork & Vadhan lead roundtable in Differential Privacy and Law and Policy (conference), Cardozo Law School

• March 2013 – Altman provided oral comments (recorded) on Public Workshop on Revisions to the Common Rule, National Academies, on limits of HIPAA approach to privacy.

• May 2013 – Altman & Crosas submitted written testimony to Public Access to Federally-Supported Research and Development Data, National Academies; including approaches to management of privacy for data sharing.

• June 2013 – Dwork, Sweeney, & Vadhan invited & participated in Privacy Law Scholars Conference, George Washington Law School/Berkeley Law School

• June 2013 -- Yiling Chen, Stephen Chong, Ian Kash, Tal Moran, and Salil Vadhan. “Truthful Mechanisms for Agents that Value Privacy”, Proceedings of the 14th ACM Conference on Electronic Commerce (EC), June 2013.

• September - Integrating Approaches to Privacy across the Research Lifecycle Workshop

• In Progress – Rewrite and expansion of, Vadhan, S. , et al. 2010. “Re: Advance Notice of Proposed Rulemaking: Human Subjects Research Protections”, proposing framework for integrating modern privacy concepts in to Human Subjects protections.

Page 33: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Information Life Cycle Model

Creation/Collection

Storage/Ingest

Processing

Internal Sharing

Analysis

External dissemination/public

ation

Re-use• Scientometric• Education• Scientific• Policy

Long-term access

Research methods

Data ManagementSystems

Legal / Policy Frameworks∂

Statistical / Computational

Frameworks

Page 34: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Example: Stakeholder Concerns Across Lifecycle

Research sources:- Research Subjects.- Owners of subject material- Owners of supplementary data

Research sponsors:- Home institution- Funding sources

Project Personnel:- Investigators- Research Staff

Research Publishers- Print publishers- Research archives

Research Consumers- Readers- Secondary researcher

LicensingCopyrightDMCAInformed ConsentPrivacyTrade secrets

LicensingFreedom of InformationCopyright

Copyright

CopyrightLicensing

Fair Use

InformationTransfer

PrivacyConfidentialityIntellectual Property

Replicable ResearchPolicy RelevanceAccessibility of ResearchProtect IPAvoid third party IP/Privacy Issues

Replicable ResearchPublishPromote use of PublicationsTrack use

Replicable researchPromote use of their publicationsProtect publisher IPAvoid third party IP/Privacy Issues

Replicate and extendSecondary analysisLink research

Stakeholder Concerns Legal Issues

Page 35: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Modeling Features

Features Characteristics

Data - Structure; Source; Unit of observation; Attribute types; Dimensionality; Number of observations; homogeneity; frequency of updates; quality characteristics

Analytic Results - Form of output; analysis methodology; analysis/inferential goal; utility/loss/quality

Disclosure scenario - - Source of threat; areas of vulnerability; attacker objectives, background knowledge, capability; Breach criteria/disclosure concept

Stakeholders - Stakeholder types; capacities; trust relationships; budgets

Lifecycle characteristics - Lifecycle stages controlled/in scope; policies used; stakeholders involved at each stage

Current privacy management approach - Regulation/policy; legal controls; statistical/computational disclosure methods; information security controls

Page 36: Managing Confidential Information – Trends and Approaches

Legal/Policy FrameworksContract Intellectual Property

Access Rights Confidentiality

Copyright

Fair Use

DMCA

Database Rights

Moral Rights

Intellectual Attribution

Trade Secret

Patent

Trademark

Common Rule45 CFR 26

HIPAA

FERPA EU Privacy DirectivePrivacy Torts

(Invasion, Defamation)

Rights of Publicity

Sensitive but Unclassified

Potentially Harmful

(Archeological Sites,

Endangered Species, Animal

Testing, …)

Classified

FOIA

CIPSEA

State Privacy Laws

EAR

State FOI Laws

Journal Replication

Requirements

Funder Open Access

Contract

License

Click-WrapTOU

ITAR

Export Restrictions

Page 37: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Risk Assessment

• [NIST 800-100, simplification of NIST 800-30]

Law, policy, ethics

Research design …

Information security

Disclosure limitation

System Analysis

Threat Modeling

Vulnerability Identification

Analysis- likelihood- impact- mitigating controls

InstituteSelected Controls Testing and

Auditing

Information Security Control Selection Process

Page 38: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

• Infrastructure requirements analysis– Data acquisition, storage, dissemination– Identification, authorization, authentication– Metadata, protocols

• System design: potential implementation cost of interactive privacy:– Information security -- hardening– Information security – certification & auditing– Model server development, provisioning, maintenance, reliability, availability

• System design: information security tradeoffs of Interactive privacy mechanisms:– Availability risks: denial of service attack– Availability/integrity risks: privacy budget exhaustion attacks– Integrity risks: modification of delivered results (e.g. man-in-the-middle attacks)– Secrecy/privacy: breach of authentication/authorization layer

• System design: optimizing privacy & utility across lifecycle– When does limiting disclosive data collection dominate methods at the data analysis stage– When does restricted virtual data enclaves + public synthetic data dominate interactive mechanisms

• System design: Information use/reuse– Support of scientific analysis use cases (model diagnostics, exploratory data analysis, integration of external

data) within interactive privacy systems.– Align informational assumptions across stages & incorporating informative priors? – Requirements for scientific replication/verification of results produced by model servers?

Systems Policy Research questions deriving from Information Lifecycle Analysis

Page 39: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Legal Policy Research questions deriving from Information Lifecycle Analysis

• Legal requirements across lifecycle stages• Legal instruments

-- capturing scientific privacy concepts in legal instruments consistently across lifecycle– service level agreements– consent terms– deposit agreement– data usage agreements– Regulatory language

Page 40: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

• Where does market fail for sharing confidential research data?– What market conditions are theoretically violated?– What is the empirical evidence of the degree of violation? – How do degree of violation vary by policy context & use case?

• Policy equlibria– What are contribution and privacy equilibria for data sharing

under different privacy concepts? • Interventions

– How do proposed interventions (e.g. advise & consent; “privacy icons”, uniform regulations, breach notification, information accountability, anonymization ) correspond to sources of market failures?

Public Policy Research Questions

Page 41: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Beyond Legal Research -- Market Theory• Condition on Markets

– No political/legal distortions[See, e.g., Posner 1978]

– Common knowledge– No barriers to entry

• Conditions on agents[See e.g. Acquisti 2010; Tsai, Egelman, Cranor & Aquisti 2010]

– Perfect rationality– Self-interested– Infinitely many agents– Stable preferences

• Conditions on goods– Consumptive goods– Excludable goods– Decreasing returns to

scale– Transferability

– No externalities• Conditions on exchange

[See e.g., Benisch, Kelley, Sadeh, & Cranor 2011; McDonald & Cranor 2010]

– No transaction costs– No information

asymmetries• Conditions on

equilibrium valuation– Pareto optimality vs.

economic surplus– Ignorability of

distributional concern

Private Goods• Excludable• Consumable• No

externalities

Commons• Non

excludable• Consumable• Negative

externalities

Public Good• Non-

excludable• Non

consumable• Positive

externalities

Toll Good• Partially non-

excludable• Non-

consumable• Positive

externalities

Page 42: Managing Confidential Information – Trends and Approaches

Bibliography (Selected)

• L. Willenborg and T. D. Waal. Elements of Statistical Disclosure Control, volume 155 of Lecture Notes in Statistics. Springer Verlag, New York, NY, 2001.

• Higgins, Sarah. "The DCC curation lifecycle model." International Journal of Digital Curation 3.1 (2008): 134-140.www.dcc.ac.uk/resources/curation-lifecycle-model

• ESSNET, Handbook on Statistical Disclosure Control. 2011.neon.vb.cbs.nl/casc/SDC_Handbook.pdf

• Fung, Benjamin, et al. "Privacy-preserving data publishing: A survey of recent developments." ACM Computing Surveys (CSUR) 42.4 (2010): 14.

• Altman, M. (2012). “Mitigating Threats To Data Quality Throughout the Curation Lifecycle. In G. Marciano, C. Lee, & H. Bowden (Eds.), Curating For Quality. datacuration.web.unc.edu

Information Privacy Across the Research Lifecycle

Page 43: Managing Confidential Information – Trends and Approaches

Information Privacy Across the Research Lifecycle

Questions?

E-mail: [email protected]:informatics.mit.edu