why can’t we all just share?

16
1 MITRE CIDR ‘05 - Monterey HBP © 2005 The MITRE Corporation. All rights rese Why Can’t We All Just Share? Ken Smith Ken Smith The MITRE Corporation The MITRE Corporation ([email protected]) ([email protected])

Upload: devin

Post on 19-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Why Can’t We All Just Share?. Ken Smith The MITRE Corporation ([email protected]). Sharing Can Be Really Good!. Must Solve Problems in:. big win. public, detailed, reconciled, available. policy, info extraction, integration, infrastructure. data that doesn’t share well. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Why Can’t We All Just Share?

1

MITRECIDR ‘05 - MontereyHBP

© 2005 The MITRE Corporation. All rights reserved.

Why Can’t We All Just Share?

Ken SmithKen Smith

The MITRE CorporationThe MITRE Corporation

([email protected])([email protected])

Page 2: Why Can’t We All Just Share?

2

MITRE

Sharing Can Be Really Good!

Page 3: Why Can’t We All Just Share?

3

MITRE

1) Scope ofIntendedVisibility

3) Sender-Reciever

Homogeneity

2) Quality ofAnnotation

private,non-specific,customized,inaccessible

A Four Dimensional Space of Open Issues ...

4) A

cces

sibi

lity

big win

public,detailed,

reconciled,available

policy,info extraction,

integration,infrastructure

Must SolveProblems in:

data thatdoesn’t

share well

Page 4: Why Can’t We All Just Share?

4

MITRE

A Data Sharing Story

Laboratory A

25 subjects over 4 years,400 Alzheimers images

Laboratory B

algorithmsyear 3

Laboratory C

Laboratory D

Laboratory E

Alzheimer’sResearch

Community

year 5??

year 8

Internet

10 images

Page 5: Why Can’t We All Just Share?

5

MITRE

0 How can PIA unambiguously express, communicate, and incrementally evolve his sharing intent?

- in what language?- (must be simple yet expressive)

0 How can the described sharing be implemented and enforced (in new environments) without a heroic effort by PIA

- who has other things to do with his time involving neuroscience!

0 What role does a local lab database play? Public databases? Email? Webservers? P2P tools?

- what tools are used?- (must work well with what exists)

Some Reflections . . . .

Page 6: Why Can’t We All Just Share?

6

MITRE

What’s Needed (In Tools / Policy) What’s Needed (In Tools / Policy)

0 Communities (of all sorts) should be first-class citizens0 Well-defined “channels” of information flow

??

?

Data owners want to be able to control theirexposure to risks as they share.

0 Incremental degrees of visibility0 Dynamic sharing coalitions (possibly many at once)0 Simple, widely-understood expressions of sharing intent0 Supports risk-management

Page 7: Why Can’t We All Just Share?

7

MITRE

Thank You For Sharing These 5 Minutes Thank You For Sharing These 5 Minutes With Me.With Me.

NIH / NIMH

URL: neuroinformatics.mitre.org

Page 8: Why Can’t We All Just Share?

8

MITRE

Backup Slides

Page 9: Why Can’t We All Just Share?

9

MITRE

Data Sharing Sure Has Gotten A Lot of Attention Data Sharing Sure Has Gotten A Lot of Attention LatelyLately

0 Millions of teenagers, their favorite music, and KaZaA

0 Homeland security, total information awareness (TIA), fighting terrorism

0 Medical research records, funding agencies, finding a cure for Alzheimers

Societal behemouths are on a collision course over data sharing issues

0 The Recording Industry Association of America (RIAA) and lawsuits

0 The Electronic Frontier Foundation (EFF), US Newsmedia, individuals

0 The health insurance portability and accountability act (HIPAA), faculty concerned about getting scooped

Share Freely! No You Don’t!

Page 10: Why Can’t We All Just Share?

10

MITRE

0 Lessons about data visibility:- data visibility tends to increase incrementally with time and

events (e.g. publications)- data visibility is associated with the perception of risk- data visibility centers on specific communities at specific

times0 Questions about realizing this scenario:

- How can PIA unambiguously express, communicate, and incrementally evolve his sharing intent?

- How can the described sharing occur without a heroic effort by PIA?

- What role does the local database play? Public databases? Peer-to-peer sharing tools? In general, how is sharing intent implemented in real systems??

Reflections on this Story

Data owners want to be able to control theirexposure to risks as they share.

Page 11: Why Can’t We All Just Share?

11

MITRE

Isn’t Data Sharing just a Policy Issue?(i.e. Non-informatic)

dklaoiek akfdj adkdk dkdk akdoaoiedn d d dkdkdk da093 4mcz 39jfd0 d93lk dda[09emlk akd93j aiksd[09 akd90 akdoi a30b

1) The data owner/shepherd’s sharingintentions (Policy)

2) Their clear expressionin a language (Encoding)

3) Their executionin a computerizedsystem capable ofsharing data(Automated Enforcement)

Data sharing involves computerized systems which must “understand” the data owner’s intent

Page 12: Why Can’t We All Just Share?

12

MITRE

What is Neuroimagery?

Page 13: Why Can’t We All Just Share?

13

MITRE

Why Share Neuroimagery?

0 “Large N” results- key scientific results are unobtainable with the images any single

lab is likely to possess

0 Peer-to-peer collaboration for mutual publication- “A” has the data, “B” has the algorithms

0 Obligation to funding source- funding agencies want the biggest “bang for their buck”

0 Altruism- extend usefulness of unusual or hard-won datasets, and benefit

the field as a whole, and poorer labs in particular

Page 14: Why Can’t We All Just Share?

14

MITRE

What Is Sharing?

0 Privacy

- “Seclusion or isolation from the view of, or from contact with, others” (Websters)

- A relational sphere of trust and immunity from external intrusion, encompassing people and information (personal)

0 Data Sharing

- Voluntary disclosure of privately-held information

Implication: for sharing to occur, the perceived benefitsof disclosure must outweigh the perceived risks

Page 15: Why Can’t We All Just Share?

15

MITRE

An Overview of the Risks Of Sharing Neuroimagery (Result of an Informal Survey)

0 Information theft risks- scooped results; uncited sources; mass downloads; uncompensated

commercial use; vengeful deanonymization0 Information abuse risks

- insurance denial; shared form of data altered; data misunderstood and improperly reused; reuse for purposes opposed by the subject (e.g racism)

0 Loss of time and effort risks- besieging questions from colleagues; cost of learning data sharing tools;

cost of compliance with complex regulations.0 Subject privacy risks

- shared data no properly deanonymized; shared data is found to violate HIPAA resulting in a financial penalty or in a prison term.

Each domain has its characteristic sharing risks.

Page 16: Why Can’t We All Just Share?

16

MITRE

A Label- based Model for Data Sharing

0 Each class is a COI (community of interest)

- users sharing a common task and using common data

- data, users are labeled with their COI, and can access all data in their COI

0 Dominance entails set membership:

- If person P COI1 and COI1

dominates (belongs to) COI2, then P COI2

- Thus, one can “read down”0 “Lower” classes offer more

visibility (and risk)

- information flows downward over time

Lab A

ResearchGroup 1

internet

PeerA-B

null (top)

Lab B

ResearchGroup 2

Lab C