1 laura odwazny senior attorney office of the general counsel u.s. department of health and human...

29
1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet -- Regulatory Considerations DOE CIRB meeting June 14, 2012

Upload: ezekiel-robison

Post on 16-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

1

Laura OdwaznySenior AttorneyOffice of the General CounselU.S. Department of Health and Human Services

Research Using Data Mined from the Internet --Regulatory Considerations

DOE CIRB meetingJune 14, 2012

Page 2: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

2

Disclaimer

This presentation does not constitute legal advice. The views expressed are the presenter’s own, and do not bind the U.S. Department of Health and Human Services or its components.

Page 3: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

3

Do Note:

• OHRP has no guidance on Internet research specifically• Many boards have separate guidelines and best practices for

Internet research

Page 4: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

4

Internet Research

Internet research = research which utilizes the Internet to collect information through an online tool, such as an online survey; studies about how people use the Internet, e.g., through collecting data and/or examining activities in or on any online environments; and/or, uses of online datasets, databases, databanks, repositories.

– Internet as a TOOL FOR research or…– Internet as a MEDIUM/LOCALE OF research

• TOOL=search engines, databases, catalogs, etc…• MEDIUM/LOCALE=chat rooms, newsgroups, home pages, multi-

player gaming sites, blogs, skype, tweeting, online course software, etc

Page 5: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

5

Forms of Research: Exploring Where Human Subjects Fit

• Consider Methodologies, Venues, Types of Data Generated through:

• Quantitative Research– Data Aggregation, Scraping, Transaction Log Analysis,

Network Analysis, Statistical Analysis etc• Qualitative Research

– Ethnography, Focus Groups, Observation, Surveys, Content/Discourse Analysis, etc

5

Page 6: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

6

Forms of Internet Research Venues

• Email, IM, tweets• Listserves, chat rooms• Search engines, other archives• Social network sites, media sharing sites• Blogs and home pages• Virtual worlds• Online marketplaces, online gaming• Databanks, repositories• Venues other than “place-based), e.g. mobile data

collection

Page 7: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

7

E-Data Raises New Ethical Challenges• Trackability

– “Dataveillance” = data monitoring+ recording• “Greased”

– “When information is computerized, it is greased to slide easily and quickly to many ports of call. But legitimate concerns about privacy arise when this speed and convenience lead to the improper exposure of information. Greased information is information that moves like lightning and is hard to hold onto.”

• Malleability– Can be utilized in varied ways for multiple purposes

• Invisibility Factor – Computer operations usually invisible; can allow for abuse

James Moor, 1985

Page 8: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

8

Data aggregation/scraping

Page 9: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

9Online Support Groups

Page 10: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

10

Page 11: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

11

Twitter• Blurs the boundaries between public/private• Tweeter A (private)followed by Tweeter B (public)Tweeter B retweets A = Tweet A is now

visible to Tweeter B’s public)feed • Track-backability is increased; consider sensitivity, reputation, risk/benefit• Archived Tweet Data fields:

– country code:– id:– klout score– link:– location – coord type:– location coords:– location displayname:– location type:– posted time:– real name:– rule match:– tweet url:user twitter page:– username:

Page 12: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

12

Regulatory considerations

HEADER

Page 13: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

13

Big regulatory issues…• What is “private”?• What is “identifiable”?• How to protect subjects’ privacy and confidentiality

interests?• Minimizing risk when using sensitive online data

– Current sensitivity vs. future sensitivity– Informational risks – Data security

Page 14: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

14

OHRP’s Analytic Framework for the Common Rule: Always Start With…

• Is the activity subject to regulation?– Conducted or supported by a Common Rule agency?– Covered under an applicable FWA?

• Is it research?• Does it involve human subjects?• Is it exempt?

Keep in mind regulatory flexibilities:– Can it be expedited?– Waiver of informed consent?– Waiver of documentation of consent?

Page 15: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

15

Human subject.102(f): “a living individual about whom an investigator conducting

research obtains (1) data through intervention or interaction with the individual, or (2) identifiable private information… Private information includes information about behavior that occurs in a context in which an individual can reasonably assume that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects. (emphasis added)

Page 16: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

16

Privacy in the Internet agePrivate

• How to interpret “reasonably expect that no observation or recording is taking place” or “reasonably expect will not be made public”

– IMs, tweets, email, FB profile, chatroom discussions, listserves

• Must information be considered either “public” or “private”?– Members-only forum, community standards

• Shifting norms about what information is “private”• What is a “reasonable” expectation of privacy in

grid/Internet/e-data?– Expectations of privacy vs. actual privacy

Page 17: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

17

How should the IRB assess privacy?

• What expectations of privacy are “reasonable”?– Get information about the environment– Get information about the users– Review Terms of Service– Data security consideration

Page 18: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

18

Human subjects (2)Identifiable

• Individually identifiable = subject’s identity readily ascertainable by the investigator or associated with the information

• Structure of social network, search terms, purchase habits, movie ratings on Netflix may uniquely identify individual

– Zip code + sex + DOB enough for Latanya Sweeney to identify

• Given demonstrated ability to reidentify individuals from anonymized or aggregated data, is this a meaningful decision point?

Page 19: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

19

How should the IRB assess identifiability?

• When will the subject’s identity be “readily” ascertainable by the investigator or associated with the information?– Consider the investigator, e.g. Professor LaTanya

Sweeney vs. Professor Elizabeth Buchanan– Consider the potential identifiers– Consider likelihood of reidentification with triangulation

Page 20: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

20

Exemption .101(b)(4)

• Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects.

HEADER

Page 21: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

21

Exemption .101(b)(4) applied

• When is information “recorded in an identifiable manner”– Is an email address an identifier?– Do tweets contain identifiers?– Does the inclusion of IP address make information

identifiable?• When are data, documents, or records publicly available

on the internet?– Does “publicly available” include large datasets

purchased/obtained from Google or Facebook?– What if data are semi-restricted -- available only to

‘friends’, listserve members?

Page 22: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

22

Key Considerations for IRB Review

• What type of venue?• Expectations of privacy?• Consent procedures?• Sensitivity of data?• Harm/Risk?• Age verification?• Authentication of participants?• Identification of participants?• Use of encryption?• Storage/transmission of data?

Page 23: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

23

Other potential issues – international research

• PI is proposing to collect data from publically accessible social media sites, some of which are hosted by servers outside of the US. The PI will collect all data from his computer in the US. Is the activity international research?” (from IRB Forum)

– Consider EU data protection directive, Canadian laws, etc. if applicable!

Page 24: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

24

Stay tuned

Page 25: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

25

ANPRM– Implications for Internet research

• Base concept of identifiability under Common Rule on HIPAA Privacy Rule standards of identifiability?

• Tor protect from informational risks (inappropriate use/disclosure of information), mandatory data security measures “modeled on” HIPAA?

• Apply Common Rule to all institutions receiving support from CR agency?

• No continuing review for most minimal risk research?

Page 26: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

26

ANPRM – Proposals for “excused” research

• Additional requirements for “excused” (formerly exempt) research?– Registration– Consent, oral or written, depending, with waiver

contemplated• Oral w/o documentation for educational tests, surveys,

focus groups, interviews – Data security standards– Retrospective auditing of portion of “excused” submissions

Page 27: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

27

Proposal: Revised scope of existing exemption 4

• Expansion of .101(b)(4) by removing “existing” and de-identified recording?– Keep collected for purposes other than the research

Page 28: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

28

ANPRM – consent and exempt research

• Additional consent requirements for “excused” (formerly exempt) research?– Oral or written consent, depending, with waiver contemplated

• Oral w/o documentation for educational tests, surveys, focus groups, interviews (modifying exemption 46.101(b)(2))

• Secondary use of data (modifying exemption 46.101(b)(4))

– originally collected for research purposes, consent required whether or not the researcher obtains identifiers

– originally collected for non-research purposes, no change (no consent required unless identifiers are obtained)

Page 29: 1 Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services Research Using Data Mined from the Internet

29

Your Experiences, Comments, Questions