data at work: supporting sharing in science and engineering

14
Data at Work: Supporting Sharing in Science and Engineering (Birnholtz & Bietz, 2003) Adam Worrall LIS 6269 Seminar in Information Science 3/30/2010

Upload: jada-hebert

Post on 31-Dec-2015

18 views

Category:

Documents


1 download

DESCRIPTION

Data at Work: Supporting Sharing in Science and Engineering. ( Birnholtz & Bietz , 2003) Adam Worrall LIS 6269 Seminar in Information Science 3/30/2010. Data and data sharing. Information science needs “a better understanding of the use of data in practice” (p. 339) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data at Work: Supporting Sharing in Science and Engineering

Data at Work: Supporting Sharing in Science and

Engineering(Birnholtz & Bietz, 2003)

Adam WorrallLIS 6269 Seminar in Information Science

3/30/2010

Page 2: Data at Work: Supporting Sharing in Science and Engineering

Data and data sharing

• Information science needs “a better understanding of the use of data in practice” (p. 339)

• Data fundamentally “different from documents”(p. 339)

• Data sharing important (p. 339-340)

– “Openness” of scientific process• Confirm findings, replicate results• Build on previous work

– Large data sets require distributed collaboration• Collaboratories, e-science

3/30/2010 2LIS 6269 Seminar in Information Science

Page 3: Data at Work: Supporting Sharing in Science and Engineering

Data sharing problems

• Collaborating and sharing of data should be encouraged– But it “is not easy” to do so (p. 340)

• Why?

3/30/2010 3LIS 6269 Seminar in Information Science

– Lack of willingness to share, trust others• Competition for “revenue” (p. 345)

• Restrictions imposed by commercial interests• Trust of sources• Trust of others; will they use data well?

(see also Van House, 2003)

Page 4: Data at Work: Supporting Sharing in Science and Engineering

Data sharing problems

• Reasons (continued)– Problems with finding shared data

• Negotiate access– Difficulties interpreting and using shared data

• How collected?• How analyzed?• What format?• Metadata

– Format, encoding, controlled vocabularies, etc.• Data quality (see also Stvilia et al., 2008; Wand & Wang, 1996)

• “Tacit” knowledge of data (p. 340)

3/30/2010 LIS 6269 Seminar in Information Science 4

Page 5: Data at Work: Supporting Sharing in Science and Engineering

Methodology

• Three disciplines– Earthquake engineering– HIV / AIDS research– Space physics

• Observation and interviews of all three, surveys of earthquake engineers

• Inductive, grounded approach– Claimed they made “no assumptions about the

purpose of data” (p. 340)

3/30/2010 LIS 6269 Seminar in Information Science 5

Page 6: Data at Work: Supporting Sharing in Science and Engineering

Data dimensions

• Two dimensions identified (p. 341)

– “news” vs. “confirmation”• Confirm existing or expected results• Something unexpected needing further exploration• Something not fitting expected / prevailing model

– “streams” vs. “events”• Longitudinal vs. cross-sectional• Context for data may change• Rate of data different

• Different disciplines, different data use

3/30/2010 LIS 6269 Seminar in Information Science 6

Page 7: Data at Work: Supporting Sharing in Science and Engineering

Data’s role in scientific communities

• Defines boundaries between communities– Experimental, deductive

• More possessive of data– Theoretical, inductive

• More interested in sharing data• More interested in using shared data

– Increasing blurring of boundaries in some fields• Provides gateway into communities– Access to data, knowledge about data is “valuable

resource” (p. 343)

– Those who control data and knowledge, and access to it, act as “gatekeepers of the field” (p. 343)

3/30/2010 LIS 6269 Seminar in Information Science 7

Page 8: Data at Work: Supporting Sharing in Science and Engineering

Data’s role in scientific communities

• Indicates status in community– Using one’s own data “seen as ‘better’” than using

public data (p. 344)

• “Analyzing somebody else’s data … arguably ‘counts’ for less” (p. 344)

– Higher quality data means better reputation• For researchers, research groups, and institutions

• Enables indoctrination into community– Students often work with collecting, managing data– Degree of sharing of responsibilities differs between

fields, sometimes by seniority in field

3/30/2010 LIS 6269 Seminar in Information Science 8

Page 9: Data at Work: Supporting Sharing in Science and Engineering

Categories of data uses (p. 345)

• Identified with an eye to “revenue” from use– Benefits: reputation, publications, funding, etc.

1. “A scientist’s data set is her [or his] castle”– Researcher wants to and is able to use data to solve a

particular problem or question– Will increase revenue

2. “With a little help from my friends”– Researcher wants to use data, but needs to collaborate

with others in order to do so successfully– Data can be shared privately

• Limited risk (but still some risk)– Will increase revenue

3/30/2010 LIS 6269 Seminar in Information Science 9

Page 10: Data at Work: Supporting Sharing in Science and Engineering

Categories of data uses (p. 345)

3. “One scientist’s junk is another one’s treasure”– Researcher has no interest in using the data for a

particular problem, but others do have interest– Sharing data will slightly increase revenue– May not be worth risk of losing other revenues

4. “D’oh!”– Researcher has not thought of a use, but it would be

relevant to them and help them with a problem or question

– Sharing data could be embarrassing, decrease revenue

3/30/2010 LIS 6269 Seminar in Information Science 10

Page 11: Data at Work: Supporting Sharing in Science and Engineering

Categories of data use• Researchers will be less willing to share data

unless incentives high, risks low• Data sharing follows social networks• Provide facilities for communication around

abstractions of data sets– Encourage sharing and collaboration (category 2)

• Extend researcher’s social network– Reduce risks of embarrassment (category 4)

• Preliminary abstractions allow questions / comments before they are embarrassing

– Increase incentives and benefits (categories 2 & 3)• Beyond boundaries of researcher’s community

3/30/2010 LIS 6269 Seminar in Information Science 11

Page 12: Data at Work: Supporting Sharing in Science and Engineering

Recommendations and conclusions• Efforts to support “social interaction around data

abstractions and the data themselves” should be made (p. 346)

• Metadata should be augmented through “the sharing of supplementary materials” (i.e. abstractions) (p. 346)

• Consideration of the “social and scientific roles of data” and how to support them necessary in future research (p. 346)

• Better understanding of data abstractions needed (p. 347)

3/30/2010 LIS 6269 Seminar in Information Science 12

Page 13: Data at Work: Supporting Sharing in Science and Engineering

Issues with study and article• Bias towards natural sciences– Social scientists may use, share data differently

• Only 3 disciplines studied, others may differ further

• Generally coherent, but some parts hard to follow– Indoctrination examples appeared similar, despite

what authors termed “critical” distinction (p. 344)

– Promised “three aspects of the way data are used” but only discussed two dimensions (p. 341)

• Limitations only discussed briefly

3/30/2010 LIS 6269 Seminar in Information Science 13

Page 14: Data at Work: Supporting Sharing in Science and Engineering

Questions, comments?

3/30/2010 LIS 6269 Seminar in Information Science 14