data reuse experiences within digital vs. physical zoological collections

27
The world’s libraries. Connected. Data Reuse Experiences within Digital vs. Physical Zoological Collections University of Michigan Museum of Zoology (UMMZ), February 20, 2014 Ixchel M. Faniel, Ph.D. OCLC Research [email protected] Elizabeth Yakel, Ph.D. University of Michigan [email protected]

Upload: zudora

Post on 24-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

University of Michigan Museum of Zoology (UMMZ), February 20, 2014. Ixchel M. Faniel, Ph.D. . OCLC Research [email protected]. Data Reuse Experiences within Digital vs. Physical Zoological Collections. Elizabeth Yakel, Ph.D. . University of Michigan [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Data Reuse Experiences within Digital vs. Physical Zoological Collections

University of Michigan Museum of Zoology (UMMZ), February 20, 2014

Ixchel M. Faniel, Ph.D. OCLC Research

[email protected]

Elizabeth Yakel, Ph.D. University of Michigan

[email protected]

Page 2: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

• Institute for Museum and Library Services (IMLS) funded project led by Drs. Ixchel Faniel (PI) & Elizabeth Yakel (co-PI)

• Studying the intersection between data reuse and digital preservation in three academic disciplines to identify how contextual information about the data that supports reuse can best be created and preserved.

• Focuses on research data produced and used by quantitative social scientists, archaeologists, and zoologists.

• The intended audiences of this project are researchers who use secondary data and the digital curators, digital repository managers, data center staff, and others who collect, manage, and store digital information.

For more information, please visit http://www.dipir.org

Page 3: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Research Motivations & Questions

1. What are the significant properties of quantitative social science, archaeological, and zoological data that facilitate reuse?

2. How can these significant properties be expressed as representation information to ensure the preservation of meaning and enable data reuse?

Faniel & Yakel 2011

Page 4: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

DIPIR Project

Nancy McGovernICPSR/MIT

Ixchel FanielOCLC

Research (PI)

Eric Kansa Open

Context

William Fink UM

Museum of Zoology

Elizabeth Yakel

University of Michigan (Co-PI)

The Research Team

Page 5: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

DIPIR Project

Nancy McGovernICPSR/MIT

Ixchel FanielOCLC

Research (PI)

Eric Kansa Open

Context

William Fink UM

Museum of Zoology

Elizabeth Yakel

University of Michigan (Co-PI)

The Research Team

Page 6: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Research Methodology

ICPSR Open Context UMMZ

Phase 1: Project Start up

Interviews Staff

10 Winter 2011

4 Winter 2011

10 Spring 2011

Phase 2: Collecting and analyzing user data

Interviews data consumers

44 Winter 2012

22 Winter 2012

27 Fall 2012

Survey data consumers

Over 1,600 Summer 2012

Web analyticsdata consumers

Server logsOngoing

Observations data consumers

11✓Fall 2013

Phase 3: Mapping significant properties as representation information

Page 7: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Research Methodology

ICPSR Open Context UMMZ

Phase 1: Project Start up

Interviews Staff

10 Winter 2011

4 Winter 2011

10 Spring 2011

Phase 2: Collecting and analyzing user data

Interviews data consumers

44 Winter 2012

22 Winter 2012

27 Fall 2012

Survey data consumers

Over 1,600 Summer 2012

Web analyticsdata consumers

Server logsOngoing

Observations data consumers

11✓Fall 2013

Phase 3: Mapping significant properties as representation information

Page 8: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

• Snapshot of Users

• Interviews

• Observations

• Discussion

Agenda

Image: DIPIR Team

Page 9: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

A Snapshot Of 40 Data Reusers

65%

90%95%

reuse data directly from colleagues

27.5%

reuse data from online repositories and websites

reuse data from museums and archives

35%

are systematists

study ecological trends

reuse data from journal articles

20%

Page 10: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

The Discovery Process

“I knew from prior experience which museums had large collections of material from the part of the world I was

interested in.” (CAU19)

“… we started from that [author] paper and then added to it from other people’s work…So mostly from…reading other people’s papers.” (CAU22)

“I am a graduate student at [university], in Zoology and one of my committee members is an adjunct professor here, [name], so she noticed that I had genetic data for the same individuals that

U of M has skull data for.” (CAU39)

“…that [aggregator repository] targets so many different collections that once you have access you know pretty much…You can identify very quickly what you need.” (CAU13)

Page 11: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Selection Criteria

Data coverage Geographic precision

Matches another dataset

Availability of voucher specimen

Time period specimen collected

Sequence has been published

Results of pre-analysis

Relevant taxonomically

Condition of specimen

Location of repository

Availability of metadata

Physical variation of the species

Manner in which the specimen is preserved

Page 12: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Interviews

Image: DIPIR Team

Page 13: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Digital Data Selection Based On Locality

…often when it doesn’t meet my needs the most obvious reasons would be there’s just not enough data or it doesn’t cover…Like geographically it doesn’t cover the area I’m interested in well enough (CAU03).

…that’s the first filter…looking for specific species. And then for me, yeah, it’s been mostly about the geographic precision of the data, to say whether or not I can use that record for something. (CAU26).

Image: Microsoft Clipart

Page 14: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Digital Data Selection Based On Other Datasets

…we decide, okay, these Georeferences have an error thatIs probably higher than, let’s say, five kilometers but our climatedata is the resolution, the pixel size,…is may be 4.5 kilometers. So, anything that is above that size of pixel that we have, we actually cannot use. (CAU14)

I include it [the sequence] in my dataset, do the analyses I’m going to do and then based on the results of those analysis look to see how those data match with the

data that I’ve collected. (CAU05)

Image: Microsoft Clipart

Page 15: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Trusting Digital Data

“I can sort of qualitatively assess what the quality of taxonomic data might be just by it being, having some mention of the museum record. I know [a] …museum worker who is often... I don't know about an expert in say, my group, but at least has access to the relevant literature to make good taxonomic decisions about those fishes from which they took the tissue.” (CAU02)

“I would go back to the literature to look at the paper it came from. I guess there is also to some degree the particular

researchers’ that actually produced that sequence; I might actually know their reputations or what they kind of work on

and trust it more or less.” (CAU12)

Page 16: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Trusting Digital Data

“A lot of times, it's just a matter of looking at what the Latin name is that they supply because I can't really make a decision based on the information that I'm given. If I had a picture, I could use that when I'm taking into account their ability to identify something. But the main way that I do it is by looking at the geography of where they claim a specimen is located.” (CAU17)

“Well, if there's a voucher specimen available then I can request that specimen from the museum where it's housed,

re-examine it, confirm or deny that it is that particular species. If the voucher's there and it's the right species, then I have to go with it. If the voucher is not there, and I really question the identification…Because it's unreliable in my mind.” (CAU20)

Page 17: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Observations

Image: DIPIR Team

Page 18: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Specimen Selection Based On Condition

“It needs to be intact right? The skull needs to be

intact. That isn't in the records usually, and I've

gotten used to the idea that you just go and hope for the best, and figure that if

they say they have 20, you might find six you could

use. That would be a helpful thing to know.”

(CAU34)

Image: DIPIR Team

Page 19: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Specimen Selection Based On Holotypes For Comparison

“[Many] holotypes from the past [are] deposited here in this collection. And then it's really useful to me, and important to make a comparison with those specimens that was the original description when the species already occur in the country. But to do that in the best comparisons, we need to compare morphological data with the new specimens that we already collected in the recent years.” (CAU29) Image: DIPIR Team

Page 20: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Deciding To Visit UMMZ

“I think it’s because I was a student here so I know, I knew what was here But I have to say, I worked on my

dissertation in the same area, I worked on skull morphology, and so I learned as a graduate student that you go and find the museums that are most likely to

have the specimens that you need.” (CAU34)

“And it's a good-sized collection. Especially in terms of university's collections, there are a lot of specimens here, good taxonomic diversity, and it's also close for us . . . I'm going to the Smithsonian next week, but that's a lot more expensive, a lot more time consuming.” (CAU36)

Page 21: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

How Researchers Prepare For Their Visit To UMMZ

“Well, the crucial thing there is getting a copy of the data associated with the specimens that are here…an Excel spreadsheet that gave all the information about the tissues that are held here and the morphological specimens. Using that database, I was able to then select which species we need to study.” (CAU32)

Image: DIPIR Team

Page 22: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Interaction With Repository Staff

“In this case, I was fortunate to have [UMMZ staff], who took the initiative to go through the collections and find the most well-preserved specimens that he could . . . So, actually looking through the collection that was done by [UMMZ staff] and he brought out the specimens for me to use. So, that aspect was alleviated by the fact that he gave me a lot of help.” (CAU33)

Image: DIPIR Team

Page 23: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Discussion

Image: DIPIR Team

Page 24: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

• In global age of online databases people still need to see the actual specimens

• Condition and depth of the collection is important • Aggregators vs. museum website vs. inventory

system• Having data accessible online is great, but at

times it just is not sufficient

Discussion

Page 25: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

• The discovery processes are similar but selection criteria are specific to research objectives

• Gaining trust in data about the specimen from a distance

Discussion

Page 26: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Acknowledgements

• Institute of Museum and Library Services

• Partners: Nancy McGovern, Ph.D. (MIT), Eric Kansa, Ph.D. (Open Context), William Fink, Ph.D. (University of Michigan Museum of Zoology)

• OCLC Fellow: Julianna Barrera-Gomez

• Doctoral Students: Rebecca Frank, Adam Kriesberg, Morgan Daniels, Ayoung Yoon

• Master’s Students: Jessica Schaengold, Gavin Strassel, Michele DeLia, Kathleen Fear, Mallory Hood, Annelise Doll, Monique Lowe

• Undergraduates: Molly Haig

Page 27: Data  Reuse Experiences within Digital vs. Physical Zoological  Collections

The world’s libraries. Connected.

Questions?

Ixchel [email protected]

Beth [email protected]

http://www.dipir.org