collaborative data management at the university of california

44
1

Upload: university-of-california-curation-center

Post on 01-Nov-2014

1.231 views

Category:

Education


0 download

DESCRIPTION

Webinar presented on December 5, 2012, by Joan Starr and Perry Willett of CDL/UC3, and Lisa Federer and Claudia Horning from UCLA. Part of the ACRL Digital Curation Interest (DCIG) Group Webinar Series.

TRANSCRIPT

Page 1: Collaborative Data Management at the University of California

1

Page 2: Collaborative Data Management at the University of California

Photo credit: http://www.flickr.com/photos/joanet/2994421437/ By Jo@netJoanCampderrós‐i‐Canas

2

Page 3: Collaborative Data Management at the University of California

3

Page 4: Collaborative Data Management at the University of California

Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite) 

Why should researchers bother with DATA CITATION? What is their motivation?

To provide fair credit to those responsible: exposureTo ensure scientific transparency and reasonable accountability for authors and stewards: transparencyTo aid in tracking the impact of the work: citation trackingTo help data authors verify how their data are being used: verificationT id i tifi d ibilit th h di t bi ti t th i d t d iTo aid scientific reproducibility through direct, unambiguous connection to the precise data used in a particular study: scientific re‐use

Source: ESIP—Earth Science Information Partners (http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)

4

Page 5: Collaborative Data Management at the University of California

Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite) 

Why should researchers bother with DATA CITATION? What is their motivation?

To provide fair credit to those responsible: exposureTo ensure scientific transparency and reasonable accountability for authors and stewards: transparencyTo aid in tracking the impact of the work: citation trackingTo help data authors verify how their data are being used: verificationT id i tifi d ibilit th h di t bi ti t th i d t d iTo aid scientific reproducibility through direct, unambiguous connection to the precise data used in a particular study: scientific re‐use

Source: ESIP—Earth Science Information Partners (http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)

5

Page 6: Collaborative Data Management at the University of California

Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite) 

Why should researchers bother with DATA CITATION? What is their motivation?

To provide fair credit to those responsible: exposureTo ensure scientific transparency and reasonable accountability for authors and stewards: transparencyTo aid in tracking the impact of the work: citation trackingTo help data authors verify how their data are being used: verificationT id i tifi d ibilit th h di t bi ti t th i d t d iTo aid scientific reproducibility through direct, unambiguous connection to the precise data used in a particular study: scientific re‐use

Source: ESIP—Earth Science Information Partners (http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)

6

Page 7: Collaborative Data Management at the University of California

Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite) 

Why should researchers bother with DATA CITATION? What is their motivation?

To provide fair credit to those responsible: exposureTo ensure scientific transparency and reasonable accountability for authors and stewards: transparencyTo aid in tracking the impact of the work: citation trackingTo help data authors verify how their data are being used: verificationT id i tifi d ibilit th h di t bi ti t th i d t d iTo aid scientific reproducibility through direct, unambiguous connection to the precise data used in a particular study: scientific re‐use

Source: ESIP—Earth Science Information Partners (http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)

7

Page 8: Collaborative Data Management at the University of California

Image credit: http://www.flickr.com/photos/vixon/116447718/ by barryegan (Vitor Leite) 

Why should researchers bother with DATA CITATION? What is their motivation?

To provide fair credit to those responsible: exposureTo ensure scientific transparency and reasonable accountability for authors and stewards: transparencyTo aid in tracking the impact of the work: citation trackingTo help data authors verify how their data are being used: verificationT id i tifi d ibilit th h di t bi ti t th i d t d iTo aid scientific reproducibility through direct, unambiguous connection to the precise data used in a particular study: scientific re‐use

Source: ESIP—Earth Science Information Partners (http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines)

8

Page 9: Collaborative Data Management at the University of California

9

Page 10: Collaborative Data Management at the University of California

This is also a question of an almost perfect fit with our historic mission to preserve and protect our institution’s scholarly output.

472,000 in September

455,000 in May!

10

Page 11: Collaborative Data Management at the University of California

11

Page 12: Collaborative Data Management at the University of California

IMAGE CREDIT:IMAGE CREDIT: http://www.flickr.com/photos/bleuman/6160605143/By Yunchung Lee

As Sherry described to you, Research has a life cycle.

12

Page 13: Collaborative Data Management at the University of California

13

Page 14: Collaborative Data Management at the University of California

Supporting the MANAGE AND SHARE STAGES ARE THE MERRITT curation and i i d l id ifi i EZIDpreservation repository and our long‐term identifier service, EZID.

BAKING DATA CURATION INTO THE COLLECTION PHASE: DATA‐UPAND ENHANCING COLLECTION OF E‐SCIENCE IN THE WEB ENVIRONMENT :  WEB ARCHIVING SERVICE, OR WAS

To facilitate data publication, we are exploring this new Data Paper model.

We are engaged in a number of network‐level collaborations and partnerships, but these two have particular relevance  to the data management space, with DataONE focused on distributed data networks and DataCite on persistent identifiers.

And lastly, we have partnered with UVA, and many others to develop and l h t D t M t Pl T llaunch an easy to use Data Management Plan Tool.

Let’s take a brief look at all of these things, and  then we’ll talk about what this means to you.

14

Page 15: Collaborative Data Management at the University of California

Current work includes the Datashare project at UC San Francisco (UCSF). p j ( )Datashare, as the name suggests, encourages researchers to share their data.  See the Datashare website at http://datashare.ucsf.edu

15

Page 16: Collaborative Data Management at the University of California

In its capacity as a data Management tool, Merritt can function in one of p y g ,several ways: it can be a “dark” or inaccessible archive for important digital assets; It can serve as a “bright” archive with direct discovery and access; It can be the preservation back‐end for existing or new discovery and content management systems; or it can integrate with distributed data grids. 

Current work includes the Datashare project at UC San Francisco (UCSF). Datashare, as the name suggests, encourages researchers to share their data.  See the Datashare website at http://datashare.ucsf.edu

16

Page 17: Collaborative Data Management at the University of California

17

Page 18: Collaborative Data Management at the University of California

18

Page 19: Collaborative Data Management at the University of California

19

Page 20: Collaborative Data Management at the University of California

Preservation: Curation microservices and MerrittT b k d i i d i DCXL (D C i XL Pl I )To bake data curation into data creation: DCXL (Data Curation XL Plug‐In)To enhance data sharing,  collecting and gathering: WAS serviceTo facilitate data publication, we are exploring this new Data Paper model.And behind many of these steps, the EZID service.

We are engaged in a number of network‐level collaborations and partnerships, but these two have particular relevance  to the data 

t ith D t ONE f d di t ib t d d t t k dmanagement space, with DataONE focused on distributed data networks and DataCite on persistent identifiers.

And lastly, we have partnered with UVA, and many others to develop and launch an easy to use Data Management Plan Tool.

So let’s take a brief look at all of these things, and while I’m there, I’ll dive d l i t EZID hi h i th i Imore deeply into EZID, which is the service I manage.

20

Page 21: Collaborative Data Management at the University of California

Nobody thinks of Excel as a preservation‐ready tool, but everybody uses y p y , y yit! The KEY IDEA in keeping this EASY here is: let them use the tools they are use to using. (Get out of the way of that elephant!)

Gordon & Betty Moore Foundation + Microsoft Research are funding this.

Our part is requirements gathering; MS will do development. Open source plug in.

21

Page 22: Collaborative Data Management at the University of California

WAS allows curators to collect and manage web‐published content so g pthat scholars can use the content for private research and/or publish the content for public access. 

The archives contain eScience content as well as government documents, event captures, and archives for specific research communities, such as unique data sets, collections of sites not otherwise grouped together, and the sites resulting from grant activity.  

22

Page 23: Collaborative Data Management at the University of California

PUBLIC OR PRIVATE

WAS provides tools for analyzing site change over time and allows keyword searching for archived sites, and publishing an archive is optional. As of this writing, there are 93 active archives, over half of which are publically available.

23

Page 24: Collaborative Data Management at the University of California

Preservation: Curation microservices and MerrittT b k d i i d i DCXL (D C i XL Pl I )To bake data curation into data creation: DCXL (Data Curation XL Plug‐In)To enhance data sharing,  collecting and gathering: WAS serviceTo facilitate data publication, we are exploring this new Data Paper model.And behind many of these steps, the EZID service.

We are engaged in a number of network‐level collaborations and partnerships, but these two have particular relevance  to the data 

t ith D t ONE f d di t ib t d d t t k dmanagement space, with DataONE focused on distributed data networks and DataCite on persistent identifiers.

And lastly, we have partnered with UVA, and many others to develop and launch an easy to use Data Management Plan Tool.

So let’s take a brief look at all of these things, and while I’m there, I’ll dive d l i t EZID hi h i th i Imore deeply into EZID, which is the service I manage.

24

Page 25: Collaborative Data Management at the University of California

We are exploring this idea with various partners and funders, potentially p g p , p yto encourage conventions for describing data so that it can stand alone when appropriate

25

Page 26: Collaborative Data Management at the University of California

Preservation: Curation microservices and MerrittT b k d i i d i DCXL (D C i XL Pl I )To bake data curation into data creation: DCXL (Data Curation XL Plug‐In)To enhance data sharing,  collecting and gathering: WAS serviceTo facilitate data publication, we are exploring this new Data Paper model.And behind many of these steps, the EZID service.

We are engaged in a number of network‐level collaborations and partnerships, but these two have particular relevance  to the data 

t ith D t ONE f d di t ib t d d t t k dmanagement space, with DataONE focused on distributed data networks and DataCite on persistent identifiers.

And lastly, we have partnered with UVA, and many others to develop and launch an easy to use Data Management Plan Tool.

So let’s take a brief look at all of these things, and while I’m there, I’ll dive d l i t EZID hi h i th i Imore deeply into EZID, which is the service I manage.

26

Page 27: Collaborative Data Management at the University of California

DataONE is an NSF funded,  virtual data center for  biology, ecology, and , gy, gy,environmental sciences.

DataOne has the overarching goal of building a new culture of data access and data sharing. This is an international collaboration working with scientists and librarians, as well as other stakeholders.

1. Engaging the scientist in the data curation process2. Supporting the full data life cycle3. Encouraging data stewardship and sharing4. Promoting best practices5 E i iti5. Engaging citizens6. Developing domain agnostic solutions

27

Page 28: Collaborative Data Management at the University of California

28

Page 29: Collaborative Data Management at the University of California

How can EZID be in the business of issuing DataCite DOIs? California gDigital Library was one of the founding members.

DataCite was indeed formed in 2009 by 10 Libraries and Research Centers with a Mission: “"Helping you find, access, and reuse data“

The number has now grown to 16. In addition there are 3 associate members, including the Korea Institute of Science and Technology Information and BGI, so there is a presence in Asia.

DATACITE’s primary methodology for achieving this mission: issuing DOIs (Di it l Obj t Id tifi ) f d t t(Digital Object Identifiers) for datasets.

29

Page 30: Collaborative Data Management at the University of California

These are the factors driving the collaboration: g

1. Institutions rely on soft funding… agencies have created a new demand, meet the demand or don’t get funded.

2. Approach is to work collaboratively to consolidate expertise and reduce costs

3. Libraries plus, plus4. Provide an environment that allows researchers to focus on research

30

Page 31: Collaborative Data Management at the University of California

Image credit:Image credit: http://content.cdlib.org/ark:/13030/kt667nc4xn/?query=service%20station&brand=calisphere, courtesy of  Anaheim Public Library

31

Page 32: Collaborative Data Management at the University of California

Image credit:Image credit: http://content.cdlib.org/ark:/28722/bk0007s853c/?query=tools&brand=calisphere Courtesy of  UC Berkeley, Bancroft Library; United Aircraft Corporation: Joint War Production Drive Committee

DATA CURATION LEADS TO GOOD OUTCOMES FOR RESEARCHERS.

• They’ll be motivated routinely to deposit in stable public storage.   Data products (datasets and processing information) and the data papers that reward them with authorship credit

• Data journals will spring up around disciplines, even if disciplinary data papers are scattered across geographically distributed repositories.

• Data products will be re‐used, annotated, corrected, d i l li k d t f t diti l bli ti

32

Page 33: Collaborative Data Management at the University of California

Lots of work going on with data at UCLA, but I’m going to focus on just a couple of them

33

Page 34: Collaborative Data Management at the University of California

DMP Tool developed in part at UCLA – UCLA is second among Ucs in usage.  More enrollees in DMPTool after presentation to administrator group than entire 4 months prior

34

Page 35: Collaborative Data Management at the University of California

35

Page 36: Collaborative Data Management at the University of California

Carly Strasser visit from CDL established interestInitial survey indicated researchers interested in many aspects of data management, especially data management plan

36

Page 37: Collaborative Data Management at the University of California

Initial results from the test indicate that researchers found the class useful

37

Page 38: Collaborative Data Management at the University of California

In Summer 2012, UCLA was one of 7 libraries to receive funding to add an informationist to an existing NIH funded research team

38

Page 39: Collaborative Data Management at the University of California

39

Page 40: Collaborative Data Management at the University of California

40

Page 41: Collaborative Data Management at the University of California

41

Page 42: Collaborative Data Management at the University of California

http://www.flickr.com/photos/sekihan/6100774057/ By sekihanp p y

42

Page 43: Collaborative Data Management at the University of California

43

Page 44: Collaborative Data Management at the University of California

Image source: http://www.flickr.com/photos/ausnahmezustand/4752989186/

By ausnahmezustand

44