rda, data citation, and pids for dataone

27
Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License Building Collaborative Bridges Opportunities and Challenges for Data Sharing and Citation Mark A. Parsons 0000-0002-7723-0950 Secretary General DataONE Webinar 10 May 2016

Upload: research-data-alliance

Post on 15-Apr-2017

75 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: RDA, Data Citation, and PIDs for DataOne

Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License

Building Collaborative Bridges Opportunities and Challenges for Data Sharing and Citation

Mark A. Parsons0000-0002-7723-0950Secretary General

DataONE Webinar10 May 2016

Page 2: RDA, Data Citation, and PIDs for DataOne

All of society’s grand challenges require diverse

(often large) data to be shared and integrated

across cultures, scales, and technologies.

Page 3: RDA, Data Citation, and PIDs for DataOne

Research Data Alliance

Vision Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society.

Mission RDA builds the social and technical bridges that enable open sharing of data.

Page 4: RDA, Data Citation, and PIDs for DataOne
Page 5: RDA, Data Citation, and PIDs for DataOne
Page 6: RDA, Data Citation, and PIDs for DataOne
Page 7: RDA, Data Citation, and PIDs for DataOne
Page 8: RDA, Data Citation, and PIDs for DataOne

Infrastructure is

Relationships, interactions, and connections between people, technologies, and institutions

Page 9: RDA, Data Citation, and PIDs for DataOne

FranBerman,ResearchDataAlliance

“Create - Adopt - Use” (in 12-18 months)

Systems Interoperability

Adopted Policy

Sustainable Economics

Common Types, Standards, Metadata

TrafficImage:MikeGonzalez

Adopted Community Practice

Training, Education, Workforce

Page 10: RDA, Data Citation, and PIDs for DataOne

Shared Principles

• Openness

• Consensus

• Balance

• Harmonization

• Community Driven

• Non-profit

Page 11: RDA, Data Citation, and PIDs for DataOne

May-July Aug-Oct Nov-Jan Feb-Apr May-July Aug-Oct Nov-Jan Feb-Apr May-July Aug-Oct Nov-Jan Feb-Apr

392

9911274

16562048

24042636

28813126

34343698

3976

SouthAmerica1%

NorthAmerica34%

Europe48%

Australasia5%

Asia9%

Africa3%

from 110 countries

https://rd-alliance.org/about-rda/who-rda.html

TheRDACommunity:~4,000+membersfrom110countries

(April2016)

70+ Working and Interest Groups

Page 12: RDA, Data Citation, and PIDs for DataOne

RDA Organisational Members

Organisational & Affiliate Members

RDA Affiliate Members

https://rd-alliance.org/organisation/rda-organisation-affiliate-members.html

Page 13: RDA, Data Citation, and PIDs for DataOne

FranBerman,ResearchDataAlliance

RDA: Accelerate Data Sharing and Interoperability Across Cultures, Communities, Scales, Technologies

▪ Technicalpartsofthedataengine:▪ Datatyperegistriesreferencemodel▪ Wheatdatainteroperabilityframework

▪ Rulesoftheroad:▪ Commonagreementondatacitation▪ Commonpracticefordatarepositories▪ Principlesoflegalinteroperability

▪ Betterdrivers• Summerschoolsindatascienceandcloud

computinginthedevelopingworld(withCODATA)

• Activedatamanagementplandevelopmentandmonitoring

Policy and Practice

Systems Interoperability

Sustainable Economics

Common Types, Standards, Metadata

Training, Education, Workforce

Page 14: RDA, Data Citation, and PIDs for DataOne

Some themes amidst the difference

1. Persistent Identifiers for data, documents, people, organisations, instruments—Everything!

2. Certifying Trust in assertions, evidence, organisations, processes…

3. The value of Conversations, Relationships, and Mediation — an agile network effect.

Page 15: RDA, Data Citation, and PIDs for DataOne

‹#›An Area of Convergence and Agreement

Internet Domain

nodes with IP numbers

packages being exchanged

standardized protocols

Slide courtesy P. Wittenberg from L. Lannom from D. Clark

Page 16: RDA, Data Citation, and PIDs for DataOne

‹#›An Area of Convergence and Agreement

Internet Domain

nodes with IP numbers

packages being exchanged

standardized protocols

Data Domain

objects with PID numbers

objects being exchanged

standardized protocols

Slide courtesy P. Wittenberg from L. Lannom from D. Clark

Page 17: RDA, Data Citation, and PIDs for DataOne

Purpose of Data Citation

• Aid scientific reproducibility through direct, unambiguous connection to the precise data used.

• Credit for data authors and stewards • Accountability for creators and stewards • Track impact of data set • Help identify data use (e.g., trackbacks)

• Data authors can verify how their data are being used. • Users can better understand the application of the data.

• A locator/reference mechanism not a discovery mechanism per se

Page 18: RDA, Data Citation, and PIDs for DataOne

Crisis of Confidence in Research Data Citation

Page 19: RDA, Data Citation, and PIDs for DataOne

The Evolution of Data Citation

• Data was part of the literature—tables, maps, monographs, etc.—and we cited accordingly. (Some data were still hoarded).

• Digital data becomes the norm. It’s messier and we forget how to do cite it routinely.

• Initial efforts to define digital data citation in the 90s - early 00s • Right idea, little traction • Partially conflated with the citing URLs issue

• A blossoming in the mid-late 00s. • Multiple disciplines start developing approaches and guidelines • DOI a big driver, especially for DataCite, but other identifiers used too

(Handles, LSIDs, UNFs, ARKs and good ol’ URI/Ls) • A somewhat competitive atmosphere

• Finally consensus through the Joint Declaration of Data Citation Principles, 2013

Page 20: RDA, Data Citation, and PIDs for DataOne

JointDeclarationofDataCitationPrinciples(Overview)

TheNobleEight-FoldPathtoCitingData

1. Importance2. Creditandattribution3. Evidence4. UniqueIdentification5. Access6. Persistence7. Specificityandverifiability8. Interoperabilityandflexibility

Principlesaresupplementedwithaglossary,referencesandexampleshttp://force11.org/datacitation

Page 21: RDA, Data Citation, and PIDs for DataOne

‹#›Citing Dynamic Data

Data Citation: Data + Means-of-access

▪ Data à time-stamped & versioned (aka history)

Researcher creates working-set via some interface: ▪ Access à assign PID to QUERY, enhanced with − Time-stamping for re-execution against versioned DB − Re-writing for normalization, unique-sort, mapping to history − Hashing result-set: verifying identity/correctness

leading to landing page

S. Pröll, A. Rauber. Scalable Data Citation in Dynamic Large Databases: Model and Reference Implementation. In IEEE Intl. Conf. on Big Data 2013 (IEEE BigData2013), 2013http://www.ifs.tuwien.ac.at/~andi/publications/pdf/pro_ieeebigdata13.pdf

Page 22: RDA, Data Citation, and PIDs for DataOne

‹#›

Output / Results http://bit.ly/1T1HHXI

▪ 14 Recommendationsgrouped into 4 phases: - Preparing data and query store - Persistently identifying specific data

sets - Resolving PIDs - Upon modifications to the data

infrastructure ▪ Still open for comment by

members ▪ See RDA Magazine for

overview and adoption cases ▪ Reference implementations

(SQL, CSV, XML) ▪ Pilots

Page 23: RDA, Data Citation, and PIDs for DataOne

Getting involved

Individuals✓Observers✓Contributors✓Drivers

22

Organisations✓ Insight✓ Adopt✓ Drive

Nationallevel✓ Coordination&Knowledge

Exchange,Strategy&/orImplementation

• Members• WGs-IGs-BoFs• Requestsfor

Comments• Plenaries

• Member• WGs-IGs-BoFs• RfCs• Fundedprojects• Adoption/Uptake

• Papers&Events• Meetings&Fora• Training&Workshops• Uptakepilots

https://rd-alliance.org/get-involved.html

Page 24: RDA, Data Citation, and PIDs for DataOne

12-16 September 2016in

Denver, Colorado, USA

Page 26: RDA, Data Citation, and PIDs for DataOne

25RDA Interest (IG) and Working Groups (WG) by Focus 1 — February 2016

Page 27: RDA, Data Citation, and PIDs for DataOne

26RDA Interest (IG) and Working Groups (WG) by Focus 2 — February 2016