integration options for persistent identifiers in osgeo project repositories: towards osgeo best...

35
TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES Peter Löwe , Markus Neteler, Jan Goebel, Marco Tullney Boston, August 17 2017

Upload: peter-loewe

Post on 23-Jan-2018

87 views

Category:

Science


0 download

TRANSCRIPT

Page 1: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES

Peter Löwe, Markus Neteler, Jan Goebel, Marco TullneyBoston, August 17 2017

Page 2: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

4 Original sin ?Science + Culture of Sharing = Open Science

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation2

https://xkcd.com/1228/

Does Prometheus receive due credit for

his creativity ?

Page 3: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

4 Open Science

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation3

https://en.wikipedia.org/wiki/Open_science#/media/File:Open_Science_-_Prinzipien.png

Open Science is the movement to make scientific research and data accessible to all

Page 4: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Open Science Triangle: Science-related benefits

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation4

Open Access

Open Data Open Source

Society:• Greater availability and

accessibility of publicly funded scientific research outputs

• Greater reproducibility and transparency of scientific works

Community:• Possibility for rigorous peer-

review

Individual:• Greater impact of scientific

research

Code citation: Requires standards and infrastructure

Code citationrequired

Page 5: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Motivation for Code Citation

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation5

Understanding research fields: code as important part of therecord of research and progress in science (no „throwaway code“)

Credit: Researchers on all levels(including students!) deserve credit in their coin of the realm (aka citation), especially when this work enablesfurther research by others.

Discoverability: Citation enables finding and reuse

Reproducability: Citation of a specific software is required, but also information about underlying software stack and configurations areneeded

Page 6: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

OSGeo-Infrastructure

Business

MemoryOrganisations

Research,Education,

Data Centers, Code Repositorories,

Libraries

OSGeoinfrastructure

& best practices

Code citation ?

Page 7: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Software Citation Best Practicesaccording to FORCE 11

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation7

• Importance Software matters in Science• Credit and Attribution Get due credit for your work• Unique Identification Unique, presistent, interoperable• Persistence Identifier & metadata never expire• Accessibility Code & documentation, interop.• Specifity Reference to specific code versions

https://www.force11.org/

Page 8: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

PUBLISHING RESEARCH SOFTWARE Open Access Journals for Geospatial Research Software ?

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation8

In comparison to the actual magnitude of research code being produced, only a fraction is being communicated by journals.

As a result, advances in scientific software are not being properly communicated and therefore remain inaccessible to other scientists.

Page 9: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Reality Check: OSGeo Journal

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation9

• Founded in 2007• Online Journal• ISSN• Publishes FOSS4G proceedings• No defined standards for software citation (yet)

http://www.osgeo.org/journal

Page 10: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Journal of Open Source Software -a Role Model ?

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation10

„DOI-Link“ points to codewithin github repository. Metadata stored in Zenodo.

Page 11: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Motivation for DOI links

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation11

Long term perspective:• Data and code will move within the WWW, • URL links to webpages will expire over time .Digital Object Identifers (DOIs) as a way to ensure stable links, preventing:

Very bad

Page 12: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

5 Introducing Digital Object Identifiers (DOI)

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation12

• DOI System ISO Standard 26324 (2012)• International DOI Foundation (1998).• Based on the Handle system.• Long-term persistence and

accessibility of information.

• Global infrastructure provider for research data and code DataCite (non-profit, software infrastructure is FOSS):

https://www.datacite.org/

Page 13: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

5 What is a DOI ?

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation13

DOI: Acronym for "digital object identifier“.A DOI identifies the object itself and not the place where it is located.

What you see: alphanumeric string (never changes)Associated with: location (such as URL)Accompanied with: who, what, when… (metadata)

Page 14: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

WHAT TO USE DOIs FOR ?

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation14

DOIs can be used to reference

• Publications• Code• Data

Open Access

Open Data Open Source

Page 15: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

DOI magic explained: Man in the middle – can be friendly…

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation15

https://image.slidesharecdn.com/doi-100203060339-phpapp01/95/doi-in-he-11-728.jpg?cb=1265177093https://www.deepdotweb.com/wp-content/uploads/2016/10/word-image-19.png

• DOIs are resolved by a resolving entity („man in the middle“).• The resolving entity does not host the data itself.• It receives updates from the hosting data repository whenever the

data changes location (new URL). • A DOI will then always resolve to a valid landing page.

Page 16: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

DOI magic explained: Man in the middle – can be friendly…

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation16

https://image.slidesharecdn.com/doi-100203060339-phpapp01/95/doi-in-he-11-728.jpg?cb=1265177093https://www.deepdotweb.com/wp-content/uploads/2016/10/word-image-19.png

• DOIs are resolved by a resolving entity („man in the middle“).• The resolving entity does not host the data itself.• It receives updates from the hosting data repository whenever the

data changes location (new URL). • A DOI will then always resolve to a valid landing page.

Code,Data

Page 17: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Requirement: DOIs resolve to landing pages

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation17

• Every DOI resolves to a landing page.• Landing pages provide metadata and further

content• DOIs are designed to outlive their content,

OSGeo content like GRASS module manual pagesalready qualify as landing pages for DOIs

Page 18: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

5 DOI is a quality label

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation18

A digital object with a DOI has to be:

Stable° (i.e. not going to be modified)Complete (i.e. not going to be updated)Permanent – by assigning a DOI we’re committing to make the dataset available for posterityGood quality – by assigning a DOI its receiving the data centre’s stamp of approval, saying that it’s complete and all the metadata is available

(°DOI can handle software-versioning )

Seal ofApproval

Page 19: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

DOIs are on the rise

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation19

www.datacite.org

Page 20: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Example: DOI for Journal Articles

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation20

Page 21: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Example: DOIs for Data

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation21

Page 22: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Example: DOIs for Code

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation22

Page 23: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

DOIs being currently used by OSGeo: Video

Scientific-technical video is part of the research cycle

• FOSS4G presentations deserve scientific credit by citation and

long term preservation in a repository

Open Access

Open Data Open Source

Page 24: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

OSGeo Videos with DOIs

• OSGeo conference recordings are hosted by FOSS4G media partner German National Library for Science and Technology (TIB) .

• The annual growth exceeds 100 hours of new content

• OSGeo videos are part of the record of science

https://wiki.osgeo.org/wiki/Global_conferences_overview

http://dx.doi.org/10.5446/14749#t=39:10,39:33

DOI Timestamp

Scientifccitation

Page 25: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

GRASS GIS

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation25

• GRASS GIS, • Over 3 decades experience (since 1982)• OSGeo project • Over 350 modules• Additional add-on modules• Main repository: SVN

https://grass.osgeo.org/

Page 26: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

GRASS Code Citation

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation26

The GRASS GIS project wiki provides advice, how to cite versions of GRASS GIS in scientific. No coverage of DOIs (yet).

Page 27: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

GRASS Code Levels

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation27

1. „External code“, based on GRASS repo, not shared with community, not hosted in OSGeo GRASS repository. Potentially volatile.

2. Add-on modules: Hosted and preserved in OSGeo GRASS repository, minimal quality standards, including standardizedlanding page (GRASS module manual page), limited peer review,discoverable by GRASS search functions

3. Core modules: Hosted and preserved in GRASS repository, manualpage with links to previous code versions, demo data, reference todevolpers, rigorous peer review by GRASS community, discoverableby GRASS search functions

Page 28: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Individual Level: Zenodo Option external code / add-ons

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation28

International DOI Foundation

Registration Agencies

Member

DatacenterDatacenterOther

Datacentes

Managing Agent

MemberMember

Other Members

Datacenter

Personal github repo

Dawn of a code

diaspora ?

Page 29: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Community Level: Zenodo Option forGRASS Repository

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation29

International DOI Foundation

Registration Agencies

Member

DatacenterDatacenterOther

Datacentes

Managing Agent

MemberMember

Other Members

Datacenter

SVN Repomigration

(RISK)

Other OSGeoprojects ? ?

GRASS codeRepo

Page 30: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Reality check: Zenodo (and figshare) areall-purpose Repositories: One size fits all ?

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation30

Rueda, Laura. (2017, May). Introduction to DataCite. Zenodo. http://doi.org/10.5281/zenodo.571808

Allpurpose. Good ?

Page 31: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Project Community Level: GRASS Project DataCenter

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation31

International DOI Foundation

Registration Agencies (9)

Member

DatacenterDatacenterOther

Datacentes

Managing Agent

MemberMember

Other Members

DatacenterSVN Repo

!GRASS

SVN Repo

Page 32: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Umbrella Option: OSGeo becomes a DOI member, unlimited DOI minting for all OSGeo projects.

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation32

International DOI Foundation

Registration Agencies

Member

DatacenterDatacenterOther

Datacentes

Managing Agent

MemberMember

Other Members

DatacenterRepo

All OSGeoProjects !

Meta dataguidelines

Meta dataguidelines

GRASS SVN Repo

Page 33: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

5 Opportunity: OSGeo to benefit from Datacite Services

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation33

Search.datacite.org

Page 34: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Proposal for Follow-up Action

Löwe, Neteler, Goebel, Tullney: FOSS4G 2017 Towards OSGeo Best Practices for Scientific Software Citation34

• Make code citation a OSGeo topic• Journal

• Projects

• Incubation

• Discuss DOI-/citation-related best-practices within OSGeo

• Explore:Conduct tests on project level

Geo For All

Page 35: INTEGRATION OPTIONS FOR PERSISTENT IDENTIFIERS IN OSGEO PROJECT REPOSITORIES: TOWARDS OSGEO BEST PRACTICES FOR SCIENTIFIC SOFTWARE CITATION

Vielen Dank für Ihre Aufmerksamkeit.

DIW Berlin — Deutsches Institutfür Wirtschaftsforschung e.V.Mohrenstraße 58, 10117 Berlinwww.diw.de

RedaktionPeter Löwe ([email protected])

³German Institute for Economic ResearchMohrenstraße 58, 10117 Berlin, Germany

E-mail: [email protected]://orcid.org/0000-0002-3243-1935

⁴Technische InformationsbibliothekWelfengarten 1B, 30167 Hannover, Germany

E-mail: [email protected]://orcid.org/0000-0002-5111-2788

¹German Institute for Economic ResearchMohrenstraße 58, 10117 Berlin, Germany

E-mail: [email protected]://orcid.org/0000-0003-2257-0517

²Mundialis GmbH & Co. KGKölnstraße 99, 53111 Bonn, Germany

E-mail: [email protected]://orcid.org/0000-0003-1916-1966

Peter Löwe¹, Markus Neteler², Jan Goebel³ and Marco Tullney⁴