open science and identifiers

63
Open Science and Identifiers Hideaki Takeda National Institute of Informatics [email protected] ORCID: 0000-0002-2909-7163 k, JATS-CON Asia in Tokyo, JST Tokyo headquarter, Tokyo, Japan, Oct

Upload: hideaki-takeda

Post on 17-Feb-2017

851 views

Category:

Science


0 download

TRANSCRIPT

Open Science and Identifiers

Hideaki TakedaNational Institute of Informatics

[email protected]: 0000-0002-2909-7163

Keynote talk, JATS-CON Asia in Tokyo, JST Tokyo headquarter, Tokyo, Japan, October 19, 2015

Internet changes our life

Law

Norm

Mar

ket

Architecture

four modalities of regulation (Lawrence Lessig)

So our society is becoming Open Society

Globalism, Borderless, Cross-culture, Nomad life, …

Internet changes science

Law

Norm

Mar

ket

Architecture

four modalities of regulation (Lawrence Lessig)

So Science is becoming Open Science

• Open science can be discussed in philosophical, political, methodological, or any kind of views.

• “Open Science NOW” is geared and realized by Internet as Architecture

• So data sharing is the core of Open Science

Data sharing

Researcher before Digital Age

papers

data

research target

Survey Paper working

Research & Writing

01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101

01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101

Researchers now

Data use Data publishing

Research, Writing & Data publishingpapers

data

research target

Survey Paper working

01011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101001011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011101011011101011011111000111100001101010101011111000111100001101010101011101011011111000111100001101010101011111100001101010101111111001110000110101010100110101010100001101010101110001111000011010101010111010110111110001111000011010101010111111000011010101011111110011100001101010101001101010101000011010101010

Researcher in Future

Data

Data use Data publishing

Integration of papers & data

Data publishing

Research = Data Supply-chain

Data sharing

Data Sharing? or

Data Publication?or

Open Data?

Data Life Cycle

• Data is created, shared, published, and archived

• But, just “published” is not enough, it should be “openly published” (open data)

Data ShareCreate Publish Archive

Research Phase In Progress Results

Open Data

• “A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.”   http://opendefinition.org/

• Open data is data publication with some open license– Open license ensues the above condition

Data Life Cycle

• Different tools for different stages of life cycle– Data sharing: generating, federating, …– Data publishing: searching, harvesting, …– Data archiving: migration, …

• The architecture CAN be shared

Data ShareCreate Publish Preserve

Research Phase In Progress Results

StakeholderResearch Institute

Researcher/R. Group

Why should research data be open?But still

Four reasons for openness of research data

• Demands from Society– Knowledge sharing among society– Accountability of public money

• Demands in Science– Future development of Science itself• “Standing on the shoulders of giants” (nanos gigantum humeris insidentes)

– Reproducibility

Dimensions of Science

• Local - Global• Open - Authorized

Dimensions of Science

Local

Global

AuthorizedOpen

Dimensions of Science

Local

Global

AuthorizedOpen

GovernmentUniversityPublisherCitizen Science/Open Science

Dimensions of Science

Local

Global

AuthorizedOpen

GovernmentUniversityPublisherCitizen Science/Open Science

Various stakeholders stand on different positions on Open Science

Architecture of data sharing

Repository

Architecture of data sharing

Identifier

Data

Format

MetadataID

Metadata Schema

Systematic Integration across the layers

Interoperability on each layer

Metadata Description Language, Collectoin and sharing, Conversion

ス Schema Description Language, collection and sharing, conversion

System Development, Community

管理 Organization, systems, ID federation

Repository

Architecture of data sharing

Identifier

Data

Format

Metadata

Metadata Schema

DOI ORCID FundRef

DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked DataOrganization Schema System Technology

Coordination and CompetitionDspace Fedora Weko

Research Activities and Related Entities

Survey

Article Writing

Data

Digital Articles

Acquiring DataPublishing Data

Funding agencies

ResearchInstitutions

affiliated

Projects

Supported

Academic Societies

Digital objects Digital objects

Topics

Research Activities and Related Entities

Survey

Article Writing

Data

Digital Articles

Acquiring DataPublishing Data

Funding agencies Projects

ResearchInstitutions

affiliated

Supported

Academic Societies

Digital objects Digital objects

Topics

ID

ID ID

ID

ID IDID

ID

IDID

ID

Research Activities and Related Entities

Survey

Article Writing

Acquiring DataPublishing Data

Funding agencies Projects

affiliated

Supported

ID

ID ID

ID

ID IDID

ID

IDID

ID

Data

Digital Articles

ResearchInstitutions

Academic SocietiesTopics

Identifies for research

• A research activity is represented with a structure of identifies– Planned and submitted – Organized and executed– Concluded and evaluated

IDID ID

ID

ID

IDID

ID

IDID

ID

Identifies for research

IDID ID

ID

ID

IDID

ID

IDID

ID

• ID for– Article– Data– Researcher– Institutions, affiliation– Funding agency, funded project– Academic society– Topic– …

Nature of IDs for research

Local

Global

AuthorizedOpen

DOI

ORCID

Institution Member ID

URI

ResearchGate/Academia.edu/…

Grant ID

Kaken Grant IDKaken

Researcher ID

PubMed ID

ResearchMap

Facebook

Nature of IDs for Science• Balance in some features

– Global vs. Local• Global: Unified service• Local: Specialized service

– Authorized vs. Open• Authorized: Trusted, restricted• Open: no restrictions

– Charged vs. Free

• Multiple IDs can co-exist in a single category• How to mange multiple IDs

– Integration/mapping/associating/discovering– Control/Manage/Authorize– Private/Share/Open

DOI

ç√ç√

管理

Repository

DOI in Architecture of Data Sharing

Identifier

Data

Format

Metadata Schema

DOI

DataCite Metadata Schema

JaLC Metadata Schema

JaLC DataCiteMetadata

Members (data providers)

Domain-specific metadata schemata

DOI (Digital Object Identifier)

• Service to translate DOI names to URIs containing digital objects

• Service managed by International DOI Foundation (IDF)

• Initially started by STM publishers to share identifiers for digital publications

• Distributed management– Delegation of registration tasks to Registration

Agencies (RAs)

DOI (Digital Object Identifier)

• Service to translate DOI names to URIs containing digital objects

doi: 10.1007/978-3-642-21616-9_30 http://www.springerlink.com/content/xkj2386758245u85/

DOI URL

http://doi.org/10.1007/978-3-642-21616-9_30

http://www.springerlink.com/content/xkj2386758245u85/

DOI as URL URL

Management Structure of DOI• There Layers: International DOI Foundation (IDF), Registration

Agency (RA), members• RAs contributes to IDF by registration to Registry DBs,

management of Registry DBs, and members fees• RAs offers services for DOI registration to their members• Members can register DOIs to their digital objects through

RAsIDF

CrossRef

Publishers

Publishers

Publishers

Publishers

DataCite

University    Library Research

Institute

JaLC

Publisher

University

Academic Society

RAs

Members

Roles of DOI

• Provide resolvable, persistent, interoperable links– Resolvable: standard syntax + mapping by handle

system– Persistent• Technically: management of registry DBs• Socially: organizational operations and duties for

members– Interoperability: sharing datamodel

Registration Agencies (RAs)

Airiti, Inc. CrossRef

China National Knowledge Infrastructure (CNKI)

DataCite

EIDR (Entertainment Identifier Registry)

ISTIC (The Institute of Scientific and Technical Information of China)

JaLC (Japan Link Center)

mEDRA (Multilingual European DOI Registration Agency)

OP (Publications Office of the European Union)

CrossRef• Ensure accessibility and citation of articles and books in

STM publications• Started in 1999• Largest and oldest RA of IDF

– Most of DOI registered are via CrossRef– Members over 70 countries, most are publishers

• Functions– DOI Registration– Metadata Management

• Bibliographic metadata• Citation

– Services with metadata• Search for bibliographic metadata and citation• Reverse look up

DataCite

• IDF RA for research data• a not-for-profit organization since 1 December

2009

42

Japan Link Center (JaLC)• Founded in March 2012• Aimed to register DOIs for academic contents produced

in Japan or in Japanese, to circulate information in Japan and overseas.• Controlled by four national organizations:

Japan Science and Technology Agency (JST) National Institute for Materials Science (NIMS) National Institute of Informatics (NII) National Diet Library (NDL)

• Operated by JST• Membership system

(Academic societies, Publishers, University libraries, etc)

• External coordinationJaLC is a member of CrossRef and DataCite(Mar. 2014)

Over 1,300,000 DOI registered

43

Content categoriesCategory

Journal articles

Journal articles Dec.2012 -

University bulletins Sep.2014 -

Conference proceedings Mar.2012 -

Books

Books Jan.2015 -

Doctoral theses Mar.2014 -

Reports

Technical reports Jan.2015 -

Governmental reports Jan.2015 -

Research data Jan.2015 -

e-learning resources Jan.2015 -

44

Data

DOI Registration Flow

DOIIDF

DOI

DOI

Article

CrossRef DOI+CrossRef Matadata

- JaLC- CrossRef- DataCiteMetadata

DOI

DOI+Article

Matadata

DOI+Data Metadata

DataCite DOI+DataCite Metadata

JaLCMem

.

45

Experiment Project to register DOIs for Research Data

• Goal− Establish operation flows to register DOIs for

research data and have stable operation• Objectives− Set policies in registering DOIs for research data− Establish operation flows to register DOIs for

research data with the next version of JaLC system. Ensure that by performing registration tests

− October 2014 – October 2015

Members of the project

9 projects with 14 organizations

Members of the project• National Bioscience Database Center (NBDC), Japan Science and Technology Agency (JST)• National Institute of Polar Research (NIPR)• National Institute of Informatics (NII)• DIAS-P Project (National Institute of Informatics (NII))

– Japan Agency for Marine-Earth Science and Technology (JAMSTEC) – University of Tokyo– Kyoto University– National Institute for Environmental Studies (NIES)

• National Institute of Advanced Industrial Science and Technology (AIST)• National Institute of Information and Communications Technology (NICT)

– Kyoto University– National Institute of Informatics (NII)– InfoProto Co.,Ltd.– Japan Aerospace Exploration Agency (JAXA)– National Institute of Polar Research (NIPR)

• Chiba University Library• National Institute for Materials Science (NIMS)• Neuroinformatics Japan Center, Brain Science Institute (BSI), RIKEN

48

Issues in Data DOI

• Flow of operations• Persistent access• Granularity of data in registration• Dynamics of data• Landing page• Quantity of data• Applications

49

Issues in Data DOI• Flow of operations: Who, When, How

− Who registers data?: Researcher/Project manager/Librarian− When is data registered?− How is metadata provided for data?

• Persistent access− What persistency can we expect for data?− Can time-limited projects participate? Who will ensure the

persistency of the data?(ex.)

The representative institute takes over all of the dataRegistering DOIs only for data managed by real organizations among

the members of the project

ID

metadata

Data

Register

Create Register Modify

saveCreate publish Modify remove

Researcher

LibraryInstitutional Repository

Life cycle of data and stakeholders- in case of literature -

50

ID

metadata

Data

Register

Create Register Modify

saveCreate publish Modify remove

Life cycle of data and stakeholders- in case of data -

51

Create Register Modify

Researcher

Library Research Institution

Project

JaLCMetadataDomainMetadata

52

Issues in Data DOI (cont’d)

• Granularity of data in registration– Some aspects for granularity of data• Good for citation• Granularity of data itself

– Observation data/Experiment data/Simulation data• Easy for access• Easy for management• Quantity of data

53

Issues in Data DOI (cont’d)

• Dynamics of data− Adding data after registration of DOI− Some options:

− Different DOIs− Add relationship metadata to denote the relation to the original

DOIs− Use the original DOI

− Versioning: add the link to the new data while keep the link to the original data

− History of changes in the single DOI− No descriptions (e.g., data in observing)

54

Issues in Data DOI (cont’d)

• Landing page−Metadata description− For open/closed data

• Quantity of data− Registering DOI for a large amount of data

• Applications− Citing DOIs for research data− Developing other applications

Recommendations for Data DOIs

• Recognition of variety of the nature of data• Minimal Commitment– Persistency, Interoperability, Usability,

manageability• Design own DOI registration policy

ID for Researchers

ORCID(Open Researcher and Contributor Identifier)

• ID for researchers and contributors of research to identify uniquely

• Managed by ORCID, Inc. (NPO)   2011-– Members: STM publishers, universities, funding agencies

• Service started in October, 2012• How to use ORICD– When submitting manuscripts– Author information in articles– Faculty Management– …

Metadata Management

Linked Data

• Network of metadata• Sharing metadata

among RA– CrossRef– DataCite– (JaLC) Image

Title

Yokohama Museum

Isamu [email protected]

1989

近寄るとなぜか覗きたくなってしまう「真夜中の太陽」越しに「無言のうちに歩いている」を見る。いつもと違った作品に出会えます。

Description

WorkURI

URI

CreatorURI

3-4-1, Minato Mirai, Nishi-ku, Yokohama

045-221-0300

MuseumPlaceURI

真夏の太陽

DateCreator

Is_located_inLabel Address

Phone

Category

Image

Image

NameE-addresswikipedia

Summary

• Open Science backed by data-sharing• Data-sharing architecture – Interoperability should be guaranteed– Layers

• ID/Metadata Schema/Metadata/Data format/Data/Repository– Cooperation and Competition

• DOI is the promising ID for data but different in use from one for literature– DOI registration policy is needed