cris oar project presentation
DESCRIPTION
Presentation on the Knowledge Exchange project on metadata standards for exchange between Current Research Information Systems and Open Access Repositories. Presented at EUROCRIS, Aalborg, Denmark, 4 June 2010 by Mogens Sandfaer.TRANSCRIPT
CRIS
OAR
INTER
OPERA
+
+
Knowledge Exchange
CRIS-OAR interoperability project
publication metadata
CRIS
OAR
INTER
OPERA
+
Knowledge Exchange is an international co-operative effort
that supports the use and development of e-infrastructures
for higher education and research.
Partners are:
Denmark’s Electronic Research Library (DEFF)
German Research Foundation (DFG)
Joint Information Systems Committee (JISC) in UK
SURF foundation in the Netherlands�
CRIS
OAR
INTER
OPERA
+Motivation: Enable broad collaboration in the
information management of research publications
Current Research Information Systems
a label for research management systems of various
types, dealing with many aspects of research activities
contain metadata on research publications
Open Access Repositories
a label for for open research output archives aiming at
preservation and dissemination of publications etc.
contain metadata on research publications
They share the challenge of achieving full metadata
coverage for the publications within their scope
CRIS
OAR
INTER
OPERA
+
If CRIS and OAR easily could exchange metadata about
publications, they could support each other
But CRIS and OAR have grown out of different
communities and have developed rather different
approaches to publication metadata
If a university has a CRIS and an OAR, generally a
publication must be registered twice to comply with
both systems’ requirements
Both CRIS and OAR strive to be complete in their
coverage of publications – both would benefit from
collaboration – not to mention the authors/researchers.
Motivation: Enable broad collaboration in the
information management of research publications
CRIS
OAR
INTER
OPERA
+
CRIS use a variety of formats – some use CERIF
(or variants thereof) and some use various local or
national formats
In many disciplines, publications are of global interest
and are often results of international collaboration
They are often of interest to more than one CRIS
CRIS with different formats would benefit from an easy
and precise mechanism to exchange publication
metadata
Motivation: Enable broad collaboration in the
information management of research publications
CRIS
OAR
INTER
OPERA
+
OAR use a variety of formats – some use Dublin Core
(or variants thereof), some use library formats such as
MARC and MODS, and some use use various local or
national formats
In many disciplines publications are of global interest
and are often results of international collaboration
They are often of interest to more than one OAR
OAR with different formats would benefit from an easy
and precise mechanism to exchange publication
metadata
Motivation: Enable broad collaboration in the
information management of research publications
CRIS
OAR
INTER
OPERA
+Aim and purpose
To increase the metadata interoperability
between CRIS and OAR systems
and thus also
between CRIS and CRIS with different formats
between OAR and OAR with different formats
by defining and proposing
1.a metadata exchange format for publications
2.a set of common vocabularies for key elements
CRIS
OAR
INTER
OPERA
+Project participants
UK - JISC DE - DFG NL - SURF DK - DEFF
Rosemary Russell,
UKOLN
Michael Day,
UKOLN
Simon Lambert,
Rutherford Appleton
Wolfram Horstmann,
Bielefeld University
Najko Jahn,
Bielefeld University
Friedrich Summann,
Bielefeld University
Max Stempfhuber,
Aachen University
Marga van Meel,
KNAW
Arnoud Jippes,
KNAW
Ed Simmons
Nijmegen Univ.
Adrian Price,
Copenhagen Univ.
Mikael Elbaek,
Technical Univ. DK
Mogens Sandfaer,
Technical Univ. DK
Project
manager
Project
director
CRIS
OAR
INTER
OPERA
+Building new bridges in the old world
This metadata island knows
well what is doing - Good
reasons govern its choice
of format and vocabulary
Not designing new (and better) worlds
This metadata island knows
well what is doing - Good
reasons govern its choice
of format and vocabulary
good
CRIS
OAR
INTER
OPERA
+Building new bridges in the old world
This metadata island knows
well what is doing - Good
reasons govern its choice
of format and vocabulary
Not designing new (and better) worlds
This metadata island knows
well what is doing - Good
reasons govern its choice
of format and vocabulary
good
We (simply) build a bridge
that will enable these islands to communicate
- without changing their language and life style.
That will allow them to exchange publication metadata
without studying and understanding the particularities of the other part.
CRIS
OAR
INTER
OPERA
+Challenges stemming from
different missions of formats
The different nature (and tasks) of
CRIS formats
Repository formats
The granularity challenge
CRIS
OAR
INTER
OPERA
+The different nature of CRIS
and repository formats
Typical CRIS main entities and their relations
(many triples & many detailed fields)
CRIS
OAR
INTER
OPERA
+The different nature of CRIS
and repository formats
Simple
Dublin Core
15 fields in
a single flat
structure
Aimed at the
description of
some sort of
“document”
May be
enhanced to
provide more
granularity
and specificity
But mostly isn’t
CRIS
OAR
INTER
OPERA
+Bridging publications metadata
CRIS formats are characterized by their
broader view on research information depicting research
results as well as the actors and various environmental
factors in their own right
(often) high level of detail and specificity in describing the
various entities (very granular and precise)
ability to handle the dynamics of time – as everything else
but research publications changes over time as well as
their interrelations
CRIS
OAR
INTER
OPERA
+Bridging publications metadata
OAR (DC) formats are characterized by their
Narrow view on depicting research results – generally
publications
(mostly) low level of detail and specificity in describing the
various aspects (less granular)
absence of need to handle the dynamics of time – as they
deal with research publications tied to a specific point in
time
CRIS
OAR
INTER
OPERA
+Bridging publications metadata
Implode the relational/network nature of the
CRIS formats to a single structure – adequate for
describing publications
Design the field/element hierarchy so that highly
granular as well less granular metadata may be
represented – without loss of information
CRIS
OAR
INTER
OPERA
+
DRIVERDC
CERIF
NARCISMODS
DDF-MXD
DRIVER
DRIVER
DRIVER
DRIVER
Metadata exchange
format and vocabulary
METIS
ePrintsdefault
Project approach
CRIS
OAR
INTER
OPERA
+ Project approach
1. Analyze metadata practices of CRIS and OAR
Looking at formats in actual use at KE partners
Chart entities and granularities, similarities, differences
CRIS
OAR
INTER
OPERA
+ Project approach
2. Define entities/elements/attributes to be exchanged Respecting differences in granularity
So that metadata may be exported without loss of information
So that the format may be used by very granular
environments as well as less granular
3. Define/propose common exchange vocabulary
For the identified key concepts/entities
4. Define/propose common exchange syntax
Handle differences in granularity
CRIS
OAR
INTER
OPERA
+Some potential use cases
CRISOAR
OARCRIS
CRISCRIS
OAROAR
CRIS/OAROpenAIRE (EU Open Access pilot)
PublisherCRIS/OAR
Subject repositoryCRIS/OAR (institutional)
CRIS
OAR
INTER
OPERA
+Over to Mikael
CRIS
OAR
INTER
OPERA
+Based on ideal examples – ”use
cases”
CRIS
OAR
INTER
OPERA
+Ideal example of a publication
CRIS
OAR
INTER
OPERA
+Basic idea evolved
To carrie both the highest granularity (CRIS) and the lowest
level (OAR?)
+The DC elements are used as a
baseline.
Title
Creator
Subject
Description
Publisher
Contributer
Date
Type
Format
Indentifier
Source
Language
Relation
Coverage
Rigths
+Main entities of interest
The publication is in focus and other entities are in relation to the publication
CRIS
OAR
INTER
OPERA
+Person
CRIS
OAR
INTER
OPERA
+Organisation
CRIS
OAR
INTER
OPERA
+Event
CRIS
OAR
INTER
OPERA
+Project
CRIS
OAR
INTER
OPERA
+Publication
CRIS
OAR
INTER
OPERA
+Person in more details
CRIS
OAR
INTER
OPERA
+Vocabularies
Person
Role
Description: role is the person role in
relation to the publication.
Terms:
Author
Primary Author
Corresponding Author
Editor
Publisher
Translator
Illustrator
Inventor
Supervisor
CRIS
OAR
INTER
OPERA
+Publication in detail – type, review
and
CRIS
OAR
INTER
OPERA
+Publication types
Publication
Type
Description: the format does provide a gross list of publication
types based on an analysis of the formats analysed in the project.
A mapping between the different systems and formats in the
analysis can be found on a web page.
Mapping between common vocabularies can be found at:
http://weekschild.uci.ru.nl/KE/?select=all
The formats analysed: CERIF2008, MODS/DIDL, DRIVER_DC, DDF-
MXD; EPrints, METIS, PURE
CRIS
OAR
INTER
OPERA
+Publication types (terms)
Journal Letter
Journal comment
Journal review article
Journal book review
Book
Book chapter
Book preface
Conference paper
Conference abstract
Conference poster
Conference talk
Thesis Doctoral
Thesis PhD
Thesis Master
Working paper, preprint
Report
Report chapter
Lecture Notes
Lecture
Memorandum
Net publication
Patent
Software
Data set
Newspaper article
Radio/TV broadcast
Exhibition catalogue
Student report
Other
CRIS
OAR
INTER
OPERA
+Vocabularies - Versions
Version
Description: This element and vocabulary is expressing the version of
the document i.e. draft or published version of the document. The terms
are based on the VERSIONS toolkit excluding the term “updated”.
Important! Different versions should be self contained and constitute
individual records. This mirrors best-practices for repositories but not
always the case for CRIS.
Terms:
Draft i.e. working paper
Submitted i.e. pre print
Accepted i.e. post print
Published i.e. publisher edition
Updated i.e. reprint
VERSIONS project: http://www2.lse.ac.uk/library/versions/
Let’s test it!
CRIS
OAR
INTER
OPERA
+The challenges for interoperability
Discussion!