conception of citing scientific primary data (result of codata wg, supported by dfg)
DESCRIPTION
Conception of Citing Scientific Primary Data (Result of CODATA WG, supported by DFG). Michael Lautenschlager WDC for Climate Max-Planck-Institut für Meteorologie IDF Coordination Meeting, Hannover, 04.09.2003. Who are we? - PowerPoint PPT PresentationTRANSCRIPT
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 1
Conception of Citing Scientific Primary Data
(Result of CODATA WG, supported by DFG)
Michael LautenschlagerWDC for Climate
Max-Planck-Institut für Meteorologie
IDF Coordination Meeting, Hannover, 04.09.2003
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 2
CODATA
Who are we? CODATA, the Committee on Data for Science and Technology,
is an interdisciplinary Scientific Committee of the International Council for Science (ICSU). We are established over 30 years and our secretariat is housed at 51, Bld de Montmorency, 75016 Paris, France.
What are our objectives?In short, the reason for CODATA is to help foster and advance
science and technology through developing and sharing knowledge about data and the activities that work with data.
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 3
CODATA WG
CODATA National Committee initiated WG, grant-aided by DFG
Working PeriodSeptember 2001 to May 2002
ResultFinal Report "Konzept zur Zitierfähigkeit wissenschaftlicher
Primärdaten" or "Conception of Citing Scientifc Primary Data", Hannover, 29.05.2002
Continuation
One year project for pilot implementation funded by DFG
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 4
WG-Members
Carola Kauhs (Head of Library, Max-Planck-Institut für Meteorologie, Hamburg)
Dr. Michael Lautenschlager (WG-Speaker and Director WDC for Climate; Gruppe Modelle und Daten am Max-Planck-
Institut für Meteorologie, Hamburg)
Dr. Manfred Reinke (Scientific Information Systems, Stiftung Alfred-Wegener-Institut für Polar- und
Meeresforschung, Bremerhaven)
Prof. Dr. Gerhard Schneider (Head of Computing Centre, Universität Freiburg)
Dr. Irina Sens (Deputy Head of Technische Informationsbibliothek und Universitätsbibliothek Hannover)
Dr. Uwe Ulbrich (Institute für Geophysics and Meteorology, Universität zu Köln)
Dr. Joachim Wächter (Head of Data and Computing Centre, GeoForschungsZentrum Potsdam)
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 5
Content
Problems
Concept
Scientific and Technical Data DOI
Pilot Project
Cost Model
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 6
WG Constraints
Limitation to geo-referenced dataPrimary data with defined space-time relation, e.g.. observational
stations, satellites, climate modells
Limitation to research Data Especially data from time limited projects
Widely dispersed, not long-term saved, poorly documented
Exclusion of data from civil services and agenciesCentrally archived and documented, but access restrictions
Partly scale of charges and fees for dissemination
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 7
Problem and Solution
Shortcomings in data provision and interdisciplinary useRules of good scientific practise are not taken into account in all cases.Data sources are widely unknown.Data are achived without context.
Method of resolution: publication of primary dataPersitent Identifier (PI) for long-term data referencingIndividual scientists will be motivated to document and to customise their
primary data.
Preferred dissemination by InternetStandard in scienceAllows for direct data access
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 8
Credits in Science
"Citation Index": Scientific efficiency is "measured" by publications.
Extra work for data publication is currently not acknowledged.Data processing, context documentation, quality assurance.
Recommendation: Data publications should be included in the "Citation Index".Motivation of the individual scientist.Connection between person and primary dataset.
Citable Data publicationssupport the rules of good scientific practise.encourage inter-disciplinary data utilisation.
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 9
Publication of Primary Data in Journals
Scientific JournalsRestriction to original scientific workOnly limited interest in data publicationCopy rights on the data are shifted to publishers
Example: CristallographyMeasured spectral data of scientific publications are collected by the
publisher in a central databaseData access is controlled by the publishers, only limited decision about
data access by sciences
Primary data are considered as self-contained entitiesDatabases and data products are fundamentals for different
publicationsHow to reference and to cite primary data entities?
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 10
Persistent Identifier
Concept developed for web publications:Uniqueness
Identification of units of intellectual property
Metadata kernelDescription of referenced entity
ImmutabilityIdentifier are allocated nonrecurring, entity left unchanged
Stable connection Connection between identifier and referenced entity is stable
Central resolutionEntity must be accessible by the identitfier
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 11
Criteria for PI Allocation
Critical points are securing of data quality and stable connection between identifier and data entity
PI allocation is restricted to syntax control and completeness, i.e. expert data description and long-term archiving
Scientific quality assurance is done by the author / originator.
High-quality data sets achieve good positions in the "Citation Index"
Stable connection between PI reference and data entity as well as long-term availability of the primary data are essential.
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 12
DOI-System
"Digital Object Identifier" (DOI) identifies and administers units of intellectual properties independent of the form and the granularity.
DOI consists of an organisation dependent prefix and the identifier.10.1007/s102360100001
DOI connects object identificiation with URL (= storage location of the object) and with metadaten kernel (= description of the object)
Global handle system is provided by IDF (International DOI Foundation), consistent entry point
Commercial application: Links to publications across different publishers
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 13
URN
Uniform Resource Name (URN)Supervision by IETF (Internet Engineering Task Force)Similar structure and functionality compared with DOI
urn:nbn:de:gbv:089-33217752945
ApplicationNon-commercial usage in library projects (e.g. registration of online
dissertations by the DDB)
Central resolution system comparable to DOI is yet not implemented
Perferable PI for scientific primary data is presently the DOI
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 14
Application Profile: STD-DOI(Scientific and Technical Data DOI)
Concept: DOI for primary data in responsibility of sciences Allows for access regulations without commercial background andCopy rights remain by the data originators
Structure: DOI metadata kernel will be expanded by bibliographic specifications which allow for citation as for written publications
Allocation of a STD-DOI will be assessed as data publication
Data set / -entity is then citable as independent object like"Author, publication year: dataset name, STD-DOI"
DOI system does not substitute an expert data model, which is located at the expert level of long-term archiving
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 15
STD-DOI Metadata
On top of the DOI metadata kernel as defined in the handbookAdditional items for citation of electronic documents are the basis:
Author STD-DOITitle DOI-KernelSub-title if applicablePublication date STD-DOIInstitution / Publisher STD-DOIData amount / no. of pages STD-DOIPlace of publication STD-DOIIdentification number (DOI, ISBN) DOI-KernelURL DOI-KernelLanguage if applicableEdition / version if applicableVolume / series if applicable
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 16
International DOI Foundation
Registration Agency
Agent Cristall
Agent Weather
Agent Cuneiform
Writing
Data StorageLong-termArchiving
DOI-Prefix: xxx
Sub-Prefixxxx1 xxx2 xxx3
DOI-Metadata Entry
Primary Data
Architecture ofPrimary Data DOI
Contract
Application Profile:STD-DOI
Securing ofCompliance withAllocation Criteria
Global Handle System
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 17
GFZ Geophysics
International DOI Foundation
TIB HannoverRegistr.Agency
M&D/MPIM Climate Models
Marum/AWI Observations
Data StorageLong-termArchivingIn WDC
Data Storage Long-termArchivingIn WDC
Data StorageLong-termArchiving
Global Handle System
DDBURN-Knot
DFG Project "Publication and Citation of ScientificPrimary Data"
M.Lautenschlager (WDCC, Hamburg) / 26.08.03 / 18
Cost Model
Pilot phase: project funding (DFG)
(P1) Feasibility study including overall costs
(P2) Pilot implementation
Operation: Accounting on work load basis
(O1) One-time charge for DOI registration and maintenanceor
(O2) One-time charge for registration and annual charge for maintenance
Support by project funding agencies
O1 "One-time charge" fits better the project funding limitations.
It must be allowed to include STD-DOI and long-term archiving in project grants.