pangaea - dini.de · dini jahrestagung, göttingen – 2017-10-05 what is pangaea? • information...

16
DINI Jahrestagung, Göttingen – 2017-10-05 PANGAEA Data Publisher for Earth & Environmental Sciences Michael Diepenbroek

Upload: others

Post on 18-Sep-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

PANGAEA

Data Publisher for Earth & Environmental Sciences

Michael Diepenbroek

Page 2: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

What is PANGAEA?

• Information system for long-term archiving and publication of data from earth & environmental sciences (since 1993)

• Accredited by the „World Meteorological Organisation“ (WMO) as „World Radiation Monitoring Center“ (WRMC) (since 2007)

• Accredited by the „International Council for Science“ (ICSU) as World Data Center „Publisher for Earth & Environmental Science“ (World Data Center) (since 2001)

Page 3: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

PANGAEA - contents

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

IRD

( gr av/ 10 cm 3)

Sand

( %)

CaCO3

( %)

TOC

( %)

Radio

( %/ sand)

Smect

( %/ clay)

PS1389-3 PS1390-3 PS1431-1 PS1640-1 PS1648-1

Age (kyr) max. : 233.55 kyr PS1389-3ff

0.0

100.0

200.0

0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100 0 20 0 100 0 15 0 0. 5 0 50 0 100

54° 0' 54° 0'

54°30' 54°30'

55° 0' 55° 0'

55°30' 55°30'

11°

11°

12°

12°

13°

13°

14°

14°

15°

15°

World vector shore line

Grain size class KOLP A

Grain size class KOEHN2

Grain size class KOEHN

Geochemistry

Grain size class KOLP B

Grain size class KOLP DIN

20 m

Scale: 1:2695194 at Latitude 0°

Source: Baltic Sea Research Institute, Warnemünde.

• Integral part of science – More than 160 European to

international projects since 1995 (https://www.pangaea.de/projects)

• highly heterogenous &dynamic

• multidisciplinary

Hydrosphere

Human Dimensions

Biosphere

Cryosphere

Lthosphere

Atmosphere

Number of data sets ~360.000

Number of data items ~14 Billion

Data volume <3 PB

Increase ~5% per year

1.000.000.0002.000.000.0003.000.000.0004.000.000.0005.000.000.0006.000.000.0007.000.000.0008.000.000.0009.000.000.000

10.000.000.00011.000.000.00012.000.000.00013.000.000.00014.000.000.00015.000.000.000

cumulative growth

Page 4: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

DataCite

Google

OCLC

Thomson Reuters

EUR-OCEANS

CARBOOCEAN

OBIS

GBIF

IODP

ICSU WDS

PubMed Central

OpenAire

WMO-IS

PANGAEA – interoperability

Dublin Core

STD-DOI

ISO19115

PANGAEA

data management &

longterm archiving

RDB

catalogues

XSLT

Index

protocols

marshaller

WS (SOAP/WSDL)

Frontends /

portals

Elsevier,Scopus …

OGC CSW

Geoserver (OGC)

OAI-PMH

WS (SOAP/WSDL)

INSPIRE

DOI registration

catalogues

DOI registry

DIF

Dublin Core harvester

ISO19115 harvester

GEOSS

Darwin Core

DIGIR

Darwin Core

DIF

harvester

harvester

gml, kml

PANGAEA

web frontend

GFBio

Page 6: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Cross-referencing, linking

Publications

Researchers

Samples

Organisms

Sequences Projects

Page 7: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Data Publishing – Cross-referencing

Page 8: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Data Publishing – Cross-referencing

Page 10: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

DOC

PDF

CSV

NetCDF

TXT

XML

XLSX

XLS

GRIB

OECD principles and guidelines for access to research data (2007)

• Licenses & persistent identification (DOI) • Quality

QA/QC -> review procedures Harmonization of data -> ontologies

• Efficiency (Meta)data & interoperability standards

(mashine readable)

FITNESS OF USE!

Data Set

Data Set

Data Set

Data Set

Data Set

Data Set

Data Set

Data Set

Data Set

Data publication - prerequisites

Page 11: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Data submission

Editorial review &

processing

Archiving author

proof read

Publication registered & citable - DOI

Data Publishing – simplified workflow

Page 12: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Fitness for Use - Initiatives

• RDA/WDS Data Publishing Workflows WG • Certification of data centers/repositories • FAIR principles • GEO label facets • ESIP Information Quality Cluster

• Literature!

Page 13: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Fitness for Use - Assessment & Roles

• Certification authority – Reviewers

• Data center / repository – Data editors / reviewers

• User – Downloads, social tagging

Current approaches

F A I R F A I R 2 User Reviews

1 Archivist Assessment

24 Downloads

2 User Reviews

1 Archivist Assessment

24 Downloads

F A I R 2 User Reviews

1 Archivist Assessment

24 Downloads

TrustSeal Repository

TrustSeal Software

TrustSeal Data

5 ★ OPEN DATA

Page 14: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

WDS/RDA Publishing Data IG WDS/RDA Certification of Digital Repositories IG

Assessment of Data Fitness for Use

Helena Cousijn Claire Austin Jon Petters

Michael Diepenbroek

Page 15: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Lessons learnt

• Multidisciplinarity

• Generic & flexible technical infrastructure

• Flexible business model

• Linkage to international developments

• Moving target!

Page 16: PANGAEA - dini.de · DINI Jahrestagung, Göttingen – 2017-10-05 What is PANGAEA? • Information system for long-term archiving and publication of data from earth & environmental

DINI Jahrestagung, Göttingen – 2017-10-05

Costs

• Overall annual budget -> ~1 Mio Euro

• Staff -> ~24, >2/3 for curation

• Open access

• Basic operation -> host institutions (AWI, marum ~15%)

• Further development -> third party funds

• Curational costs -> third party funds

– Open science policy -> EU, DFG, BMBF