preservation of scientific data (in natural sciences)

14
Preservation of Scientific Data (in Natural Sciences) 1 Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004 Network of Expertise in Digital Preservation Preservation of Scientific Data (in Natural Sciences) Thomas Severiens Institute for Science Networking at the Carl von Ossietzky University Oldenburg, Germany [email protected]

Upload: buffy

Post on 05-Jan-2016

52 views

Category:

Documents


3 download

DESCRIPTION

Preservation of Scientific Data (in Natural Sciences). Thomas Severiens Institute for Science Networking at the Carl von Ossietzky University Oldenburg, Germany [email protected]. Overview. Introduction Primary Data Running Implementations and Developments - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 1Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Preservation of Scientific Data (in Natural Sciences)

Thomas SeveriensInstitute for Science Networking

at the Carl von Ossietzky UniversityOldenburg, Germany

[email protected]

Page 2: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 2Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Overview Introduction Primary Data Running Implementations and Developments Requirements and Status

Volume of Data to be preserved Requirements by the Users

Aspects of a Business Model for Preservation Conclusions

Page 3: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 3Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Introduction

ISN – Institute for Science Networking Survey on status of Preservation of

Primary Data in Germany (and its neighbours)

Page 4: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 4Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Primary Data 1 Examples:

Weather observation data Space observation data Accelerator detector data Surveys (-> next presentation by R.van Horik) Data in Medicine Genetic sequence data Data in crystallography (Sound and Video)

Page 5: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 5Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Primary Data 2

…are binary encoded streams of pure data, mostly.

…mostly, are not saved in XML format, but are optimized for processing in science workflow.

Page 6: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 6Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Primary Data 3 …build the basis for all scientific work and

publication, …are very expensive of even impossible to

reconstruct Neutrino flow during a supernova in our

neighbourhood Weather observation data Measuring data form high energy colliders already

broken down Medicine data giving information on long term

development in health of special groups over several centuries

Page 7: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 7Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Primary Data 4

…often contain information, which is still undiscovered Example: Radius of the proton

…are the key to identify scientific falsifications of articles Schön’s articles would never have been

published, if he had to publish the primary data of the experiments

Page 8: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 8Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Status of Preservation Preparation

Volume: in Germany about 1.000 TByte every year

Format: 70% binary encoded, 20% ASCII encoded, 10% XML or similar

Self-description or Metadata: about 65% contain within the stream or extra file within archive

Media: DLT (60%), DAT, CD-Rom, …

Page 9: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 9Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Status of Preservation Preparation Access: all institutions do allow access for

colleagues from science, most do not allow access for commercial reasons.

Institutionalisation: still all institutions run their own archiving system.

Selection strategies: not developed at all. Business Model: all institutions:

“Preservation is of high public interest, so the government (=tax payers) should pay.”

Page 10: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 10Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Status and Implementations Weather Service (Deutscher Wetterdienst):

Has to preserve all data by law. Runs distributed computer pools. Offers access to the raw data, earns money with computed data.

World Data Centers: world-wide network of (mostly) global observation institutions (52 institutions in 12 countries). Since 1956. Share data in their archives to keep it available and alive. Implemented standards and auditing system.

Page 11: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 11Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Conclusions 1 Survey showed: process of shaping the

awareness on requirements of and for long term preservation of primary data is in very early stage in most fields (except from astronomy, high energy physics, and global observation).

Primary data are key use case within every preservation implementation, because here preservation helps to save much more money than it ever will cost, even on the short time scale.

Page 12: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 12Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Conclusions 2 Experience, Expertise, and Standards are

available on selected fields, but should be published to get their knowledge in the broad and to develop open implementations.

Technical work of preservation is done by many institutions in parallel. In many fields without any standards at all. Synergy effects could be used for the benefit of all.

Preservation is the job for experts on this field in co-operation with experts on the datatype.

Preservation of primary data must be part of a globally co-operating network.

Page 13: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 13Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

References Nestor project:

www.langzeitarchivierung.de Deutscher Wetterdienst (German weather

service): www.dwd.de

World data centers (overview site): www.ngdc.noaa.gov/wdc/wdcmain.html

Page 14: Preservation of Scientific Data (in Natural Sciences)

Preservation of Scientific Data (in Natural Sciences) 14Chinese-European Workshop on Digital Preservation, Beijing, July 14 – 16 2004

Network of Expertisein Digital Preservation

Thank you for your attention!

Thomas SeveriensInstitute for Science Networking

at the Carl von Ossietzky UniversityOldenburg, Germany

[email protected]