metadata concepts / use in climate research
DESCRIPTION
Metadata Concepts / Use in Climate Research. Stephan Kindermann , Martina Stockhause German Climate Computing Center (DKRZ) Hamburg, Germany. Overview. Metadata descriptions: sources, usage data level, preservation level, model level, domain knowledge level - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/1.jpg)
Metadata Concepts / Use in Climate Research
Stephan Kindermann, Martina Stockhause
German Climate Computing Center (DKRZ)
Hamburg, Germany
![Page 2: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/2.jpg)
Overview
Metadata descriptions: sources, usage
data level, preservation level, model level, domain knowledge level
Metadata standards, IT-principles
![Page 3: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/3.jpg)
A) Metadata descriptions: sources, usage
(I) Data Description Level:
source: model run output format: gib, netcdf3/4 container formats (including basic metadata) metadata homogenization („Climate and Forecast Convention (CF)“
conformance, CMOR2 compliance, controlled vocabs)
usage: analysis tools, data access script, data search ( „linked data principle“)
(II) Data Preservation Level:
target: legacy data centers (e.g. WDCC) format: internal DB, various external formats, e.g. ISO 19139, DIF, .. usage: long term data storage and access, citation e.g. using DOIs
![Page 4: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/4.jpg)
A) Metadata descriptions: sources, usage
(IIl) Model Description Level: source: Researcher interviews, online questionnaire
format: CIM (Climate Metadata for Climate Modelling Digital Repositories - Metafor FP7)
Con-CIM: UML, APP-CIM: XSD + vocabs)
usage: model intercomparison, scientific portals, information space browsing / search
(lV) Semantic Annotion Level: source: data metadata, model metadata, domain knowledge
metadata format: OWL (RDF) usage: user navigation in portals, „faceted search“ etc. deployments: Earth System Grid CMIP5 portal, IS-ENES portal
![Page 5: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/5.jpg)
.. Short Background Info ..
The Fifth Coupled Model Intercomparison Project (CMIP5)
– Sponsored by the WMO WGCM
– Quality Controlled Data to (eventually) appear in the IPCC Data Distribution Centre…
• World Wide Data Management Infrastructure building effort, consistent wflow from
producers to consumers...
In Numbers:
Simulations:~90,000 years~60 experiments~20 modelling centres using~30 major(*) model configurations~2 million output datasets~10's of petabytes of output
~2 petabytes of CMIP5 requested output~1 petabyte of CMIP5 “replicated”output– Which will be replicated at BADC& DKRZ, to arrive in 2010/2011!~10 TB of land-biochemistry (from thelong term experiments alone).
![Page 6: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/6.jpg)
B) Metadata standards, IT principles
(I) Data Description Level:
Grib, netcdf
data containers
10`s of PBytes
Metadata
Data
File naming convention based on CVs building uniform URIs (DRS, Data Reference Syntax)
Activity/Product/Institute/Model/Exp/frequ/realm/Variable/ensemble
Data serversMD catalogue
servers
wget http://server.org/Activity/Product/../ensemble
Enabling „linked data“
![Page 7: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/7.jpg)
B) Metadata standards, IT principles
(II) Data Preservation Level:
CERA2 DB
schema
OWL conceptual model
Tape Archive
search API
QC, DOI assignment
, ..
WDCC Metadata Concept
CERA GUI IS-ENES Portal …
•Scalability
•SustainabilityCommonCV
•Flexibility
•User friendly GUIs
OAI-PMH
ISO 19139
…
![Page 8: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/8.jpg)
B) Metadata standards, IT principles
(III) Model Description Level:
Metafor FP7 project: Common Information Model (CIM)
Formal metadata model of the climate modelling process
It includes descriptions of the experiments being undertaken, the
simulations being run in support of these experiments, the software
models and tools being used to implement the simulations and the
data generated by the software.
CMIP5 use case: CV collection, CMIP5 questionnaire
![Page 9: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/9.jpg)
CONCIM (UML)
APPCIM (XSD)
CIM Instances (interliked XML files)
ISO, Geographic Markup Language
(GML) series
Automatic translation
CMIP5 portal(s)IS-ENES portal
Metafor catalogue
Metafor CIM overview
![Page 10: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/10.jpg)
Metadata collection
![Page 11: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/11.jpg)
![Page 12: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/12.jpg)
Automatic XML RDF translation
CMIP5 gateway(s)
IS-ENES1 portal
1Infrastructure for the European Network for Earth System Modelling
ESG OWL instances
![Page 13: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/13.jpg)
(CON)CIM Overview
Quality
ISO
Shared
Data
Activity: simulations in support of experiments
Software(hierarchical model components, Coupled together)
Grids
![Page 14: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/14.jpg)
B) Metadata standards, IT principles
(IV) Semantic Annotation Level
CIMXML
RDF
Data objectXML
Communitycontent
Content Management
System
RDF
TripleStore
Portal(s)
ESG Gateways
IS-ENESPortal
Evolving OWL model
TripleStore
OWL ontologies:
http://ontologies.ucar.edu/owl
Rel.DB
![Page 15: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/15.jpg)
CMIP5 Quality Control
Files Data Metadata CIM Metadata
Datain prescribedDRS Syntax
Data QualityChecks L2
MD QualityChecks L2
THREDDSData Server
MD on data
Metafor / CIMQuestionnaire
MD onmodel+simulation
QC DB
Quality MD
MetadataRepository
Data MD Information MD
![Page 16: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/16.jpg)
CMIP5 STD-DOI Publication
TIB:DOIRegistrationAgency
Data Node Metadata
THREDDSData Server
MD on data
QC DB
QualityMD
Data MD InformationMD
Filesystem
Data
LongtermArchive
Data QualityChecks L3double check,cross checks
STD-DOICatalogue
STD-DOI MD Information MD
WDCC:DOI Publication Agent
DOI Target Page
access todata andmetadata
Metafor / CIMMD on
model+simulation+data+quality
![Page 17: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/17.jpg)
B) Metadata standards, IT principles
(IV) Semantic Annotation Level
CIMXML
RDF
Data objectXML
Communitycontent
Content Management
System
RDF
TripleStore
Portal(s)
ESG Gateways
IS-ENESPortal
Evolving OWL model
TripleStore
OWL ontologies:
http://ontologies.ucar.edu/owl
Rel.DB
![Page 18: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/18.jpg)
IS-ENES Info Portal
![Page 19: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/19.jpg)
![Page 20: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/20.jpg)
2010-07-07 16:49:13 INFO triplestorefill.utility Adding item<ComponentModel at /test7/echam> with ID echam athttp://localhost:8080/test7/echam2010-07-07 16:49:13 INFO triplestorefill.sesameconnector Storing RDF...(1118 byte)2010-07-07 16:49:13 INFO triplestorefill.sesameconnector RDF data:@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dc: <http://purl.org/dc/elements/1.1/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@prefix isenes: <http://www.enes.org/isenes#> .isenes:echam rdf:type isenes:ComponentModel .isenes:echam foaf:page <http://plone.dkrz.de/test7/echam> .<http://plone.dkrz.de/test7/echam> foaf:topic isenes:echam .isenes:echam dc:title "ECHAM" .isenes:echam rdfs:label "ECHAM" .isenes:echam rdfs:comment "Global circulation model" .isenes:dkrz isenes:isResponsibleFor isenes:echam .isenes:echam isenes:hasResponsible isenes:dkrz .isenes:joachim-biercamp rdfs:label "Joachim Biercamp" .isenes:joachim-biercamp rdf:type foaf:Person .isenes:dkrz rdfs:label "DKRZ" .isenes:dkrz rdf:type foaf:Organization .isenes:joachim-biercamp isenes:isMemberOf isenes:dkrz .isenes:dkrz isenes:hasMember isenes:joachim-biercamp .isenes:dkrz dc:title "DKRZ" .isenes:joachim-biercamp foaf:mbox "[email protected]"
„save“
Triple Store
![Page 21: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/21.jpg)
(B) From a user`s perspective
Bildchen: Plone seite mit „related info“ portlet
![Page 22: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/22.jpg)
(B) From a user`s perspective
Bildchen: Plone Seite nach Klick auf „related“ link: faceted search
![Page 23: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/23.jpg)
Summary
• international CMIP5 / IPCC effort is key driver for collection / standardization of CVs, Metadata, conceptual models (Ontologies)
• Metadata mainly used for model intercomparison, uniform data search / access + data processing
Prepare for Climate Impact Community use cases !!
![Page 24: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/24.jpg)
..workshop reminder..
- Usage and quality of descriptive keyword type of metadata used in your domain to manage
data.
- Types of usages of this metadata (management, retrieval, research statistics, machine
processing, etc).
- The standards used for your metadata descriptions (structure, elements, vocabularies).
- Adherence to common IT principles (explicit syntax, registered semantics, use of PIDs, etc).
- Compliance with the recommendations to be found in the report of the e-IRG task force on
Data Management http://www.e-irg.eu/publications/e-irg-task-force-reports.html
..therefore we would like the presenters to focus on a few points allowing all of us to draw conclusions at the end:
![Page 25: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/25.jpg)
Methodology to create CMIP 5 CIM instancaes
![Page 26: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/26.jpg)
Producers: providers of models, tools, model
results, HPC ecosystem, Grid .., community
Motivation Consumers: ENES community, impact
community
Virtual Earth System Modeling Resource Centre
Portal
E-infrastructure components
GovernanceAgreements,
Commitments,Sociology,..
TicketingCollaboration
Metadata (CIM,..) Protocols
APIs
AAICMIP5/AR5/+data services
![Page 27: Metadata Concepts / Use in Climate Research](https://reader036.vdocuments.mx/reader036/viewer/2022081514/56814c08550346895db9095c/html5/thumbnails/27.jpg)
IS-ENES vERC Portal
(A) Community info presentation (models, tools, descriptions,..)
Content Management Sytem (CMS, Collab.Tool)
Requirement E-Infra component Technology used
Plone + IS-ENES „content-types“
(C) Data portal to AR5 archives Web Framework Zope/Plone plugin(s)
(F) Additional value provisioning
„Cross-selling“Semantic interlinking
RDF triple store (Sesame)
(D) CIM metadata (external) Metafor service(s)(external) ESG-gateway
(E) External content / metadata collection
Web service (proxies)
Info (XML) harvesterPython info collector based using Atom, OAI-PMH,.. protocols
(B) Community development support Project Management / Ticketing Tool
Redmine