integrating cerif entities in a multidisciplinary e-infrastructure for environmental research data

21
entities in a multidisciplinary e- infrastructure for environmental research data Enrico Boldrini a , Daniela Luzi b , Stefano Nativi a , Fabrizio Pecoraro b a Institute of Atmospheric Pollution Research, National Research Council (CNR-IIA), Sesto Fiorentino, Italy b Institute for Research on Population and Social Policies, National Research Council (CNR-IRPPS), Rome, Italy * CRIS2014 - Rome, 13-15 May 2014

Upload: hiram-spence

Post on 03-Jan-2016

20 views

Category:

Documents


1 download

DESCRIPTION

Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data. Enrico Boldrini a , Daniela Luzi b , Stefano Nativi a , Fabrizio Pecoraro b a Institute of Atmospheric Pollution Research, National Research Council (CNR-IIA), Sesto Fiorentino , Italy - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

Enrico Boldrinia, Daniela Luzib, Stefano Nativia, Fabrizio Pecorarob

aInstitute of Atmospheric Pollution Research, National Research Council (CNR-IIA), Sesto Fiorentino, ItalybInstitute for Research on Population and Social Policies, National Research Council (CNR-IRPPS), Rome, Italy *

CRIS2014 - Rome, 13-15 May 2014

Page 2: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Aims Background Two-way Crosswalk

From ISO 19115 INSPIRE profile to CERIF From CERIF to ISO 19115

Proposal of CERIF extension Proposal of a CERIF profile in ISO 19115 Implementation in a brokering framework Discussion

Index

Page 3: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Aim

From ISO to CERIF:

Providing a CERIF guideline for the description of datasets according to the INSPIRE profile ISO 19115

From CERIF to ISO:Proposing an ISO profile for contextual research information on the basis of CERIF concepts

Extension of the Brokering approach used in environmental

e-infrastructures with contextual research information based on CERIF

Two way crosswalk:

Proposal of different solutions to integrate research context information with environmental datasets

Page 4: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

ISO 19115Geographical Information metadata

ISPIRE Metadata Implementing Rules • Eu Directive to implement ISO

19115 to create a European Union spatial data infrastructure

• Core set of mandatory and optional metadata and related constraints

INSPIRE profileISO 19115

• Part of geographical information suite of standards (19100 series)

• Description of geographic information and services: identification, extent, quality, spatial and temporal schema, spatial reference and distribution of digital geographic data

• more than 400 metadata elements• Provision of rules for valid metadata

extensions

Page 5: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

CERIF

Comprehensive conceptual model on research information and related process suitable for different purposes: management, scientific exchange, evaluation …

E-R based, flexible model based on: Base entities Semantic layer Multiple relationships

Constantly maintained by the euroCRIS community

CERIF version 1.6

Page 6: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Challenges

Citation

CV

Prize

Qualification

ExpertiseAndSkills

Equipment

Facility

Funding

Service

ElectronicAddresse

PostalAddress

Country

CurrencyLanguage

Event

Metrics IndicatorMeasurement

Different domainsscopes

structuressemantics

Page 7: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Mapping from INSPIRE ISO 19115 profile to CERIF

• Straightforward INSPIRE elements have semantically correspondent elements in the CERIF data model

• Inferential mapping both INSPIRE and CERIF can refer to a data dictionary/vocabulary that contains semantically shared terms;

• Convention the CERIF metadata elements can be accommodated to express some mandatory INSPIRE elements by convention of the parties exposing their metadata

Page 8: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

• Semantically correspondent notation with CERIF entity cfResProd and some related elements

• Automatic discovery and interpretation of datasets exposed in RISs using CERIF model

INSPIRE elements INSPIRE Section ISO 19115 Path ISO

Card. CERIF Path CERIF Card.

Dataset title B1.1 MD_Metadata > MD_DataIdentification.citation > CI_Citation.title [1..1] cfResProd > cfResProdName [1..*]

Geographic Bounding Box B4.1 MD_Metadata > MD_DataIdentification.extent > EX_Extent >

EX_GeographicExtent > EX_GeographicBoundingBox [1..*]cfResProd > cfResProd_GeoBBox > cfGeoBBox

[0..*]

Abstract describing the dataset B1.2 MD_Metadata > MD_DataIdentification.abstract [1..1] cfResProd >

cfResProdDescr [1..*]

Dataset keyword B3 MD_Metadata > MD_DataIdentification.descriptiveKeywords > MD_Keywords [1..*] cfResProd >

cfResProdKeyw [1..*]

Unique resource identifier B1.5 MD_Metadata > MD_DataIdentification.citation >

CI_Citation.identifier [1..*] cfResProd > cfResProdID [1..1]

Resource type B1.3 MD_Metadata.hierarchyLevel [1..*] [fixed by the scope to dataset]

Metadata character set - MD_Metadata.characterSet [1..1] [fixed to UTF-8]

Straightforward mapping

Page 9: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Information can be inferred using: • CERIF semantic layer (cfClassId …) and link entities • ISO CodeList dictionary

Important to express roles and topics univocally

INSPIRE mandatory elements

INSPIRE

SectionISO 19115 Path ISO

Card. CERIF Path CERIF Card. CERIF Role specification

Dataset responsible party

B9MD_Metadata > MD_DataIdentification.pointOfContact > CI_ResponsibleParty

[1..*]

cfResProd > cfOrgUnit_ResProd > cfOrgUnit > cfOrgUnitName AND cfOrgUnit_EAddr [AND cfResProd > cfPers_ResProd > cfPers > cfPersName]

[1..*]cfClassId ∈ CI_RoleCode (e.g. “custodian”)cfClassSchemeId=”CI_RoleCode”

Metadata point of contact

B10.1 MD_Metadata.contact > CI_ResponsibleParty [1..*]

cfResProd > cfOrgUnit_ResProd > cfOrgUnit > cfOrgUnitName AND cfOrgUnit_EAddr [AND cfResProd > cfPers_ResProd > cfPers > cfPersName]

[1..*]cfClassId CI_RoleCode (e.g. ∈“pointOfContact”)cfClassSchemeId=”CI_RoleCode”

Dataset topic category B2.1

MD_Metadata > MD_DataIdentification.topicCategory

[1..*] cfResProd_Class [1..*]cfClassId ∈ MD_TopicCategoryCode (e.g. biota)cfClassSchemeId=”MD_TopicCategoryCode”

Inferential mapping

Page 10: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Mapping of information on:• dataset quality and lineage, • temporal reference, • language

INSPIRE mandatory elements

INSPIRE

SectionISO 19115 Path ISO

Card. CERIF Path CERIF Card.

CERIF Role specification

Conformity B7 MD_Metadata > DQ_DataQuality.report [1..*] cfResProd > cfResProd_Meas > cfMeas >

cfMeasName AND cfValJudgeText [1..*] cfMeasName=‘conformity’

Lineage B6.1MD_Metadata > DQ_DataQuality.lineage > LI_Lineage

[1..1] union(cfResProd > cfResProd_Meas > cfMeas > cfMeasDescr) [1..*] cfMeasName=

‘lineage’

Dataset reference date

B5MD_Metadata > MD_DataIdentification.citation > CI_Citation.date

[1..*] cfResProd > cfOrgUnit_ResProd > cfOrgUnit > cfStartDate/cfEndDate

cfClassId=‘author institution’

Metadata date stamp B10.2 MD_Metadata.dateStamp [1..1] cfResProd > cfOrgUnit_ResProd > cfOrgUnit

> cfStartDate/cfEndDate [1..*] cfClassId=‘publisher institution’

Metadata language B10.3 MD_Metadata.language [1..1] cfResProd > cfResProdName > cfLangCode [1..*]

Temporal extent B5.1

MD_Metadata > MD_DataIdentification.extent > EX_Extent > EX_TemporalExtent

[1..*] From minimum to maximum (cfResProd > cfResProd_Meas > cfMeas > cfDateTime) [1..*]

Convention

Page 11: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

New entities related to research products expressing: • condition of access and use, • limitation on public access• dataset language• dataset character codes• + optional ISO information related to the metadata used

A proposal of CERIF extension

Page 12: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Mapping from CERIF to ISO 19115 profileProposal of extensions according to ISO methodology based on CERIF :• project entity • publications linked to dataset+ expansion of ISO concepts providing more information on Organisations and Persons

Page 13: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

GI-cat discovery broker

• GI-cat broker technology powers different projects and initiatives:

• Italian Antactic Data Center (IADC)

• Italian Special project NextData

• CNR GIIDA• ISPRA catalog of catalogs• AfroMaison• Global Earth Observation

System of Systems (GEOSS)• …

GI-cat enables scientific data search across different, heterogeneous data sources. Results are profiled according to the desired model.

Page 14: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Implementation results - GI-cat extensions for CERIF

brokers CERIF datasets published according to the

CERIF XML Schema

exposes the resources brokered returning documents

which are conform to the CERIF XML Schema.

CERIF Docs

Page 15: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

CERIF Documents stored in a XML repository are

brokered by GI-cat and republished according to ISO

19115 through the CSW/ISO discovery

interface, required by INSPIRE.

Test case #1 Publishing CERIF products for INSPIRE

Aim: CERIF result products are made available according to INSPIRE

ISO profiler

CSW/ISO

CERIF Docs

Page 16: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

The CERIF profiler enables discovery

through an OpenSearch interface.

INSPIRE datasets stored in a CSW ISO

catalog can be discovered and

converted to CERIF XML documents.

Test case #2 Porting INSPIRE information to CERIF

Aim: INSPIRE datasets are discovered and returned according to CERIF XML Schema

CSW/ISO accessor

CSW/ISO

INSPIRE

Catalog

Page 17: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Testing results

Page 18: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Summarising some results … 1)

Data elements mapped:

• 16/20 INSPIRE mandatory elements• 7 --> straightforward • 3 --> inferential • 6 --> by convention

• 6/8 optional elements

Discovery of primary data elements based on CERIF Result Product

CERIF semantic layer facilitates a flexible application of the model in heterogeneous environments

BUTneeds specific constraints and rules

to establish consistent semantic integration

Page 19: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Summarising some results … 2)

• Proposal of introducing a CERIF profile to extend ISO concepts with contextual research information:

• Projects• Datasets associated to publications

Implementation and successful test of the GI-cat allows without additional implementation efforts:• Integration of ISPIRE datasets in RISs• Integration of RISs with environmental dataset systems

Future work: service discovery, extending mapping to ISO 19119

Page 20: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Discussion

Proposal of different solutions to be submitted to the euroCRIS community

Some further suggestions:

• Introduction of a specific entity to univocally identify datasets as research products

• Establish set of rules/procedures to create CERIF valid metadata extensions

Page 21: Integrating CERIF entities in a multidisciplinary e-infrastructure for environmental research data

CRIS2014 - Rome, 13-15 May 2014

Thank you!