eudat-b2find: a fair-friendly and interdisciplinary data catalogue
TRANSCRIPT
EUDAT-B2FINDA FAIR-friendly and Interdisciplinary Data Catalogue
Heinrich Widmann, DKRZ
BlueBRIDGE Workshop 2017
03.04.2017
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Outline
1 EUDAT and the B2 services
2 Guidelines and Concepts
3 FAIR Approach of EUDAT-B2FIND
4 Outlook and Summary
1 / 24
EUDAT and the B2 services
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue
EUDAT - Motivation and Objective
The project European Data Infrastructure (eudat) isfunded by the EU Horizon2020 program, started in 2011,now in 2nd phase EUDAT2020, will end 2018≥ 2018 : agreement of EUDAT-EGI-Indigo consortium on theEOSC-Hub proposal
Motivation : Manage the rising tide of research dataChallenge : Help communities to handle the Big DataManagement in a wide cross-disciplinary scopeObjective : Build up a Collaborate Data Infrastructure (CDI),
based on common and generic data servicesdriven by requirements of the research communities
2 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue
EUDAT B2 Service Suite
For details see at http://www.eudat.eu/services3 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue
EUDAT Collaborative Data Infrastructure (CDI)
For details see at http://www.eudat.eu/eudat-cdi
4 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue
EUDAT Collaborative Data Infrastructure (CDI)
For details see at http://www.eudat.eu/eudat-cdi
4 / 24
Guidelines and Concepts
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
The FAIR principles → EUDAT-B2FIND
The FAIR principles → as implemented by EUDAT-B2FIND
Findability→ Discovery Portal with powerful search featuresAccessibility→ Persistent Identifiers for unique resolvabilityof data objectsInteroperability→ Interdisciplinary Catalogue based onCommon standardsReuseability→ Interoperable Format used for data access byEUDAT’s Storage B2-services
5 / 24
FAIR Approach of EUDAT-B2FIND
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Facetted Search
B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets
TagsCreatorDiscipline etc.
6 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Types of Identifiers
7 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Resolvability of Data Objects
8 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Resolvability of Data Collection
9 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Distribution of Data Access Identifiers
1 DOI & PID
28
DOI
27
PID
45
URL
10 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Levels of Interoperability -1-
11 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Levels of Interoperability -2-
12 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Levels of Interoperability -3-
13 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Levels of Interoperability -4-
14 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
B2FIND Ingestion Workflow
Preconditions tojoin B2FIND
MD providerserviceSpec. of MD(format,schema)Only twomandatoryfields (titleand oneidentifier)
15 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
B2FIND MD Schema (extract)
MD Type Field name Semantic definition Allowed Values Obligation Occurence
General Info Title A name or title a resourceis known
Free Text (Unicode) Mandatory 1
Description Additional informa-tion about content ofresource.
Free text (Unicode) Recommended 0-1
Data AccessSource URL that uniquely identi-
fies a resourceShould be resolv-able URL Mandatory [1]
0-1
PID Persistent IDentifier(Handle in a Handle-server)
+ persitent and re-solvable via handleserver
0-1
DOI Digital Object Identifier(registered at Datacite)
+ citable and re-solvable via DOIagencies
0-1
Provenence DataCreator Main researchers in-
volved in data productionList of persons Recommended 0-1
Discipline Field of Research Controlled Vo-cabulary, seeb2find_disciplines
Recommended 0-n
PublicationYear
The year data are pub-lished
YYYY Optional 0-1
Coverage Data TemporalCoverage
The temporal limits Interval of UTCdate-times
Optional 0-1
SpataialCoverage
Spatial extent Spatial coordinatebox or point
Optional 0-1
16 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
The facet Discipline and its Controlled Vocabulary
Taken from List of Academic disciplines athttp://en.wikipedia.org/wiki/List_of_academic_disciplines_and_sub-disciplines
17 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
Coverage of Disciplines in B2FIND
10Social Sciences8
Natural Sciences
51Humanities
2
Professions
29
Not stated
18 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue
B2FIND Metadata Catalogue - Ingestion Status
19 / 24
Outlook and Summary
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
ChallengesLessons learnedNext stepsConclusionsLinks and Contact
Challenges
EUDAT has to master the balancing act between providinggeneric, discipline agnostic services and meet research specificneeds
e.g. requirements of Blue Growth Communities
Integrate B2 services in/as BlueBRIDGE VRE ?Handle scalability and granularity issuesAssurance of Quality of MetadataImprove Usability of (Graphical) User InterfaceUse the potential of the Semantic Web (LoD)
20 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
ChallengesLessons learnedNext stepsConclusionsLinks and Contact
Lessons learned
Less is sometimes more : Catalogue with a manageable amountof high quality metadata instead a mess of millions of entriesTalk more to community representatives and researchers (atbest already in the phase of generation of the metadata)Low(er) barrier for communities to get contact, documentationand support from EUDAT
21 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
ChallengesLessons learnedNext stepsConclusionsLinks and Contact
Next steps
Provide Guidelines and Recommendations for Data Providers‘Annotation’ functionality (B2NOTE) : Users link datasets toexternal reference materials (vocabularies, ontologies, etc.)Hierarchical search, Query-based Taxonomies : Enablinghierarchical search, e.g. in trees of DisciplinesExtend and adapt Validation and Consistency checks, e.g.
check of resolvability of URL’s (Resource Identifiers)
22 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
ChallengesLessons learnedNext stepsConclusionsLinks and Contact
Conclusions
EUDAT-B2FINDestablished an operative service based on agreed standards andguidelines as the FAIR principles,provides a discovery portal with powerful search functionalitiesandis based on a unique catalogue of research data , combining manyheterogeneous and cross-discipline sources
Improved interoperability is achieved by homogenisation to acommon metadata schemaFurther efforts are made to address the demands of thecommunities and data projects, to adapt the system for futurechallenges
23 / 24
EUDAT and the B2 servicesGuidelines and Concepts
FAIR Approach of EUDAT-B2FINDOutlook and Summary
ChallengesLessons learnedNext stepsConclusionsLinks and Contact
LinksInfo about EUDAT : http://eudat.euB2FIND portal : http://b2find.eudat.eu
ContactSupport form : www.eudat.eu/support-requestEmail : [email protected]
Thank you for your attention !
24 / 24