eudat a cross-disciplinary data infrastructure in...
TRANSCRIPT
![Page 1: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/1.jpg)
EUDAT
A cross-disciplinary data
infrastructure in Horizon
2020
Damien Lecarpentier
EUDAT Project Manager
CSC – IT Center for Science Ltd
![Page 2: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/2.jpg)
Data ”Deluge”
2
Increasing complexity and variety
Gigabytes
Terabytes
Petabytes
Exabytes
Zettabytes
Exp
on
enti
al g
row
th
• Where to store it?
• How to find it?
• How to make the most of it?
![Page 3: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/3.jpg)
Synergies
3 3
If there are hundreds of Research Infrastructures, how many different data management systems can we sustain?
![Page 4: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/4.jpg)
Tru
st
Data C
uration
Common Data Services
Users Data
Generators
Community Support Services
Riding the Wave
Collaborative Data Infrastructure
-A framework for the future? -
![Page 5: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/5.jpg)
5
![Page 6: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/6.jpg)
Consortium
6
![Page 7: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/7.jpg)
• EPOS: European Plate Observatory System
• CLARIN: Common Language Resources and
Technology Infrastructure
• ENES: Service for Climate Modelling in Europe
• LifeWatch: Biodiversity Data and Observatories
• VPH: The Virtual Physiological Human
• INCF: International Neuroinformatics Coordinating
Facility
• DRIHM: Distributed Research Infrastructure for
Hydrometeorology
Seven Research Communities on Board
7
![Page 8: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/8.jpg)
User Forums + 25 communities
8
1st User Forum
7-8 March 2012,
Barcelona
![Page 9: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/9.jpg)
Service Building Process
Takes time!
Reusing existing
technologies and
expertise rather
than reinventing
everything!
Infrastructure
coordination
(resources,
security, etc.)
![Page 10: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/10.jpg)
Data Staging Safe Replication Simple Store
AAI Metadata Catalogue
Dynamic replication
to HPC workspace
for processing
Data curation and
access optimization
Researcher data
store (simple
upload, share and
access)
Aggregated EUDAT metadata domain.
Data inventory
Network of trust
among
authentication
and
authorization
actors
Selected Services
EUDAT Box dropbox-like service
easy sharing local synching
Semantic Anno checking & referencing
Dynamic Data immediate handling
New services to come
PID Identity Integrity Authenticity Locations
![Page 11: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/11.jpg)
Safe Replication Service
• Robust, safe and highly available data replication service
for small- and medium- sized repositories
– To guard against data loss in long-term archiving and
preservation
11
EUDAT CDI Domain of registered data
PIDs • Policy rules
http://eudat.eu/safe-replication | [email protected]
– To optimize access for
user from different regions
– To bring data closer to
powerful computers for
compute-intensive
analysis
![Page 12: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/12.jpg)
Data Staging Service
• Support researchers in transferring large data collections
from EUDAT storage to HPC facilities
• Reliable, efficient, and easy-to-use tools to manage data
transfers
12
EUDAT CDI Domain of registered data
PRACE HPC
HPC
• Provide the means to re-
ingest computational results
back into the EUDAT
infrastructure
http://eudat.eu/datastaging | [email protected]
![Page 13: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/13.jpg)
Simple Store Service
• Allow registered users to upload ”long tail” data into the
EUDAT store
• Enable sharing objects and collections with other
researchers
13
http://eudat.eu/simplestore | [email protected]
EUDAT CDI Domain of registered data
Simple upload
Simple metadata
PID registration
• Utilise other EUDAT
services to provide
reliability and data
retention
![Page 14: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/14.jpg)
14
![Page 15: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/15.jpg)
15
![Page 16: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/16.jpg)
Simple Store Basic/Premium
16
Properties/functionaliti
es
Basic Premium
Upload Capacity < 2GB per
file/deposit
On-demand
Storage Capacity Faire share Unlimited
Center Selection No Yes
Replication No Yes
Customized interfaces
(MD fields, logo, etc.)
Yes Yes
Access management Standard (open/not
open)
Extended (restricted
access to groups, etc.)
Duration TBC Based on SLAs
![Page 17: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/17.jpg)
Metadata Service
• Easily find collections of scientific data – generated
either by various communities or via EUDAT services
• Access those data collections through the given
references in the metadata to the relevant data stores
• Europeana of scientific data
17
http://eudat.eu/metadata | [email protected]
EUDAT CDI Domain of registered data
![Page 18: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/18.jpg)
18
![Page 19: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/19.jpg)
Towards Horizon 2020
19
Synergy Sustainability
User driven services
Global collaboration
Trust
Joint e-infrastructure roadmaps
![Page 20: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/20.jpg)
A Network of Trusted Centers
• Strong and sustainable generic data centers with existing trusted relationships
• Each having specific relationship with research communities
• EUDAT is about providing solutions in a federated environment
Generic data centres
Community data sites
![Page 21: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/21.jpg)
• Strong
requirement from
researchers and
funders
Path to
Sustainability
Bridging National and European solutions
![Page 22: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/22.jpg)
![Page 23: EUDAT A cross-disciplinary data infrastructure in Horizone-irg.eu/documents/10920/208005/eudat_e-irg_5112013b.pdf · If there are hundreds of Research Infrastructures, how many different](https://reader034.vdocuments.mx/reader034/viewer/2022050219/5f64f24473bc3f72ce6c80ed/html5/thumbnails/23.jpg)
EUDAT Priorities in H2020
• Consolidation of Core Services – Increased performance, new functionalities, AAI, etc.
– Develop tools and policies to facilitate usage: data management plans, licensing, training, etc.
– Development of new services
• Financial Sustainability – Cost and funding models
– Framework and mechanisms for sharing resources across sites and across communities (juste retour, etc.)
• Interoperability – E-Infrastructures a joint roadmap?
– National initiatives service portfolios
– RDA EUDAT as a driver and implementer
23