egi vision for data and distributed computing e ... · training and outreach integrate at european...

36
www.egi.eu ladislav.hluchy @savba.sk www.ui.sav.sk EGI vision for Data and Distributed Computing e-infrastrutures for Open Science Tiziana Ferrari, Egi.eu, Ladislav Hluchý, Institute of Informatics SAS

Upload: others

Post on 28-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

www.egi.eu

[email protected]

www.ui.sav.sk

EGI vision for Data and Distributed

Computing e-infrastrutures for Open

Science

Tiziana Ferrari, Egi.eu, Ladislav Hluchý, Institute of Informatics SAS

Page 2: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

210/10/2016

Challenge and scope

EINFRA-12 (A) Meeting, 13-09-2016

Page 3: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

310/10/2016

Impact

EINFRA-12 (A) Meeting, 13-09-2016

Page 4: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

410/10/2016

EINFRA-12 Challenge

Compute/Storage/Data/Security

Federation ServicesThematic Services

Training and outreach

“Integrate at European level the geographically and disciplinary dispersed resources to achieve economies of scale and efficiency gains in providing the best data and computing capacity and services to the research and education communities.”

Issues: - geographical and disciplinary

fragmentation- Lack of economies of scale

EINFRA-12 (A) Meeting, 13-09-2016

Page 5: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

510/10/2016

Impact and Excellence

• Concerned platforms and services are based on systems

and technologies that have reached at least TRL 8 (“system

complete and qualified”) before the start of the project.

• Quality and Quantity of services in a joint service

catalogue: “The extent to which the Service Activities

(Trans-national and/or Virtual Access Activities) will offer

access to state of-the-art infrastructures and high quality

services

• Potential to enhance capacity for innovation and

production of new knowledge

Exploitation of services for excellence science and industry/SMEs

EINFRA-12 (A) Meeting, 13-09-2016

Page 6: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

610/10/2016

Requirements/1

• Integration of computing, software and storage resources

– EGI services: Compute/Storage/Data management/AppDB/UMD and CMD

distribution

– External services: EUDAT and other e-Infrastructures, RIs

• Exposing them through a dynamic registry and catalogue of services

supporting European research and education communities in their

tasks related with data and computing intensive science

– EGI new services: marketplace (from EGI-Engage), Cloud services for data

science educational activities

• This integration should be done by means of open and flexible

architectures and include institutional, regional, national and European

capabilities, packaging them in the optics of end-user needs

– Support and training for federated service management including service portfolio

management

– Extend federation to regional infrastructures e.g. funded through ESIF

EINFRA-12 (A) Meeting, 13-09-2016

Page 7: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

710/10/2016

Requirements/2

• Seamless operation of highly scalable and agile data and computing

platforms and services dedicated to analytics including hardware and

software components, database, compilers, analytics software,

supported to easy user entry points for the community of users

– EGI services: federation fabric (aka Core Infrastructure Platform) and

federated service management tools and activities. Includes new core

infrastructure services of general interest.

– New services: Data Hub/INDIGO Datacloud, EOSCpilot succesful PoCs

– Activities:

• Operation of the EGI Core Infrastructure Platform and federated operations

• Operation of EGI community platforms (for research and industry), Integration of new

ones at TRL 8

• Integration with thematic data infrastructures (RIs, national data infrastructures) and

generic ones (EUDAT)

EINFRA-12 (A) Meeting, 13-09-2016

Page 8: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

810/10/2016

Requirements/3

• Reliably address the aspects of privacy, cybersecurity and

information assurance supporting multiple compartments

with private, public or industrial corpus of data, protected

from unauthorized access by secure interfaces

– Security coordination and policy development

– Incident response

– Security training

– Security certification in compliance to data protection regulations

– AAI services (credential translation, attribute translation, IdP/SP

proxy services: “EGI CheckIn”)

EINFRA-12 (A) Meeting, 13-09-2016

Page 9: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

910/10/2016

Requirements/4

• Adoption of standards-based common interfaces, open

source components enabling access and processing of

underlying data collected/stored in different platforms and

formats.

• Empowering users to customise application and services

tailoring them to specific requirements, which will differ

across disciplines, applications etc.

– Support to open standards

– Maintenance of critical middlware components

– Technical integration of new community platforms according to

EGI participants’ priorities and EC priorities (e.g. Copernicus/DAIS)

through competence centres

EINFRA-12 (A) Meeting, 13-09-2016

Page 10: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1010/10/2016

Requirements/5

• Work closely with user communities (from different disciplines) to

foster the use of digital infrastructures

– Outreach, training

• Promote the values of open science and support their data

management plans

– Distributed storage infrastructure for European research collaborations

and long tail of science for depositing data

– Open access to data management planning tools available at RI/EGI

participant level, cooperation with EUDAT/OpenAIRE and relevant national

authorities /EIROs (e.g. Zenodo/CERN and others)

• Engage and train users (researchers, educators and students) to

contribute to the dynamic registry and catalogue of services

improving quality of data, software and computingi nfrastructure that

become available for re-use

– Promotion of marketplace with new service providers (public/private)

EINFRA-12 (A) Meeting, 13-09-2016

Page 11: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1110/10/2016

Requirements/6

• Foster interoperability of pan-European

thematic/community-driven e-infrastructures providing

cost-effective and interoperable solutions for data

management. The data and computing e-infrastructure

should be able to interoperate with resources based on

different technologies which are operated/owned by

public and or private organisations

– Establish a commercial cloud supplier group and define a technical

interoperability requirements and roadmap for the European Open

Science Cloud (leveraging HNSciCloud results)

– Establish EGI as cross-border lead procurer (in collaboration with

GEANT FPA initiatives)?

EINFRA-12 (A) Meeting, 13-09-2016

Page 12: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1210/10/2016

Requirements/7

• Support the preservation and curation of data and

associated software so that the reproducibility and

accuracy of the data can be verified

– EGI services: AppDB, Cloud Compute integrate with

preservation infrastructures (e.g. Zenodo, B2Share…) for

preservation of VM images and linking to data and

Cloud Compute

– Liaise with EINFRA-12 (B) proposal (OpenAIRE)

– Repurpose community data preservation and data

management instruments of general interest

EINFRA-12 (A) Meeting, 13-09-2016

Page 13: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1310/10/2016

Requirements/8

• Enable seamless transition and e-infrastructure upgrades,

exploiting economies of scale and promoting

interoperability with similar infrastructures across and

beyond Europe and operate user-friendly and

comprehensive repositories of software components for

research and education

– EGI services: UMD and CMD software distributions, AppDB, quality

verification of services for marketplace

– International cooperation for data-driven research collaborations

• HBP/Astronomy and Astrophysics/Life Science…

EINFRA-12 (A) Meeting, 13-09-2016

Page 14: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1410/10/2016

EGI Services/NGI-EIRO services

Compute

High-throughput

GPU

Cloud

Cloud Container

Cloud GPU

Storage

Online

Archival

Data

Data transfer

Public Data

Security

User attributes

management

- Providers- NGIs and EIROs- (Commercial suppliers)

- Funding- 100% National funding

agencies- Private funds

- EINFRA-12 funding- EC (cost for pooling and

supporting new user groups) + National/Private funds establishing a distributed operations coordination and technical support team

EINFRA-12 (A) Meeting, 13-09-2016

Page 15: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1510/10/2016

EGI Services/federation services

Helpdesk

GGUS

Technical support

Security

Coordination, CSIRT and

policies

VO and user registration

IDP and IDP Proxy

Credential translation

Attribute Management

Accounting

Repository

Portal

Scientific Applications

and cloud VM library

Operations tools

Messaging infrastructure

Monitoring

Service registry

(GOCDB)

Operations Portal

Collaboration tools

Unified Middleware Distribution

Quality assurance

UMD infrastructure

Coordination

Technical

Operations

User communities

- Providers- NGIs and EIROs- EGI Foundation

- Currently- 50% National funding

agencies + 50% Fees- EINFRA-12 funding

- EC contribution (devops) + EGI in kind pledged contributions

EINFRA-12 (A) Meeting, 13-09-2016

Page 16: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1610/10/2016

EGI Services/Thematic services

HEP, Astroand

Astroparticle

WLCG, CTA, etc.

Structural biology

WeNMR services

Biomedicine and Bioinformatics

Life Science Grid

Community

BILS (Sweden)

CHIPSTER

Hydrology

DRIHM

Fresh water and marine

resource conservation

iMARINE

Environmental Science

ESA Them.

Exploitation

Platforms and

Data and Info Access Service Coperni

cus

Art and humanities

Musicology (Peachnote)

DARIAH

Scientific applications on

demand

Bioinformatics

Engineering

ETC

- Providers- Research

organizations/universities (MoUs)

- NGIs- Industry

- Funding- 100% National

funding agencies- Private funds

- EINFRA-12 funding- EC (devops,

technical support) + National funding agencies or private funds

EINFRA-12 (A) Meeting, 13-09-2016

Page 17: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1710/10/2016

Cloud Services for

Nanotechnology

Page 18: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1810/10/2016

Nanotechnology Requirements

• Nanotechnology is easily able to exhaust any

available computer resources

– An increase in detail of computation requires also a

siginificant increase in computing time

– Outputs are measured in hundreds of GB

– Monte Carlo methods are quite popular

• Many different software packages

– VASP (Vienna Ab-initio Simulation Package, DFT),

CPMD (Car-Parrinello Molecular Dynamics, DFT), SPR-

SKKR (Spin-Polarized Relativistic Screened Korringa-

Kohn-Rostoker), Q-espresso (Quantum Espresso, DFT)…

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 19: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

1910/10/2016

Experiment and Simulation

• Nanotechnology can be broadly divided into

experimental and theoretical (simulation-based)

• These two fields work together, share ideas,

exchange data and results

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 20: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2010/10/2016

High performance or high scalability?

• Theoretical (simulation-based) nanotechnology

currently relies on HPC

– Reduced availability for smaller research teams

– Often cumbersome setup of software and various

constraints (security, multi-user sharing of resources…)

• A shift towards HSC would be beneficial

– Some widely used methods (like Quantum Monte

Carlo) work very well in HSC environment

– Cloud can work as HSC -> excellent on-demand

availability, price scales with requirements

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 21: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2110/10/2016

Different models of large-scale computing

• Cloud can be very useful; cloud with HPC

resources available is even better

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 22: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2210/10/2016

Storage requirements

• Usually only processed data (graphs, pictures of

atomic sctructures, movies of trajectories) is being

exchanged

– Raw data is too big to send via e-mail or store for

longer periods of time

• Availability of raw data from previous runs would

free up considerable resources

– Simulations would not need to be repeated by different

teams

– Often the raw data contains aspects not present in the

final, processed data – these just get lost and need to

be computed from scratch if needed

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 23: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2310/10/2016

Modern Virtual Research Environment for

Nanotechnology - Consortium

• Based on close cooperation with several

important nanotechnology centers– Karlsruher Institut für Technologie

– Technische Universität Wien

– Universität Regensburg

– Universität Tübingen

– King’s College London

– Justus-Liebig-Universität Giessen

– Institut Català de Nanociència i Nanotecnologia

– North Carolina State University

– Department of Applied Physics, Graduate School of Engineering,

Osaka University

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 24: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2410/10/2016

Modern Virtual Research Environment for

Nanotechnology –motivation and goals

• Weakest points of current technology:

– Need of frequent and cumbersome porting and moving of application

codes onto and between the different HPC platforms,

– Need of expert performance tuning on the platforms on which the

application codes have been ported,

– Slow and cumbersome data exchange/flow between the different research

participants

• A new type of cloud-based VRE:

– making increased use of the HSC computing paradigm, grant easy,

comfortable, and secure access to and use of even larger computing

facilities

– solve the needs of huge data exchange/sharing among the diverse

research participants (theory/simulation, experiment)

– long-term data storage

– improved convenience and quality of graphics and data animation

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 25: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2510/10/2016

Modern Virtual Research Environment for

Nanotechnology - Objectives

• A collaborative, nanotechnology-specific virtual

hub

• Extensive semantic support for data and services

• Modern distributed infrastructure with a state-of-

the-art architecture and deployed technologies

• Comprehensive integration of resources

• A platform built for real users and with their close

involvement

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 26: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2610/10/2016

High-level Architecture

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 27: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2710/10/2016

Adopted Technology – WS-PGRADE

• WS-PGRADE/gUSE

– MTA SZTAKI (Hungarian Academy of Sciences)

• Scientific Gateway Based User Support (SCI-BUS)

– Based on WS-PGRADE

– Is widely used

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 28: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2810/10/2016

EGI and the European Open Science Cloud

Page 29: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

2910/10/2016

Open Science:

a Complex Resource System

• Shared resources

– Integrated, easy and fair access

• Engaged communities

– Participating in the process

– Culture of sharing

– Collaborating in the management and

stewardship

• Governance

– Rules to access

– Rules to resolve conflicts

– Rules to balance quality vs. openness

• Financial support

– For long-term availability

Digital services and applications

Knowledge & Expertise

Instruments

Research Data

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 30: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3010/10/2016

Open Science Commons:

When implemented…

Researchers from all disciplines

have easy, integrated and open access

to the advanced digital services, scientific instruments,

data, knowledge and expertise

they need to collaborate and achieve

excellence in science, research and innovation.

They feel engaged in governing, managing and preserving

these resources for everyone’s benefit, with the support of

all stakeholders.

Open Science Commons adopted in the EU Council Conclusions, May 2015Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 31: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3110/10/2016

A multi-stakeholder endeavor

(EU perspective)

Digital services and applications

Knowledge & Expertise

Instruments

Research data

Centres of Excellence

Innovation Centres

Research Infrastructures

Virt. Research Env. providers

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 32: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3210/10/2016

Commons

Institutionalised community governance of the production

and/or sharing of a particular type of resource (from natural

to intellectual)

Constructing Genome Commons

GÉANT: European Communications

Commons

e-Infrastructure Commons

Linux

Wikipedia …

Internet

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 33: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3310/10/2016

EOSC principles?

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 34: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3410/10/2016

EOSC Research Objects Hub

EOSC Research Objects Hub

Storage/data management

Cloud computeCloud container compute

HTC and HPC

Research outputs

Thematic services (data products, pipelines,

software, virtual appliances..)

Hub-specific service management processes,

business processes, policies

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 35: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3510/10/2016

Federation services and processes

Research objects libraries

Research Object Indexingand discovery services

EOSC federation services and activities (examples)

Markeplaces Standards and policies

Federation services/processes(accounting, monitoring,..

Business processes and channels

Knowledge an d training

Federated IdP, Auth, Authz

Tech Summit Bratislava & GadgetEXPO - Bratislava, 11.-12.05.2016

Page 36: EGI vision for Data and Distributed Computing e ... · Training and outreach Integrate at European level the geographically and disciplinary dispersed resources to achieve economies

3610/10/2016

Thank you for your attention

Questions?