grpgrp-workshop-2019.ucsd.edu/presentations/l_howard...• a collaboration between the two national...

39
nci.org.au GRP APRP - ANRP Australian National Research Platform Andrew Howard NCI Cloud Team Manager Co-Chair APAN Asia Pacific Research Platform WG 1

Upload: others

Post on 08-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

GRPAPRP - ANRP Australian National Research PlatformAndrew HowardNCI Cloud Team ManagerCo-Chair APAN Asia Pacific Research Platform WG

�1

Page 2: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• NCI • Bringing compute and data together

• Friction Free Data Movement• European eXtreme Data Cloud• Problem space• Potential Solutions• Research Platforms• APRP proposed design

• Australia National Research Platform• Data Mover Challenge

• Australia, Singapore, Korea, Japan, USA

Overview

�2

Page 3: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Introduction to NCI

�3

• The National Computational Infrastructure (NCI) is a collaboration between its foundational partners ANU, CSIRO, Bureau of Meteorology , Geoscience Australia, the major research Universities, research Institutes and Industry to provide a highly integrated computational environment for Australian researchers to enable National Research and Innovation.

Page 4: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Introduction to NCI

�4

• High performance computing CLOUD COMPUTING DATA STORAGE & SERVICES

WE ENABLE AUSTRALIAN RESEARCH WITH WORLD-CLASS…

Page 5: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

�HPC resource use at NCI

�5

Distribution by field of researchNot specified

4%

Mathematics3%

Technology1%Engineering

9%

IT and Computing2%

Biological Sciences4%

Environment1%

Earth and Earth System Sciences38%

Chemistry20%

Physical Sciences17%

Distribution by Research Organisation

Other Research Centres17%

Merit Flagships19%

Universities29%

Geoscience Australia3%

Bureau of Meteorology15%

CSIRO17%

�5

Page 6: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Containers• Role of NRENs• Science drivers

• Collaborative research• Genomics• Instruments like SKA

Overview

�6

Page 7: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Pawsey and NCI

�7

Page 8: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Nullarbor

�8

Page 9: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• NeCTAR• National eResearch Collaboration Tools and Resources• Restructure of governance

• Australian Research Data Commons (ARDC)• Combines Australian National Data Services (ANDS),

Research Data Services (RDS) and NeCTAR Cloud

Australian Research Clouds

�9

Page 10: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• The National eResearch Collaboration Tools and Resources project (Nectar) provides an online infrastructure that supports researchers to connect with colleagues in Australia and around the world, allowing them to collaborate and share ideas and research outcomes, which will ultimately contribute to our collective knowledge and make a significant impact on our society.

NeCTAR

�10

Page 11: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• NeCTAR Cloud operators• University of Melbourne,• Monash University,• National Computational Infrastructure (NCI),• Queensland Cyber Infrastructure Foundation (QCIF),• University of Tasmania (TPAC).

NeCTAR Cloud Operators

�11

Page 12: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• AARNet• Spectrum ownership• Alien waves• Tri-versity• Shorter path to Europe via SingAREN and CAE-1

Indigo and CAE-1

�12

Page 13: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• What is a Research Platform• Notable Research Platforms

• APRP• Participants

• Australia • NCI• Pawsey SuperComputer Centre

• NRENs

Overview

�13

Page 14: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Australia National Research Platform• How it relates to APRP • Foundation capabilities

• Data Movement• Federated authentication• Service orchestration

• Data Mover Challenge• APRP participants

Overview

�14

Page 15: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Data movement• File replication• Object replication• Scheduled and background transfers

• Service endpoints• Shared capabilities• Distributed data stores integrated into a single metadata

namespace• Build on advanced network capabilities

Overview

�15

Page 16: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• We need to provide our researchers with a friction free data transfer system• Easy to use• Secure using a Federated Access system

• The network and tools should have the data in the right location at the right time• Able to effectively use different storage tiers

• SSD• Spinning Disk• Tape

• The researcher creates a Data Intent definition• Data Source• Data Target

• Transfer priority (High, Medium, Low)• Storage performance (SSD, Disk, Tape)

• optional Network intersection

Friction Free Data movement

�16

Page 17: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

TCP/IP

�17

By default TCP/IP does not perform well over high bandwidth, high delay circuits.

Page 18: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

How to… ?... orchestrate and federate Cloud, Grid and HPC [public or private] resources? ... Avoid software and vendor lock-in? ... overcome performance issues limiting massive adoption of virtualised Cloud resources in large data centres? ... exploit specialised hardware, such as GPUs or low-latency interconnections? ... manage dynamic and complex workflows for scientific data analysis? … combine data from multiple sources and stored in multiple locations through incompatible technologies? … support federated identities and provide privacy and distributed authorisation in open Cloud platforms? ... provide APIs to exploit the above and write applications, customisable portals and mobile views? ... move beyond statical location and partitioning of both storage and computing resources in data centres? ... distribute and deploy applications in a flexible way? ... exploit distributed computing and storage resources through transparent network interconnections? 

The challenges of the Big Data era

�18

Page 19: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Capabilities and Requirements

�19

• Regional connection• Federated access• Data capacitor capabilities

• Local storage• Container provisioning

• Instantiate toolkit containers• VM provisioning

• Provide VM access on regionally connected DTN

Page 20: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Containers• Docker in a well protected hosting environment• Singularity

• V2• V3

• Lightweight services• Role of NRENs

Containers

�20

Page 21: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Our National Research and Education Networks are critical• Advanced network services

• 100G• Anycast• IPV6

• Data sharing services (AARNet Cloudstor)• National service termination point

Role of NRENs

�21

Page 22: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Open Science Data Cloud

�22

Page 23: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

eXtreme DataCloud

�23

INDIGO PaaS Orchestrator

INDIGO CDMI Server

FTS

Page 24: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

XDC components

�24

c

Storage

c

Federation

c

Orchestration

INDIGO Orchestrator

Rucio

xRootD Cache

QoS CDMI

Page 25: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Research Platforms

�25

• Pacific Research Platform (PRP)• US Initiative to build a network of Science DMZs with well tuned

systems for data movement• Asia Pacific Research Platform (APRP)

• Regional initiative• KISTI - Korea, NCI - Australia, Perdana U - Malaysia, NSCC -

Singapore, Tsinghua U & NSCC Wuxi - PRC, CSIRO - Australia, Putra U - Malaysia

Page 26: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• A collaboration between the two national HPC facilities

• National Computational Infrastructure (Canberra)

• Pawsey Supercomputer Centre (Perth)• Will be connected via a dedicated 100G network

path over AARNet• Allows for secure, high speed data transfer

between the facilities• Designed for long running services which are

able to utilise HPC facilities for High Throughput

National Research Platform

�26

Page 27: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• KeyStone integration to support the facility specific HPC authorisation domains

• Data transfer services using Fermilab Big Data Express• Easy to use• Secure using a Federated Access system combining NCI, Pawsey and AAF

namespaces• Supports scheduled transfers and prioritisation• Highly multi threaded for maximum network and storage utilisation

• Highly available national services

National Research Platform - projects

�27

Page 28: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• Creating a national facility to support the management, processing, storage and sharing of “omics” data

• Three initial pathfinder projects• Zero Childhood Cancer• Oz Mammals• Genomes of Australian Plants

National Biosciences Cloud Pathfinder

�28

Page 29: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• NCI tender for new HPC system completed• A$70M project• Existing HPC system will be re-purposed as additional Cloud

resources for NRP and Bioscience Cloud• Challenges

• Disassemble and reassemble from primary to secondary data centre

• Power and cooling• Networking

NCI Cloud expansion

�29

Page 30: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Indigo Alien Wave 300G Pawsey/NCI

�30

Page 31: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

KREONET

SLIX

APRP proposed high level design

�31

Australia

SingaporeLA

DTN DTN

DTN

SingAREN

AARNet SX Transport

PacWave

Internet2

Korea

DTN

Page 32: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

National and Regional Research Platform Architecture

�32

Regional Availability Zone

AUNational Availability Zone

NCISite Availability Zone

SGNational Availability Zone

AvailabilityZone

PawseySite Availability Zone

AvailabilityZone

AvailabilityZone

AvailabilityZone

NZNational Availability Zone

AvailabilityZone

NationalService

RegionalService

NationalService

NationalService

NCIServicePawsey

Service

AARNet

CloudStor

AAF

SAF

Tuakiri

DTS

DTS

DTS

DTS

AARNet

REANNZ

SingAREN

AARNet

AARNet

Network as aService

DTS Data Transfer Service

Message Queue Service

Lambda function Service

Federated Authorisation

Object replication

File system replication

Page 33: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Capabilities and Requirements

�33

• Regional connection• Federated access• Data capacitor capabilities

• Local storage• Container provisioning

• Instantiate toolkit containers• VM provisioning

• Provide VM access on various National Research and Commercial clouds

Page 34: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

High level architecture/goals

�34

• Services may operate at a Site, National or Regional scope. • Replication of Objects and Filesystems to support services

operating in multiple Availability Zones. • Authentication support for existing LDAP based systems and

Federated identities through AAF and other federated Federations (eduGain).

• Share common best practice and personnel in design and implementation.

• Efficiently support the rapidly growing national BioInformatics activities.

Page 35: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

• GPU access• Object store replication• File system replication• Data transfer services• Advanced Cloud development testbed• Containers• Message queues• Functions

Capability required

�35

Page 36: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Conclusion

�36

• We have started the journey• The foundation of data movement is in progress• Activities like the DMC are building better collaboration• We need to investigate other shared resources

Page 37: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Contact details

�37

• For more information please contact me [email protected]

Page 38: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au

Questions ?

�38

Page 39: GRPgrp-workshop-2019.ucsd.edu/presentations/L_HOWARD...• A collaboration between the two national HPC facilities • National Computational Infrastructure (Canberra) • Pawsey Supercomputer

nci.org.au�39

Acknowledgements