the square kilometer array (ska) use case for... · discovery and data mining etp4hpc workshop, 23...

17
THE SQUARE KILOMETER ARRAY (SKA) ESD USE CASE Ronald Nijboer Head ASTRON R&D Computing Group 1 ETP4HPC workshop, 23 June 2016 With material from Chris Broekema (ASTRON) John Romein (ASTRON) Nick Rees (SKA Office) Miles Deegan (SKA Office) John Taylor (U. of Cambridge) Michael Wise (ASTRON)

Upload: others

Post on 09-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

THE SQUARE KILOMETER ARRAY (SKA)

ESD USE CASE

Ronald Nijboer

Head ASTRON R&D Computing Group

1ETP4HPC workshop, 23 June 2016

With material fromChris Broekema (ASTRON)John Romein (ASTRON)Nick Rees (SKA Office)Miles Deegan (SKA Office)John Taylor (U. of Cambridge)Michael Wise (ASTRON)

Page 2: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

ASTRON, Offices & Locations

2ETP4HPC workshop, 23 June 2016

GroningenLOFAR CEP

WesterborkWSRT

Borger-Odoorn, ExlooLOFAR core

DwingelooLOFAR / WSRT OperationsR&D, ScienceJIVE, NOVA

Page 3: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Radio Astronomy

3ETP4HPC workshop, 23 June 2016

Doppler shift 21 cm lineGalaxy M81

Page 4: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Square Kilometer Array (SKA)

4ETP4HPC workshop, 23 June 2016

SKA: one Observatory (HQ in UK), two sites (South-Africa & Australia)

Page 5: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SKA: big scientific questions

5ETP4HPC workshop, 23 June 2016

Testing gravitation

Epoch of Reionisation

Cosmic Magnetism

Cradle of lifeLarge scale structures

Turbulent Universe

Page 6: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SKA Context Diagram

6ETP4HPC workshop, 23 June 2016

SDP is off-site! (Perth & Cape

Town)

The Science Data Processor transforms theSignals into Science Data Products

Page 7: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Regional Science Centers

ETP4HPC workshop, 23 June 2016 7

Regional Centers are proposed for‘doing Science’ with the SKA Data Products

Page 8: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

RSC Functionality

ETP4HPC workshop, 23 June 2016 8

Data MiningData ProcessingData Discovery Observation database Associated metadata Quick-look data products Flexible catalog queries Integration with VO

tools Publish data to VO

Reprocessing and calibration High resolution imaging Mosaicing Source extraction Catalog re-creation DM searches

Multi-wavelength studies Catalog cross-matching Light-curve analysis Transient classification Feature detection Visualization

Page 9: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

RSC Requirements

Regional Science Centers are being discussed and planned for

Requirements do not exist yet

H2020 project Aeneas submitted

Likely RSCs will be different in different locations

SKA SDP type processing will be needed, as well as Data Discovery and Data Mining

9ETP4HPC workshop, 23 June 2016

Design and specification of a distributed, European Science Data Centre (ESDC) to support the pan-European astronomical community in achieving the scientific goals of the SKA

Page 10: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SDP Key Performance Requirements

ETP4HPC workshop, 23 June 2016 10

SDP Local Monitoring & Control

High Performance• ~100 PetaFLOPS

Data Intensive• ~100 PetaBytes/observation

(job)

Partially real-time• ~10s response time

Partially iterative• ~10 iterations/job (~6hour)

CSP

Observatory

High Volume & High Growth Rate• ~100 PetaByte/year

Infrequent Access

• ~few times/year max

Data Processor Data Preservation

Delivery System

Data Distribution•~100 PetaByte/year from Cape Town & Perth to rest of World

Data Discovery•Visualisation of 100k by 100k by 100k voxel cubes

~1 Tbytes-1 ~10 Gbytes-1

~200Gbytes-1

Page 11: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SDP Functional Breakdown

ETP4HPC workshop, 23 June 2016 11

Page 12: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Data Parallelism

ETP4HPC workshop, 23 June 2016 12

Frequency

Time & baseline

o Data parallelism: Dominated by frequency

o Provides dominant scalingo Nothing more needed if each processing

node can manage a frequency channel complete processing

Processing nodes

Visibility data

Exploit frequency independence

Grid and de-

grid

FFT

Buffered UV data

A lot of the processing is embarrasingly (data) parallel, but …… there will be synchronisation points where data needs to be combined

Page 13: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SDP Compute Requirements

~50 PFLOPS total sustained, max

FFT and Gridding dominant

Mixed precision

Achieve 10-15% of peak now

Large fast working memory (~2 FLOP/byte)

Can exchange memory for FLOPs using facetting

Fast Storage

~3 Tb/s write, ~30 Tb/s read

~ 13000 FLOPS/byte read

~5MW per site

13ETP4HPC workshop, 23 June 2016

Page 14: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

SDP Compute Characteristics

Few, well known applications -> co-design

Trivially parallel workloads, baseline architecture leveragesthis

Low arithmatic intensity, thus I/O bound

Pseudo real-time + fast storage + batch processing

Tight budgets (energy, capital and ops)

14ETP4HPC workshop, 23 June 2016

Page 15: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Current Timeline

2013 – 2017 SKA Pre-Construction

2018 – 2022 SKA Construction

2020 Start Early Science

2023 Start Full Operations

15ETP4HPC workshop, 23 June 2016

Page 16: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Conclusions

SKA is a huge computational challenge

RSCs in the process of being defined

SDP ~ 50 Pflop (sustained), 5 MW

Power is also a major driver.

Software complexity is also beyond what has been achieved in astronomy previously.

Traditional HPC is not a good match because the problem is bandwidth dominated.

SKA would be a perfect Use Case as Big Data application for the EsD projects

16ETP4HPC workshop, 23 June 2016

Page 17: THE SQUARE KILOMETER ARRAY (SKA) Use Case for... · Discovery and Data Mining ETP4HPC workshop, 23 June 2016 9 Design and specification of a distributed, ... ~ 13000 FLOPS/byte read

Questions?

ETP4HPC workshop, 23 June 2016 17