sdp system module decomposition and dependency viewska-sdp.org › sites › default › files ›...

SDP Architecture > Module views

SDP System Module Decomposition and

Dependency view

Contributors: P. Alexander, V. Allan, U. Badenhorst, C. Broekema, T. Cornwell, S. Gounden, F.

Graser, K. Kirkham, B. Mort, R. Nijboer, B. Nikolic, R. Simmonds, J. Taylor, A. Wicenec, P. Wortmann

TABLE OF CONTENTS

1 Primary Representation 9

2 Element Catalogue 10

2.1 Elements and Their Properties 10

2.1.1 Execution Control 11

2.1.1.1 Master Controller 11

2.1.1.2 Processing Control 11

2.1.1.3 TANGO Control 11

2.1.1.4 Monitoring 11

2.1.2 SDP Services 12

2.1.2.1 Delivery 12

2.1.2.2 Long Term Storage 12

2.1.2.3 Model Databases 12

2.1.2.4 Buffer Management 13

2.1.3 Platform Services 13

2.1.3.1 Operations Interface 13

2.1.3.2 Configuration & Orchestration 13

2.1.3.3 Platform Software 14

2.1.3.4 SDP Dependencies 14

2.1.4 Science Pipeline Workflows 15

2.1.5 Quality Assessment 15

2.1.6 Workflow Libraries 15

2.1.7 Execution Frameworks 16

2.1.7.1 Execution Framework Implementation 16

2.1.7.2 Processing Wrappers 16

2.1.8 Processing Functions 16

2.1.8.1 Processing Components 16

2.1.8.2 Receive 17

2.1.9 Data Models 17

2.1.9.1 Buffer Data Models 17

2.1.9.2 Memory Data Models 18

2.1.10 System Interfaces 18

2.1.11 Platform Interfaces 18

2.2 Relations and Their Properties 18

Document No: SKA-TEL-SDP-0000013 Unrestricted

Revision: 06 Author: P. Wortmann et al. Release Date: 2018-10-31 of 87


2.2.1 Execution Control Relations 19

2.2.1.1 Uses Platform Services 19

2.2.1.2 Uses SDP Services 19

2.2.1.3 Uses Workflow Libraries 19

2.2.2 SDP Services Relations 19


2.2.2.2 Uses Data Models 20

2.2.3 Science Pipeline Workflows Relations 20

2.2.3.1 Uses Workflow Libraries 20

2.2.3.2 Uses Quality Assessment 20

2.2.3.3 Uses Execution Framework 20

2.2.4 Quality Assessment Relations 20

2.2.5 Execution Framework 20


2.2.5.2 Uses Core Processing 20

2.2.5.3 Uses Data Models 20

2.3 Element Interfaces 21

2.4 Element Behaviour 21

3 Context Diagram 21

4 Variability Guide 22

5 Rationale 22

5.1 Experience 22

5.1.1 Existing Architectures 22

5.1.2 Prototyping 23

5.2 Constructability, Modifiability and Maintainability 24

5.2.1 Service/Processing Structure 24

5.2.2 Processing Layers 24

5.2.3 Context 25

5.3 Scalability 25

5.4 Performance 25

5.5 Reliability 25

5.6 Portability 26

6 Related Views 26

7 Reference Documents 27

8 Processing Components Modules 28

8.1 Primary Representation 28

8.2 Element Catalogue 28

8.2.1 Elements and Their Properties 29

8.2.1.1 Processing Components 29




8.2.1.1.1 Processing Component Interface 29

8.2.1.1.2 Calibration 30

8.2.1.1.2.1 Solution 30

8.2.1.1.2.2 Operations 31

8.2.1.1.2.3 Instrumental 32

8.2.1.1.2.4 Iterators 32

8.2.1.1.2.5 Ionospheric Monitoring 32

8.2.1.1.3 Visibility 32


8.2.1.1.3.2 Phase Rotation 34

8.2.1.1.3.3 Flagging 34

8.2.1.1.3.4 Sky transforms 34

8.2.1.1.3.5 Coalescence 35

8.2.1.1.3.6 Scatter/gather/iteration of visibilities 35

8.2.1.1.4 Images 35


8.2.1.1.4.2 Fast Fourier Transforms 36

8.2.1.1.4.3 Reprojection 37

8.2.1.1.4.4 Deconvolution 37

8.2.1.1.4.5 Spectral processing 37

8.2.1.1.4.6 Polarisation processing 38

8.2.1.1.4.7 Scatter/gather/iteration of images 38

8.2.1.1.5 GriddedData 38


8.2.1.1.5.2 Gridding / De-Gridding, Kernels, and Convolution Function 39

8.2.1.1.6 Imaging 40

8.2.1.1.6.1 Imaging Base 40

8.2.1.1.6.2 Weighting and tapering 41

8.2.1.1.6.3 Primary beams 41

8.2.1.1.6.4 Imaging for a timeslice 42

8.2.1.1.6.5 Imaging for a w slice 42

8.2.1.1.7 Science Data Model 42

8.2.1.1.8 Simulation 43

8.2.1.1.9 Sky 44


8.2.1.1.9.2 Finding sky components 44

8.2.1.1.9.3 Fitting sky components 44

8.2.1.1.9.4 Insertion 45

8.2.1.1.9.5 Skymodel 45

8.2.1.1.10 Non-Imaging 45

8.2.1.1.10.1 Pulsar Search 45

8.2.1.1.10.2 Pulsar Timing 46




8.2.1.2 Processing Libraries 47

8.2.1.3 Processing Wrappers 47

8.2.1.3.1 Processing Component Wrapper 47

8.2.1.3.2 Data Redistribution 47

8.2.1.3.3 Realtime & Queue I/O 47

8.2.1.3.4 Buffer I/O 48

8.2.1.4 Memory Data Models 48

8.2.1.5 Buffer Data Models 49

8.2.2 Relations and Their Properties 49

8.2.3 Element Interfaces 49

8.2.4 Element Behavior 49

8.3 Context Diagram 51

8.4 Variability Guide 51

8.5 Rationale 52

8.5.1 Modifiability 52

8.5.2 Maintainability 52

8.5.3 Scalability 52

8.5.4 Portability 52

8.6 Related Views 53

8.7 References 53

9 Delivery Modules 56




9.2.1.1 Primary Modules 57

9.2.1.1.1 Web Interface 57

9.2.1.1.2 Publish Products 57

9.2.1.1.3 Transfer Scheduler 57

9.2.1.2 Intermediate Modules 58

9.2.1.2.1 Transfer and Subscription DB Access 58

9.2.1.2.2 Location DB Access 58

9.2.1.2.3 Catalogue DB Access 58

9.2.1.2.4 Storage Access Service 58

9.2.1.2.5 WAN Gateway Configuration 58

9.2.1.3 IVOA Modules 58

9.2.1.3.1 SSA, SIA, DataLink Services 58

9.2.1.3.2 TAP Service 58

9.2.1.4 Base Modules 58

9.2.1.4.1 Database Implementation 58

9.2.1.4.2 HTTP Engine 59

9.2.1.4.3 Transfer Endpoint 59




9.2.1.4.4 HTTP Filter 59

9.2.1.4.5 Network Health Monitor 59



9.5 Rationale 60


9.7 Reference Documents 61

10 Execution Control Modules 62




10.2.1.1 Tango Control 62

10.2.1.2 Master Controller 62

10.2.1.3 Processing Controller 63

10.2.1.4 Monitoring 63

10.2.2 Relations and their Properties 63


10.2.4 Element Behaviour 64



10.5 Rationale 64



11 Platform Modules 65



11.2.1 Element and Their Properties 66

11.2.1.1 Configuration Management 66

11.2.1.2 Configuration and Orchestration 67

11.2.1.2.1 Operations Management Interface 67

11.2.1.2.2 Configuration 67

11.2.1.2.2.1 Platform Configuration 68

11.2.1.2.2.2 Operational System Configuration 68

11.2.1.2.3 Platform Configuration Interface 69

11.2.1.3 Operations Interface 69

11.2.1.4 System Interfaces 70

11.2.1.4.1 Service Connection Interface 70

11.2.1.4.2 Operating System Interface 70

11.2.1.4.3 Logging and Metrics Input Interface 70

11.2.1.5 Platform Interfaces 71

11.2.1.6 Platform Software 71




11.2.1.7 SDP Dependencies 72



11.2.4 Element Behaviour 73



11.4.1 Services vs Science Pipeline Workflows 74

11.4.2 Running other Components on the Platform 74

11.4.3 Isolating SDP Operational System from Platform Services 74

11.4.3.1 Service Connection Interface 74

11.4.3.2 Platform Configuration Interface 75

11.4.3.3 Logging and Metrics Input Interface 75

11.4.4 Isolating different layers of Platform Services 75

11.4.5 Science Processing Centre and Science Regional Centre 76

11.5 Rationale 76

11.5.1 Existing Architectures 76

11.5.1.1 Software Defined Infrastructure 76

11.5.1.2 Container Orchestration 76

11.5.1.3 Baremetal Cloud 77

11.5.1.4 Operations as a Service 77

11.5.2 Prototyping 77

11.5.3 Requirements 78


11.7 References 78

11.7.1 Applicable Documents 78

11.7.2 Reference Documents 78

12 Science Pipeline Workflows Modules 80




12.2.1.1 Processing Controller 81

12.2.1.2 Science Pipeline Workflows 81

12.2.1.3 Control & Configuration Scripts 82

12.2.1.4 Execution Engine Programs 82

12.2.1.5 Workflow Control Libraries 82

12.2.1.6 Quality Assessment 82

12.2.1.7 Workflow Control Interface 83

12.2.1.8 Workflow Service Interfaces 83

12.2.1.9 Execution Framework Interface 83






12.2.4 Element Behavior 83



12.5 Rationale 85

12.5.1 Modifiability and Maintainability 85

12.5.2 Testability 85

12.5.3 Robustness 85



13 Applicable Documents 86

© Copyright 2018 University of Cambridge

This work is licensed under a Creative Commons Attribution 4.0 International License.



http://creativecommons.org/licenses/by/4.0/


LIST OF ABBREVIATIONS

AAAI Authorization, Access, Authentication and Identification

API Application Programming Interface

AZ Availability Zone

C&C Component and Connector

CPU Central Processing Unit

CSP Central Signal Processor

FPGA Field Programmable Gate Array

GPU Graphics Processing Unit

HTTP Hypertext Transfer Protocol

IPMI Intelligent Peripheral Management Interface

LAN Local Area Network

NVME Non-volatile Memory Express

P3 Performance Prototype Platform

PFS Parallel File System

REST Representational State Transfer Technology

SAFe Scaled Agile Framework

SAML Security Assertion Markup Language

SATA Serial ATA

SDP Science Data Processor

SKA Square Kilometre Array

SSD Solid State Disk

SSH Secure Shell

VLAN Virtual LAN




1 Primary Representation

Figure 1: Primary representation of SDP modules at system level

Elements of this view are modules, which are units of software. “Allowed-to-use” relationships are

shown as arrows annotated with “use”; “Is-part-of” relations between modules are shown by

nesting; “implements” realises the interface. More detailed explanations of the meaning of elements

and relations are described in the Element Catalogue.

The module view is split into three layers. Execution Control at the top is responsible for providing

the TANGO Control interfaces and taking care of Monitoring the more loosely coupled services and



https://www.draw.io/?scale=4#G1HVcfr3CYxXsvUXZgy9OJ6Fcya5hCfJfw


processing code at lower levels. Control is split between the Master Controller, which is responsible

for starting and maintaining services, and the Processing Controller, which similarly schedules the

execution of Science Pipeline Workflows.

Science Pipeline Workflows (or workflows for short) define the concrete processing stages to be

performed in order to implement the processing associated with observations. As this is one of the

modules that is expected to evolve the fastest, it is isolated from the rest of the architecture by the

Workflow Libraries, which provides a simple and robust way to access SDP capabilities relevant to

Workflow execution.

The middle layer is split into two pillars, with service modules on the left and processing modules on

the right side:

● Service modules are split again vertically into two layers, with SDP Services providing

SDP-specific functionality such as Delivery and Model Databases. Platform Services offers

more general-purpose cluster management services such as orchestration and configuration

management, which handles lower level concerns such as deploying both Platform Software

and satisfy SDP dependencies by configuring underlying hardware. In fact, it is the only

module that should strongly depend on the hardware platform the SDP will be deployed on,

here indicated as Platform Interfaces. ● For the processing modules, considerable effort has been put into decoupling modules

vertically: Science Pipeline Workflows use Workflow Libraries to orchestrate SDP Services

and Execution Frameworks in order to run distributed processing. Processing Wrappers

encapsulate Processing Component and Receive modules from the next layer, which

perform the actual processing.

● Processing Components and Receive are only allowed to depend on Data Models and

System Services. They are meant to be implemented as purely functional components with

minimal implicit dependencies on other parts of the system. This is to ensure modifiability,

testability and encapsulation of performance concerns.

● Data Models define data formats and provide their implementations, and will be used

throughout the architecture. Those are further split into Buffer and Memory Data Models to

reflect the difference in performance guarantees.

The only interfaces available to all modules are the System Interfaces, which offer general-purpose

interfaces to the Operating System, Storage as well as Logging and Accelerator facilities. As

mentioned above, Platform Interfaces are outside interfaces specific to the Platform.

2 Element Catalogue This section is organised as a dictionary where each entry is an element of the Primary

Representation.

2.1 Elements and Their Properties This section explains the elements of the Primary Representation, identifies particular properties and

highlights their functionality. As this view is about modularisation, we especially attempt to identify

where we can encapsulate certain Expertise in order to ensure buildability. This might open up

certain Implementation options such as leveraging off-the-shelf solutions to improve affordability as

well.




2.1.1 Execution Control

Groups all modules related to the control and monitoring of the SDP. This especially handles both

resource allocation and process (services / execution framework) orchestration at the highest level.

This module is described in more detail in the Execution Control Module View.

2.1.1.1 Master Controller

Tracks status of the SDP, especially all services including Processing Control. Handles top-level

control, such as shutting down individual services or the entirety of SDP.

Expertise: SDP configuration management

Implementation: In-house

2.1.1.2 Processing Control

Manages Processing Block execution according to resource availability. Starts and coordinates

Processing Block Controllers to execute Science Pipeline Workflows according to control commands.

The main function implemented by Processing Control is scheduling buffer and compute resources

for Batch Processing Blocks, see Execution Control Data Model. This will have to take a long term

view of the system in order to spot possible resource bottlenecks in order to guarantee safe

workflow execution with minimal operator supervision.

Expertise: Compute/storage resource scheduling

Implementation: In-house, possibly utilising off-the-shelf scheduling and resource

allocation solutions

2.1.1.3 TANGO Control

Implements TANGO devices to allow control and monitoring of the SDP. This means implementation

of TANGO device servers offering information about the current state of the SDP system, which

might include Quality Assessment and Telescope State information obtained internally from Data

Queues.

Furthermore it communicates commands and attribute updates (where allowed by the TANGO

control interface) back to the rest of the system. This interface will be used by the Telescope

Manager sub-system in order to control SDP and coordinate it with the rest of the telescope. This

module is meant to entirely encapsulate the control interface in such a way that it can easily be

replaced to eliminate the TANGO dependency for control, for example in the case of SRC

deployments.

Expertise: Telescope operation


2.1.1.4 Monitoring

Collects information about global operational status and service health using information provided

by Platform Services and SDP Services. This especially involves following logging and other system



https://docs.google.com/document/d/1LYyPdlJF_uGVmzet0rCHLrLmRYoaG2wYDqp4XbvrUzg/edit?ts=5b9fc19b#heading=h.gpojvelzbhz8


health information and aggregating information relevant to telescope management. This will

include information about SDP load and capacity metrics that will feed into high-level observation

planning.

Note that collection of health and logging information is mainly implemented in Platform Services.

This will provide their own independent monitoring and reporting using the Platform’s Operation. In

contrast to that, this module specifically implements SDP-specific filtering and aggregation for the

purpose of TM’s monitoring interfaces (provided via TANGO).

Expertise: Telescope operation


2.1.2 SDP Services

Groups modules that provide domain-specific SDP services to support processing. As this is a

data-driven architecture, this especially concerns maintaining data items around the execution of

Science Pipeline Workflows: Buffer Management maintains the primary data exchange of the SDP

architecture. Model Databases, Delivery, as well as Long Term Storage, manage long-term data

stores which are serving and/or getting updated by processing.

2.1.2.1 Delivery

Delivers Data Products to the Observatory or SKA Regional Centres. Maintains the Science Data

Product Catalogue using data from Scheduling Blocks and the Science Data Model produced by

Model Databases.

The implemented services are Data Product discovery (both to the Observatory and SRCs), locally

providing IVOA access to Data Products, and providing data transfer interfaces to SKA Regional

Centres. For more details see Delivery Module View.

Expertise: Data distribution, IVOA services, Science Data Model

Implementation: In-house integration of open source components

2.1.2.2 Long Term Storage

Provides long-term storage for Data Products. This is the software interface to an off-the-shelf HSM

storage appliance.

Expertise: Data preservation

Implementation: Off-the-shelf with custom control

2.1.2.3 Model Databases

Produces the Science Data Model using the Global Sky Model database as well as Telescope State

and Configuration data obtained using TANGO from other SKA sub-systems. Handles queries and

updates to the Sky Model database.

Expertise: Science Data Model

Implementation: In-house, likely utilising off-the-shelf databases internally



https://docs.google.com/document/d/1jCqssiayXYICCJcFh4V_FOTBW8s3gUYjJfV1NF0xwg8/edit#heading=h.hmafzpwa5tit


2.1.2.4 Buffer Management

Arranges storage of input data for workflow execution as well as Data Products. Manages lifecycle of

storage instances across Buffer and Long Time Storage. Implements Data Island Interface for

applications to access storage.

Expertise: Data lifecycle management

Implementation: Thin layer of management services on top of platform software.

2.1.3 Platform Services

Groups modules that provide a non-domain-specific cloud-like high performance computing

platform environment. The platform should especially provide high-level facilities to track available

resources and deploy new software instances on them. Software components should be as agnostic

as possible of the concrete hardware set-up as well as the other software instances running on the

cluster.

It is expected that the Platform will be implemented mostly using off-the-shelf modules plus

(possibly substantial) configuration and scripting. This module is decomposed further in the Platform

Services Module View and implements the components documented in the Platform Services C&C

View.

2.1.3.1 Operations Interface

Provides (user) interfaces for operators to the internal functionality of the platform, including

internal monitoring and control as well as infrastructure and inventory management. In

implementing a cloud-like environment, this interface will be used for initial start-up of the SDP, but

otherwise have minimal direct interaction with the Operational System.

Expertise: Cloud-style cluster operations

Implementation: Customised off-the-shelf

2.1.3.2 Configuration & Orchestration

Platform interface to the rest of the SDP. Provides the capability for configuration management of

the Platform, i.e. track and use platform resources to provide processing and storage components.

This especially includes keeping track of platform resource states, as it has to provide a consistent

view of platform load and capacity in the presence of internal and external restrictions (e.g.

low-power mode, see Platform Services Module View). Furthermore the Platform should support

checking the state of deployed components, retrieving generated logs and keeping track of

generated software and hardware metrics.

The Platform is also responsible for providing the means for communication with the deployed

software. This might involve deployment of intermediate infrastructure, such as adjusting network

configuration or deploying dedicated communication middleware. While all of these services will

draw heavily on off-the-shelf software, providing such a wide array of services expected to require

substantial amount of orchestration and coordination on the Platform’s side. This should take the

form of a well-maintained library of configuration scripts, interpreted by a Configuration

Management system internal to the Platform, see again Platform Services Module View.



https://docs.google.com/document/d/1Oph5HMMfSnwpjGXtwxhiVrts47thYYtHBOoBisQe6R0/edit#heading=h.3257dahbcbhg


https://docs.google.com/document/d/1csZAezlyZ_lLRwGK78b02uGEpp_l840DTI5FXxV0AgE/edit#heading=h.3257dahbcbhg





Expertise: Platform Configuration Management

Implementation: Orchestration configuration for SDP infrastructure

2.1.3.3 Platform Software

Off-the-shelf software deployed by Configuration & Orchestration to help provide the required

services to the rest of the architecture. While this software will not be used directly by components

external to the platform, the complexity of operating a cloud-like environment at scale will require a

significant internal infrastructure.

Platform software might especially include low-level storage and compute provisioning, software

deployment facilities (some combination of bare-metal, virtualised or containerised) as well as

internal logging and health tracking.

Expertise: Cloud or Distributed Computing, Data Centre Administration

Implementation: Mostly off-the-shelf - OpenStack (or customised) Service, hosted

inside OpenStack managed infrastructure, or alternatives

2.1.3.4 SDP Dependencies

Like Platform Services, these dependencies will be deployed and configured by the Platform, but

serve to provide visible infrastructure to the rest of the system. It will be configured to enable

interaction with both the Operational System and Platform Services as required.

An important example here is the Configuration Database (see Execution Control C&C View), which

will be used for storing both Operational System and Platform configuration parameters. As

illustrated in the Platform Services C&C View this will be the main interface to functionality of the

Configuration & Orchestration module: it can be used to request deployments and perform service

discovery and monitoring. This should be implemented as reliable data store, and especially provide

a scalable notification mechanism (possibly as a separate queue infrastructure) that allows

components to detect and react to configuration changes. See Operational System C&C View for

discussion of the provided “coordination” interface.

Furthermore, Data Queues will be used throughout the architecture as a mechanism to stream data

in real-time between different components. The platform is responsible for deploying this

infrastructure in a scalable fashion. See again the Operational System C&C View for information

about the provided “queue” interfaces.

Finally, it is expected that further “standard” infrastructure modules such as databases will be

provided. Parts of storage infrastructure might also fall partly into this category, even though it is

likely that providing storage access would be handled at the time of deployment, and therefore

effectively be mediated by configuration scripts.

Expertise: COTS

Implementation: Off-the-shelf



https://docs.google.com/document/d/1zdFxxbSfLfG0BWaN1tLQ6SygO4YJ4m9AXLWNPlCmyJE/edit#heading=h.n1u8s1biebpq


https://docs.google.com/document/d/1FTGfuy1R4_xjEug5ENPZwXqfAEy9ydqYXCXP__48KKw/edit#heading=h.80ct9azeh8gp



2.1.4 Science Pipeline Workflows

Workflows are scripts describing the steps that must be taken by the SDP to execute a Processing

Block. This means that the workflow will specify:

1. the resources requested for execution,

2. the storage infrastructure (Data Islands, Data Queues) required, and most importantly

3. The concrete Workflow Stages to be performed by SDP Services and Processing.

Once resources are made available, workflows are responsible for executing the stages in a fashion

appropriate to resource constraints, dependencies both on external entities and other stages, as well

as commands from the TANGO control interface (for real-time processing).

Workflow stages might include, in no particular order:

● Configuring the Buffer to provide required Data Islands

● Instantiating Data Queues to capture dynamic data

● Initialising Quality Assessment to aggregate appropriate metrics

● Using Model Database to build a Science Data Model

● Instantiating Execution Engines to perform processing

● Updating the Science Data Product Catalogue in Delivery

This module is described in more detail in the Workflow Module View. See Execution Control Data

Model for how Processing Blocks and especially Workflow Stages will be represented in SDP

configuration. How workflow scripts and Execution Engine Programs are deployed to implement a

workflow is illustrated in the Science Pipeline Workflow Script View. For example workflows, see the

behaviour section in the Operational System C&C View outlines how Processing Blocks are executed

in general, and the Science Pipeline Workflow View for concrete workflow types we might want to

implement.

Expertise: Radio astronomy workflows


2.1.5 Quality Assessment

Workflows will emit information allowing early assessment of the quality of scientific data products.

To this end, metrics will be gathered from Processing Components, emitted using Data Queues,

possibly aggregated it to make the metrics more useful to users, then pushed out using the TANGO

Control Interface in Execution Control. This type of data path will be shared by many workflows,

therefore the Quality Assessment module provides templates to simplify this task.

As illustrated in the Processing C&C View, at runtime Quality Assessment will be managed very

similarly to other Batch and Real-Time Processing.

Expertise: Workflows, quality assessment


2.1.6 Workflow Libraries

Provides an interface to manage the interaction between Science Pipeline Workflows and the rest of

the architecture. This especially covers a way for the workflow to instantiate the Execution



https://docs.google.com/document/d/1ymUHzkUNVLoYmDL5-5qB04XkCqgAfwCbyTZFh4nYGOE/edit#heading=h.jqjsb0qu5mlu



https://docs.google.com/document/d/1cv1vKT7IHEfL_iI0Br4gD1woPZAsBqmQFIkl1tmBcyE


https://docs.google.com/document/d/1-VKSHG7bblrqkQbe2QuYC6XU89zzsVKrCHxl6twX-G4

https://docs.google.com/document/d/12T03o0xnXdp2H1NB7XwMPAxRbEPSDEA3hCdsz6di4Go/edit#heading=h.gpojvelzbhz8


Framework for a Science Pipeline Workflow using provisioned resources - such as compute and

storage, but also services such as data queues. It should allow workflow code to monitor and

influence processing, mediated via the Configuration Database or Data Queues. For more detail see

the Workflow Module View.

Expertise: Distributed Computing, HPC system support.


2.1.7 Execution Frameworks

Used for execution of Science Pipeline Workflows stages. The SDP will support multiple

independently developed and maintained Execution Frameworks, even within the same workflow.

Execution Frameworks will be used by Execution Engine Programs from Science Pipeline Workflows

(see Workflow Module View) to instantiate Execution Engines, see Processing C&C View.

2.1.7.1 Execution Framework Implementation

Provides core functionality of an Execution Framework: handles fine-grained scheduling of assigned

compute resources, organises data movement and invokes Processing Component execution

through Processing Wrappers.

Expertise: Data flow implementation

Implementation: Either off-the-shelf or in-house

2.1.7.2 Processing Wrappers

Wraps Processing Components, Receive and SDP service interfaces for Science Pipeline Workflows

and the Execution Framework Implementation. Should allow the Execution Framework

Implementation to instantiate and execute processing graphs in a distributed way without

introducing strong coupling to the actual Processing Component implementations.

Implementing this might involve light data-model specific transformations such as splitting or

merging data, or caching for example Science Data Model data. The Processing Wrappers are also

expected to take care of conversion between different Data Models, such as Memory Data Models

and Buffer Data Models. Especially note that this means that processing storage I/O is expected to

be implemented by the Wrappers in coordination with the Execution Framework, not by Processing

Components.

Expertise: Data flow kernel integration


This module is further decomposed in the Processing Component Module View.

2.1.8 Processing Functions

Libraries implementing core SDP-specific data processing. The main radio-astronomy and

interferometry components are implemented as Processing Components. Just-in-time handling of

incoming data from CSP and LFAA is handled by Receive libraries.

2.1.8.1 Processing Components

Library of domain-dependent radio astronomy and interferometry algorithm implementations





https://docs.google.com/document/d/12T03o0xnXdp2H1NB7XwMPAxRbEPSDEA3hCdsz6di4Go/edit#heading=h.gpojvelzbhz8

https://docs.google.com/document/d/1dm3MwJXlN35UTxZbF0mNQiooZyx3dxDsRkskiqDxtaQ/edit#heading=h.gpojvelzbhz8


consuming and producing data according to Memory Data Models.

Should implement a standardised Processing Component interface to make them as much as

possible agnostic of how they are used in Science Pipeline Workflows or what Execution Framework

is calling them.

For reproducibility, processing components should be referentially transparent, which is to say that

output data should be entirely determined by input data. This excludes problems like floating point

rounding errors, which due to compilation differences and non-deterministic parallel execution are

hard to control for.

Expertise: Radio astronomy algorithms

Implementation: In-house, some third party

This module is further decomposed in the Processing Component Module View.

2.1.8.2 Receive

Handles incoming data from CSP and LFAA (for SKA1-Low transient buffer data). The received data

will be both written into the Buffer for Batch Processing as well as handed over directly to Real-time

Processing Pipelines (such as fast imaging or real-time calibration). This data hand-over will require

deploying Receive using the same workflow and likely the same execution framework as real-time

processing.

Expertise: Networking, radio astronomy algorithms


2.1.9 Data Models

Groups the definitions and implementations of data representations, data formats, and appropriate

utility code for SDP data. It includes support for versioning and conversion between data formats

and versions. It is composed of Buffer Data Models (formats for e.g. visibility, pulsar candidates,

pulsar timing, and transient data in the Buffer) and Memory Data Models (in-memory data

representations for processing components and queues).

This module will also include any utility libraries used for global communication across the SDP

system, such as Data Queues and the Configuration Database.

2.1.9.1 Buffer Data Models

Definitions of data representations of data inputs and products in the buffer as well as utility code

for reading, writing and interpreting it. This may include intermediate Data Products not visible

outside pipelines. Note that this especially includes the in-buffer representation for the Science Data

Model, which this module should provide an interface for that is similar to database queries in terms

of flexibility and scalability.

Note that support for legacy Processing Components will likely require support for a large number of

Buffer data models, even for the same type of data (e.g. measurement sets). Therefore physical data

models should be uniquely identified and versioned. Conversion to and from different Buffer Data

and Memory Data Models should be possible.





Expertise: Radio astronomy data models

Implementation: In-house, possibly involving legacy libraries

This module is further decomposed in the Processing Module Component View.

2.1.9.2 Memory Data Models

Data representations used by Processing Components and Data Queues. Meant to be used as a

high-speed way to interface processing components and pipelines with each other. Might also

include utility code as appropriate.

Memory Data Models will be SDP-specific, but given that they will be used across a lot of Processing

Components and SDP services, modifiability should be kept in mind. Extensible binary formats

should be used where possible, and as with Buffer Data Models, memory data formats should be

identified and versioned to make it possible to check Processing Components for compatibility.

Expertise: Radio astronomy algorithms and data models


This module is further decomposed in the Processing Module Component View.

2.1.10 System Interfaces

Common interfaces in use by all SDP modules. This includes typical Operating System Interfaces,

such as for example file system access and other APIs typically provided to UNIX applications. For the

purpose of the SDP, this will also include access to middleware such as Platform’s Logging and

Metrics as well as Configuration Database and Data Queues of the Operational System (see

Operational System C&C view).

Implementation of the System Interface is going to be managed by Platform Services. As changes to

these modules might impact the entire system, this should be restricted to well-established standard

interfaces.

Expertise: Platform Services

Implementation: Configured off-the-shelf

2.1.11 Platform Interfaces

Interfaces used by the Platform to access hardware and SDP-external platform resources. For

example this could provide access to an existing container infrastructure for SKA regional centre

deployments. This allows the Platform to be independent of the level of existing infrastructure.

Expertise: Platform Services

Implementation: Externally provided

2.2 Relations and Their Properties The primary presentation uses two types of relations: The “allowed-to-use” relationship and module

decomposition.







A “use” relationship implies that the implementation of a module closely depends on the

implementation of another module. The “Views & Beyond” definition is that a module A “uses”

module B if it cannot be implemented correctly without B also being correct. This means that

communication between module code does not necessarily imply that they “use” each other.

Instead we can choose to “use” a common data model / protocol module that decouples them from

the concrete implementation of the other. Note that at this level of decomposition, “use”

relationships are not prescriptive, so we rather characterise them as “is allowed to use”

relationships.

Module decomposition is shown by nesting. Beyond module hierarchy, this implies some degree of

“allowed-to-use” relationship between contained modules even if not spelled out explicitly.

Implementation decisions for these are likely going to be related. “Allowed-to-use” relationships of

the top-level module are understood to propagate to lower-level modules.

The following sections we will go through the function of individual “use” relationships in the

primary representation.

2.2.1 Execution Control Relations

2.2.1.1 Uses Platform Services

Steering of the platform infrastructure in relation to SDP functionality:

1. Provision compute resources for services and processing (Master Controller, Processing

Controller)

2. Manage Configuration Database data, often in order to interact with other modules of the

SDP architecture

3. Setting up and tearing down Data Queues for Quality Assessment and calibration data

alongside processing

4. Monitor Logging as well as Health and Metrics data

2.2.1.2 Uses SDP Services

Starting and coordinating SDP services in relation to SDP processing:

1. Initiation of SKA Data Product Catalogue updates using Delivery

2. Request generation of the Science Data Model from Model Database Services, and feeding

back updated data after processing is finished.

3. Organise Buffer preparations for processing or archiving.

2.2.1.3 Uses Workflow Libraries

Has the interface used by Processing Control for parameterising and executing Science Pipelines. See

Workflow Module View. This especially includes Instantiation of Execution Frameworks on assigned

resources to perform processing jobs, monitoring execution of processing and tearing down after

use.

2.2.2 SDP Services Relations


Used for coordinating work within the platform infrastructure

1. Used for provisioning software services (such as databases).

2. Access Configuration and Coordination data





2.2.2.2 Uses Data Models

Required for being able to reason about data exchanged within the SDP system

1. Buffer Data Models required by Buffer Management Services as well as Product Preparation

and Delivery in order to work with Buffer data independently of processing.

2. Interface to Data Queues to stream data between processing and SDP services such as

Quality Assessment, Model Database Services and Data Queue Services.

3. Interface to Configuration Database to obtain configuration data

2.2.3 Science Pipeline Workflows Relations

2.2.3.1 Uses Workflow Libraries

Workflows will be implemented in a fashion that abstracts away as much detail of the SDP’s

architecture, which serves to reduce their complexity, simplify changing the SDP architecture, as well

as limiting the potential damage from errors in workflows. Workflow Libraries provide intermediate

modules to serve this purpose.

2.2.3.2 Uses Quality Assessment

Import common workflow structures used for implementing Quality Assessment

2.2.3.3 Uses Execution Framework

Science Pipeline Workflows use the Execution Frameworks and Processing Wrappers in order to

execute Execution Engine Code, which in turn executes Processing Components.

2.2.4 Quality Assessment Relations

Depends on Workflow Libraries, as it will be written similarly to Science Pipeline Workflows,

especially using Execution Frameworks to deploy Execution Engines dedicated to Quality Assessment

(e.g. metric aggregation).

2.2.5 Execution Framework


Platform infrastructure must be to some degree transparent for Execution Frameworks to efficiently

organise data movement for processing.

1. Might provision common software services such as databases and query locality information

if available

2. Access Configuration and Coordination data

3. Data Queues are used for streaming data in and out of processing at run-time

2.2.5.2 Uses Core Processing

Processing Wrappers use standard Processing Component interfaces to ensure algorithms are as

much as possible agnostic of how they are used in pipelines or what Execution Framework is calling

them.

2.2.5.3 Uses Data Models

The Execution Framework and Processing Wrappers are responsible for organising data movement

between Processing Components, and especially Processing Components and other parts of the SDP

system. To do so, the Data Models must be understood at least to the point where they can be

transmitted and distributed (which might involve splitting and merging operations).




Especially note that Processing Wrappers will be in charge of Buffer I/O, which means translating

Buffer Data Models into Memory Data Models or vice versa as required. See also the Processing

Component Module View.

2.3 Element Interfaces Not applicable

2.4 Element Behaviour Not applicable

3 Context Diagram

Figure 2. Context diagram showing the relation of SDP to other SKA software.

The context diagram shows a selection of SKA modules with focus on the modules developed by SDP

(in bold): The main Science Data Processor module, which is what the primary representation shows

a decomposition for, and the SDP Resource Model module. The Science Data Processor module will

be used in a number of variants of the system, most notably the Assembly, Integration and

Verification / Commissioning SDP system that is going to get deployed in the SKA construction phase.

Similarly, SKA Regional Centre deployments might use modified version of the SDP software.

The two external modules that SDP will “use” are the “SKA TANGO” module (SKA TANGO control

system module) and “SKA Core Software”. Specifically, SDP will use the following functionality from

SKA Core Software:

● Telescope Model: Static (code) parts of Telescope Model information

● Astronomy Library: Basic astronomy/domain specific algorithms shared between telescope

elements

● SDP Resource Model: Resource model shared with the Telescope Manager to facilitate

scheduling and resource planning. As this module will require detailed knowledge of

scheduling-level performance characteristics of SDP workflows, it will likely be developed in

close coordination with the relevant SDP modules (such as Workflows, Execution Engines,

Processing Components and Data Models)



https://docs.google.com/document/d/1dm3MwJXlN35UTxZbF0mNQiooZyx3dxDsRkskiqDxtaQ/edit#heading=h.waei3tkcp56b

https://docs.google.com/document/d/1dm3MwJXlN35UTxZbF0mNQiooZyx3dxDsRkskiqDxtaQ/edit#heading=h.waei3tkcp56b

https://www.draw.io/?scale=4#G0B2iaikOONnHuY1JpMUNjMWFvbkU


4 Variability Guide The scientific pipelines being run will likely continue to evolve throughout SDP’s life time. We plan

for three levels of variability:

1. Workflows themselves can be exchanged and re-implemented freely. Scheduling should be

set up such that it can easily be setup to run new types of workflows.

2. We would also like to avoid lock-in in terms of Execution Frameworks interpreting the

workflows. The architecture should support multiple instances and types of Execution

Frameworks within SDP. This is ensured by Workflow Libraries employing an Execution

Framework Interface to interact with them (see Workflow Module View) 3. Finally, Processing Components should be built in a way that they can easily be exchanged.

Their interface should be general enough that any processing component could be used by

any Execution Framework or Science Pipeline Workflow.

This workflow variability especially allows SDP to be configured for operation in both the SKA1-Low

and SKA1-Mid Telescopes using configuration changes. We do not expect that there will be other

modules dedicated to either Telescope.

5 Rationale General points:

1. SDP Services and Processing Components are represented as the two main poles of this

architecture. The reasoning is that processing is our central function, with non-processing

modules filling a support role.

2. Workflows, Processing Components and Data Models represent the “core” of

domain-dependent functionality. This makes them the primary focus of the architecture

within processing; therefore they are kept at top-level.

3. Long Term Storage is viewed as “opaque” for the purpose of this module view. Proven

off-the-shelf technologies exist in this space, so from an architectural standpoint these

module is not enough of a risk to be considered in detail.

5.1 Experience The architecture aims to use solutions that have been proven in practice. There two main points of

comparison here are existing systems and prototypes built specifically for the SDP. In this section we

present some evidence that aspects of the presented architecture have indeed been realised before.

5.1.1 Existing Architectures

The layered structure with Platform and System interfaces at the bottom overlaid by applications

and finally application supervisors is a standard structure used in distributed computing.

Furthermore, having a service-oriented part of the architecture (SOA) with module loosely coupled

both at compile and at run time is a fairly typical architecture, as exemplified by MeerKAT [RD14] or

ASKAP [RD15]. At the System level this is useful for ensuring proper partitioning of the system and

ensuring robustness.

Furthermore, the structure of processing into layers is also inspired by what we perceived as best

practice in radio astronomy. The points of comparison here would be:





1. Data Models: Having a set of modules dealing with data models separate from processing

allows de-coupling data model development. This is common to the point where even many

observatories today share data models, such as Measurement Sets using casacore [RD00].

On the other hand, what is more unusual here is that the architecture suggests a separate

memory data model for high-speed data exchange. The fact that we are identifying data

objects independent of representation can be seen as a special case of DALiuGE’s data drop

mechanism [RD11].

2. Processing Functions: Gathering functions to work on these data models in a way that they

can be run in an isolated and repeatable fashion is also fairly common. Inspiration here was

CASA’s tasks [RD12] as well as more or less specialised tools such as WSClean or calibration

solvers. The main difference for the SDP architecture is that we will not use the paradigm of

measurement set files getting updated due to scalability concerns, instead working with

processing data in memory and writing out only final and intermediate data products.

3. Execution Frameworks: Using properly encapsulated Processing Functions to parallelise

processing is also in itself not a new idea. Clearly, given programs that communicate using

file systems, a distributed file system and scheduler are enough to be able to exploit a fair

amount of parallelism. It is typical for HPC to use systems like Slurm for this purpose, but it

has been shown (through SDP prototyping) that other systems such as Spark [RD13],

DALiuGE [RD11] or Swift/T [RD16] can be used in its place.

Again, the SDP perspective is that we want to push this trend even more by giving the

Execution Frameworks more responsibility for executing processing, which is thought to

translate to more opportunities for enabling both modifiability and performance. We are

especially thinking of the potential of putting the Execution Framework in charge of memory

transfers and management, as the SDP has to solve a heavy I/O problem with large working

sets.

4. Workflows: Running through top-level workflows is also common practice in radio

astronomy. Often the steps are run manually, yet their complexity has grown to the point

that large form libraries of their own (see for example ASKAP Processing Pipelines [RD17]).

As SDP will eventually have to operate without manual intervention for most of data

processing - especially involving time-critical preparation steps - this type of scripting will

have to feature even more heavily here.

5.1.2 Prototyping

The SDP consortium has built a number of prototypes to test the architectural ideas presented in this

view. Most notably:

● The SDP Integration Prototype [RD00] focuses especially on testing the interactions between

Execution Control, Platform and external interfaces. It has been developed in close

coordination with the architecture.

● The Algorithm Reference Library [RD01] has demonstrated that distributed radio astronomy

algorithms using a variety of Python-based Execution Engines and Processing Components

can be implemented within the restrictions of the SDP processing architecture.

● DALiuGE [RD02] is a prototyping effort testing some specific design ideas for previous

versions of the SDP’s processing architecture. By finding success even outside the confines of

SDP It has demonstrated that a high-level approach to workflow development can work.




● SDP Execution Frameworks Prototyping [RD03] has gathered experience about combining

Execution Engines with Processing Components, providing evidence that we can interface

them despite high variability of technologies.

● Technology choices for real-time messaging infrastructure were assessed in SDP memo 52

(Apache Kafka for an SDP log-based architecture) [RD04]. It especially illustrates how Data

Queues (and possibly the Configuration Database) could be implemented using Apache

Kafka.

5.2 Constructability, Modifiability and Maintainability Requirements: SDP_REQ-810 (Maintainability of Software), SDP_REQ-828 (Constructability)

The SDP system needs to be practical to build within the constraints of the construction schedule.

The topology of the “module uses view” is useful for deriving construction and implementation

plans, as it suggests where implementation work might have to progress in a certain order. So if for

example an Execution Engine module “uses” a Processing Components module, then the former

module needs to be completed at least to the point of a minimal/mock implementation before

Processing Component development can start. In an agile development context, implementing new

Execution Engine features might often involve work on Processing Components.

5.2.1 Service/Processing Structure

Communication between modules is often decoupled using Platform-provided interface modules:

Buffer (file system), Data Queue and Configuration information will be provided using standard

(often off-the-shelf) infrastructure. This strongly de-couples both Execution Engines and Services,

especially from each other.

This strongly hints that a number of possible architecture subsets can be prototyped, developed and

tested in isolation, as interactions with other modules can be “mocked” using standard

infrastructure. This especially should allow Science Pipeline Workflows to be tested outside of the

SDP before starting an associated observation. Furthermore, even though Processing Functions are

envisaged to use an SDP-specific interface, the strong isolation from the rest of the architecture

should allows us to both build and test Processing Components separately as well.

5.2.2 Processing Layers

Modules are split by how much domain expertise is required for development. This should enable

groups with different backgrounds to collaborate effectively on SDP construction. This is especially

important on the Processing side of things: on the one hand, Science Pipeline Workflow and

Processing Function development will involve radio astronomy know-how to the point where

developers will likely work directly with radio astronomers. On the other hand, Execution

Frameworks should mostly involve reasoning about processing and data movement, ideally isolated

to some degree from domain-specific concerns.

This relative isolation is also useful because different layers of Processing modules are expected to

change at different speeds: while workflows are expected to evolve quickly, Execution Frameworks

and Processing Functions will need to be developed more conservatively, and Data Models will likely

evolve yet more slowly. This suggests different ways that updates to these layers would be handled,

see Science Pipeline Management Use Case View.



https://docs.google.com/document/d/13KFmNUM9e3nUsjT9vTnXvOwRnSaodJfO3SE2VA-wyFo/edit#heading=h.yad2dmef8kdf


Up to a point the strong isolation of Processing Functions and the likely re-use of data models will

even allow reuse of existing astronomical software, simplifying construction. Note that this will

conflict with the requirement for Processing Components to use SDP memory data models.

Introduction of “legacy” components might therefore require some modifications, such as the

addition of conversion or emulation layers. This is seen as preferable to losing performance

guarantees and restricting the design space for Execution Frameworks.

5.2.3 Context

The two external modules that SDP will “use” are the “SKA Tango” (SKA TANGO distribution and

element interfacing module) and “SKA Core Software” modules (basic astronomy/domain specific

algorithms shared between elements). While this involves coupling, sharing these should minimise

problems relating to data exchange with other subsystems; it also reduces code duplication and

consequent maintenance and consistency issues. In particular, we expect the “SDP Resource Model”

to become part of the SKA Core Software module so that it can be shared with the SKA Telescope

Manager element for telescope planning purposes.

5.3 Scalability Requirements: SDP_REQ-829 (Scalability)

Our architecture is data driven: all services serve processing, so that we can do as much work on the

data when and where it is available. Furthermore, due to taking most of the responsibility for

communication out of the hands of processing components and encapsulating it in Execution

Frameworks, we ensure that domain-specific functionality can not become a problem for scaling.

Having multiple Execution Framework implementations also enables us to scale both up and down

naturally: we can choose the execution engine to fit exactly the scale that we need it to run on. So

for example, this architecture makes it a viable choice to implement primarily serial workflows as a

self-contained process on a single node.

5.4 Performance Requirements: SDP_REQ-826 (General Workflow / Algorithm Performance)

To ensure performance while maintaining workflow variety, modules have been decoupled

vertically: Execution Frameworks are modules that execute distributed Science Pipeline Workflows,

controlled via a generic Execution Framework Interface. Processing Wrappers encapsulate

Processing Component and Receive modules from the next layer, which perform the actual

processing. This means that performant components can be re-used, and when components

under-perform we can develop and test alternatives without disturbing working code. See the

Processing Component Module View for details.

5.5 Reliability Requirements: SDP_REQ-762 (SDP Inherent Availability (Ai)), SDP_REQ-821-825 (Failure detection to

Achieve Ai, Node failures recovery, Failure Prevention, Ingest and Buffer Failure Prevention,

Monitoring to prevent critical failures)

By using standard cloud-like infrastructure - such as OpenStack - a lot of failure scenarios could be

handled automatically at a lower level, such as by re-starting components (where supported) or

migrating resources in a way that is opaque to the Operational System. For instance the





Configuration and Orchestration Platform module should be able to check and ensure application

health independently, which means that top-level controller modules in Execution Control would

only be involved in case of catastrophic failures.

Furthermore, we have split controller functionality such that the “Master Controller” has minimal

responsibilities - in fact it might be implemented mostly using Platform Configuration Management

scripts that may get triggered by operators or TM via the TANGO interface. This minimises the

complexity of code that directly affects the availability of the entire SDP.

5.6 Portability Requirements: SDP_REQ-812 (Portability of SDP to SRCs), SDP_REQ-816 (Portability when hardware

is refreshed)

To enable portability of Operational System (non-Platform) modules, we abstract Platform details

away by providing software execution environments configured to specification (represented in this

view as System Interfaces). This allows portability in software, as we can tailor the environment

towards the software’s requirements (such as specific versions of libraries or development

environments) largely independently of the hardware details. This will likely be implemented using

Platform technologies such as containerisation or possibly virtualisation.

SDP Services are separated from processing, so it is a possibility that SDP may execute Scientific

Pipeline Workflow and Execution Framework on dedicated infrastructure, which might be refreshed

independently. Furthermore, Processing Component kernel code compatible with (or optimised for)

new architectures can be introduced seamlessly either by Processing Component support, or by

switching out Processing Component implementations used for workflows depending on the

hardware used for execution.

Furthermore, Platform Services itself is open towards the concrete hardware used, and especially

allows for pre-existing APIs native to the deployment. This will be important if for instance SDP will

need to get deployed inside an existing platform, such as with a cloud deployment (a possibility for

SRC deployments). In this case we will implement the SDP’s Platform Services on top of existing

infrastructure instead of deploying our own.

6 Related Views This view is decomposed further in the following views:

● Execution Control Module View

● Workflow Module View

● Processing Component Module View

● Delivery Module View

● Platform Services Module View

This modules shown at this level of decomposition implement the components shown in the

following component and connector views:

● Operational System C&C view

● Platform C&C view





https://docs.google.com/document/d/1jCqssiayXYICCJcFh4V_FOTBW8s3gUYjJfV1NF0xwg8/edit#heading=h.hmafzpwa5tit





7 Reference Documents

The following documents are referenced in this document. In the event of conflict between the

contents of the referenced documents and this document, this document shall take precedence.

[RD00] SKA-TEL-SDP-0000137, SKA1 SDP SIP Prototyping Report

[RD01] SKA-TEL-SDP-0000150, SKA1 SDP Algorithm Reference Library Prototyping Report

[RD02] SKA-TEL-SDP-0000153, SKA1 SDP DALiuGE Prototyping Report

[RD03] SKA-TEL-SDP-0000117, SKA1 SDP Execution Framework Prototyping Report

[RD04] SKA-TEL-SDP-0000163, SDP Memo 052: Apache Kafka for an SDP log-based architecture

[RD10] van Diepen, G. N. J. "Casacore Table Data System and its use in the MeasurementSet." Astronomy and Computing 12 (2015): 174-180.

[RD11] Wu, Chen, et al. "DALiuGE: A graph execution framework for harnessing the astronomical data deluge." Astronomy and Computing 20 (2017): 1-15.

[RD12] McMullin, J. P., et al. "CASA architecture and applications." Astronomical data analysis software and systems XVI. Vol. 376. 2007.

[RD13] "Apache Spark: Lightning-fast cluster computing." http://spark.apache.org

[RD14] Booth, R. S., et al. "MeerKAT key project science, specifications, and proposals." arXiv preprint arXiv:0910.2935 (2009).

[RD15] Guzman, Juan C., and Ben Humphreys. "The Australian SKA Pathfinder (ASKAP) Software Architecture." Software and Cyberinfrastructure for Astronomy. Vol. 7740. International Society for Optics and Photonics, 2010.

[RD16] Wozniak, Justin M., et al. "Swift/t: Large-scale application composition via distributed-memory dataflow processing." Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on. IEEE, 2013.

[RD17] "ASKAP Processing Pipelines" https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/pipelines/index.html



http://spark.apache.org/

https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/pipelines/index.html

https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/pipelines/index.html


8 Processing Components Modules Contributors: T. Cornwell, D. Mitchell, P. Wortmann

8.1 Primary Representation

The SDP Science Processing Workflows [RD39] provide the highest level SDP processing capabilities

such as the science pipelines. These workflows are built on top of other workflows and intermediate

level Processing Components. The Processing Components are built on top of a Processing Library

that contains low-level capabilities such as coordinate systems and FFTs. In this document, we

consider primarily the intermediate level functions that are composed into higher level workflows.

The primary representation is shown in Figure 1.

Figure 1: Module decomposition showing the relation between Workflow Components, the underlying Data

Models, and the Execution Framework through Component Wrappers. Each Module may make use of Libraries

for Processing or Data Access.

8.2 Element Catalogue This section lists the major processing components in each module. This section builds on the SDP

Pipelines Design document [RD01] and its supporting documents [RD8.1], [RD8.2], [RD8.3], and



https://www.draw.io/#G12N6gk_KRdOSVYm6FPz5m-virHOifgBVk


[RD8.4]. The module composition is in outline close to that used in the Algorithm Reference Library

[RD8.5] but with some additions.

8.2.1 Elements and Their Properties

The following properties are defined to explain the elements in this section:

● Variants: How many function variants should be implemented? For our purposes variants

mean sub-modules providing similar Processing Component interfaces such that they could

be used interchangeably by Workflows. This should be used to wrap different algorithmic

approaches, usage of certain optimised libraries and especially optimisations for certain

accelerator hardware (CPU, GPU, FPGA…).

● Performance: How compute intensive is the function? Is performance critical to pipelines?

When appropriate we give a rough estimate of how much computational cost this module

will likely contribute to the SDP compute budget according to the parametric model [RD41].

The value given will be percentage of projected compute averaged over pipelines required

for high-priority science objectives. This is not intended to be used as an authoritative

estimate; instead the system sizing document should be consulted [RD8.36].

● Dependent workflows: The workflows that rely upon these components.

● Associated data models: The Data Models associated with this functional element

● Implementation: Identification of possibilities for software reuse. Identify which

components from existing software could be adapted to implement this function. Note that

there are a number of data processing software suites from precursors that implement SDP

functionality (CASA [RD8.9], LOFARSoft [RD8.10] [RD8.11], ASKAPSoft [RD8.12]). This does

not necessarily mean that we would always reuse those software parts. Reuse is also

dependent on SKAO policy.

Processing Components are to first order organised in sub-modules according to the Data Model

that they work on, see also section 2.4. Details on the Data Model related to Workflows are

out-of-scope for this document, but will be described in a Workflows Data Model View document.

Furthermore, modules are grouped by function, not according to ‘pipeline’ or workflow they are

used by.

8.2.1.1 Processing Components

8.2.1.1.1 Processing Component Interface

The Processing Component Interface is an interface common to all Processing Components. It is used

by Processing Wrappers to instantiate, invoke and pass data between Processing Components as

required for the Science Pipeline Workflow under execution.

Thus Processing Components are implemented independently from Execution Frameworks and

Science Pipeline Workflows, which means that it is decoupled both from the work distribution as

well as the purpose of the workflow.

Variants: Only one, but might have to be implemented / get wrapped for a number of programming

environments depending on the Processing Wrappers




Performance: Needs to able to pass in-memory data efficiently, ideally using raw pointers to

in-process data for bulk memory data models.

Dependent workflows: All workflows rely upon this function.

Implementation: No reuse from existing astronomy software

8.2.1.1.2 Calibration

The measured data from the telescope is corrupted by various effects. As a result an image produced

from the measured data is limited in its quality. To improve the quality of the image, a model of the

instrument and its environment is fitted to the measured data and used to correct the measured

data for all corrupting effects. The resulting corrected image will have an improved quality.

The Calibration sub-module functions operate on Visibility data items and Calibration Data items.

Calibration Solutions are worked on by the Solution Flagging, Application, Solving, Solution

Operations, and the Solution Resampling modules. We will take a closer look in the following

subsections.

8.2.1.1.2.1 Solution

Fit model parameters to the visibilities for calibration. This is a fully non-linear fit of the Jones

matrices to the full vector visibility polarisation.

There will be sub-modules for the implementation of different algorithms for fitting, including:



https://www.draw.io/#G1R3g2Hxlnh7mvMD8t8Krbvu9eJzSQy5Ge


● Levenberg-Marquardt,

● Antsol,

● StEFCal,

● Linear Fitting.

However, these may not be sufficient for SKA. The calibration step for SKA will have many more free

parameters than is usual for existing arrays. This means that current solvers may not have sufficient

performance. For example, for the Antsol/StEFCal approaches requiring passes through the entire

data set for each iteration, which is prohibitively expensive. An alternate approach is used for ASKAP

[RD02]. Normal equations are formed and averaged over time and frequency. Estimation of the gains

thus requires solution of a linear equation of size N by N where N is the number of parameters.

Variants: There must be variants to allow for optimizations towards different platforms (CPU, GPU,

FPGA). There may be a solver appropriate for time-critical results such as in the RCAL pipeline.

Performance: Parameter fitting can be a performance bottleneck, depending on the Solving

Strategy. Various algorithms will have different computational complexity depending on number of

stations, frequencies, directions, and granularity of the data. Some might only be appropriate in the

context of Workflows. The Parametric Model [AD04] estimates 1% total compute, but the

performance impact is likely higher especially for distributed algorithms.

Dependent workflows: RCAL, ICAL

Associated Data Models: Visibility, Calibration Data

Implementation: Known implementations of calibration workflows: CASA [RD8.9], LOFAR-BBS

[RD8.11], DPPP [RD8.35], SAGECal [RD8.11], MeqTrees [RD8.15], ASKAPSoft [RD8.12]. These are

mostly equivalent to Workflows, whereas we are referring to the core solver functions.

8.2.1.1.2.2 Operations

Operations on calibration solutions, including applying the calibration solutions to the visibilities.

Resampling calibration solutions will be quite important for global calibration, and would include:

● Average: transform to coarser time-frequency resolution (but need to consider things like

Jones matrix ambiguities).

● Interpolate:

○ transform to denser time-frequency resolution

○ interpolate over solutions that are missing due to flagging

● Merging of solutions from different parts of SDP (e.g. frequencies or times) into a single set

of solutions.

● Combination of ionospheric phases from different frequency bands.

● Fitting of linear features to phases of neighbouring stations. (This may go elsewhere.)

When solving for Jones matrices, each independent set of solutions will have independent unitary

rotations, which can complicate solution smoothing, interpolation, averaging, etc., as discussed in

[RD8.34].

Variants: - Performance: Estimated ~1% of total computation




Implementation: SDP will implement this without re-use.

8.2.1.1.2.3 Instrumental

Solvers specific to particular instrumental effects (see Instrumental Calibration Workflow Module

View).

Variants: Solvers for different effects:

● Global antenna location and cable delay

● Pointing measurements for Mid

● Flux scale transfer

● Bandpass

● Polarization leakage

● Parallel and cross-hand delay measurement

● Measurement of antenna voltage patterns by holography or beam scans

● Low station beam calibration

Some of these can be addressed using a common Jones matrix solvers plus some manipulation of the

solutions, whereas others require specialised processing.

Performance: Mostly negligible but in some cases data flow required could be substantial.

Implementation: SDP will implement these without re-use.

8.2.1.1.2.4 Iterators

For iterating through a gaintable.

Variants: - Performance: Negligible


8.2.1.1.2.5 Ionospheric Monitoring

Provide a model of the ionosphere and/or metrics on the state of the ionosphere.

Variants:

● Diffractive scale estimates from temporal variations of calibrator phase shifts.

● Diffractive scale estimates from spatial variations of calibrator phase shifts.

● Metrics based on the size and isotropy of phase shift spatial derivatives across the array. Spatial metrics will rely on modelling at some level to achieve a higher resolution than is available

with antenna/station-based phase shift measurements.

Performance: Mostly negligible but for high dynamic range applications could be substantial.


8.2.1.1.3 Visibility





Simple arithmetic visibility operations with low computational complexity. This module collects a

variety of operations that will be developed as needed to support workflows.

There is a general purpose mathematics function. Possible functionality implemented here might

include:

● uv-subtraction: subtract one set of visibilities from another (in order to create residual

visibilities)

● uv-addition (e.g. Visibility Predict or Degridding output)

● uv-scaling (e.g. change flux scale)

● uv-ratio (e.g. XX/I, Q/U, XX/YY)

● Polarisation conversions

● Simple baseline calculations like length and azimuth angle (e.g. for weighting)

● Visibility weights arithmetic e.g. estimation, calculation after averaging

● Imaging weights arithmetic e.g. tapers, inverse tapers to down weight short baselines and

diffuse emission, etc.

● Visibility statistics for Quality Assessment

● Visibility sorting for performance optimisation

● Numerical derivatives in time or frequency as input to calibration

● Calculation of Doppler-shifts



https://www.draw.io/#G14fHbKuL-o0EHDpidI8YhYyTBlZf2Z-PC


8.2.1.1.3.2 Phase Rotation

Rotate visibilities to a different phase center, possibly changing the projection plane in the process.

Used for working with visibilities corresponding to different fields of view (such as facets) and

reducing the complexity of de/gridding. Equivalent to Reprojection in the image domain. Also

includes associated functions for performing direction summation of components to visibility using

phase rotation (i.e. a Direct Fourier Transform), and the inverse operation.

This should support a number of component models, including:

● Point Sources

● Gaussians

● Shapelets

Variants: - Performance: Phase Rotation needs to be performed at the full visibility resolution and for every

facet. This means that even though the cost per visibility-pixel pair is relatively minor, the total

computational performance contribution may become substantial. Estimated ~2% of total

computation.

Dependent workflows: RCAL, ICAL, all DPrep

Associated Data Models: Visibility, Sky

Implementation: Tied closely to Visibility Data Model implementation and therefore likely to be

implemented from scratch.

8.2.1.1.3.3 Flagging

Flagging of visibilities would be implemented as three sub-modules:

1. One based on ranges that are specified by the Observatory at the start of the processing

(e.g. given stations / dishes, given baselines, given frequency channels, etc.).

2. The other based on the visibility data itself by detecting outliers from average and / or

median values of the data

3. Flagging of visibilities based on calibration solutions. Outlier Calibration Solutions are flagged

and those flags are propagated to the corresponding Visibility data samples.

Variants: Module (based on the visibility data itself) comes in two variants: one for Batch workflows

and one for Real Time workflows. Since the time-frequency data ranges that the Flagger will work on

will be different for Batch and Real-time workflows the optimal way of determining statistics will be

different. Therefore, two optimized variants are foreseen.

Performance: Estimated ~1% of total computation


Associated Data Models: Visibility

Implementation: AOFlagger [RD8.13], CASA [RD8.9], ASKAPSoft [RD8.12]. Tied closely to Visibility

Data Model implementation and therefore likely to be implemented from scratch.

8.2.1.1.3.4 Sky transforms

Predict visibilities from Sky Components using Direct Fourier Transforms and the inverse.




Variants: - Performance: The Visibility Predict by means of the Direct Fourier Transform is a performance

critical component, since it scales with the number of visibilities times the number of discrete source

components to be treated in this way [AD04]. It is estimated to contribute about 60% of the SDP

computation, yet high operational intensity might be more straightforward to realise than with other

modules.


Associated Data Models: Visibility, Sky Components

Implementation: Known implementations: CASA [RD8.9], LOFAR-BBS [RD8.10], MeqTrees [RD8.15],

ASKAPSoft [RD8.12], SAGECal [RD8.11], AIPS [RD8.16]. Tied closely to Visibility Data Model

implementation and therefore likely to be implemented from scratch.

8.2.1.1.3.5 Coalescence

Visibility re-sampling operations, which change visibility data density. Low performance

requirements, but critical for keeping the size of visibility data under control.

● Average: transform the visibilities to coarser time-frequency resolution

● Interpolate: transform the visibilities to denser time-frequency resolution

● Coalesce: apply baseline dependent averaging (BDA) to visibilities

● De-coalesce: invert from baseline-dependently averaged visibilities

Variants: - Performance: Estimated < 0.1% of total computation

Dependent workflows: ICAL, all DPrep


Implementation: Tied closely to Visibility Data Model implementation and therefore likely to be

implemented from scratch.

8.2.1.1.3.6 Scatter/gather/iteration of visibilities

Many operations require traversal of a visibility set. These functions provide scatter/gather

into/from sub-visibility sets, and iteration through sub-visibilities of a visibility set. Broadly speaking,

the former are necessary for distributed processing, and the latter for sequential processing.

Variants: By time or frequency or wplane or other subset.

Performance: Highly variable, and dependent on data locality and traversal order.



Implementation: Known implementations: Casacore is based on the CASA Tables mechanisms, and

ARL is based on numpy capabilities. Tied closely to Visibility Data Model implementation and

therefore likely to be implemented from scratch.

8.2.1.1.4 Images

● FFT of images

● Manipulation of Images: Image Arithmetic and Reprojection

● Scatter/gather/iteration of images





Pixel by pixel image mathematics.

8.2.1.1.4.2 Fast Fourier Transforms

Transform between uv-grid and image grid (and vice versa). Various optimized implementations for

the Fast Fourier Transform exist, which may give rise to different sub-modules. A key requirement is

that the coordinate system be updated correctly.

If > 95% of the pixels are zero a Sparse Fourier Transform (SFT) may be useful, but this will only be

relevant for the Fast Imaging workflow (as part of the Real-time workflows).

Variants: -

Performance: Performance is critical for the large image sizes needed for the SKA, it is estimated

that ~11% of SDP computation will be spent in image/grid FFTs. The Imaging support document

[RD8.3] reports efficiencies between approximately 8% and 15% of peak.


Associated Data Models: Image

Implementation: Known implementations: CASA [RD8.9], AW-Imager [RD8.11], ASKAPSoft [RD8.12],

WS-Clean [RD8.14] using FFTW library [RD8.7]. This is a prime target for reuse of available

third-party libraries.



https://www.draw.io/#G1Jc_gALD9WHLvVpLtai2zc1OPuKrg4wl8


8.2.1.1.4.3 Reprojection

Project the image onto a new tangent plane to the celestial sphere. Equivalent to Phase Rotation in

the visibility domain. This is needed for snapshot imaging, where ‘snapshot images’ are first created

on planes tangent to the array and then combined (i.e. reprojected) onto a final tangent plane at

fixed RA, Dec.

Variants: - Performance: Estimated ~4% of total SDP compute



Implementation: -

8.2.1.1.4.4 Deconvolution

Minor cycle deconvolution algorithms, where sky components are found / fitted to the Dirty Image.

There will be sub-modules for different algorithms. Notably:

● Multi-scale, multi-frequency deconvolution algorithm is used for Continuum Images,

● A multi-scale deconvolution algorithm (such as MSClean) is used for Spectral Line Cubes

● A complex clean (Hogbom or multiscale) for Q+iU.

In addition there will be functions for calculating clean masks, supporting a number of approaches.

Variants: There will be variants to allow for optimizations towards different accelerators (CPU, GPU,

FPGA, …)

Performance: The image sizes that the Deconvolution module has to work on may be very large, see

Performance Modelling [AD04]. This means we have to consider the amount of memory that is used,

and distribution across sub-images. In which case we will be forced to use distributed processing

components with e.g. graph or MPI communication [RD8.19]. This is estimated to account for about

5% of SDP computation.



Implementation: Known implementations: CASA [RD8.9], ASKAPSoft [RD8.12], WSClean [RD8.14] ,

ARL [RD8.19].

8.2.1.1.4.5 Spectral processing

These are operations involving the spectral axis, including conversion to and from frequency

moments, and removing continuum:

Variants: Moment conversion is straightforward. Continuum removal is likely to require considerable

steering and heuristics.

Performance: This requires a corner turn of potentially large cube.

Dependent workflows: ICAL, DPrep continuum and spectral pipelines





Implementation:

8.2.1.1.4.6 Polarisation processing

These are operations involving the polarisation axis, including conversion of polarisation type:

Variants: Conversion is straightforward.

Performance: This requires a modest corner turn of potentially large cube.

Dependent workflows: ICAL, DPrep continuum and spectral pipelines


Implementation:

8.2.1.1.4.7 Scatter/gather/iteration of images

Many operations require traversal of an image. These functions provide scatter/gather into/from

images, and iteration through sub-images of an image. Broadly speaking, the former are necessary

for distributed processing, and the latter for sequential processing.

Variants: Raster, both overlapped and not (see e.g. [RD8.18], and irregular tessellation.

Performance: Highly variable, and dependent on data locality and traversal order.

Dependent workflows: Potentially all workflows


Implementation: Known implementations: Casacore is based on the CASA Tables mechanisms, and

ARL is based on numpy capabilities. The implementation is likely to be closely tied to the

implementation of Image Model.

8.2.1.1.5 GriddedData

This module includes operations for gridding and degridding, FFT’ing the results, and creating

kernels.





Creation of Gridded Data for a given Image.

8.2.1.1.5.2 Gridding / De-Gridding, Kernels, and Convolution Function

We discuss these three modules together. Gridding and degridding convert between sampled

visibilities and the regular uv-grid. This is a necessary prerequisite for allowing the reconstruction of

images using FFT algorithms. The Gridded Data can be preconditioned by applying scaling to each

pixel prior to transform to an image. This is closely related to weighting of visibilities but avoids an

extra pass through the data.

Gridding requires compensation for the grid’s regularity as well as correcting for instrumental effects

such as baseline non-coplanarity, reception patterns and calibration. In practice this means that

visibilities will have to be convolved with a number of extra terms in this process: anti-aliasing,

W-term, A-term, plus one for ionospheric correction (I-term). Currently the standard algorithm is to

use oversampled convolution kernels and then a nearest neighbour algorithm for putting visibilites

on the grid. The AW kernel can apply refined Primary Beam patterns. For the AWI kernel we would

need to construct Gridding Kernels from Calibration Data, e.g. for Ionospheric correction.

Variants: Performance optimized variants will be created depending on the particular Workflow

(Buffer-Continuum Imaging, Buffer-Spectral Line Imaging, RT-Fast Imaging for Slow Transients).

Optimizations will depend on type of hardware and data access patterns (and, hence, data

distribution schemes). A high level of co-design is expected. Various variants of the W- and

A-Projection algorithm exist, each having different performance characteristics (see below). Image

Domain Gridding may be available for further optimization, depending on the particular data access

pattern.

Performance: Gridding and FFT dominate the computational performance (see [AD04]). Optimizing

Gridding needs co-design of Data Access Pattern and hardware architecture. Various optimizations



https://www.draw.io/#G1Kz_QQF_t2BODm3eLHfUU2SKwC_p1_8NG


exist for minimizing the convolution kernels that need to be applied: W-Stacking (smaller kernels at

the expense of higher memory usage), W-Snapshots (smaller kernels at the expense of reprojection

of snapshot images). Convolution Kernels may be computed on-the-fly if they need updating on

short timescales. Image Domain Gridding optimizes the computation of the Convolution Kernels.

This is estimated to contribute about ~13% to the total SDP computation. It is unclear what level of

algorithmic efficiency can be achieved.


Associated Data Models: Visibility, Image, GriddedData

Implementation: Known implementations: CASA [RD8.9], AW-Imager [RD8.11], WS-Clean [RD8.14],

ASKAPSoft [RD8.12], ARL. Whether these implementations are fit for re-use is TBD.

8.2.1.1.6 Imaging

8.2.1.1.6.1 Imaging Base

The core functions are a thin interface to the GriddedData gridding and degridding functions. The

interface is responsible for ensuring that the visibilities and images have a common phase centre.

Variants: Prediction of visibilities from a model, inversion of visibilities to obtain a dirty image and

point spread function. The predict and invert steps are appropriate for 2d transforms only, including

gridding/degridding with prolate spheroidal wave functions or Bessel functions, and w projection

kernels. Other wide-field algorithms are implemented via workflows, where the necessary varying

types of distribution (e.g. over w slice, time slice) can be performed.

Performance: Performance of this core capability is crucial. Both phase rotation (shift_vis_to_image)

and gridding/degridding are likely to require careful attention.


Associated Data Models: Visibility, Sky, Image, GriddedData

Implementation:



https://www.draw.io/#G1Gw8-AV9taaaX25Br5I55GdBBI2jbHkZX


8.2.1.1.6.2 Weighting and tapering

Different image weighting schemes for the uv-samples trading image sensitivity (i.e. image thermal

noise level) against image resolution

● Uniform

● Briggs

● Natural

Furthermore, in order to minimize differences in the Point Spread Function (PSF) over frequency

channels, the calculation of imaging weights based on frequency-integrated density is supported.

In addition, the PSF can be shaped by Gaussian and / or Tukey tapering of the imaging weights.

There may be the need for other types of tapering as well, e.g. elliptical tapering. Also, with many

short baselines it may be useful to have an inverse taper that weighs down short baselines (but

perhaps only for calibration).

Variants: Traditional Weighting schemes (except for Natural Weighting) require a full pass through

all UVW and FLAG to determine the imaging weights. Only then can the Gridding step start.

ASKAPSoft implemented a-posteriori weighting using a Wiener filter, for which the extra pass

through UVW and FLAG data is not needed.

Performance: Estimated <0.1% of total computation


Associated Data Models: Visibility, Sky. GriddedData

Implementation: Known implementations: CASA [RD8.9], AW-Imager [RD8.11], WS-Clean [RD8.14],

ASKAPSoft [RD8.12]. Whether these implementations are fit for re-use is TBD.

8.2.1.1.6.3 Primary beams

Accurate modelling of visibilities requires requires models of the primary beams as a function of sky

position and frequency.

Variants: Low and Mid will need separate consideration. Low will have station beams that include

the effects of the antenna layout, and the antenna beam pattern. The predicted beam will vary

markedly across the sky and will have complex polarisation behaviour. The primary beam for Mid

should be well-behaved to a good approximation but will require a parallactic angle dependent

model.

Performance: Critical for high quality Low imaging, less so for Mid imaging. In conjunction with I

projection (for the highly time-variable ionospheric phase), this will be a major driver for Low. Image

Domain Gridding [RD21] may be preferable to standard gridding in this regime.


Associated Data Models: Visibility, Image

Implementation: OSKAR [RD8.17] is capable of predicting Low station and antenna beams.




8.2.1.1.6.4 Imaging for a timeslice

For a given parallactic angle and zenith angle, the w-term is equivalent to a distortion of the image

coordinate system. These functions calculate the distortion and reproject an image correspondingly,

and predict and invert for a single timeslice. A workflow is responsible for the processing across

different time slices. The tolerance on the spread is time can be relaxed by using a compact AW

convolution kernel.

Reprojection works better for bandlimited functions rather than point components. Hence

performing invert in this way is more accurate than the predict phase.

Variants: Dependent on image reprojection capabilities.

Performance: Reprojection can be a significant fraction of the processing cost for w-snapshots.



Implementation: Casacore, ASKAPSoft, and ARL all have variants.

8.2.1.1.6.5 Imaging for a w slice

For a given w plane, the w-term is equivalent to a multiplication of the sky by a complex screen.

These functions predict and invert for a given w plane. A workflow is responsible for the processing

across different w slices. The tolerance on the spread in w can be relaxed by using a compact AW

convolution kernel.

Variants: - Performance: Optimisation of the calculation of the w-beam should be pursued.




8.2.1.1.7 Science Data Model

Provides functions needed for creating, copying, and selecting a Science Data Model. Also functions

for enforcing a consensus across different SDM’s.

Variants: - Performance: Not expected to be performance critical

Dependent workflows: ICAL, RCAL, DPrep pipelines, Model Partition Calibration

Associated Data Models: Visibility, Image, Calibration Data, Science Data Model

Implementation:




8.2.1.1.8 Simulation

The simulation module contains functions needed for testing. These include:

● Array configuration information

● Creation of various test images and components

● Creation of controlled simulated visibility sets for unit tests

Variants: More sophisticated variants will be required as testing improves. For example, the

simulation of a gain table will eventually have to be capable of producing physically realistic gain

tables.

Performance: Required for various testing scenarios, including scientific performance assessment,

code regressions, and code unit tests.

Dependent workflows: All calibration and imaging workflows

Associated Data Models: Visibility, Sky, Image, Calibration Data




https://www.draw.io/#G18GnWoDil4PqKFPgTNNgyj8tUXrj8kvg_

https://www.draw.io/#G1_NFaA87_7yZ5msHWnmLN49er3dyKyu-F


8.2.1.1.9 Sky

Sky contains both Skymodel and Skycomponent functions.


Creation and copy of Skycomponents.

8.2.1.1.9.2 Finding sky components

Find the position of a source component in an image.

Variants: Source finding may use a different approach than source estimation.. Note that the

requirements within the ICAL pipeline are less demanding than for Scientific Analysis of images.

[RD8.4]

Performance: Non-critical, not estimated


Associated Data Models: Image, Sky

Implementation: Known implementations: DUCHAMP, BLOBCAT, AEGEAN, BDSF [RD8.4]. Whether

these implementations are fit for re-use is TBD. Algorithms are likely to change over time.

8.2.1.1.9.3 Fitting sky components

Estimate the source flux, polarization, spectrum (e.g. direct spectrum or spectral Taylor terms for

multi-frequency deconvolution) and morphology for a known component or components in an

image.

Variants: There might be different approaches to Source Finding / Estimation. Note that the

requirements within the ICAL pipeline are less demanding than for Scientific Analysis of images.

[RD8.4]



https://www.draw.io/#G1sTqiIhFI91_Xe_hTwqQAtcOYn7Op8VWH


Performance: Non-critical, not estimated



Implementation: Known implementations: DUCHAMP, BLOBCAT, AEGEAN, BDSF [RD8.4]. Whether

these implementations are fit for re-use is TBD.

8.2.1.1.9.4 Insertion

Insert a pixelated version of a source component in an image. This is required for processing of

intermediate brightness compact objects.

Variants: Different ways of interpolating a compact component onto a grid. There is no optimum

approach. Lanczos interpolation is one of the best.

Performance: Scientific and computing performance are critical since this allows processing of

intermediate brightness components that otherwise might require an expensive Direct Fourier

Transform.



Implementation: ARL has version using sinc and Lanczos interpolation.

8.2.1.1.9.5 Skymodel

A skymodel is a collection of Images and a collection of Skycomponents. A Skymodel is created from

a global skymodel, and can be updated during processing by, for example, a fitting of successively

weaker components in ICAL.

8.2.1.1.10 Non-Imaging

Components for Non-Imaging applications (NIP). Only the Pulsar Search, Single-pulse Search and

Pulsar Timing Pipelines require (post) processing. VLBI data and Transient Buffer data do not require

processing and, therefore, do not show up in the element catalogue. See SDP Pipelines Design

document [RD01] for details. Note that the components that for the NIP pipelines are described

more fully in other SDP documents [RD21,RD22,RD23,RD24,RD25, RD26].

8.2.1.1.10.1 Pulsar Search

Processing components for the Pulsar and Single-pulse Search workflows. The components are used

to filter and classify the pulsar and single-pulse candidates generated within CSP. The classification

labels applied by this processing component, are used to prioritise candidates most likely to yield

new discovery for science analysis. There are 7 main pipeline sub-components. These include the

'Sift', Source Matching, Feature Extraction, Classify, Evaluate, Select and Alert modules.

Variants: Pulsar / Single-pulse search pipelines used around the world utilise the same fundamental

processing steps. However their exact implementation can differ in ways that significantly affect

science output. Principally due to optimisations made to accommodate the specific target of any

given search (e.g. long period pulsars vs. millisecond pulsars, or repeating transients vs. fast radio

bursts). For this reason there will likely be more than one deployable pipeline, with implementation




specific differences between key pipeline functions - though crucially the same pipeline processing

steps. We note that pipeline components must be modular enough to allow them to be replaced

(driven by advances in data science/pulsar research), without impacting pipeline accuracy and

efficiency.

Performance: The majority of Pulsar/Single-pulse search processing is performed by the CSP. SDP

performs post-processing on the candidates produced by the CSP only. As the SDP only

post-processes candidates, it is not anticipated that the Pulsar / Single-pulse search pipeline

execution will generate performance bottlenecks. There are aspects of the pipeline that pose a low

risk to computational performance. This includes the sifting operation that compares candidates

collected during a scan, to identify duplicate detections / similarity to known pulsar (or some other

relevant radio) sources. Sifting requires the aggregation of data from all observing beams, which is

why it poses a low risk as a potential I/O bottleneck. The other potential bottleneck surrounds the

training of machine learning models for the candidate classification step. Some machine learning

models can be computationally expensive to train (e.g. Deep Neural Networks [RD8.23]), however if

a suitable model is trained off-line prior to data processing, this risk is completely mitigated.

Machine learning inferencing (i.e. decision making) is becoming increasingly efficient. Thus assigning

classification labels to candidates is not expected to pose a performance/scalability risk.

Associated Data Models: Non Imaging

Implementation: Prototype implementations exist and can be used to construct the SDP

implementation. This includes popular tools such as PRESTO [RD8.24], SIGPROC [RD8.25], DSPSR

[RD8.26], though there are many others e.g. [RD31,RD32], and open source codebases published via

Github e.g. [RD8.29].

8.2.1.1.10.2 Pulsar Timing

Processing components for the Pulsar Timing workflow. The components clean and calibrate pulsar

timing data cubes, measure pulse arrival-times, compute and QA the timing residuals, and update

the timing model for the pulsar being observed as appropriate. The pulsar timing data cubes are

generated within CSP. SDP receives these in PSRFITS format [RD34,RD35] and applies the defacto

standard (linear) pulsar timing processing steps. There are 9 main pipeline sub-components. These

include the Remove RFI, Calibrate, Average, Determine Pulse Time-of-arrival (TOA), QA TOAs /

Residuals, Generate Residuals, Update Pulsar Ephemeris and Alert modules (not mandatory).

Variants: There will be a single pulsar-timing pipeline. However the pipeline components must be

modular enough to allow them to be replaced (driven by advances in data science/pulsar research),

without impacting pipeline accuracy and efficiency.

Performance: The majority of the Pulsar Timing processing is performed by the CSP. SDP performs

post-processing on fully detected data cubes only. For SDP we do not anticipate these modules will

generate performance bottlenecks. This is because there are only sixteen pulsar-timing beams, and

thus sixteen corresponding data products. Whilst these are large (>30Gb in size), the linear

processing they require is not computationally expensive enough to pose a problem.

Associated Data Models: Non Imaging

Implementation: Prototype implementations exist and can be used to construct the SDP

implementation [RD34, RD35, RD36, RD37].




8.2.1.2 Processing Libraries

Low-level algorithms will be split out into separate libraries, since these algorithms are used in

multiple areas of the decomposition tree. Examples are:

● Functions that provide values in astronomical reference frames using physical units e.g. as

currently implemented in Casacore Measures [RD8.6]

● Region definitions as needed to describe regions on the sky or within an image

● Calibration solvers; some are currently implemented in Casacore Scimath [RD8.6]

○ Levenberg-Marquardt

○ StEFCal

○ Linear Least Squares

● FFT, e.g. FFTW [RD8.7], CUFFT

● Gridders

● Flaggers

Variants: For each algorithm there may be multiple variants to allow for various optimizations

towards different hardware platforms (CPU, GPU, FPGA, …)

Performance: These are low-level algorithms, including some for the performance bottlenecks of

SDP (e.g. FFT, Solvers, Gridders)

Implementation: Published mature code. In house if necessary.

8.2.1.3 Processing Wrappers

8.2.1.3.1 Processing Component Wrapper

Wrapper to make the Processing Component agnostic of the Execution Framework.

Variants: One per supported Execution Framework

Performance: No performance bottleneck


8.2.1.3.2 Data Redistribution

Processing Components will work on subsets of data (e.g. sub-bands, sub-arrays or snapshots) and

are not aware of the global data distribution. Therefore the data distribution task has to be handled

by the Execution Framework.

Variants: One per supported Execution Framework

Performance: Data redistribution may come at a substantial performance cost. Its use should be

weighed against sub-optimal performance of the Processing Components.


8.2.1.3.3 Realtime & Queue I/O

Data can be exchanged in real time by two methods:




● Received from the instrument (Ingest). This data will not only be written to the Buffer, but

will also made available to Real-time processing.

● Furthermore, Data Queues can be used to read and write data in real time. This can be used

to exchange data with other running workflows (e.g. for cooperation on calibration solving),

or to publish data about quality assessment or calibration to other components of the SDP

or SKA.

In either case the wrappers need to manage the execution engine such that we can inject data

arriving at real-time.

Variants: Per Execution Framework

Performance: Receive rate up to ~1 GB/s (for a case with 400 Receive nodes), Queue I/O likely <100

MB/s per node


8.2.1.3.4 Buffer I/O

The principal mechanism for workflows to obtain data is by reading it from Buffer storage. This will

generally involve using a Data Island’s File System Interface to read and write data in Buffer Data

Model Form and translate it to and from the appropriate Memory Data Model so the workflow or

Processing Components can handle it.

Variants: Per Execution Framework, possibly further specialised by Storage Backend type to realise

the highest possible data rates.

Performance: Read rate of up to ~3 GB/s (for 1500 nodes and 10 major loops)


8.2.1.4 Memory Data Models

The Data Models have been described in multiple views, see System-level Data Model View [AD03].

Apart from the Raw Input Data, the Telescope State Information (which is a subset of the Science

Data Model and indicated here as Science Data Model Information) are important data items for the

Workflows and its Processing Components.

Variants:

Performance:

Implementation:

Data types:

● Visibility

● Calibration Data

● GriddedData

● Image

● Sky Components

● NIP Data

● Science Data Model Information



https://docs.google.com/document/d/1LsLKlC-Q4Qy7MPEro2IUVsYGXnkmSAXRYt4YLdAY9zE/edit#heading=h.e3r3pim29gl8


8.2.1.5 Buffer Data Models

See System-level Data Model View [AD03]. Apart from the Raw Input Data, the Telescope State

Information (which is a subset of the Science Data Model and indicated here as Science Data Model

Information) are important data items for the Workflows and its Processing Components.

Variants:

Performance:

Implementation:

Data types:

● Visibility

● Calibration Data

● GriddedData

● Image

● Sky Components

● NIP Data

● Science Data Model Information

Data Models may use a Data Access Library. Possibilities are:

● Access to Visibilities. E.g. Casacore MS [RD8.6]; Note the SKAO (SDP) - NRAO collaboration

for the development of MSv3

● Access to Images. E.g. Casacore Images [RD8.6]

● Coordinates, e.g. World Coordinate System [RD8.8]

8.2.2 Relations and Their Properties

Workflows can be composed from these low level Processing Components and from other low-level

Workflows. For example, a Workflow for ‘Strong Source Removal’ that can be used within a

Workflow for ‘Pre-processing’.

8.2.3 Element Interfaces

Processing Components are wrapped and their interfaces to Data Models and Execution Framework

go through these wrappers. See Figure 3 (Trivial Execution Engine Example) of the Processing

Component & Connector View [AD02].

8.2.4 Element Behavior

Processing Components only interact with Data and the Execution Framework. In principle this is

flexible, allowing the Observatory to create and execute dedicated workflows from the current set of

Modules. The way a workflow fits within the SDP architecture is described in the Processing Module

View document [AD01]. The Data Distribution aspect of workflows is also out-of-scope for this

document and will be described in a dedicated Data Distribution View which is to be written for CDR.




Figure 2: Data Transition diagram for Data Models in calibration and imaging applications. This omits some

possible transitions for clarity . 1

The module decomposition (for the calibration and imaging applications) is based on the data

transitions as shown in Figure 2. As a guiding principle components are grouped together if they

work on the same type of data. In this way we may achieve a ‘clean’ split of modules. Note that the

Data Transition diagram (figure 2) is consistent with the Algorithm Reference Library (ARL) [RD8.5].

A Dependency Matrix mapping (Sub-) Modules onto Workflow behaviours (or Functions; [RD01]) is

given in the mapping spreadsheet contained in the architecture documentation

[SKA-TEL-SDP-0000013_06_SDPArchitecture_MappingSpreadsheet].

1 Such as continuum removal and pre-conditioning.



https://www.draw.io/#G1LdpkiVaeqVcNjGFWIzMP5ADjFhPCJPiM


8.3 Context Diagram

Figure 3: Workflow Modules Context Diagram

This view documents a section through processing-related modules as shown above: It ignores the

execution framework implementation and Receive to instead focus entirely on Processing Wrappers

and Processing Components. Therefore this view is not complete with respect to how Workflows get

executed, and ignores the interfaces used to obtain real-time measurement data from the telescope.

8.4 Variability Guide

Processing Components are expected to be a highly variable part of the SDP architecture, therefore

there are a large number of variation mechanisms:

● Using the Processing Component Interface, Processing Components are decoupled from the

Execution Engines and Workflows using them. Therefore both new as well as modified

Processing Components can easily be introduced into the system.

● Through the same mechanism we can especially allow a large number of Processing

Components to co-exist within the same architecture. This should allow SDP to experiment

with new algorithmic developments and optimisations for accelerator architectures.

● It is the usage of common data models that allows Processing Components to remain

composable. This means that it is essential to avoid “forks” in the data models.



https://www.draw.io/?scale=4#G1Vw6Yksfjg2E6aZPB1BEJZH-HdnjNfk-D


● The Science Data Model information provides a flexible way to provide Processing

Components with meta data, both about the observation as well as about the Workflow.

This allows parameterization of Processing Components in a flexible manner even as

requirements change.

8.5 Rationale

8.5.1 Modifiability

The key architectural decision associated with this view is that the processing components form a

reusable library that can be used by all of the workflows and execution engines supported by the

SDP, and for this reason they are all organised together in a single top level module.

This is a horizontally integrated pattern, i.e., different functions of the SDP all use the same

processing component module. The rationale for this are savings in construction and maintenance

costs as only one, coordinated, set of modules needs to be built and maintained. This needs to be

traded off against:

● Organisational difficulties in specifying and building a single widely-used module

● Potential impact on construction timeline critical path, and

● Development of Workflow specific processing components, optimised for a small range of

specific roles in workflows and execution engines, which would eliminate possible savings.

Furthermore, the reusability of the processing components between different execution engines

drives the decision to separate out the data model in a high-level module of its own.

8.5.2 Maintainability

We expect that most workflows will be procedural, and are implemented as straightforward scripts.

This means that high level changes can be made by astronomers, rather than requiring a developer.

8.5.3 Scalability

To solve the problem at scale, we translate a procedural view into a data-driven view, and that is

done by the workflow engine. The workflow engine provides a rich environment to script the

workflows. At the lower level we must therefore require purely functional components that can be

strung together with the data models being inputs and outputs. Thus the processing state is kept in

the data models and science data model, rather than in the processing functions.

8.5.4 Portability

Processing components can be used in different Workflows and by different Execution Frameworks.

This is achieved by letting all processing components have the same Component Interface that

interfaces with the Execution Framework by means of a wrapper, and by adopting a common set of

Data Models.




8.6 Related Views

The current Workflows Module View provides more detail on parts of the System-level Module

Decomposition and Dependency View,. In particular for the Processing Components sub-module of

the Core Processing Module and the Memory Data sub-module and Buffer Data sub-module of the

Data Models module, see Figure 1 (Primary Representation).

SDP Science Pipeline Workflows Module View describes the pipeline workflows.

The System-level Data Model View, provides context for the Memory and Buffer Data Models. Note

that the relation of the Processing Components to the Science Data Model (the ‘meta-data’) is

currently not worked out. This needs attention in future updates of this document.

The following views relate to Non-Imaging: Non-Imaging Data Model View Packet, Pulsar Search & Single Pulse Workflow View Packet, and Pulsar Timing Workflow View Packet.

8.7 References



[RD8.1] G. van Diepen et al., SDP Memo: Receive and Pre-process Visibility Data, SKA-TEL-SDP-0000028, 2016-04-06

[RD8.2] S. Salvini et al., SDP Memo: The SDP Calibration Component, SKA-TEL-SDP-0000029, 2016-04-07

[RD8.3] A. Scaife, SDP Memo: The SDP Imaging Pipeline, SKA-TEL-SDP-0000030, 2016-04-07

[RD8.4] M. Johnston-Hollitt et al., PDR.02.05.04 Science Data Analysis Pipeline Reference Document, SKA-TEL-SDP-0000031, 2015-02-09

[RD8.5] T. Cornwell, Algorithm Reference Library (ARL) documentation; http://www.mrao.cam.ac.uk/projects/jenkins/algorithm-reference-library/docs/build/html/ARL_directory.html

[RD8.6] Casacore documentation; http://casacore.github.io/casacore/

[RD8.7] FFTW documentation; http://www.fftw.org/

[RD8.8] WCSLib documentation;

http://www.atnf.csiro.au/people/mcalabre/WCS/wcslib/index.html

[RD8.9] CASA software documentation; https://casa.nrao.edu/

[RD8.10] LOFAR software documentation;

https://www.astron.nl/radio-observatory/lofar-data-processing/software-processin

g-tools/software-processing-tools



http://www.mrao.cam.ac.uk/projects/jenkins/algorithm-reference-library/docs/build/html/ARL_directory.html

http://www.mrao.cam.ac.uk/projects/jenkins/algorithm-reference-library/docs/build/html/ARL_directory.html

http://casacore.github.io/casacore/

http://www.fftw.org/

http://www.atnf.csiro.au/people/mcalabre/WCS/wcslib/index.html

https://casa.nrao.edu/

https://www.astron.nl/radio-observatory/lofar-data-processing/software-processing-tools/software-processing-tools

https://www.astron.nl/radio-observatory/lofar-data-processing/software-processing-tools/software-processing-tools


[RD8.11] LOFAR imaging cookbook;

https://support.astron.nl/LOFARImagingCookbook/index.html

[RD8.12] ASKAP software documentation;

https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/index.

html

[RD8.13] A.Offringa et al., Post-correlation filtering techniques for off-axis source and RFI

removal, MNRAS, 422, 563 - 580, 2012.

[RD8.14] WSClean software documentation; https://sourceforge.net/p/wsclean/wiki/Home/

[RD8.15] MeqTrees software documentation; http://meqtrees.net/

[RD8.16] AIPS software documentation; http://www.aips.nrao.edu/index.shtml

[RD8.17] OSKAR: http://oskar.oerc.ox.ac.uk

[RD8.18] Sebastiaan van der Tol, Bram Veenboer, and André R. Offringa, Image Domain Gridding: a fast method for convolutional resampling of visibilities. A&A, Volume 616, August 2018

[RD8.19] SKA-TEL-SDP-0000179 SDP Memo 83, Distribution of the Rau-Cornwell MFSMS algorithm

[RD8.20] Ger van Diepen et al., 2016, "Data Models for the SDP Pipeline Components, Draft" [Output for JIRA Task-73].

[RD8.21] SKA-TEL-SDP-0000161 SDP Memo 42: Data Model Summary for Pulsar/Transient Search & Timing

[RD8.22] Lyon R. J., 2017, “CSP to SDP NIP Data Rates & Data Models (version 1.1)”, doi:10.5281/zenodo.836715.

[RD8.23] Goodfellow I., Bengio Y., Courville A., Bach F., 2017, “Deep Learning”, MIT Press.

[RD8.24] Ransom S., 2016, “Presto”, http://www.cv.nrao.edu/~sransom/presto/ , accessed 22/02/2016.

[RD8.25] Lorimer D. R., 2016, “Sigproc”, on-line, http:// sigproc.sourceforge.net , accessed 22/02/2016.

[RD8.26] van Straten W. & Bailes M., 2011, “DSPSR: Digital Signal Processing Software for Pulsar Astronomy”, PASA, 28, 1. doi:10.1071/AS10021

[RD8.27] Keith M. J., 2016, “Pulsar hunter”, on-line, http://www.pulsarastronomy.net/wiki/Software/PulsarHunter , accessed 22/02/2016.



https://support.astron.nl/LOFARImagingCookbook/index.html

https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/index.html

https://www.atnf.csiro.au/computing/software/askapsoft/sdp/docs/current/index.html

https://sourceforge.net/p/wsclean/wiki/Home/

http://meqtrees.net/

http://www.aips.nrao.edu/index.shtml

http://oskar.oerc.ox.ac.uk/

http://www.cv.nrao.edu/~sransom/presto/

http://sigproc.sourceforge.net/

http://www.pulsarastronomy.net/wiki/Software/PulsarHunter


[RD8.28] Lyon R. J. et. al., in prep., " A Big Data Pipeline for High Volume Scientific Data Streams".

[RD8.29] Barr E. D., "Peasoup: C++/CUDA GPU pulsar searching library", on-line, https://github.com/ewanbarr/peasoup.

[RD8.30] van Straten W., Demorest P. and Osłowski S., 2012, “Pulsar data analysis with PSRCHIVE”, arXiv:1205.6276.

[RD8.31] Hotan A. W., van Straten W., and Manchester R. N., 2004, “PSRCHIVE and PSRFITS An to Radio Pulsar Data Storage and Analysis”, arXiv:astro-ph/0404549v1.

[RD8.32] van Straten W., Manchester R. N. , Johnston S. , and Reynolds J., 2009, “PSRCHIVE and PSRFITS: Definition of the Stokes Parameters and Instrumental Basis Conventions”, arXiv:0912.1662.

[RD8.33] Hobbs G. B., Edwards R. T. and Manchester R. N., 2006, “TEMPO2, a new pulsar-timing package – I. An overview”, MNRAS, 369, 2, p.655–672.

[RD8.34] Yatawatta 2012 (https://arxiv.org/abs/1209.5492)

[RD8.35] https://support.astron.nl/LOFARImagingCookbook/dppp.html

[RD8.36] SKA-TEL-SDP-0000038 SKA1 SDP System Sizing, Rev 03

[RD8.37] “ASKAP Science Processing”, ASKAPSoft memo 20, http://www.atnf.csiro.au/projects/askap/ASKAP-SW-0020.pdf



https://arxiv.org/abs/1209.5492

https://support.astron.nl/LOFARImagingCookbook/dppp.html

http://www.atnf.csiro.au/projects/askap/ASKAP-SW-0020.pdf


9 Delivery Modules Contributors: R. Simmonds, P. Wortmann


Figure 1: Primary representation of the Delivery Module View

Figure 1 shows the primary representation of the SDP Delivery System Module View. The

Component and Connector view of the Delivery system is also available. This is the SDP element

responsible for making data available to SKA Regional Centres (SRCs) and setting up data transfers to

those sites.

Main functions include creating and maintaining entries in the Science Data Product Catalogue (see

Related Views below), providing a means of managing the rules used to set up the transfer of Data

Products to SRCs, and then to schedule those transfers to make best use of the network bandwidth

available on the WAN links connecting to the SDP sites.

The delivery system also provides a set of International Virtual Observatory Alliance (IVOA)

compliant services that can be accessed by Observatory staff. Note that these services and the ability

to submit new subscriptions, i.e. the rules for setting up transfers, could also be made available to

SRC staff at based on policy set by the Observatory.



https://www.draw.io/?scale=2#G1gtOlAa9fXuZ542IMS_iFJCkAJfADhyjB


The Delivery System also includes a set of Wide Area Network (WAN) monitoring tools to ensure that

the the network health is tracked over time close to the Delivery Transfer Endpoints. This can greatly

reduce the amount of time taken to locate and resolve networking problems when they occur.

9.2 Element Catalogue


9.2.1.1 Primary Modules

9.2.1.1.1 Web Interface

Provides control and monitoring interfaces to the delivery system. These include the interface to

insert and manage the subscriptions that define the rules for which data should be transferred to

which SRC(s). It also includes an interface to monitor the status of transfers that are scheduled or in

progress. Finally it provides the GUI interfaces for the IVOA services.

There are a number of ways to implement this. It could use one of the Scientific Gateway toolkits,

though using a Content Management System (CMS) such as Drupal for the starting point may

provide a more sustainable solution. Code would need to be added to talk to the underlying delivery

servers. In most cases the interfaces will be simple. Interfaces to the IVOA services can be derived

from code already developed by projects such as CADC (OpenCADC) [RD9.1] and GAVO (DaCHS)

[RD9.2].

9.2.1.1.2 Publish Products

This module provides components that are used to create new entries in the Science Data Product

Catalogue. Entries refer to the output of an observation and all of the Data Products associated with

that observation.

This will need to implement code that reads from the Configuration and Coordination module using

a provided API and write into the Science Data Product Catalogue using an API specific to the

Catalogue schema. If the CAOM2 [RD9.4] schema is employed, this API can be adapted from what is

provided by the OpenCADC toolkit [RD9.1].

9.2.1.1.3 Transfer Scheduler

Uses rules available from Transfer and Subscription DB Access to match against entries in the Science

Data Product Catalogue. This causes transfer requests for individual products to be generated. After

that the Transfer Scheduler is responsible for controlling how many transfers are in flight at any

point in time to make best use of bandwidth while also ensuring that the high priority transfers are

handled first.

This module implements code that needs to manage the set of transfer requests for which access

has been provided by the Provision Storage Module. It is responsible for setting up the transfers and

dealing with error cases when SRCs and/or WAN connections are not available for a planned

transfer. This module can make use of tools such at CERN’s File Transfer Service (FTS) to manage the

transfers that have been planned so far. Additional code will be required to match data subscriptions

that describe the data transfer policy, making the request for the files to be made available, and




pushing the transfer requests and file-handles to the tool managing the planned transfers (such as

FTS).

9.2.1.2 Intermediate Modules

9.2.1.2.1 Transfer and Subscription DB Access

Provides and implements the schemas for the databases that describe the data subscriptions, the

pending requests for data handles from Storage Provisioning and a list of elements holding the file

handles ready to be passed to the Transfer Endpoint.

9.2.1.2.2 Location DB Access

Provides the schema for the Location database that indicates the geographical location of instances

of Science Data Products. This is providing a service similar to the Globus Replication Service, though

EGI and CERN have more recent implementations of such a service.

9.2.1.2.3 Catalogue DB Access

Provides a schema for the database that will work with the IVOA TAP service that it must support. In

testing we have used the CAOM2 schema. The SSA, SIA and DataLink services can use the TAP service

for interacting with the Science Data Product Catalogue.

9.2.1.2.4 Storage Access Service

Provides a file access service responding to requests for files via URLs returned by the VO services

(SIA, SSA and DataLink). This service may by synchronous or asynchronous depending on the

availability of a given file in the Storage Provisioning module.The service will negotiate a transfer and

push the file to a location specified by requesting client. There will be no need to stage or cache files

behind the service.

9.2.1.2.5 WAN Gateway Configuration

Provides the deployment rules and configuration files for the other WAN Gateway components. This

could be implemented with Ansible.

9.2.1.3 IVOA Modules

9.2.1.3.1 SSA, SIA, DataLink Services

All these services are querying the Data Product Catalog with different views on the data models.

They can all use the TAP service on the back end.

9.2.1.3.2 TAP Service

The Table Access Protocol Service provides the base search capabilities required by IVOA tools.

There are several current implementations of this including the one in the OpenCADC toolkit (CITE).

9.2.1.4 Base Modules

9.2.1.4.1 Database Implementation

This could be any database management system that can support the higher level modules.

PostgreSQL has been used in testing of Delivery services.




9.2.1.4.2 HTTP Engine

Module that provides web services to the higher level modules. This could be provided by a tool such

as Apache Tomcat.

9.2.1.4.3 Transfer Endpoint

This provides the software endpoint that implements that data transfer protocols optimised for use

on large capacity Wide Area Networks with high Round Trip Times that exist between the SDP sites

and the regional SRC sites. This could be provided by GridFTP or by a commercial offering such as

Aspera.

9.2.1.4.4 HTTP Filter

The HTTP Filter is used to implement access rules to components in the delivery system from

networks external to the SDP. For example, in some cases the Observatory may grant permission to

SRC staff to insert subscriptions for missing data or to gain data in cases where it is less expensive to

transfer it from an SDP than from another SRC that already holds the data. This will form part of any

web server.

9.2.1.4.5 Network Health Monitor

This is provided to enable WAN network health monitoring as close to the transfer endpoint as

possible. This is preferable since many WAN network problems occur in the last mile to the

endpoints, so having this monitoring functionality in both of the Science Processing Centres could

detect problems that external monitors may miss. The monitor should include services for both

bandwidth and latency monitoring and be compatible with network monitoring systems deployed by

the WAN providers. An example implementation of this is PerfSonar.




9.3 Context Diagram

Figure 2: Delivery Module View Context

9.4 Variability Guide N/A

9.5 Rationale

The design has been split into layers that provide different functionalities. The base layer provides

modules consisting of COTS components. The layer above this consists of IVOA services, that while

providing astronomy domain specific functionally, it should be possible to adopt existing




implementations. The layer above this provides SKA specific schemas and configurations for

deploying on the lower layers. Finally the upper layer provides the components providing the

functional interfaces to the delivery system.

9.6 Related Views

Science Data Product Catalogue View

9.7 Reference Documents



RD9.1 OpenCADC https://github.com/opencadc/

RD9.2 GAVO (DaCHS) DaCHS - Data Center Helper Suite http://soft.g-vo.org/dachs

RD9.4 CAOM2 http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/doc/caom/



https://github.com/opencadc/

http://soft.g-vo.org/dachs

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/en/doc/caom/


10 Execution Control Modules Contributors: S. Gounden, P. Wortmann

10.1 Primary Representation Execution Control is responsible for providing the top-level control interfaces and coordinating the

more loosely coupled services and workflows at lower levels. Execution Control is split between the

Master Controller, which is responsible for starting and maintaining services, and the Processing

Controller, which handles scheduling and execution of processing.

Figure 1: Execution Control Module View



10.2.1.1 Tango Control

This module is realised as a collection of TANGO devices that receive aggregated information relating

to Control, logging, alarms and processing.

Expertise: Control and coordination

Implementation: TANGO

10.2.1.2 Master Controller

The Master Controller maintains the state of the SDP. It is primarily responsible for starting, stopping

and managing SDP services, which includes the remaining elements of Execution Control.

The Master Controller has a critical role in start-up and shut-down of the system, as well as the

ultimate responsibility for determining the SDP behaviour in the case of service failures. See

behaviour documentation in Execution Control C&C View.



https://www.draw.io/#G17KRFYFV1xl0CmuG1uLD86aTX8zERYaWj

https://docs.google.com/document/d/1zdFxxbSfLfG0BWaN1tLQ6SygO4YJ4m9AXLWNPlCmyJE


Expertise: System operation, control, external interaction


10.2.1.3 Processing Controller

Manages Processing Block execution according to resource availability. Coordinates SDP services and

execution frameworks to execute Science Pipeline Workflows.

Expertise: Observation/processing planning and resource management

Implementation: In-house, possibly utilising off-the-shelf scheduling and resource

allocation solutions

Additional Functionality:

● Subarray Management handles SDP’s subarray interface, so keeps track of information

relating to the currently defined sub-arrays. This includes propagating information and

commands to and from associated real-time processing blocks.

● Resource scheduling: Assign buffer and compute resources to scheduling blocks and

processing blocks as defined in Execution Control Data Model. This function will make use of

the SDP resource model, which is part of the SKA Core Software (see System-Level Module

View). ● Processing Block Controller: Manages high-level execution of Science Pipeline Workflows. As

workflows will dictate basically all behaviour of the Processing Block Controller, this will

likely be implemented simply as a script interpreter. See Workflow Module View and Science

Pipeline Workflow Scripts & Deployment View for more details.

10.2.1.4 Monitoring

Collects information about global operational status and service health using information provided

by Platform Services and SDP Services. This especially concerns following logging and other system

health information and filtering out information relevant to the Telescope Manager.

Note that collection of health and logging information is mainly implemented in Platform Services as

the (likely off-the-shelf) Logging and Health sub-module. This module implements the SDP-specific

filtering and decision making depending on that data.

Expertise: System operation


10.2.2 Relations and their Properties

There are very little code-level interdependencies between Execution Control modules, as they will

be running as separate components sharing little code between each other. They will however

depend on SDP Services and Platform Services interfaces (see Context Diagram) and especially the

Execution Control Data Model used for coordinating the various components.


Not applicable.



https://docs.google.com/document/d/1LYyPdlJF_uGVmzet0rCHLrLmRYoaG2wYDqp4XbvrUzg

https://docs.google.com/document/d/1M0S20FWn4Dsb8nl9duIoW93OEiXlzVDGh8sqImOl6S0


https://docs.google.com/document/d/1ymUHzkUNVLoYmDL5-5qB04XkCqgAfwCbyTZFh4nYGOE





10.2.4 Element Behaviour

Execution Control needs to coordinate complex behaviour, both with other parts of the system and

within SDP. This is documented in the Execution Control C&C View.

10.3 Context Diagram

Figure 2: Execution Control Context

10.4 Variability Guide

10.5 Rationale The key drivers of the Execution Control modules are :

● Availability and Reliability: Modules are implemented such that control functionality is

distributed among the Master Controller, Processing Controller and Processing Block

Controller. Distributed control ensures the needs of availability and reliability are met by

Execution Control.

● Modifiability and Maintainability: As they implement independent services, modules are

extremely loosely coupled. For the purpose of Execution Control they basically only share

the Execution Control Data Model.

10.6 Related Views This is a decomposition of the System-Level Module View. The modules shown here implement the

components shown in the Execution Control C&C View.

10.7 Reference Documents Currently no referenced documents.




https://www.draw.io/?scale=4#G1N0EwGPNofIUlXaQAlI7YK-ZlQg2u19XU




11 Platform Modules Contributors: J. Garbutt, J. Taylor, P. Harding, A. Ensor, V. Allan, P. Wortmann


Figure 1: Platform Services Module View Primary Representation

Elements are units of implementation referred to as modules. “Allowed-to-use” relationships are

shown as arrows annotated with “use”; “Is-part-of” relations between modules are shown by



https://www.draw.io/#G14a3pw_BiBIk3Vr1c1hgIBQCs_6j8WNIX


nesting. More detailed explanations of the meaning of elements and relations are described in the

Element Catalogue.

Reading guide

There are two main entry points to Platform Services: Operations Interface and the Platform

Configuration Interface. They both make changes to the system via a shared set of Operations

Management Scripts. Those scripts feed the appropriate configuration and options in the

Configuration Management system to perform the requested operational changes.

The majority of Platform Services involves configuring and integrating off-the-shelf software with the

chosen Hardware (see SDP Hardware Decomposition View). There are two main groups of

off-the-shelf software shown: Platform Software and SDP Dependencies.

The exact details of which off-the-shelf software is used by Platform Services is hidden behind the

Platform Interfaces and System Interfaces. For example, the Service Connection Interface expresses

how you gain access to a particular type of database, the exact details of how this is provided may

differ between running at SKA low1 and running at a Regional Science Center, and those details are

hidden from the SDP Operational System that needs to access that type of database.


11.2.1 Element and Their Properties

This section is a dictionary where each entry is an element of the Primary Representation. Each

section includes details of:

● Implementation (custom or off-the-shelf component)

● Maintainability (how to sustain the implementation over the expected 50 year lifespan).

11.2.1.1 Configuration Management

At the core of Platform Services is defining the desired state of the system in Configuration that is

interpreted by the chosen Configuration Management. The Configuration Management System

allows us to treat the infrastructure as code, including tracking changes to desired state of

components under Platform Service’s control in source code. The Configuration Management

solution provides an abstraction layer from the underlying Operating System and hardware selection

providing the necessary flexibility for managing hardware and software refresh cycles

The choice of Configuration Management System will dictate how the Configuration is expressed.

Note how the Configuration Management uses the Platform Interfaces, allowing for some variability

in the exact implementation.



https://drive.google.com/open?id=1DOMRHX8u7tmHpy0G98YtHIFXHk0X1CVduMxCDP-WUHM


Implementation: Prototype uses Ansible. Chef, Puppet are other options that could be

considered.

Maintainability: Most tools in this space are Open Source. Should the tool no longer

have a community to maintain it, there are many options open. It

should be possible to move to a tool that is maintained, although if

there are minimal changes expected, it may be possible to do

minimal maintenance on the tooling to keep it working across

Operating System and Hardware refresh cycles.

11.2.1.2 Configuration and Orchestration

The Platform makes use of Configuration Management for three key activities:

● Running Platform Services

● Running the SDP Operational System

● Allow the SDP Operational System the ability to run Science Pipeline Workflows

The following sub modules deal with different parts of this story.

11.2.1.2.1 Operations Management Interface

This module provides a common interface for both the the Operations Interface and the Platform

Configuration Interface to execute operations using Configuration Management. This is typically

implemented using quality assured shell scripts that invoke the Configuration Management tooling

with the appropriate inputs (i.e. a reference to the appropriate Configuration). This component helps

isolate consumers of the Platform from its choice of Configuration Management solution and the

specific implementations used to deliver all the Platform Services interfaces.

The P3 platform prototyped the Operations Management Interface using an API provided by an

Operations Management Platform, in this case RunDeck. RunDeck provides an API and User

Interface that invoke the Configuration Management tooling (prototype uses Ansible) with the

appropriate inputs. For Ansible based Configuration, the example inputs include the playbooks and

inventory files that reference roles (both external and internally provided roles) contained in the

Configuration.

Implementation: Custom integration code. Prototype uses RunDeck as the Operations

Management Platform to execute shell scripts that trigger Ansible.

Jenkins has also been used in similar ways.

Maintainability: The operational scripts are custom and heavily coupled to the

Configuration, and therefore the chosen Configuration Management

platform.

11.2.1.2.2 Configuration

The Configuration defines the desired state of the parts of the running system under its control. It is

specific to the chosen Configuration Management system.




What each part of the configuration is responsible for is discussed in the following two sections, but

they all share the following attributes:

Implementation: Platform Services specific configuration makes use of externally

provided configuration. External includes both from the SDP

Operational System and external to the SKA public open source

repositories of configuration.

Maintainability Appropriate re-use of external Configuration ensures that Platform

Services is primarily responsible for integrating together all the

externally maintained configuration scripts. Note that parts of the

configuration can be heavily coupled to the chosen technologies and

operating systems chosen. Although it is common to support

multiple execution environments (i.e. multiple operating systems),

extra development effort is required to support new execution

environments.

11.2.1.2.2.1 Platform Configuration

Platform Configuration describes the desired state for all the services that make up Platform

Services, in particular this includes the software modules listed in Platform Software. The

configuration is necessarily tightly bound to the chosen implementations of each service it is

deploying, and in some cases the particular combination of services being deployed together.

For example, the prototype makes use of the following sets of Ansible Configuration:

● Ansible to install RunDeck and its Operations Management Scripts, such that all other

operations are accessible

● OpenStack Kayobe packages up Ansible configuration used to install and configure

OpenStack, that implements the Core Infrastructure Interface

● OpenStack Manila, Monasca and Magnum are also installed by Ansible in OpenStack Kayobe

to provide the Remote Storage Provisioning, Logging and Metrics and Remote Storage

Provisioning Interfaces, respectively

● P3 Appliances [RD10.4] uses Ansible scripts to provision Container Orchestration Engine

clusters using OpenStack Magnum, integrating them with the OpenStack Manila and

OpenStack Monasca.

11.2.1.2.2.2 Operational System Configuration

Operational System Configuration is responsible for both configuring the Science Pipeline Workflows

the SDP Operational System requests at runtime, and configuring and running the SDP Operational

System itself, including all the services it depends on.

It is expected that the SDP Operational System will provide the majority of the Configuration scripts

required, and the platform is responsible for the integration between all the different components.

Importantly it will also inject the configuration required to access any Platform Services interfaces

each component may need, such as the Compute and Storage Interface needed to deploy Science

Pipeline Workflows.




11.2.1.2.3 Platform Configuration Interface

The Platform Configuration Interface is the main interface between the Platform and the SDP

Operational System. This module has the following key responsibilities:

● Allowing the SDP Operational System to discover and access the services it needs that are

being run by Platform Services

● Interface used to query the state of Platform Services, including current Storage and

Compute Capacity

● Interface through which to trigger a move to low power mode, restore to normal power

mode or perform complete shutdown of the platform and all the hardware it controls

● SDP Buffer C&C View creates Storage Backends

● SDP Execution Control C&C View executes Science Pipeline Workflows and attaches the

above Storage

Further details on the behaviours associated with this interface are detailed in the SDP Platform

Services C&C View. Some of those responsibilities involve a daemon reporting state to the SDP

Operational System, and some of those involve triggering appropriate operational actions via the

Operations Management Interface. It is expected that at least some of this interface will make use of

the same Configuration Database service that the SDP Operational System uses.

Implementation: A custom API that integrates with Platform Service Interfaces,

Operations Management Interface, and the Configuration Database.

Maintainability: Should be minimal custom code that integrates the above

components.

11.2.1.3 Operations Interface

This is the Operator focused counterpart to the Platform Configuration Interface module, providing

Operators access to Platform Services, including access to the Configuration via the Operations

Management Interface. The majority of the Operations Interface Component is defined by the

Platform Configuration for the component. The remaining piece is the initial bootstrap script that is

able to take the seed node and bring up the initial system.

The configuration of the system includes:

● Configuration of ssh access, including integration with SKA provided AAAI system

● Protecting HTTPS access via integration with the SKA provided AAAI. It is envisaged that

endpoints are simply restricted to an operator specific role that is communicated from the

AAAI system, rather than more complicated tighter integration with individual platform

services, such as the Container Orchestration Engine. This is because the primary user of the

system is Telescope Manager.

Implementation: Shell script for initial bootstrap of Ansible configuration. The web

proxy, AAAI protection and ssh server all use off-the-shelf

components.

Maintainability: Very minimal integration of off-the-shelf components.



https://docs.google.com/document/d/1uMzxqVDsOU9vycQRJIxQiYzc82pwLk7SRbOZGRjldpI/edit#heading=h.x41gd5yo1mye


https://docs.google.com/document/d/1csZAezlyZ_lLRwGK78b02uGEpp_l840DTI5FXxV0AgE/



11.2.1.4 System Interfaces

All components in the SDP Operational System and Science Pipeline Workflows are able to use these

Interfaces. They can all assume that they have access to these interfaces without needing to know

how Platform Services implements those interfaces. The details of each interface are covered in the

following sections.

Implementation: Off-the-shelf existing interfaces

Maintainability: Need to ensure the interface is always used in a similar consistent

way.

11.2.1.4.1 Service Connection Interface

The platform is responsible for running a variety of services (such as Databases and Queues) that are

needed by the SDP Operating System. The Service Connection Interface is the way the information

needed to access those services is presented to the components that need that information.

It is the responsibility of the SDP Operational System configuration to present the connection

information to only the components that need that information. One possible implementation is

using Environment Variables, another is injecting the information into the filesystem, both as used

by Kubernetes secrets.

11.2.1.4.2 Operating System Interface

All Workflows and Components are expected to be executing on an Operating System of their

choice. Container based deployment means the Container Image defines the userspace details of the

Operating System, while the kernel is defined by the operating system that is executing the

Container Orchestration Engine (i.e. defined by the Operating System image deployed on the server

hardware by Core Infrastructure Services).

11.2.1.4.3 Logging and Metrics Input Interface

For containers it is expected that all logs will be sent to standard output. That output will then, with

cooperation of the Container Orchestration Engine, be aggregated as specified in the appropriate

Configuration. Note if containers are not used, systemd unit files could be used to route standard

output to journald, which in turn can be be aggregated in a very similar way as extracting the logs

from the Container Orchestration Engine. No components need to be aware of how the logs are

aggregated.

Metric collectors are expected to be setup as defined by the Configuration. This includes monitoring

each processes use of key system resources, including CPU, Memory, Networking and Filesystem

usage.

An interface will need to be decided for components to report application specific metrics. The

prototype used a Service Connection Interface to statsd to allow for optionally collecting application

generated metrics in a way that keeps all components isolated from needing to understand how the

metrics are aggregated. An alternative would be a more Prometheus native approach where each

component would expose an API from which its metrics can be scraped.



https://kubernetes.io/docs/concepts/configuration/secret/


11.2.1.5 Platform Interfaces

This represents the APIs that are exported by software deployed as part of Platform Services. There

is no code associated with these interfaces, although most components have client libraries that can

be used to access these APIs, these are generally already integrated into the Configuration

Management system.

Here is a brief summary of each interface, and if it is internal or not:

● Logging and Metrics Query Interface

○ Exposed to SDP Operational System, to send alerts to Telescope Manager

○ Likely to be the Elasticsearch interface provided by the Logging and Metrics Service

● Core Infrastructure Interface

○ Internal

○ OpenStack APIs to provision baremetal nodes with the correct networking

● Container Registry Interface

○ Exposed to SKA Common, list of images exposed via Platform Configuration Interface

○ Provided by the Artifact Repository

● Container Orchestration Interface

○ Internal

○ Interface to create the Container Orchestration Engine clusters (e.g. OpenStack

Magnum)

○ Interface to create containers on the Container Orchestration Engine (e.g. Docker

Swarm, Kubernetes)

● Storage Provisioning Interface

○ Exposed via Platform Configuration Interface

○ Interface used to create a share

○ Information to attach to share used by Container Orchestration Engine

For details on the implementations of the APIs see the Deployed Software Component.

Implementation: Native interface provided by each off-the-shelf component

Maintainability: Should the need arise for adapters between multiple technologies

that are not themselves off-the-shelf components, there may be

significant work required to maintain them.

11.2.1.6 Platform Software

This module includes all the off-the-shelf software modules that are referenced in the Platform

Configuration and used to deliver one or more Platform Services components.

Here is a list of what Platform Services component each module helps deliver, along with the

off-the-shelf component that was used in the prototype:

● Logging and Metrics

○ Platform Services Component: Logging and Metrics Services

○ Prototype: OpenStack Manila, Elasticsearch (Full ELK stack), Prometheus

● Core Infrastructure Services

○ Prototype: OpenStack Ironic, Nova, Neutron (and other services they depend on)




● Remote Storage Provisioning

○ Prototype: OpenStack Manila, OpenStack Cinder, and Ansible, including various

backends: CephFS, GlusterFS, BeeGFS

● Artifact Repository:

○ Prototype report notes related experience with JFrog

● Container Orchestration Engine:

○ Prototype: OpenStack Magnum, backends: Docker Swarm and Kubernetes

● Operations Management Platform:

○ RunDeck executing Ansible

● Operating System

○ Prototype: various linux distributions including CentOS, Ubuntu, CoreOS, Alpine, and

others

Please see the P3 Prototyping Report for a discussion of each component in detail.

Implementation: Off-the-shelf

Maintainability: Open Source components were used for the prototype to allow both

for the opportunity to share and sustain any SKA specific innovations

in the appropriate upstream projects, and allow for any required

changes to keep the system working with the chosen SKA

configurations.

11.2.1.7 SDP Dependencies

The expectation is that off-the-shelf components are run by the Platform and the Operational

System Configuration will use the Service Connection Interface to link the service being run by the

platform to the appropriate SDP components.

Several examples are listed: Configuration Database, Data Queues and Other Databases. It is

expected that the Platform is in the best position to decide the most performant and reliable way to

run each of the services on behalf of the SDP Operational System. Operationally it makes sense to

share the best practices for running these services

Implementation: Off the shelf software, such as MySQL, MariaDB, Kafka, Redis, etcd

Maintainability: Not expecting modifications ot these off-the-shelf software. Platform

has to keep these upgraded so the version deployed is supported, or

migrate to a supported alternative.


The primary representation uses two types of relations: The “allowed-to-use” relationship and

module decomposition. These are defined in the SDP Platform Services C&C View.

Module decomposition is shown by nesting. Beyond module hierarchy, this implies some degree of

“allowed-to-use” relationship between contained modules even if not spelled out explicitly.

Implementation decisions for these are likely going to be related. “Allowed-to-use” relationships of

the top-level module are understood to propagate to lower-level modules.





All relations are shown in the primary representation. See the element catalog for further details on

the nature of those relationships.


Not Applicable.

11.2.4 Element Behaviour

Not Applicable.


Figure 2. Context diagram showing the relation of Platform Services to the rest of the SDP

Operational System

The Context Diagram shows how all components are able to use the System Interfaces provided by

the Platform. As an example, this includes the Logging and Metrics Input Interface.



https://www.draw.io/#G1HVcfr3CYxXsvUXZgy9OJ6Fcya5hCfJfw


It shows Execution Control using the Platform, in particular it used Platform Services to provision

Science Pipeline Workflows. In a similar way the Buffer Management component of SDP Services

provisions the Storage needed by the Science Pipeline Workflows. The Platform Configuration

Interface is responsible for all the communication between the SDP Operational System that doesn’t

go via the System Interfaces.

11.4 Variability Guide We now discuss the ways in which the architecture supports particular variations.

11.4.1 Services vs Science Pipeline Workflows

There are two different kinds of workloads supported by Platform Services.

Firstly there are long lived services. These could be considered static workloads, because they do not

directly interact with the underlying platform, they are just being run on resources managed by the

Platform and the lifetime of those services are dictated by the Platform.

Secondly there are more dynamic workloads that trigger operational changes in the platform that is

hosting them. Specifically we are talking about how the SDP Operational System starts and stops

Science Pipeline Workflows running on Container Orchestration Engines provisioned by Platform

Services.

From a general usage of cloud perspective the more static workloads are more common. However, it

is not uncommon for auto-scaling, but Platform Services is not constantly expanded to give the

impression of infinite resources, it is sized to match the expected average load of the system. As

such, it is not auto-scaling, it is scheduling work to make best use of the currently available

resources.

11.4.2 Running other Components on the Platform

The current focus of Platform Services is on the SDP and the SDP’s Workflows. However following on

from the discussion above, it would be very easy for Platform Services to host other SKA

Components, such as Telescope Manager. The main constraint is the amount of resources that are

available.

11.4.3 Isolating SDP Operational System from Platform Services

The SDP Operational system may be running in various places outside of Platform Services, such as

on a developer laptop, inside a continuous integration system test infrastructure, or on a commercial

non-OpenStack public cloud. Where possible, we want to keep things flexible, but also enable the

ability to keep developers as close as possible to how things are run in the production Platform

Services environment.

11.4.3.1 Service Connection Interface

The SDP Operational System is hidden from the details of how Platform Services provides the

services it needs by the Service Connection Interface. The connection details are specified to the SDP

Operational System when it is started, either via static configuration or the the more dynamic

Configuration Database.




This means on a developer laptop you can use similar configuration scripts that simply point to

services they are running locally on their laptop via containers. In a similar way, when running in a

Science Regional Center, its possible a database service offering provided locally at the Science

Regional Center is used instead of custom installed services that are used in production.

11.4.3.2 Platform Configuration Interface

The SDP Operational system should be able to deploy its workflows in the same way no matter

where they happen to be deployed. The fact containers are used is expected to be hidden by this

interface. The key part is that a process is started, using a pre-built binary, and a particular set of

resources.

Prototyping of SIP and AlaSKA [RD10.1] has shown the use of containers really helps enable the

ability to deploy in multiple locations. The prototype work made use of Docker Swarm with the

buffer attached via a bind mount to help keep both prototype efforts moving forward in parallel.

The exact implementation of the Storage Backends that are provisioned by the Storage and

Provisioning Interface are also expected to be abstracted away by the Platform Configuration

Interface. There is a uuid for each storage share provisioned by the Storage Provisioning Interface,

and it is then attached to the appropriate compute resources, at which point it should be mounted

into the container in the expected location.

11.4.3.3 Logging and Metrics Input Interface

A common pattern in containers is to output all logging to standard output, and let the container

infrastructure deal with log aggregation. This worked well in the prototyping efforts, and is widely

adopted by all Platform and SIP services.

In a similar way, system metric collection can make use of the container metadata to workout who is

consuming what resources. However there are currently less clear standard approaches to collection

application generated metrics. Statsd was explored as a possible Metrics Interface. This interface

allowed the platform to use both OpenStack Monasca and Prometheus to collect metrics with zero

changes to the executing SIP prototype code.

11.4.4 Isolating different layers of Platform Services

Platform Interfaces component is there to show that the platform services that communicate with

each other generally don’t care how they are provisioned, only that the expected interface is

available.

The specific hardware vendor is abstracted away, an operating system with appropriate drivers

installed is provided by Core Infrastructure Services. While baremetal provisioning is used to provide

that environment, composable hardware or virtualization could be used instead should the tradeoffs

between the different approaches be different in the future.

Once the details of various differences between the Science Regional Centre and the Science

Processing Center become clear, these interfaces could be used to abstract the differences between

services that are available in the different locations. In a similar way, this could be used to help

migrate between different service implementations as things evolve over time.




11.4.5 Science Processing Centre and Science Regional Centre

There are two Science Processing Centres, both expected to run the same Platform Services, SKA

Low1 and SKA Mid1. However it is expected there will be some configuration that is specific to each

of those sites. Most simplistically, they will have different amounts and types of hardware to match

the different data rates associated with each telescope. Similarly, it is expected there is a relatively

small pre-production environment used to test versions of the Platform and SDP Operational System

before rollout to one of the two production systems.

In the two Science Processing Centres the platform will be used to manage and control the hardware

directly. When running in a Science Regional Centre there will likely be substantial changes to

Platform Services to adapt to work on top of a pre-existing API used to share the hardware between

multiple tenants, expected to be a cloud like API and not a job submission system. It is also possible a

Science Regional Centre may directly provide container orchestration engine instances, on which

both the SDP Operational System, its support services and Science Pipeline Workflows can all

execute. This mode of execution is also popular for development, as it keeps developer laptops, test

pipelines and production looking as similar as possible.

11.5 Rationale

11.5.1 Existing Architectures

The construction of Platform Services is greatly influenced by a collection of existing architectural

approaches described below.

11.5.1.1 Software Defined Infrastructure

Platform Services follows a now very common pattern of using configuration management tools to

treat Infrastructure as Code. This approach is enabled by the use of cloud technologies that allow the

manipulation of infrastructure via APIs, such as OpenStack.

Ansible was used extensively in prototyping, which makes considerable reuse of existing community

Ansible roles. In the same way the Platform uses roles provided by OpenStack kolla-ansible to install

OpenStack, we expect the platform to use roles provided by the SDP Operational system to install

the SDP Operational System.

11.5.1.2 Container Orchestration

OpenStack Magnum is one of several certified Kubernetes Installers [RD10.3]. As such, it is becoming

more common to use a service to provision kubernetes on demand. It is a particularly common way

of quickly creating a kubernetes cluster within comercial public clouds, without needing to know the

exact infrastructure that clusters are deployed on.

The Kubernetes OpenStack cloud provider [RD10.8] provides a way to expand a cluster onto new

resources provisioned by OpenStack. However, this doesn’t meet the use case of the SDP

Operational System executing workflows. Here we have a relatively fixed amount of total capacity

that over time is shared differently between Real-time and Batch Processing, depending on the

needs of particular scheduling blocks.




Using containers helps describe supporting Science Pipeline Workflows evolving over time, via the

flow of updating image versions which enables software with competing dependencies to co-exist

due to encapsulation.

11.5.1.3 Baremetal Cloud

OpenStack Ironic is just one example of how physical machines can be deployed via an API in a very

similar way to VMs. The key advantage is the ability to have zero performance overheads, as you get

direct access to the physical server just as you would if it were deployed using more manual

methods.

The trade-off is that you no longer get features like live-migration, and in many cases snapshot. You

also don’t get the security protections gained from not allowing direct access to the physical server.

The baremetal cloud is particularly powerful when combined with container orchestration.

OpenStack Magnum can be combined with OpenStack Ironic to provide containers running directly

on the physical servers.

11.5.1.4 Operations as a Service

Operations Management Interface makes use of the the Operations as a Service approach, perhaps

the least widely adopted of all the approaches described here. It provides a way to trigger high level

operations, such as enter low power mode or restore from low power mode, without the user

needing to know how that operation is implemented. The prototype used RunDeck [RD10.2] to

explore this approach.

The key advantage of this approach is to provide a defined set of operations that can be executed

when the consumer of those operations chooses. This is all done is a way that allows the tracking of

all changes to the system, no matter who triggered them.

11.5.2 Prototyping

The Performance Prototype Platform (P3-Alaska) [RD10.4] report details the prototyping work that

has helped inform and validate the approach described in this document. That document captures

more detailed information in a series of memos, namely:

Document Components Covered

P3-Alaska OpenStack Prototyping [RD10.5] Core Infrastructure Services

P3-AlaSKA Container Orchestration and Compute Provisioning Interface [RD10.6] Cloud Native Applications on the SDP Architecture [RD10.9]

Container Orchestration Engine and Compute/Storage Provisioning

P3-AlaSKA Monitoring and Logging [RD10.7] Monitoring and Logging for the SDP [RD10.10] Apache Kafka for an SDP Log Based Architecture [RD10.11]

Logging and Metrics

These memos provide significant evidence to demonstrate the use of standard off-the-shelf software

components, such as OpenStack to provide the necessary functionality.



https://www.rundeck.com/oaas-guide

https://docs.google.com/document/d/1V0-9byeSUIY0gyxT7OELKV6qpZ_4lAoJrWNOTZFiT8U/edit#


The SDP Platform Services C&C View discusses how the prototype work has informed both the

expected behaviour of the system at run time and the split of responsibilities between the various

runtime components of both Platform Services and the SDP Operational System.

11.5.3 Requirements

There are no requirements that dictate the code structure of Platform Services. For a detailed look a

requirements that have influenced the design of Platform Services please see the SDP Platform

Services C&C View.

11.6 Related Views This view is a decomposition of the SDP System-Level Module Decomposition and Dependency View.

This view refers to other views:

● SDP Execution Control C&C View

● SDP Platform Services C&C View

● SDP Hardware Decomposition View

● SDP Buffer C&C View

● SDP Platform Services C&C View

11.7 References

11.7.1 Applicable Documents

The following documents are applicable to the extent stated herein. In the event of conflict between

the contents of the applicable documents and this document, the applicable documents shall take

precedence.

11.7.2 Reference Documents



[RD10.1] SKA-TEL-SDP-0000137 SKA1 SDP Integration Prototype (SIP) Report

[RD10.2] https://www.rundeck.com/open-source

[RD10.3] https://landscape.cncf.io/grouping=landscape&landscape=certified-kubernetes-insta

ller

[RD10.4] SKA-TEL-SDP-0000151 P3-Alaska Prototyping Report

[RD10.5] SKA-TEL-SDP-0000166 SDP Memo 069 P3-Alaska OpenStack Prototyping

[RD10.6] SKA-TEL-SDP-0000167 SDP Memo 070 P3-AlaSKA Container Orchestration and Compute Provisioning Interface

[RD10.7] SKA-TEL-SDP-0000165 SDP Memo 068 P3-AlaSKA Monitoring and Logging






https://docs.google.com/document/d/1M0S20FWn4Dsb8nl9duIoW93OEiXlzVDGh8sqImOl6S0/edit#heading=h.vfsim49widkz



https://drive.google.com/open?id=1DOMRHX8u7tmHpy0G98YtHIFXHk0X1CVduMxCDP-WUHM

https://docs.google.com/document/d/1uMzxqVDsOU9vycQRJIxQiYzc82pwLk7SRbOZGRjldpI/edit#heading=h.x41gd5yo1mye


https://www.rundeck.com/open-source

https://landscape.cncf.io/grouping=landscape&landscape=certified-kubernetes-installer

https://landscape.cncf.io/grouping=landscape&landscape=certified-kubernetes-installer

https://docs.google.com/document/d/198_-Ar6XCcDP1qKJ92c7CZBRckLchCshoRszkQL0Hhg/

https://docs.google.com/document/d/1u6yusMLeUn7Mp6qy-fPb73_kOAnkH56OSBhNmRJXsGc/edit

https://docs.google.com/document/d/1u6yusMLeUn7Mp6qy-fPb73_kOAnkH56OSBhNmRJXsGc/edit

https://docs.google.com/document/d/1iUHOKGQdpVutet64bembFbBetc0vByiJvSP1Wx997lM/edit#


[RD10.8] https://github.com/kubernetes/cloud-provider-openstack

[RD10.9] SKA-TEL-SDP-0000131 SDP Memo: 051 - Cloud Native Applications on the SDP Architecture

[RD10.10] SKA-TEL-SDP-0000163 SDP Memo 052 - Apache Kafka for an SDP log based architecture

[RD10.11] SKA-TEL-SDP-0000132 SDP Memo: 053 - Monitoring and Logging for the SDP



https://github.com/kubernetes/cloud-provider-openstack

https://docs.google.com/document/d/1VSeRLXANQeyB7lNuqVGDNIUAbqpA5bNx6sPL0cZjtYE/edit

https://docs.google.com/document/d/1VSeRLXANQeyB7lNuqVGDNIUAbqpA5bNx6sPL0cZjtYE/edit

https://docs.google.com/document/d/1FHhK5BaGm7K6Hlr2JGs2D7xlG-5KK9O5G9aPW8RxjlI/edit

https://docs.google.com/document/d/1FHhK5BaGm7K6Hlr2JGs2D7xlG-5KK9O5G9aPW8RxjlI/edit

https://docs.google.com/document/d/1ifkECphIKcyMcMsK8zVmAaaM8D1MhB1dtAzMUB-k5aE/edit


12 Science Pipeline Workflows Modules Contributors: P. Alexander, D. Fenech, P. Wortmann


Figure 1: Primary representation of the Science Pipeline Workflows Module View

This is decomposition of the System Module Decomposition View, so elements are units of software.

“Allowed-to-use” relationships are shown as arrows annotated with “use”; “Is-part-of” relations

between modules are shown by nesting; “implements” realises an interface module. This view

overlaps with the Science Pipeline Workflow Scripts & Deployment View, which provides more

documentation on the workflow modules and illustrates how they will be deployed at run time.

Science Pipeline Workflows (or workflows for short) describe the work that needs to be done to

execute all processing associated with a Processing Block. Workflows need to be easy to maintain for

people not well versed in the SDP architecture, and efficient to modify without impacting the

stability of the system (see Science Pipeline Management Use Case View) They are built on top of a



https://www.draw.io/?scale=4#G1UEdRfCeTySvFm0VlXjvhGJWSsRs4Fput



https://docs.google.com/document/d/13KFmNUM9e3nUsjT9vTnXvOwRnSaodJfO3SE2VA-wyFo/edit


deeply layered set of modules that are meant to be decoupled as much as possible from the details

of workflow execution in the SDP architecture.

At the highest level, the workflow implements the Workflow Control Interface by providing Control

& Configuration Scripts to determine the resources to allocate, how to use them in order to execute

all required processing steps, and perform any required clean-up steps afterwards. This information

will be used by Processing Control code to make resource assignment decisions, as well as initiating

and finishing processing at a high level. See also the description of real-time and off-line processing

activity in the Operational C&C View as well as behaviour in the Workflow Scripts C&C View.

Workflow Control & Configuration scripts can share code using Workflow Control Libraries. This

should especially include helpers to produce typical setup and cleanup scripts, as these are likely

going to be very similar between most workflows. Quality Assessment provides similar high-level

shared workflow building blocks geared specifically towards assessing scientific performance of the

workflow.

Workflow Interfaces isolate workflow control scripts from the rest of the SDP architecture, both to

simplify development as well as to make the architecture robust against mistakes in Workflow

scripts. Apart from the Workflow Control Interface this also covers Workflow Service Interfaces,

which provide access to - for example - Buffer Services, Delivery but also Platform Services. Using the

dependency on Data Models this also covers interaction with standard SDP infrastructure, such as

Data Queues or the Configuration Database.

Furthermore, deployment of actual processing is going to use the Execution Framework Interface

implemented by Execution Frameworks. This should allow the workflow to parametrise and deploy

Execution Engine Programs on resources assigned to the Processing Block by Processing Control.



12.2.1.1 Processing Controller

Execution Control module responsible for managing Scheduling and Processing Blocks at the highest

level. Associated with Processing Blocks will be Processing Block Controllers, which execute the

workflow Control & Configuration scripts, see Execution Control C&C.

12.2.1.2 Science Pipeline Workflows

Library of workflows that can be executed on the SDP architecture. Represented as a series of

Workflow Control & Configuration Scripts.

This will include a way to obtain a list of workflows to support workflow selection for the purpose of

observation planning. For this purpose, this should contain information about:

● Workflow characteristics - such as name, category, author, version, description.

● Any custom parameters a workflow will require - such as for example the requested number

of major self calibration loop iterations. This should include documentation and sensible

defaults where appropriate.




https://docs.google.com/document/d/1i2HPv9ZCKL9f69poLIMRpLykEKGzsZpbE43-7w-5Wus



● Information about how the SDP resource model can be used in order to derive resource

estimations for the execution of the workflow in question. Will likely take the form of a piece

of code or formula using primitives from the resource model.

● Testing and deployment information for Continuous Integration and Deployment (see Code

Management C&C View). Needs to declare the Execution Engine Program artefacts that need

to be provided in order to execute the workflow.

12.2.1.3 Control & Configuration Scripts

Gets run as the Processing Block Controller to prepare for, execute and clean up the workflow

associated with a Processing Block. Its primary way to interact with the SDP architecture is by

updating the SDP configuration information associated with its Processing Block, see the Execution

Control Data Model. This will especially include:

● Making resource requests to the Processing Controller

● Using assigned resources to deploy storage and processing infrastructure

● Coordinate services in preparation and finishing of processing

See Science Pipeline Workflow Scripts & Deployment View for more detail.

12.2.1.4 Execution Engine Programs

Code associated with workflows that is going to get instantiated using Execution Frameworks to

execute processing. Depending on the type of processing required might be anything from a simple

bash script to a complex MPI or DALiuGE pipeline.

Again, see Science Pipeline Workflow Scripts & Deployment View for more discussion.

12.2.1.5 Workflow Control Libraries

Collects code shared between workflow Control & Configuration Scripts. The idea here is that many

workflow will likely have common patterns that can be maintained in a central location. For

example, especially the set-up phase and the clean-up phase of a workflow will likely involve running

through the same steps no matter the concrete pipelines (prepare inputs, then publish data

products). These steps are often also often the most prone to error, so it makes sense to maintain

them centrally.

Furthermore, many radio astronomy pipelines share similar overall themes - such as that of

self-calibration loops (see ICAL Workflow View) or distributed calibration consensus (see Model

Partition Workflow View). Where possible we might be able to capture such patterns as workflow

“skeletons” and streamline the development of future workflows utilising them.

12.2.1.6 Quality Assessment

Code repositories for workflow Control & Configuration Scripts to support runtime assessment of

scientific performance of the workflow. Similarly to Workflow Control Libraries, collecting this type

of information from Processing Components and aggregating it will lead to characteristic workflow

patterns that will be to some degree independent from the rest of the workflow’s functionality -

typically involving setting up Data Queues and some simple Execution Engines to aggregate the data

in a way that it is useful to scientific operators (see Processing C&C View).

Especially note that this will generate metrics that will be pushed out via Data Queues to the TANGO

interface (see Execution Control C&C). This code will therefore need to correctly implement a part of



https://docs.google.com/document/d/1_xiC-YHTe0SvPE97XWT1MgL_T-GVRdC8oLbqRBgY8fA/edit#

https://docs.google.com/document/d/1_xiC-YHTe0SvPE97XWT1MgL_T-GVRdC8oLbqRBgY8fA/edit#





https://docs.google.com/document/d/15faILLquZofR4HAxemNyY6LF1cDAb3CpSd7gxVJsBXQ

https://docs.google.com/document/d/1Kr07E4DJi0C7ZXLHA10c77gWK7QFKm8xGKM5NoKvJl8

https://docs.google.com/document/d/1Kr07E4DJi0C7ZXLHA10c77gWK7QFKm8xGKM5NoKvJl8

https://docs.google.com/document/d/12T03o0xnXdp2H1NB7XwMPAxRbEPSDEA3hCdsz6di4Go



the expected monitoring interface, which again heavily suggests lifting the details of metric

aggregation out of the workflow code.

12.2.1.7 Workflow Control Interface

Interface provide to Processing Control by workflows. Very light-weight, as the Science Pipeline

Workflows module should contain already contain most information about the workflow (such as

name, parameters, resource usage, see above). Furthermore, any parameters should be discovered

by the workflow script using the Configuration Database, therefore the offered interface would just

be that of running a script. We would expect that the script interpreter would be deployed inside a

container with access to the appropriate infrastructure, which then forms the Processing Block

Controller (see Execution Control C&C).

12.2.1.8 Workflow Service Interfaces

Provides the workflow the ability to interact with SDP services and platform services. Most of this

interaction will be using the Configuration Database and Data Queues at run-time. This will include:

● Model Databases, e.g. to generate Science Data Models

● Buffer Management, e.g. to manage Data Islands

● Delivery, e.g. to publish Data Products

● Platform Services, e.g. to deploy Execution Engines

● Data Models, e.g. to access and interpret data from Data Queues

Note that the SDP architecture might already have modules that facilitate control of these parts of

the SDP. In that case, the Workflow Service Interfaces might just be relatively thin wrapper code.

However, note that part of the responsibility of this interface is to act as a (soft) filter for what we

want to allow workflows to be able to do within the architecture. Therefore internal interfaces

should never be exposed entirely without careful considerations.

12.2.1.9 Execution Framework Interface

Used by workflows to instantiate Execution Engine Programs. Must define how the Workflow can

start, stop and monitor Execution Engine programs. Starting an Execution Framework might mean

how to identify and deploy the necessary containers, or submit the program to existing

infrastructure (see Deployment in Execution Control Data Model). Might be implemented as small

pieces of workflow script specific to the Execution Framework.


N/A

12.2.3 Element Interfaces N/A

12.2.4 Element Behavior

N/A






Figure 2: Workflow Module View Context

The context of the this view is the System Module Decomposition View.

12.4 Variability Guide Describing workflows as scripts leaves open a significant amount of options for how the workflow

could be represented in practice. With the help of workflow libraries, this could well be developed

into a high-level domain-specific language (DSL), or an interpreter/adapter for a third-party workflow

language such as the Common Workflow Language or Apache Airflow. This could especially open the

door for using existing third-party tools for creating, testing and visualising workflows.



https://www.draw.io/?scale=2#G1gg-mqfyZy3o4gsWYDW_Fkpnnt0alyiVZ



12.5 Rationale

12.5.1 Modifiability and Maintainability

The main rationale for the structure of the Workflow Modules is clearly modifiability and

maintainability: It should be easy to add and change workflows, and SDP will need to maintain a

large collection of workflows to implement all of the foreseen scientific use cases.

This leads to the very deeply layered structure shown, with especially workflow Control &

Configuration scripts heavily de-coupled from the entire rest of the SDP architecture: The Workflow

Interfaces act as an intermediate to basically all interactions.

12.5.2 Testability

Workflows are heavily de-coupled from the rest of the architecture, and especially only interact with

it over the Configuration Database and Data Queues at run time. This means that they can be easily

sand-boxed in a testing environment to establish correctness. This should especially allow testing

failure scenarios.

12.5.3 Robustness

Isolating workflows from the rest of the SDP system also allows us to limit the impact of a faulty

workflow on the rest of the system. As the workflow can generally only interact by declarative

configuration changes (see Execution Control Data Model), it should always be possible to re-trace

steps back into a workable state simply by rolling back deployments and starting the workflow from

scratch.

12.6 Related Views This view is a decomposition of the System Module Decomposition View. Further information about

scripts and deployments is provided in the Science Pipeline Workflow Scripts & Deployment View.

Runtime behaviour and context of workflows is documented in the Execution Control C&C and

Workflow Scripts C&C View.

12.7 Reference Documents The following documents are referenced in this document. In the event of conflict between the


(no reference documents so far)







https://docs.google.com/document/d/1i2HPv9ZCKL9f69poLIMRpLykEKGzsZpbE43-7w-5Wus


13 Applicable Documents The following documents are applicable to the extent stated herein. In the event of conflict between

the contents of the applicable documents and this document, the applicable documents shall take

precedence.

This list of applicable documents applies to the whole of the SDP Architecture.

[AD01] SKA-TEL-SKO-0000002 SKA1 System Baseline Design V2, Rev 03

[AD02] SKA-TEL-SKO-0000008 SKA1 Phase 1 System Requirement Specification, Rev 11

[AD03] SKA-TEL-SDP-0000033 SDP Requirements Specification and Compliance Matrix, Rev 02C

[AD04] SKA-TEL-SKO-0000307 SKA1 Operational Concept Documents, Rev 02

[AD05] 000-000000-010 SKA1 Control System Guidelines, Rev 01

[AD06] 100-000000-002 SKA1 LOW SDP to CSP ICD, Rev 04A

[AD07] 100-000000-025 SKA1 LOW SDP to SaDT ICD, Rev 04

[AD08] 100-000000-029 SKA1 LOW SDP to TM ICD, Rev 03B

[AD09] 100-000000-033 SKA1 LOW SDP to LFAA Interface Control Document (ICD), Rev 01

[AD10] 300-000000-002 SKA1 MID SDP to CSP ICD, Rev 04A

[AD11] 300-000000-025 SKA1 MID SDP to SaDT ICD, Rev 04

[AD12] 300-000000-029 SKA1 MID SDP to TM ICD, Rev 03B

[AD13] SKA-TEL-SKO-0000484 SKA1 SDP to INFRA-AUS and SKA SA Interface Control Document, Rev 02

[AD14] SKA-TEL-SKO-0000661 Fundamental SKA Software and Hardware Description Language Standards

[AD15] http://www.ivoa.net/documents/TAP/

[AD16] http://www.ivoa.net/documents/latest/SIA.html

[AD17] http://www.ivoa.net/documents/DataLink/

[AD18] http://www.ivoa.net/documents/SSA/

[AD19] Memorandum of Understanding between the SKA organisation and National Radio Astronomy Observatory relating to a work package for the study and design of a new data model for the CASA software package

[AD20] MeasurementSet definition version 3.0. MSv3 team, eds. 2018. http://casacore.github.io/casacore-notes/264

[AD22] Shibboleth Authentication Service from Interenet2 https://www.internet2.edu/products-services/trust-identity/shibboleth/



http://www.ivoa.net/documents/TAP/

http://www.ivoa.net/documents/latest/SIA.html

http://www.ivoa.net/documents/DataLink/

http://www.ivoa.net/documents/SSA/

http://casacore.github.io/casacore-notes/264

https://www.internet2.edu/products-services/trust-identity/shibboleth/


[AD23] COmanage Authorization Service from Interenet2 https://www.internet2.edu/products-services/trust-identity/comanage/

[AD24] SKA-TEL-SKO-0000990 SKA Software Verification and Testing Plan



https://www.internet2.edu/products-services/trust-identity/comanage/