pdr 03.04 system sizing derived from outputs of the ...broekema/papers/sdp-pdr... · model of ad1...

20
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 1 of 19 PDR 03.04 System sizing derived from outputs of the Parametric Model for the SDP. Document number…………………………………………………………..SKA-TEL-SDP-0000038 Context………………………………………………………………………………..………PROT.SS Revision………………………………………………………………………………………….……01 Author………………………………………………………………………………………Rosie Bolton. Release Date…………………………………………………………………………….….2015-02-09 Document Classification…………………………………………………………………. Unrestricted Status………………………………………………………………………………………………. Draft

Upload: others

Post on 16-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 1 of 19

PDR 03.04 System sizing derived from outputs of

the Parametric Model for the SDP.

Document number…………………………………………………………..SKA-TEL-SDP-0000038

Context………………………………………………………………………………..………PROT.SS

Revision………………………………………………………………………………………….……01

Author………………………………………………………………………………………Rosie Bolton.

Release Date…………………………………………………………………………….….2015-02-09

Document Classification…………………………………………………………………. Unrestricted

Status………………………………………………………………………………………………. Draft

Page 2: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 2 of 19

Name Designation Affiliation

Rosie Bolton SDP Project Scientist University of Cambridge

Signature & Date:

Name Designation Affiliation

Paul Alexander SDP Project Lead University of Cambridge

Signature & Date:

Version Date of Issue Prepared by Comments

0.1

ORGANISATION DETAILS

Name Science Data Processor Consortium

Signature:

Email:

Signature:

Email:

Rosie Bolton (Feb 9, 2015)Rosie Bolton

[email protected]

Paul Alexander (Feb 10, 2015)Paul Alexander

[email protected]

Page 3: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 3 of 19

1 Table of Contents

1 Table of Contents ................................................................................................................ 3

2 List of Figures ..................................................................................................................... 4

3 List of Tables ...................................................................................................................... 4

4 Introduction ......................................................................................................................... 5

5 References ......................................................................................................................... 6

5.1 Applicable Documents ................................................................................................. 6

5.2 Reference Documents ................................................................................................. 6

6 Estimates of overall Compute Load..................................................................................... 7

6.1 The Maximal Discovery cases...................................................................................... 7

7 System performance to deliver High Priority Science Objectives .......................................13

8 Limitations of the iPython implementation of the parametric model. ...................................17

8.1 Known omissions: .......................................................................................................17

8.2 Known Errors or incomplete sections of the iPython model: ........................................18

Page 4: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 4 of 19

2 List of Figures There are no figures in this document.

3 List of Tables

Table 1 Computational requirements summary table. ................................................................ 9 Table 2: The Continuum Cases, specific parameters and overall Memory, I/O and FLOP needs.

..........................................................................................................................................10 Table 3: The Spectral Line Cases, specific parameters and overall Memory, I/O and FLOP

needs. ................................................................................................................................11 Table 4: Overall compute load, buffer size and IO rate values as calculated by the iPython

implementation of the parametric model for the SDP.. .......................................................13 Table 5: High Priority Science Objective experiments: Compute LOAD on LOW ......................14 Table 6: High Priority Science Objective experiments: Compute load on MID ...........................15 Table 7: High Priority Science Objective experiments: Compute LOAD on SURVEY ................17 Table 8: List of areas of processing that are currently NOT INCLUDED in the iPython model. ..18

Page 5: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 5 of 19

4 Introduction

Here we present two separate estimates for the overall performance requirements of the SKA1

Science Data Processor. The first of these is a maximal discovery case, based on a system

imaging incoming data at full spatial and spectral resolution and capable of keeping up with

incoming data. We use this to size and cost the system.

The second estimate is sized to be able to deliver the maximum performance required of the

High Priority Science Objectives (HPSOs) as presented in RD1.

These HPSOs are a list of key projects that typically require several thousand hours of

telescope time teach to complete; they include experiments to measure the power spectrum and

to image the Epoch of Reionisation (using LOW); Pulsar search and timing experiments using

both MID and LOW; major continuum (full polarisation) surveys at around 1 and 10 GHz using

MID and SURVEY, including covering the whole hemisphere to obtain magnetic field

information via rotation measures, and a search for rapid single pulse transient sources using

MID. There are several other experiments, but the previous list gives an idea of the variety.

The shortest experiment is 800hrs to complete, whilst the longest requires 17,500hrs.

The HPSOs were born out of the “Science Use Cases” presented in RD4. We have been

provided with two reference documents; RD2 and RD3. The most relevant one to the HPSO

analysis is RD3, the spreadsheet detailing the specifics of each of the experiments in terms of

time per pointing, hours spent in total, frequency range, number of channels required at output

and, for the imaging experiments, the required spatial resolution (which we use to determine the

maximum baseline used) and the total observed area for the experiment (essential for derivation

of archive requirements (see RD5). These parameters enable us to re-run the performance

model of AD1 & AD2 and obtain compute load estimates for the imaging experiments.

For all imaging experiments we assume that each field is imaged out to the second null of the

primary beam during the continuum pipeline, regardless of what the final field of view will be.

In our costings for this PDR submission we have chosen to use the maximal discovery case for

each of the three telescopes to estimate hardware costs. The specification of each telescope is

that of the baseline design prior to rebaselining: the maximal case will change as a

consequence of rebaselining. In contrast, the HPSO experiment numbers do not push the

system to its maximum limits in multiple dimensions and so it may be expected that the HPSO

performance requirements are more robust to the rebaselining process. We include them as a

useful measure of the performance required of the SKA1 SDP in order to be capable of

delivering these world-class science products.

This paper presents the results as calculated by the model described in PDR.05 (AD1) and

implemented in the iPython notebook (see AD2)

Page 6: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 6 of 19

5 References

5.1 Applicable Documents

The following documents are applicable to the extent stated herein. In the event of conflict

between the contents of the applicable documents and this document, the applicable

documents shall take precedence.

Reference Number Reference

AD1 PDR.05 Parametric Models of SDP Compute Requirements

AD2 PDR05.01 The SDP Performance Model implemented in IPython

AD3 PDR02 Pipelines sub-element design document

5.2 Reference Documents

The following documents are referenced in this document. In the event of conflict between the

contents of the referenced documents and this document, this document shall take

precedence.

Reference Number Reference

RD1 SKAO “Baseline design” document

RD2 SKAO Five years in the life document

RD3 Excel spreadsheet of HPSO experiments

RD4 Science Use Cases for SKA1

RD5 PDR01.03 “SDP archive size estimates”

Page 7: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 7 of 19

6 Estimates of overall Compute Load

In our parametric model document, PDR05 (AD1), we have described our parametric model for

the overall imaging compute load (i.e. the sustained number of floating point operations per

second that the system can actually deliver) i.e. they do not include any (in)efficiency factors so

the system built will need to be larger than this.

Here we take the iPython implementation of this model and apply the parameters for each

instrument to estimate the required system sizing for the SDP. Firstly we size system to the full

“baseline design” (see RD1); secondly we apply parameters relevant for each of the imaging

experiments in the High Priority Science Objectives (HPSOs) as laid out in RD2 and RD3.

The model itself has limitations of a scientific nature: these are described in PDR05 (AD1), but

there are also some shortcomings of the iPython implementation, where sections of the PDR05

model have evolved at pace and the iPython model has yet to catch up. We describe these in

section 0.

6.1 The Maximal Discovery cases

The following tables provide the input parameters from which the baseline requirements for

costing are determined for the maximal discovery cases for each instrument. Please refer to

RD1 for a full description of the parameters (we refer to the relevant section of RD1 in the table).

Page 8: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 8 of 19

Symbol SKA1-Low SKA1-Mid (B1)

SKA1-Survey (B1)

Justification Refer to PDR05 document section / L1 req

Telescope parameters

Maximum Baseline (km)

𝐵max 100 200 50 L1 requirements / RD1

Number of Frequency Channels from Correlator

𝐵f,corr 256000 256000 256000 L1 requirements / RD1

Number of stations/antennas

𝐵a 1024 256 96 L1 requirements / RD1

Number of beams 𝐵beam 1 1 36 (NB only 1 beam for LOW still)

Antenna/Station Diameter (m)

𝐵s 35 151 152 L1 requirements / RD1

Correlator Dump Time (s)

𝐵dump 0.6 0.08 0.3 L1 requirements / RD1

Number of polarisation products

𝐵pp 4 4 4 L1 requirements / RD1

Parameters for both continuum and spectroscopic case

Image-plane oversampling factor (in units of Nyquist sampling)

𝐵pix 2.5 2.5 2.5 5 pixels per synthesized beam. PDR05 Section 12.11

Amplitude level of w-kernels to include

𝐵𝐵 0.01 0.01 0.01 PDR05 section 12.5

Linear size of A-kernel (pixels)

𝐵𝐵𝐵 9 9 9 C.f. PDR05 section 12.5.2, Eq 27. These equations might typically give 10pixels rather than 9.

Ionospheric timescale (s)

𝐵ion 60 60 60 This is the assumed update rate of the gridding kernel – i.e. the maximum length of time that it can be re-used for. See Tupdate in PDR05 Section 12.8.

Oversampling factor of w-convolution kernels

𝐵GCF 8 8 8 PDR05 Section 12.5.

Number of adjacent frequency channels for

𝐵fcv 10 10 10 Baseline-dependent kernels are reused for 10

1 Some of the dishes comprising SKA1-Mid will be from MeerKAT with diameter 13.5m. This smaller diameter is not taken into account in the analysis presented here and all dishes are assumed to have 15m diameter. 2 Some of the dishes comprising SKA1-Survery will be from ASKAP with diameter 12m. This smaller diameter is not taken into account in the analysis presented here and all dishes are assumed to have 15m diameter.

Page 9: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 9 of 19

which baseline dependent kernels can be reused

adjacent frequency channels. PDR05 section 12.8. (Note that this value lacks formal justification.)

Full Mueller Matrix gridding factor

𝐵mm 4 4 4 Nmm=4 is the maximum possible. (PDR05 section 12.8)

Maximum observing wavelength in band (m)

𝐵max 6 0.857 0.86 L1 requirements, RD1, Band definitions

Minimum observing wavelength (m)

𝐵min 0.857 0.286 0.33 L1 requirements, RD1, Band definitions

Size of visibility datum in the buffer (Bytes)

𝐵vis 12 12 12 See PDR Section 12.15.1

Table 1 Computational requirements summary table.

Page 10: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 10 of 19

Table 2: The Continuum Cases, specific parameters and overall Memory, I/O and FLOP needs.

Continuum specific Parameters

Number of Major Cycles

𝐵major 10 10 10 Previous estimates were based on this value. Difficult to justify before commissioning of the telescopes. See PDR05 sections 12 and 15.1

Diameter of imaging field of view (in units of diameter to first zero)

𝐵FoV 1.8 1.8 1.8 See PDR05 section 9.2

Number of output frequency channels

𝐵f,out 500 500 500 See PDR05 section 12.8

Continuum Results

Total FLOP Requirement (PetaFLOPs)

𝐵contFLOP 2.8 4.3 3.9

Snapshot Duration(s) 𝐵snap 87 1132 1907 See PDR section 12.8

Nfacet 𝐵facet

3 13 7 See PDR05 12.12; RFlop is minimized by allowing Nfacet and Tsnap to vary

Target grid size (1D pixels)

13,933 15,005 6,966 Refer to section of PDR05 redo numbers to accout for faceting

Visibility Buffer Size (PetaByte)

𝐵bufvis 0.9 0.12 0.19 This is after averaging visibilities in time and frequency in a baseline dependent fashion using a binned baseline distribution as described in PDR05.01.

Visibility Buffer Bandwidth (TeraByte/s)

𝐵io 1.9 4.5 2.1

Page 11: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 11 of 19

Spectral Line Specific Parameters

Number of Major Cycles

𝐵major 1 1 1 In PDR05 section 12.9, Equation 36 Nmaj is effectively set at 1.5. This is a new development but an important underestimate in the modeled output.

Diameter of imaging field of view (in units of diameter to first zero)

𝐵FoV 1 1 1 FoV to extend to the first zero of Airy function (PDR05 Sec 9.2, Equation 6)

Number of output channels

𝐵f,out 256,000 256,000 256,000 L1 requirements / RD1

Spectral Line Results

Total FLOP Requirement (PetaFLOPs)

𝐵specFLOP 21.6 40.6 57.2

Snapshot Duration(s) 𝐵snap 118 1491 1817

Number of facets (1D) 𝐵facet

1 5 3 RFlop is minimized by allowing Nfacet and Tsnap to vary

Target grid size (1D pixels)

23,221 21,673 9,031

Visibility Buffer Size (PetaByte)

𝐵bufvis 256 29.7 87.5 This is after averaging visibilities in time in a baseline dependent fashion using a binned baseline distribution as described in PDR05.01.

Visibility Buffer Bandwidth (TeraByte/s)

𝐵io 5.4 17 18 NB. Since there is only one major cycle these are significantly changed.

Table 3: The Spectral Line Cases, specific parameters and overall Memory, I/O and FLOP needs.

Page 12: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 12 of 19

Fast Imaging Parameters

Number of output channels

𝐵f,out 500 500 500 To achieve 1MHz output frequency resolution

Image-plane oversampling factor (in units of Nyquist sampling)

𝐵pix 1.5 1.5 1.5 PDR05 Section 9.2, Equation 7. Qpix=1.5 is consistent with the CASA Imager task.

Diameter of imaging field of view (in units of diameter to first zero)

𝐵FoV 0.9 0.9 0.9 Image only the primary beam PDR05 Section 9.2

Timescale for images (after discussion with Transients SWG)

T(snap) (and Tobs)

1.2s 1.2s 1.2s Transients SWG suggested a 1s timescale for “standard” observations, we choose 1.2s as it is 2xTdump for SKA1 LOW

Fast Imaging Results

Number of facets (1D) 𝐵facet

2 27 18 RFlop is minimized by allowing Nfacet to vary

Total FLOP Requirement (PetaFLOPs)

𝐵fastFLOP 0.35 6.7 10.7

The total compute load and overall buffer-to-processors I/O rate for each instrument are given

by the sum of values for the three pipelines (slow transients, continuum, and spectral line). The

uv grid memory and convolution kernel cache sizes are taken to be the maximum from the three

pipelines, whilst the total buffer size required to store the visibility data after ingest (for these

maximal cases) is defined by the spectral line mode. Table 4: Overall compute load, buffer size

and IO rate values as calculated by the iPython implementation of the parametric model for the

SDP. Numbers are for pseudo-real time processing at the full spatial and spectral resolution,

generating a continuum cube and a spectral line cube once per 6 hour observation and

additionally a fast-imaging cube every 1.2 seconds. Faceting and snapshot timescale have been

optimised independently for each pipeline (where appropriate) to minimise the total FLOP rate.

Page 13: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 13 of 19

Table 4: Overall compute load, buffer size and IO rate values as calculated by the iPython implementation of the parametric model for the SDP. Numbers are for pseudo-real time processing at the full spatial and spectral resolution, generating a continuum cube and a spectral line cube once per 6 hour observation and additionally a fast-imaging cube every 1.2 seconds. Faceting and snapshot timescale have been optimised independently for each pipeline (where appropriate) to minimise the total FLOP rate.

Symbol SKA1-Low

SKA1-Mid (B1)

SKA1-Survey (B1)

Notes

Total Performance requirements, maximal imaging case

Total Compute load (PetaFLOPs)

𝐵TOTALFLOP 25 52 72

Maximum target grid size (1D pixels)

23,221 21,673 9,031

Visibility Buffer Size (PetaByte)

𝐵bufvis 256 29.7 87.5 Set by spectral line case.

Visibility Buffer Bandwidth (TeraByte/s)

𝐵io 7.4 23 22 Sum of all pipelines

7 System performance to deliver High Priority Science

Objectives

We have split the 13 HPSOs described in RD2 and RD3 into 21 different observational

programmes to be undertaken with SKA1. We assume that the imaging experiments here all

require a continuum pipeline to be run first, then, for some experiments, a spectral line pipeline

is also required to produce the final cube. Parameters driving these two pipelines are taken from

the tables above, the only differences being the frequency range, number of output channels

required (for continuum and spectral line modes) and maximum baseline needed (derived from

the stated required spatial resolution).

We have not included the NIP experiments: the bulk of the NIP processing will take place in the

Central Signal Processor (CSP); the SDP load is expected to be much lower for the NIP

experiments than it will be for a typical imaging experiment (though some capacity to handle

real-time calibration will also be required). As such, since we are interested in the highest

compute load required to deliver HPSO experiments, we are essentially ignoring the NIP cases.

None of the experiments in the HPSO list is a fast imaging experiment, and we do not assume

that the fast imaging pipeline is also run on the data, thus the derived compute loads are

specific only to the HPSO experiments and not any other use that the scientific community

might like to make of the same data.

Page 14: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 14 of 19

LOW Notes Total Time (hr)

Mode Fmin-Fmax

Nchan, out

Bmax Tpoint Rflop

1 EoR Imaging

2500 Imaging 50-200MHz

1500 100km 1000hr 2.5 PFLOP (1 beam) (x2 with 2 beams)

2a EoR Imaging and power spectrum

2500 Imaging 50-200MHz

1500 100km 100hr 2.5 PFLOP (1 beam) (x2 with 2 beams)

2b EoR Imaging and power spectrum

2500 Imaging 50-200MHz

1500 100km 10hr 2.5 PFLOP (1 beam) (x2 with 2 beams)

4c Pulsar Search

12800 NIP <<1 PFLOP assumed

5c Pulsar Timing

4300 NIP <<1 PFLOP assumed

HPSO maximum FLOP requirement

2.5 PFLOP

Table 5: High Priority Science Objective experiments: Compute LOAD on LOW

As can be seen from Table 1, the HPSOs using SKA1 LOW do not push the system to its limits

(though note that the compute load estimated for the EOR does not include any power spectrum

analysis). The three EoR experiments are currently only required to produce 1500 channels in

the output, so the compute loads derived are roughly what is required for a simple continuum

observation. The HPSOs suggest using two beams rather than 1 for this experiment and we

note that if multiple beams are enabled on SKA1 LOW then the FLOP rate required for these

experiments would double.

In the HPSO scheduling (RD2, RD3), SKA1 LOW is assigned 17,100 hours doing NIP

experiments and 7500 hours doing EoR experiments (and no other imaging experiments on

LOW are included in the HPSOs). Thus there is a potential trade-off to make: the average

compute load is much lower than the EoR experiment load, if a larger buffer was built, an SDP

Page 15: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 15 of 19

with a smaller total compute capacity could be used, which would reduce and smooth out power

loading.

MID Notes Total Time (hr)

Mode Fmin-Fmax

Nchan, out Bmax Tpoint Rflop

4a Pulsar Search a

800 NIP 650-950 MHz

<<1 PFLOP assumed

4b Pulsar Search b

800 NIP 1250-1550 MHz

<<1 PFLOP assumed

5a Pulsar Timing

1600 NIP 950-1760 MHz

<<1 PFLOP assumed

5b Pulsar Timing

1600 NIP 1650-3050 MHz

<<1 PFLOP assumed

14 Hi 2000 Spectral line Imaging

1300-1400 MHz

5000 (+500 continuum)

24km according to resolution requirement. 200km assumed here

10hrs 3.0 PFLOP (assuming image full band in continuum out to 200km)

19 Transients 10,000 NIP 650-950 MHz

<<1 PFLOP assumed

22 Cradle of Life

6000 Imaging 10-12 GHz

5000 (as continuum)

200km 600hrs 2.5 PFLOP

37a Continuum 2000 Imaging 1.0-1.7 GHz

700 (as continuum)

200km 95hrs 2.5 PFLOP

37b Continuum 2000 Imaging 1.0-1.7 GHz

700 (as continuum)

200km 2000hrs 2.5 PFLOP

38a Continuum 1000 Imaging 7-11 GHz

1000 (as continuum)

200km 16.4hrs 2 PFLOP

38b Continuum 1000 Imaging 7-11 GHz

1000 (as continuum)

200km 1000hrs 2 PFLOP

HPSO maximum FLOP requirement

3.0 PFLOP

Table 6: High Priority Science Objective experiments: Compute load on MID

Page 16: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 16 of 19

Table 6 shows that the worse-case experiment with SKA1 MID proposed in the HPSO only

requires 3 PFLOPs to process in (pseudo) real time, greatly lower than the maximal discovery

compute load. This is essentially because none of the HPSO experiments need the full 256,000

channel output (the largest number of output channels is 5,500 here); the numbers are very

simply scaled down by the ratio of the numbers of channels; so using only O(5,000) channels

will require only 5000/256000 = 2% of the spectral line compute load3. The total load is therefore

dominated by the continuum pipeline, and we have been conservative here and assumed that

the continuum pipeline considers the full bandwidth.

SURVEY Notes Total Time (hr)

Mode Fmin-Fmax

Nchan, out

Bmax Tpoint Rflop

13 Hi, limited BW

5000 Imaging 790-950 MHz

3200 (+500 continuum)

40km 2500 3.6 PFLOP

15 Hi, nearby, low spatial resolution

5000 Imaging 1415- 1425 MHz

2500 13km 2500 1 PFLOP

27 + 334 RM survey, with continuum survey

17500 Imaging 1000-1500 MHz

500 50km 10 3.0 PFLOP

35 Autocorrelation “HI Intensity Mapping survey of 30,000 square degrees covering redshift z=0.2 to 3 (nu=350 - 1200 MHz).”

5500 Autocorrelation

650-1150 MHz

3.3 We have been given no guidance as to how tio proceed with estimations of the compute load for this ambitious experiment.

37c Band 2 Continuum

5300 Imaging 1000-1500 MHz

500 50km 95 3.0 PFLOP

3 Note that this scaling only applies if the output channels remain norrow enough to avoid bandwidth smearing; this is the case for the spectral line HPSOs using MID and SURVEY because they search only a very narrow redshift (and hence frequency) range. 4 Note that these two experiments are listed separately in the HPSO schedule document but have identical observing parameters apart from the fact that the polarisation survey for cosmic magnetism requires much more time on the sky. The continuum (total intensity) survey (HPSO33) comes “for free” at no additional cost whilst conducting the polarisation survey (HPSO27). HPSO33 requires 2500 hours to complete whilst HPSO27 needs 17500 hours.

Page 17: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 17 of 19

HPSO maximum FLOP requirement

3.6 PFLOP5

Table 7: High Priority Science Objective experiments: Compute LOAD on SURVEY

The same is true of the HPSO experiments proposing to use the SURVEY instrument (Table 7):

the highest FLOP requirement here is 3.6 PFLOPS (notwithstanding that we have not got an

estimate of the compute load for the autocorrelation experiments), very much lower than the

O(50) PFLOPS maximal case.

In conclusion then, the HPSO experiments need a maximum of around 3-5 PFLOPS to process,

very much lower than the O(25, 50, 75) PFLOPS needed to cope with the maximal imaging

loads of LOW, MID and SURVEY.

8 Limitations of the iPython implementation of the

parametric model.

The iPython model is introduced on a technical basis in PDR05.01, and the theoretical

framework it seeks to implement is described in PDR05. However this is on-going work and

there are some areas where the assumptions (made in order to produce a compute load

estimate in time for PDR) have been made without a view of the full system in mind. There are

other areas where we acknowledge that the iPython model is not correct –i.e. the theoretical

model has developed and the iPython has yet to catch up.

8.1 Known omissions:

In the model we do include FLOP estimates for Gridding and De-gridding (ref POR05 sections

and equations), FFT, re-projection and phase rotation. We do not include the things listed in the

table below.

Pipeline Component PDR document reference

Estimated impact of omission

Non Imaging Single Pulse Search

PDR02 Pipelines sub-element design document section (AD3) 9.10; PDR05 (AD1) Section 14

Zero – will not drive limiting case

Non Imaging Periodic Search AD3 section 9.8; PDR05 Section 14

Zero – will not drive limiting case

Non Imaging Pulsar Timing AD3 section 9.9; Zero – will not drive limiting case

Page 18: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 18 of 19

PDR05 Section 14

Ingest All (e.g. RFI flagging, averaging, spatial filtering, demixing)

AD3 section 8.1, Unsure

EoR Power Spectrum

Power Spectrum estimation

(Not described) Unsure. Could be an additional load alongside EoR imaging pipelines

Imaging Drift Scan Imaging

Part of A term description; see AD3, section 8.2

This could have a significant impact (raising the load) where used

Imaging Minor Cycle Clean

AD3, section 8.2 and PDR05 Section 12.11

Expected to be small.

Calibration AD3, section 9.4, PDR05 Section 11

Could be very significant, possibly of same order as imaging cost.

Co-addition of multiple image cubes

(not described in detail)

Small. Image based weighting and adding will not be done many times and should not be a significant compute load.

Table 8: List of areas of processing that are currently NOT INCLUDED in the iPython model.

8.2 Known Errors or incomplete sections of the iPython model:

Areas where effects are included but where the implementation requires updating or further

analysis and checking:

1. Calculation of Npix: In the iPython model we calculate the number of pixel on a side as

the ratio of the field of view to the pixel size. However both the field of view and the pixel

size are calculated at the same fiducial wavelength. If pixels are to be matched on the

sky across a range of wavelengths then the field of view should be defined at the

maximum wavelength but the pixel size defined at the minimum. This can have a large

impact on the total grid size (a factor of 7 for SKA1 LOW spanning 50-350MHz), unless

the use of sub-bands is employed.

2. Nmaj for spectral line pipeline: In PDR05 section 12.9, Equation 36 Nmaj is effectively

set at 1.5. This is a new development but an important underestimate in the modelled

output (since the overall results for the maximal case are dominated by the spectral line

pipeline, the amounts to our assumed compute load numbers being low by a factor of

2/3 – i.e. we should be reporting numbers 50% higher. Whilst we acknowledge that this

error is quite severe, given the other uncertainties around the system design it is not

important in the context of this PDR submission, since an upscaling by 50% could not

give rise to any fundamental changes in architecture.

Page 19: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 19 of 19

3. IO Rate: In the iPython model we use:

Rio = Mvis * Nmajor * Nbeam * Npp * Nvis * Nfacet**2

Where as PDR05 suggests a factor of (Nmajor+1), and does not include the number of

beams or the affect of faceting (though we believe that our implementation of Nfacet is

correct).

4. Naa: The size (in pixels) assumed for the A-term in the convolution kernel is slightly

discrepant between the model and the description in PDR05 Section 12.5.2. Our output

numbers assume Naa=9, whilst the value should be derived from an equation (PDR05

12.5.2, Eq 27). We believe that the “correct” value to use is about 10, which is very close

to what we have included, but this is an area to improve on in future work.

5. Baseline dependence has been assumed for all appropriate steps in the imaging

pipeline after ingest and with baseline dependent time averaging in the buffer, and

baseline dependent frequency averaging at gridding for continuum.

a. Time averaging: We only estimate the averaging time as the ratio of the longest

baseline in the array to the baseline under consideration, multiplied by the

reference correlator dump time (e.g. 0.6s for LOW etc). This is somewhat crude

as it should depend on the field of view being imaged.

b. Frequency averaging: For the continuum case we assume that the number of

frequency chanels in each baseline bin is set to avoid frequency smearing at the

field of view being considered, or at the number of output channels, if this is

larger. There are subtleties here since the output channels are assumed to be

linearly spaced in frequency, so this approach may be incorrect on the shortest

baselines. (The impact is likely to be small.)

We have used binned baseline distributions as described in PDR05.01, which

contain O(10) logarithmically spaced bins. The caveat here is that this is a new

feature and has not been extensively tested, and that the baseline distributions

themselves depend on the source sky position (especially elevation of course).

6. Faceting: We have implemented a faceting approach where we divide the field of view

up in to a number of facets. This reduces the gridding load as the w kernels are smaller,

and time averaging can be increased, but potentially increases the IO rate. We have

optimised the number of facets by considering only the total compute load and not other

system drivers such as IO rate. We have also not adjusted the buffer sizes with faceting

to reflect the longer integration time that a smaller field of view would enable.

Page 20: PDR 03.04 System sizing derived from outputs of the ...broekema/papers/SDP-PDR... · model of AD1 & AD2 and obtain compute load estimates for the imaging experiments. For all imaging

PDR_03.04_System_Sizing_derived_from_outputs_of_the_Parametric_Model_V1.0EchoSign Document History February 10, 2015

Created: February 09, 2015

By: Verity Allan ([email protected])

Status: SIGNED

Transaction ID: XJEZWFY7C254HX2

“PDR_03.04_System_Sizing_derived_from_outputs_of_the_Parametric_Model_V1.0”History

Document created by Verity Allan ([email protected])February 09, 2015 - 5:12 PM GMT - IP address: 131.111.185.15

Document emailed to Rosie Bolton ([email protected]) for signatureFebruary 09, 2015 - 5:13 PM GMT

Document viewed by Rosie Bolton ([email protected])February 09, 2015 - 8:21 PM GMT - IP address: 86.6.25.129

Document e-signed by Rosie Bolton ([email protected])Signature Date: February 09, 2015 - 8:22 PM GMT - Time Source: server - IP address: 86.6.25.129

Document emailed to Paul Alexander ([email protected]) for signatureFebruary 09, 2015 - 8:22 PM GMT

Document viewed by Paul Alexander ([email protected])February 10, 2015 - 8:45 AM GMT - IP address: 131.111.185.15

Document e-signed by Paul Alexander ([email protected])Signature Date: February 10, 2015 - 8:45 AM GMT - Time Source: server - IP address: 131.111.185.15

Signed document emailed to Verity Allan ([email protected]), Rosie Bolton ([email protected]) andPaul Alexander ([email protected])February 10, 2015 - 8:45 AM GMT