pdr 03.04 system sizing derived from outputs of the ...broekema/papers/sdp-pdr... · model of ad1...
TRANSCRIPT
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 1 of 19
PDR 03.04 System sizing derived from outputs of
the Parametric Model for the SDP.
Document number…………………………………………………………..SKA-TEL-SDP-0000038
Context………………………………………………………………………………..………PROT.SS
Revision………………………………………………………………………………………….……01
Author………………………………………………………………………………………Rosie Bolton.
Release Date…………………………………………………………………………….….2015-02-09
Document Classification…………………………………………………………………. Unrestricted
Status………………………………………………………………………………………………. Draft
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 2 of 19
Name Designation Affiliation
Rosie Bolton SDP Project Scientist University of Cambridge
Signature & Date:
Name Designation Affiliation
Paul Alexander SDP Project Lead University of Cambridge
Signature & Date:
Version Date of Issue Prepared by Comments
0.1
ORGANISATION DETAILS
Name Science Data Processor Consortium
Signature:
Email:
Signature:
Email:
Rosie Bolton (Feb 9, 2015)Rosie Bolton
Paul Alexander (Feb 10, 2015)Paul Alexander
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 3 of 19
1 Table of Contents
1 Table of Contents ................................................................................................................ 3
2 List of Figures ..................................................................................................................... 4
3 List of Tables ...................................................................................................................... 4
4 Introduction ......................................................................................................................... 5
5 References ......................................................................................................................... 6
5.1 Applicable Documents ................................................................................................. 6
5.2 Reference Documents ................................................................................................. 6
6 Estimates of overall Compute Load..................................................................................... 7
6.1 The Maximal Discovery cases...................................................................................... 7
7 System performance to deliver High Priority Science Objectives .......................................13
8 Limitations of the iPython implementation of the parametric model. ...................................17
8.1 Known omissions: .......................................................................................................17
8.2 Known Errors or incomplete sections of the iPython model: ........................................18
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 4 of 19
2 List of Figures There are no figures in this document.
3 List of Tables
Table 1 Computational requirements summary table. ................................................................ 9 Table 2: The Continuum Cases, specific parameters and overall Memory, I/O and FLOP needs.
..........................................................................................................................................10 Table 3: The Spectral Line Cases, specific parameters and overall Memory, I/O and FLOP
needs. ................................................................................................................................11 Table 4: Overall compute load, buffer size and IO rate values as calculated by the iPython
implementation of the parametric model for the SDP.. .......................................................13 Table 5: High Priority Science Objective experiments: Compute LOAD on LOW ......................14 Table 6: High Priority Science Objective experiments: Compute load on MID ...........................15 Table 7: High Priority Science Objective experiments: Compute LOAD on SURVEY ................17 Table 8: List of areas of processing that are currently NOT INCLUDED in the iPython model. ..18
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 5 of 19
4 Introduction
Here we present two separate estimates for the overall performance requirements of the SKA1
Science Data Processor. The first of these is a maximal discovery case, based on a system
imaging incoming data at full spatial and spectral resolution and capable of keeping up with
incoming data. We use this to size and cost the system.
The second estimate is sized to be able to deliver the maximum performance required of the
High Priority Science Objectives (HPSOs) as presented in RD1.
These HPSOs are a list of key projects that typically require several thousand hours of
telescope time teach to complete; they include experiments to measure the power spectrum and
to image the Epoch of Reionisation (using LOW); Pulsar search and timing experiments using
both MID and LOW; major continuum (full polarisation) surveys at around 1 and 10 GHz using
MID and SURVEY, including covering the whole hemisphere to obtain magnetic field
information via rotation measures, and a search for rapid single pulse transient sources using
MID. There are several other experiments, but the previous list gives an idea of the variety.
The shortest experiment is 800hrs to complete, whilst the longest requires 17,500hrs.
The HPSOs were born out of the “Science Use Cases” presented in RD4. We have been
provided with two reference documents; RD2 and RD3. The most relevant one to the HPSO
analysis is RD3, the spreadsheet detailing the specifics of each of the experiments in terms of
time per pointing, hours spent in total, frequency range, number of channels required at output
and, for the imaging experiments, the required spatial resolution (which we use to determine the
maximum baseline used) and the total observed area for the experiment (essential for derivation
of archive requirements (see RD5). These parameters enable us to re-run the performance
model of AD1 & AD2 and obtain compute load estimates for the imaging experiments.
For all imaging experiments we assume that each field is imaged out to the second null of the
primary beam during the continuum pipeline, regardless of what the final field of view will be.
In our costings for this PDR submission we have chosen to use the maximal discovery case for
each of the three telescopes to estimate hardware costs. The specification of each telescope is
that of the baseline design prior to rebaselining: the maximal case will change as a
consequence of rebaselining. In contrast, the HPSO experiment numbers do not push the
system to its maximum limits in multiple dimensions and so it may be expected that the HPSO
performance requirements are more robust to the rebaselining process. We include them as a
useful measure of the performance required of the SKA1 SDP in order to be capable of
delivering these world-class science products.
This paper presents the results as calculated by the model described in PDR.05 (AD1) and
implemented in the iPython notebook (see AD2)
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 6 of 19
5 References
5.1 Applicable Documents
The following documents are applicable to the extent stated herein. In the event of conflict
between the contents of the applicable documents and this document, the applicable
documents shall take precedence.
Reference Number Reference
AD1 PDR.05 Parametric Models of SDP Compute Requirements
AD2 PDR05.01 The SDP Performance Model implemented in IPython
AD3 PDR02 Pipelines sub-element design document
5.2 Reference Documents
The following documents are referenced in this document. In the event of conflict between the
contents of the referenced documents and this document, this document shall take
precedence.
Reference Number Reference
RD1 SKAO “Baseline design” document
RD2 SKAO Five years in the life document
RD3 Excel spreadsheet of HPSO experiments
RD4 Science Use Cases for SKA1
RD5 PDR01.03 “SDP archive size estimates”
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 7 of 19
6 Estimates of overall Compute Load
In our parametric model document, PDR05 (AD1), we have described our parametric model for
the overall imaging compute load (i.e. the sustained number of floating point operations per
second that the system can actually deliver) i.e. they do not include any (in)efficiency factors so
the system built will need to be larger than this.
Here we take the iPython implementation of this model and apply the parameters for each
instrument to estimate the required system sizing for the SDP. Firstly we size system to the full
“baseline design” (see RD1); secondly we apply parameters relevant for each of the imaging
experiments in the High Priority Science Objectives (HPSOs) as laid out in RD2 and RD3.
The model itself has limitations of a scientific nature: these are described in PDR05 (AD1), but
there are also some shortcomings of the iPython implementation, where sections of the PDR05
model have evolved at pace and the iPython model has yet to catch up. We describe these in
section 0.
6.1 The Maximal Discovery cases
The following tables provide the input parameters from which the baseline requirements for
costing are determined for the maximal discovery cases for each instrument. Please refer to
RD1 for a full description of the parameters (we refer to the relevant section of RD1 in the table).
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 8 of 19
Symbol SKA1-Low SKA1-Mid (B1)
SKA1-Survey (B1)
Justification Refer to PDR05 document section / L1 req
Telescope parameters
Maximum Baseline (km)
𝐵max 100 200 50 L1 requirements / RD1
Number of Frequency Channels from Correlator
𝐵f,corr 256000 256000 256000 L1 requirements / RD1
Number of stations/antennas
𝐵a 1024 256 96 L1 requirements / RD1
Number of beams 𝐵beam 1 1 36 (NB only 1 beam for LOW still)
Antenna/Station Diameter (m)
𝐵s 35 151 152 L1 requirements / RD1
Correlator Dump Time (s)
𝐵dump 0.6 0.08 0.3 L1 requirements / RD1
Number of polarisation products
𝐵pp 4 4 4 L1 requirements / RD1
Parameters for both continuum and spectroscopic case
Image-plane oversampling factor (in units of Nyquist sampling)
𝐵pix 2.5 2.5 2.5 5 pixels per synthesized beam. PDR05 Section 12.11
Amplitude level of w-kernels to include
𝐵𝐵 0.01 0.01 0.01 PDR05 section 12.5
Linear size of A-kernel (pixels)
𝐵𝐵𝐵 9 9 9 C.f. PDR05 section 12.5.2, Eq 27. These equations might typically give 10pixels rather than 9.
Ionospheric timescale (s)
𝐵ion 60 60 60 This is the assumed update rate of the gridding kernel – i.e. the maximum length of time that it can be re-used for. See Tupdate in PDR05 Section 12.8.
Oversampling factor of w-convolution kernels
𝐵GCF 8 8 8 PDR05 Section 12.5.
Number of adjacent frequency channels for
𝐵fcv 10 10 10 Baseline-dependent kernels are reused for 10
1 Some of the dishes comprising SKA1-Mid will be from MeerKAT with diameter 13.5m. This smaller diameter is not taken into account in the analysis presented here and all dishes are assumed to have 15m diameter. 2 Some of the dishes comprising SKA1-Survery will be from ASKAP with diameter 12m. This smaller diameter is not taken into account in the analysis presented here and all dishes are assumed to have 15m diameter.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 9 of 19
which baseline dependent kernels can be reused
adjacent frequency channels. PDR05 section 12.8. (Note that this value lacks formal justification.)
Full Mueller Matrix gridding factor
𝐵mm 4 4 4 Nmm=4 is the maximum possible. (PDR05 section 12.8)
Maximum observing wavelength in band (m)
𝐵max 6 0.857 0.86 L1 requirements, RD1, Band definitions
Minimum observing wavelength (m)
𝐵min 0.857 0.286 0.33 L1 requirements, RD1, Band definitions
Size of visibility datum in the buffer (Bytes)
𝐵vis 12 12 12 See PDR Section 12.15.1
Table 1 Computational requirements summary table.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 10 of 19
Table 2: The Continuum Cases, specific parameters and overall Memory, I/O and FLOP needs.
Continuum specific Parameters
Number of Major Cycles
𝐵major 10 10 10 Previous estimates were based on this value. Difficult to justify before commissioning of the telescopes. See PDR05 sections 12 and 15.1
Diameter of imaging field of view (in units of diameter to first zero)
𝐵FoV 1.8 1.8 1.8 See PDR05 section 9.2
Number of output frequency channels
𝐵f,out 500 500 500 See PDR05 section 12.8
Continuum Results
Total FLOP Requirement (PetaFLOPs)
𝐵contFLOP 2.8 4.3 3.9
Snapshot Duration(s) 𝐵snap 87 1132 1907 See PDR section 12.8
Nfacet 𝐵facet
3 13 7 See PDR05 12.12; RFlop is minimized by allowing Nfacet and Tsnap to vary
Target grid size (1D pixels)
13,933 15,005 6,966 Refer to section of PDR05 redo numbers to accout for faceting
Visibility Buffer Size (PetaByte)
𝐵bufvis 0.9 0.12 0.19 This is after averaging visibilities in time and frequency in a baseline dependent fashion using a binned baseline distribution as described in PDR05.01.
Visibility Buffer Bandwidth (TeraByte/s)
𝐵io 1.9 4.5 2.1
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 11 of 19
Spectral Line Specific Parameters
Number of Major Cycles
𝐵major 1 1 1 In PDR05 section 12.9, Equation 36 Nmaj is effectively set at 1.5. This is a new development but an important underestimate in the modeled output.
Diameter of imaging field of view (in units of diameter to first zero)
𝐵FoV 1 1 1 FoV to extend to the first zero of Airy function (PDR05 Sec 9.2, Equation 6)
Number of output channels
𝐵f,out 256,000 256,000 256,000 L1 requirements / RD1
Spectral Line Results
Total FLOP Requirement (PetaFLOPs)
𝐵specFLOP 21.6 40.6 57.2
Snapshot Duration(s) 𝐵snap 118 1491 1817
Number of facets (1D) 𝐵facet
1 5 3 RFlop is minimized by allowing Nfacet and Tsnap to vary
Target grid size (1D pixels)
23,221 21,673 9,031
Visibility Buffer Size (PetaByte)
𝐵bufvis 256 29.7 87.5 This is after averaging visibilities in time in a baseline dependent fashion using a binned baseline distribution as described in PDR05.01.
Visibility Buffer Bandwidth (TeraByte/s)
𝐵io 5.4 17 18 NB. Since there is only one major cycle these are significantly changed.
Table 3: The Spectral Line Cases, specific parameters and overall Memory, I/O and FLOP needs.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 12 of 19
Fast Imaging Parameters
Number of output channels
𝐵f,out 500 500 500 To achieve 1MHz output frequency resolution
Image-plane oversampling factor (in units of Nyquist sampling)
𝐵pix 1.5 1.5 1.5 PDR05 Section 9.2, Equation 7. Qpix=1.5 is consistent with the CASA Imager task.
Diameter of imaging field of view (in units of diameter to first zero)
𝐵FoV 0.9 0.9 0.9 Image only the primary beam PDR05 Section 9.2
Timescale for images (after discussion with Transients SWG)
T(snap) (and Tobs)
1.2s 1.2s 1.2s Transients SWG suggested a 1s timescale for “standard” observations, we choose 1.2s as it is 2xTdump for SKA1 LOW
Fast Imaging Results
Number of facets (1D) 𝐵facet
2 27 18 RFlop is minimized by allowing Nfacet to vary
Total FLOP Requirement (PetaFLOPs)
𝐵fastFLOP 0.35 6.7 10.7
The total compute load and overall buffer-to-processors I/O rate for each instrument are given
by the sum of values for the three pipelines (slow transients, continuum, and spectral line). The
uv grid memory and convolution kernel cache sizes are taken to be the maximum from the three
pipelines, whilst the total buffer size required to store the visibility data after ingest (for these
maximal cases) is defined by the spectral line mode. Table 4: Overall compute load, buffer size
and IO rate values as calculated by the iPython implementation of the parametric model for the
SDP. Numbers are for pseudo-real time processing at the full spatial and spectral resolution,
generating a continuum cube and a spectral line cube once per 6 hour observation and
additionally a fast-imaging cube every 1.2 seconds. Faceting and snapshot timescale have been
optimised independently for each pipeline (where appropriate) to minimise the total FLOP rate.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 13 of 19
Table 4: Overall compute load, buffer size and IO rate values as calculated by the iPython implementation of the parametric model for the SDP. Numbers are for pseudo-real time processing at the full spatial and spectral resolution, generating a continuum cube and a spectral line cube once per 6 hour observation and additionally a fast-imaging cube every 1.2 seconds. Faceting and snapshot timescale have been optimised independently for each pipeline (where appropriate) to minimise the total FLOP rate.
Symbol SKA1-Low
SKA1-Mid (B1)
SKA1-Survey (B1)
Notes
Total Performance requirements, maximal imaging case
Total Compute load (PetaFLOPs)
𝐵TOTALFLOP 25 52 72
Maximum target grid size (1D pixels)
23,221 21,673 9,031
Visibility Buffer Size (PetaByte)
𝐵bufvis 256 29.7 87.5 Set by spectral line case.
Visibility Buffer Bandwidth (TeraByte/s)
𝐵io 7.4 23 22 Sum of all pipelines
7 System performance to deliver High Priority Science
Objectives
We have split the 13 HPSOs described in RD2 and RD3 into 21 different observational
programmes to be undertaken with SKA1. We assume that the imaging experiments here all
require a continuum pipeline to be run first, then, for some experiments, a spectral line pipeline
is also required to produce the final cube. Parameters driving these two pipelines are taken from
the tables above, the only differences being the frequency range, number of output channels
required (for continuum and spectral line modes) and maximum baseline needed (derived from
the stated required spatial resolution).
We have not included the NIP experiments: the bulk of the NIP processing will take place in the
Central Signal Processor (CSP); the SDP load is expected to be much lower for the NIP
experiments than it will be for a typical imaging experiment (though some capacity to handle
real-time calibration will also be required). As such, since we are interested in the highest
compute load required to deliver HPSO experiments, we are essentially ignoring the NIP cases.
None of the experiments in the HPSO list is a fast imaging experiment, and we do not assume
that the fast imaging pipeline is also run on the data, thus the derived compute loads are
specific only to the HPSO experiments and not any other use that the scientific community
might like to make of the same data.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 14 of 19
LOW Notes Total Time (hr)
Mode Fmin-Fmax
Nchan, out
Bmax Tpoint Rflop
1 EoR Imaging
2500 Imaging 50-200MHz
1500 100km 1000hr 2.5 PFLOP (1 beam) (x2 with 2 beams)
2a EoR Imaging and power spectrum
2500 Imaging 50-200MHz
1500 100km 100hr 2.5 PFLOP (1 beam) (x2 with 2 beams)
2b EoR Imaging and power spectrum
2500 Imaging 50-200MHz
1500 100km 10hr 2.5 PFLOP (1 beam) (x2 with 2 beams)
4c Pulsar Search
12800 NIP <<1 PFLOP assumed
5c Pulsar Timing
4300 NIP <<1 PFLOP assumed
HPSO maximum FLOP requirement
2.5 PFLOP
Table 5: High Priority Science Objective experiments: Compute LOAD on LOW
As can be seen from Table 1, the HPSOs using SKA1 LOW do not push the system to its limits
(though note that the compute load estimated for the EOR does not include any power spectrum
analysis). The three EoR experiments are currently only required to produce 1500 channels in
the output, so the compute loads derived are roughly what is required for a simple continuum
observation. The HPSOs suggest using two beams rather than 1 for this experiment and we
note that if multiple beams are enabled on SKA1 LOW then the FLOP rate required for these
experiments would double.
In the HPSO scheduling (RD2, RD3), SKA1 LOW is assigned 17,100 hours doing NIP
experiments and 7500 hours doing EoR experiments (and no other imaging experiments on
LOW are included in the HPSOs). Thus there is a potential trade-off to make: the average
compute load is much lower than the EoR experiment load, if a larger buffer was built, an SDP
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 15 of 19
with a smaller total compute capacity could be used, which would reduce and smooth out power
loading.
MID Notes Total Time (hr)
Mode Fmin-Fmax
Nchan, out Bmax Tpoint Rflop
4a Pulsar Search a
800 NIP 650-950 MHz
<<1 PFLOP assumed
4b Pulsar Search b
800 NIP 1250-1550 MHz
<<1 PFLOP assumed
5a Pulsar Timing
1600 NIP 950-1760 MHz
<<1 PFLOP assumed
5b Pulsar Timing
1600 NIP 1650-3050 MHz
<<1 PFLOP assumed
14 Hi 2000 Spectral line Imaging
1300-1400 MHz
5000 (+500 continuum)
24km according to resolution requirement. 200km assumed here
10hrs 3.0 PFLOP (assuming image full band in continuum out to 200km)
19 Transients 10,000 NIP 650-950 MHz
<<1 PFLOP assumed
22 Cradle of Life
6000 Imaging 10-12 GHz
5000 (as continuum)
200km 600hrs 2.5 PFLOP
37a Continuum 2000 Imaging 1.0-1.7 GHz
700 (as continuum)
200km 95hrs 2.5 PFLOP
37b Continuum 2000 Imaging 1.0-1.7 GHz
700 (as continuum)
200km 2000hrs 2.5 PFLOP
38a Continuum 1000 Imaging 7-11 GHz
1000 (as continuum)
200km 16.4hrs 2 PFLOP
38b Continuum 1000 Imaging 7-11 GHz
1000 (as continuum)
200km 1000hrs 2 PFLOP
HPSO maximum FLOP requirement
3.0 PFLOP
Table 6: High Priority Science Objective experiments: Compute load on MID
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 16 of 19
Table 6 shows that the worse-case experiment with SKA1 MID proposed in the HPSO only
requires 3 PFLOPs to process in (pseudo) real time, greatly lower than the maximal discovery
compute load. This is essentially because none of the HPSO experiments need the full 256,000
channel output (the largest number of output channels is 5,500 here); the numbers are very
simply scaled down by the ratio of the numbers of channels; so using only O(5,000) channels
will require only 5000/256000 = 2% of the spectral line compute load3. The total load is therefore
dominated by the continuum pipeline, and we have been conservative here and assumed that
the continuum pipeline considers the full bandwidth.
SURVEY Notes Total Time (hr)
Mode Fmin-Fmax
Nchan, out
Bmax Tpoint Rflop
13 Hi, limited BW
5000 Imaging 790-950 MHz
3200 (+500 continuum)
40km 2500 3.6 PFLOP
15 Hi, nearby, low spatial resolution
5000 Imaging 1415- 1425 MHz
2500 13km 2500 1 PFLOP
27 + 334 RM survey, with continuum survey
17500 Imaging 1000-1500 MHz
500 50km 10 3.0 PFLOP
35 Autocorrelation “HI Intensity Mapping survey of 30,000 square degrees covering redshift z=0.2 to 3 (nu=350 - 1200 MHz).”
5500 Autocorrelation
650-1150 MHz
3.3 We have been given no guidance as to how tio proceed with estimations of the compute load for this ambitious experiment.
37c Band 2 Continuum
5300 Imaging 1000-1500 MHz
500 50km 95 3.0 PFLOP
3 Note that this scaling only applies if the output channels remain norrow enough to avoid bandwidth smearing; this is the case for the spectral line HPSOs using MID and SURVEY because they search only a very narrow redshift (and hence frequency) range. 4 Note that these two experiments are listed separately in the HPSO schedule document but have identical observing parameters apart from the fact that the polarisation survey for cosmic magnetism requires much more time on the sky. The continuum (total intensity) survey (HPSO33) comes “for free” at no additional cost whilst conducting the polarisation survey (HPSO27). HPSO33 requires 2500 hours to complete whilst HPSO27 needs 17500 hours.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 17 of 19
HPSO maximum FLOP requirement
3.6 PFLOP5
Table 7: High Priority Science Objective experiments: Compute LOAD on SURVEY
The same is true of the HPSO experiments proposing to use the SURVEY instrument (Table 7):
the highest FLOP requirement here is 3.6 PFLOPS (notwithstanding that we have not got an
estimate of the compute load for the autocorrelation experiments), very much lower than the
O(50) PFLOPS maximal case.
In conclusion then, the HPSO experiments need a maximum of around 3-5 PFLOPS to process,
very much lower than the O(25, 50, 75) PFLOPS needed to cope with the maximal imaging
loads of LOW, MID and SURVEY.
8 Limitations of the iPython implementation of the
parametric model.
The iPython model is introduced on a technical basis in PDR05.01, and the theoretical
framework it seeks to implement is described in PDR05. However this is on-going work and
there are some areas where the assumptions (made in order to produce a compute load
estimate in time for PDR) have been made without a view of the full system in mind. There are
other areas where we acknowledge that the iPython model is not correct –i.e. the theoretical
model has developed and the iPython has yet to catch up.
8.1 Known omissions:
In the model we do include FLOP estimates for Gridding and De-gridding (ref POR05 sections
and equations), FFT, re-projection and phase rotation. We do not include the things listed in the
table below.
Pipeline Component PDR document reference
Estimated impact of omission
Non Imaging Single Pulse Search
PDR02 Pipelines sub-element design document section (AD3) 9.10; PDR05 (AD1) Section 14
Zero – will not drive limiting case
Non Imaging Periodic Search AD3 section 9.8; PDR05 Section 14
Zero – will not drive limiting case
Non Imaging Pulsar Timing AD3 section 9.9; Zero – will not drive limiting case
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 18 of 19
PDR05 Section 14
Ingest All (e.g. RFI flagging, averaging, spatial filtering, demixing)
AD3 section 8.1, Unsure
EoR Power Spectrum
Power Spectrum estimation
(Not described) Unsure. Could be an additional load alongside EoR imaging pipelines
Imaging Drift Scan Imaging
Part of A term description; see AD3, section 8.2
This could have a significant impact (raising the load) where used
Imaging Minor Cycle Clean
AD3, section 8.2 and PDR05 Section 12.11
Expected to be small.
Calibration AD3, section 9.4, PDR05 Section 11
Could be very significant, possibly of same order as imaging cost.
Co-addition of multiple image cubes
(not described in detail)
Small. Image based weighting and adding will not be done many times and should not be a significant compute load.
Table 8: List of areas of processing that are currently NOT INCLUDED in the iPython model.
8.2 Known Errors or incomplete sections of the iPython model:
Areas where effects are included but where the implementation requires updating or further
analysis and checking:
1. Calculation of Npix: In the iPython model we calculate the number of pixel on a side as
the ratio of the field of view to the pixel size. However both the field of view and the pixel
size are calculated at the same fiducial wavelength. If pixels are to be matched on the
sky across a range of wavelengths then the field of view should be defined at the
maximum wavelength but the pixel size defined at the minimum. This can have a large
impact on the total grid size (a factor of 7 for SKA1 LOW spanning 50-350MHz), unless
the use of sub-bands is employed.
2. Nmaj for spectral line pipeline: In PDR05 section 12.9, Equation 36 Nmaj is effectively
set at 1.5. This is a new development but an important underestimate in the modelled
output (since the overall results for the maximal case are dominated by the spectral line
pipeline, the amounts to our assumed compute load numbers being low by a factor of
2/3 – i.e. we should be reporting numbers 50% higher. Whilst we acknowledge that this
error is quite severe, given the other uncertainties around the system design it is not
important in the context of this PDR submission, since an upscaling by 50% could not
give rise to any fundamental changes in architecture.
Document No: SKA-TEL-SDP-0000038 Unrestricted Revision: 01 Author: R. Bolton Release Date: 2015-02-09 Page 19 of 19
3. IO Rate: In the iPython model we use:
Rio = Mvis * Nmajor * Nbeam * Npp * Nvis * Nfacet**2
Where as PDR05 suggests a factor of (Nmajor+1), and does not include the number of
beams or the affect of faceting (though we believe that our implementation of Nfacet is
correct).
4. Naa: The size (in pixels) assumed for the A-term in the convolution kernel is slightly
discrepant between the model and the description in PDR05 Section 12.5.2. Our output
numbers assume Naa=9, whilst the value should be derived from an equation (PDR05
12.5.2, Eq 27). We believe that the “correct” value to use is about 10, which is very close
to what we have included, but this is an area to improve on in future work.
5. Baseline dependence has been assumed for all appropriate steps in the imaging
pipeline after ingest and with baseline dependent time averaging in the buffer, and
baseline dependent frequency averaging at gridding for continuum.
a. Time averaging: We only estimate the averaging time as the ratio of the longest
baseline in the array to the baseline under consideration, multiplied by the
reference correlator dump time (e.g. 0.6s for LOW etc). This is somewhat crude
as it should depend on the field of view being imaged.
b. Frequency averaging: For the continuum case we assume that the number of
frequency chanels in each baseline bin is set to avoid frequency smearing at the
field of view being considered, or at the number of output channels, if this is
larger. There are subtleties here since the output channels are assumed to be
linearly spaced in frequency, so this approach may be incorrect on the shortest
baselines. (The impact is likely to be small.)
We have used binned baseline distributions as described in PDR05.01, which
contain O(10) logarithmically spaced bins. The caveat here is that this is a new
feature and has not been extensively tested, and that the baseline distributions
themselves depend on the source sky position (especially elevation of course).
6. Faceting: We have implemented a faceting approach where we divide the field of view
up in to a number of facets. This reduces the gridding load as the w kernels are smaller,
and time averaging can be increased, but potentially increases the IO rate. We have
optimised the number of facets by considering only the total compute load and not other
system drivers such as IO rate. We have also not adjusted the buffer sizes with faceting
to reflect the longer integration time that a smaller field of view would enable.
PDR_03.04_System_Sizing_derived_from_outputs_of_the_Parametric_Model_V1.0EchoSign Document History February 10, 2015
Created: February 09, 2015
By: Verity Allan ([email protected])
Status: SIGNED
Transaction ID: XJEZWFY7C254HX2
“PDR_03.04_System_Sizing_derived_from_outputs_of_the_Parametric_Model_V1.0”History
Document created by Verity Allan ([email protected])February 09, 2015 - 5:12 PM GMT - IP address: 131.111.185.15
Document emailed to Rosie Bolton ([email protected]) for signatureFebruary 09, 2015 - 5:13 PM GMT
Document viewed by Rosie Bolton ([email protected])February 09, 2015 - 8:21 PM GMT - IP address: 86.6.25.129
Document e-signed by Rosie Bolton ([email protected])Signature Date: February 09, 2015 - 8:22 PM GMT - Time Source: server - IP address: 86.6.25.129
Document emailed to Paul Alexander ([email protected]) for signatureFebruary 09, 2015 - 8:22 PM GMT
Document viewed by Paul Alexander ([email protected])February 10, 2015 - 8:45 AM GMT - IP address: 131.111.185.15
Document e-signed by Paul Alexander ([email protected])Signature Date: February 10, 2015 - 8:45 AM GMT - Time Source: server - IP address: 131.111.185.15
Signed document emailed to Verity Allan ([email protected]), Rosie Bolton ([email protected]) andPaul Alexander ([email protected])February 10, 2015 - 8:45 AM GMT