Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 1 of 50
PDR.07.01 COSTING BASIS OF ESTIMATE
Document number…………………………………………………………………………………SKA-TEL-SDP-0000046
Context…………………………………………………………………………………………………………………..…………MGT
Revision…………………………………………………………………………………………………………………………….……2
Author……………………………………………………………………………………………….Ferdl Graser, John Taylor
Release Date………………………………………………………………………………………………………….2015-02-09
Document Classification………………………………………………………………………………….…. Unrestricted
Status……………………………………………………………………………………………………………………………. Draft
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 2 of 50
Name Designation Affiliation
Ferdl Graser SDP Systems Engineer Space Advisory Company
Signature & Date:
Name Designation Affiliation
Paul Alexander SDP Project Lead University of Cambridge
Signature & Date:
Version Date of Issue Prepared by Comments
0.1
ORGANISATION DETAILS
Name Science Data Processor Consortium
Signature:
Email:
Signature:
Email:
Ferdl Graser (Feb 10, 2015)Ferdl Graser
Paul Alexander (Feb 10, 2015)Paul Alexander
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 3 of 50
1 CONTENTS
1 Contents................................................................................................................................... 3
2 List of Figures ........................................................................................................................... 7
3 List of Tables ............................................................................................................................ 7
4 References ............................................................................................................................... 8
4.1 Applicable Documents ..................................................................................................... 8
4.2 Reference Documents ...................................................................................................... 8
5 Introduction ........................................................................................................................... 10
5.1 Scope .............................................................................................................................. 11
5.2 Assumptions ................................................................................................................... 11
6 Hardware Compute Platform ................................................................................................ 12
6.1 Hardware Cost Estimation Methodology ....................................................................... 12
Hardware Model ..................................................................................................... 12
Data and Processing Requirements ........................................................................ 12
Estimate Costs for Build, Ship and Test .................................................................. 12
Overview of the SDP ............................................................................................... 12
6.2 Compute Island .............................................................................................................. 13
6.3 Buffer .............................................................................................................................. 15
6.4 Compute Node ............................................................................................................... 16
6.5 Interconnect System - Bulk Data Transport ................................................................... 17
Third Stage – Implicit to Compute Island Configuration ........................................ 17
Second Stage Networking ....................................................................................... 18
First Stage Networking ............................................................................................ 18
Network Redundancy ............................................................................................. 20
6.6 Low-Latency Networking................................................................................................ 20
6.7 Management and Archive .............................................................................................. 21
7 Hierarchical Storage (was called Science Archive) ................................................................ 22
7.1 Long Term Storage & Medium Performance Buffer ...................................................... 22
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 4 of 50
7.2 Delivery Platform Hardware .......................................................................................... 24
7.3 LMC Hardware ................................................................................................................ 24
8 SDP Software Cost Estimation ............................................................................................... 24
8.1 Estimation Methodology ................................................................................................ 24
8.2 Effort Estimation ............................................................................................................ 24
8.3 Labour Rates ................................................................................................................... 24
8.4 Software Estimation Assumptions ................................................................................. 26
AIV effort ................................................................................................................. 26
Documentation effort ............................................................................................. 26
8.5 Software Compute Platform .......................................................................................... 27
Compute Operating System .................................................................................... 28
Middleware ............................................................................................................. 28
Hierarchical Storage Management ......................................................................... 28
Application Development Environment and SDK ................................................... 29
Scheduler ................................................................................................................ 29
8.6 Data Layer....................................................................................................................... 29
Data Manager ......................................................................................................... 30
Data Lifecycle Manager ........................................................................................... 30
Science Archive Software ........................................................................................ 30
Local Database Services .......................................................................................... 30
Ingest Data from CSP into Data Layer ..................................................................... 31
8.7 Pipeline Components ..................................................................................................... 31
Processing Library ................................................................................................... 31
Science Analysis Pipeline Software ......................................................................... 32
Non-Imaging Software ............................................................................................ 32
Imaging Pipeline Software ...................................................................................... 33
Ingest Pipeline Software ......................................................................................... 36
Calibration Pipeline Software ................................................................................. 36
Image Space Search Engine .................................................................................... 36
Algorithmic Software .............................................................................................. 36
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 5 of 50
Sky Model Use and Creation ................................................................................... 37
9 Data Delivery Platform .......................................................................................................... 37
9.1 Tiered Data Transfer Service .......................................................................................... 37
9.2 User Portal ...................................................................................................................... 37
9.3 Data Discovery Service ................................................................................................... 37
9.4 Data Visualisation Service .............................................................................................. 38
9.5 Regional Centre Interface .............................................................................................. 38
10 Local Monitoring and Control ............................................................................................ 38
10.1 Local Telescope Model ................................................................................................... 39
10.2 Data Flow Manager (LMC) ............................................................................................. 39
10.3 QA Monitoring ................................................................................................................ 39
10.4 User Interfaces ............................................................................................................... 39
10.5 Master Controller and Error Handling ........................................................................... 39
10.6 Event Monitoring and Logging ....................................................................................... 39
10.7 Data Flow Models ........................................................................................................... 39
10.8 Task Management and Control ...................................................................................... 40
11 System Level Tasks ............................................................................................................. 40
12 SDP Early operations (Not changed SINCE M7) ................................................................. 41
12.1 Early operations costs scope .......................................................................................... 41
12.2 General principles applied ............................................................................................. 41
12.3 Hardware Compute Platform ......................................................................................... 41
Compute Island ....................................................................................................... 41
Buffer ...................................................................................................................... 42
SDP Infrastructure ................................................................................................... 42
Hierarchical Storage ................................................................................................ 42
Interconnect System ............................................................................................... 43
Compute node OS development ............................................................................ 43
Hardware support ................................................................................................... 43
Hardware maintenance .......................................................................................... 43
Archive storage ....................................................................................................... 43
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 6 of 50
Archive buffer storage ......................................................................................... 43
Archive media ...................................................................................................... 44
Archival network core switches .......................................................................... 44
Interconnect system ............................................................................................ 44
Non-Domain Software ......................................................................................... 45
Domain Software ................................................................................................. 45
System Level tasks (SDP overall) ......................................................................... 45
13 SDP Operations costs (NO CHANGE SINCE M7) ................................................................. 46
13.1 General principles applied ............................................................................................. 46
13.2 Hardware ........................................................................................................................ 46
Compute System Hardware .................................................................................... 46
13.3 Infrastructure ................................................................................................................. 48
Hardware maintenance .......................................................................................... 48
13.4 Long Term Storage (Archive) .......................................................................................... 48
Hardware support ................................................................................................... 48
Hardware maintenance .......................................................................................... 48
13.5 Interconnect system ....................................................................................................... 49
Hardware support ................................................................................................... 49
13.6 Non-Domain & Domain Software .................................................................................. 49
Maintenance of developed software ..................................................................... 49
Documentation ....................................................................................................... 49
COTS software maintenance .................................................................................. 50
Archive HSM licenses .............................................................................................. 50
System Level tasks (SDP overall) ............................................................................. 50
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 7 of 50
2 LIST OF FIGURES
Figure 1: US outline plans [RD02] for Exascale ............................................................................ 10
Figure 2: Schematic representation of the SDP Costed Hardware Concept showing the
unidirectional Ethernet Bulk Data Network (BDN) supporting Ingest to a High Performance
Buffer located on the Compute Islands. Data exchange between Compute Islands is supported
by an orthogonal bidirectional Low-Latency Network (LLN) currently costed as Infiniband.
Science products are delivered to Storage Pods for intermediate and long-term storage over a
bi-directional Ethernet network for onward user delivery. ....................................................... 13
Figure 3: IEEE Prediction for 40GbE availability in commodity x86 Servers .............................. 19
Figure 4: Infiniband Roadmap [RD11] SDR - Single Data Rate, DDR - Double Data Rate, QDR -
Quad Data Rate, FDR - Fourteen Data Rate, EDR - Enhanced Data Rate, HDR - High Data Rate,
NDR - Next Data Rate ................................................................................................................... 21
Figure 5: ASTC Technology Roadmap .......................................................................................... 23
Figure 6: The Science Data Processor System stack, showing the relationship between the level
2 elements of the product tree. Boxes are arranged so that each box is allowed to use only the
boxes below it. Furthermore, the horizontal partitioning of boxes into columns is
approximately arranged so that boxes in vertical alignment tend to be used together. The boxes
are colour coded to reflect the L1 elements of the product tree that they come from: Computer
Hardware (orange) Computer Software (gray), Pipelines (green), Data (yellow), Deliver
Platform (black), LMC (blue). ....................................................................................................... 27
3 LIST OF TABLES
Table 1: Configuration of a Compute Island ................................................................................ 15
Table 2: Cost trajectory for non-volatile and volatile storage No discount has been applied to
this pricing. We have assumed a ratio of 9:1 SATA:SSD in the costing estimate ...................... 15
Table 3: Compute node configuration * depending on working memory set [AD05] .............. 17
Table 4: Cost trajectory for High Speed Ethernet No discount has been applied to this pricing.
....................................................................................................................................................... 19
Table 5: Estimated costs (Euro) for various over-subscribed FDR Infiniband networks for various
telescopes ..................................................................................................................................... 21
Table 6: Estimated Labour Rates ................................................................................................. 26
Table 7: Estimated effort based on Meerkat............................................................................... 32
Table 8: Line of Code (LOC) production rate estimates ................................................................ 34
Table 9: AWImager LOC analysis .................................................................................................. 34
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 8 of 50
4 REFERENCES
4.1 APPLICABLE DOCUMENTS The following documents are applicable to the extent stated herein. In the event of conflict
between the contents of the applicable documents and this document, the applicable
documents shall take precedence.
Reference Number Reference
[AD01] PDR.01 SKA.TEL.SDP-0000013 - SKA SDP Architecture
[AD02] PDR.02.01 Sub-element design: COMP
[AD03] PDR.02.02 Sub-element design: Data Delivery
[AD04] PDR.02.04 Sub-element design document: LMC
[AD05] PDR.05 SKA-TEL-SDP-0000040 Parametric models of SDP Compute Requirements
[AD06] PDR.08 Preliminary Plan for Construction
[AD07] PDR.02.05 Sub-element design document: PIP
[AD10] PDR.07A Cost Spreadsheet Revisions
[AD08] PDR.11 Preliminary Integrated Logistics Support Plan
4.2 REFERENCE DOCUMENTS The following documents are referenced in this document. In the event of conflict between the
contents of the referenced documents and this document, this document shall take precedence.
Reference Number Reference
RD01 https://asc.llnl.gov/fastforward/
RD02 http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/talk4-Harrod.pdf
RD03 http://insidehpc.com/2014/11/slidecast-nvidiaibm-build-two-coral-100-petaflop-supercomputers-2017/
RD04 http://ark.intel.com/products/64595/Intel-Xeon-Processor-E5-2670-20M-Cache-2_60-GHz-8_00-GTs-Intel-QPI
RD05 http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last
RD06 http://www.storagereview.com/ssd_vs_hdd
RD07 http://www.storagereview.com/ssd_vs_hdd
RD08 http://www.colfaxdirect.com/store/pc/home.asp
RD09 http://www.anandtech.com/show/8729/nvidia-launches-tesla-k80-gk210-gpu.
RD10 http://content.yudu.com/A2097a/SCWDEC12JAN13/resources/14.htm
RD11 https://cw.infinibandta.org/document/dl/7580
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 9 of 50
RD12 http://mellanox.com/configurator
RD13 https://www.backblaze.com/blog/backblaze-storage-pod-4/
RD14 http://www.idema.org/?page_id=5868
RD15 http://www.45drives.com/products/order/dw-redundant.php
RD16 https://jira.ska-sdp.org/secure/attachment/13501/OracleLicenses.xls
RD17 https://jira.ska-sdp.org/browse/PDR-134?focusedCommentId=30502&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-30502
RD18 http://www.atnf.csiro.au/research/pulsar/tempo2/
RD19 http://psrchive.sourceforge.net/
RD20 http://stackoverflow.com/questions/966800/mythical-man-month-10-lines-per-developer-day-how-close-on-large-projects
RD21 http://www.green500.org/lists/green201311&green500from=1&green500to=100
RD22 https://www.tacc.utexas.edu/stampede/
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 10 of 50
5 INTRODUCTION
This section provides an estimate of the costs for the SDP for a phased delivery of system
components from 2018 onwards. The starting point for the cost estimate of hardware
components is commercial pricing available today; best estimates are presented for potential
discounts given the large volumes; and performance is extrapolated based on potential benefits
through technological improvements (e.g. Moore's Law, evolution of standards, etc.).
The SDP is designed to satisfy the computational, data management and archival aspects of the
SKA. Unlike traditional HPC environments, the SDP is designed with an emphasis on being data-
driven. The SDP needs to provide Exascale-class performance for in-situ scientific analysis and
derived science products. Speculation on the IT technology components that will be available to
deliver such a system in the 2018+ time-frame while operating within the envisaged power
budget is difficult. To set the scene for this, the USA is predicting an Exaflop system being
available as prototype in 2021 (see below), with a power usage of order 20-30 MW and
potentially with a budget of $200 million. In order to meet this challenge, the USA is in the process
of establishing plans for prototype compute node implementations during 2014 [RD01] which
will be developed over the next 2 years and act as demonstration points towards a Petascale
compute node prototype, an overall Petascale prototype system and thence the subsequent
Exascale system. In addition to this development programme, the US had also invested in 2 multi-
PFlops (c. 100PFlops) systems [RD03] costing some $300M for installation in 2017/8.
Figure 1: US outline plans [RD02] for Exascale
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 11 of 50
5.1 SCOPE The basis of the cost model assumes technology components as described in AD02 to deliver
buffer, network, management and processing requirements allied to potential developments in
the IT industry and commodity components. As such, we envisage the basic elements of the
system will be compatible with current parlance: commodity compute nodes, delivering an
overall efficiency based on the arithmetic intensity for a particular application/algorithm; storage
in terms of working memory for compute nodes and implementation of the buffer; networking
in terms of the necessary data transport mechanisms from ingest to the compute nodes; and
potential inter-compute node communication and subsequent archival [AD01].
The design equations [AD05] were used as input to the cost model. This cost model represents a
single point in the solution space and a thorough trade-off analysis will be performed through to
CDR to find an optimum, and possibly improved, point in the solution space.
5.2 ASSUMPTIONS A number of the assumptions and conclusions used in this document should be viewed from a
perspective of potential disruption in the next 2-3 years as mentioned above. However, in the
absence of any definitive information that can be disclosed at this present time, the information
provided here offers a means to speculate on potential future systems. The assumptions that
have been made in the cost analysis are summarised below:
● Computational Performance – An approximation to Moore's Law has been applied to
price/performance. We assume a doubling of performance at constant price every 24 months.
● Storage – Moore's Law has been applied to assume a halving in the unit cost of storage
(DRAM, SSD, SAS) every 24 months up to 2020 and 36 months for beyond.
● Storage Performance estimated as –
● 50 GBytes/sec [RD04] - DRAM
● 250-500 GBytes/sec from [RD05] - GDRAM
● 140 Mbytes/sec from [RD06] - SAS/SATA,
● 500 Mbytes/sec from [RD07] - SSD
● Low-Latency network to support inter-compute node communication has been based on
[RD08] .
● Network – The price/port for 10GbE and 40GbE switches will reduce under market
pressure as “Big Data” drives large-scale data centres.
● Cables – These will remain constant for 10GbE and 40GbE connections. Current R-NICs
supporting RDMA will not be CAT 6. SMF transceiver costs will decrease.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 12 of 50
Not covered:
● Computational Performance – Use of application-specific accelerators, FPGA.
● Network – Connections from CSP may be 100GbE but this is not factored in.
● Novel proprietary networking and I/O technologies that may become available as CPUs
and I/O become more tightly integrated, even though this undoubtedly will have a big impact.
● Advances in non-volatile memory technology
Where appropriate these are discussed in [AD02] and supporting documents.
6 HARDWARE COMPUTE PLATFORM
6.1 HARDWARE COST ESTIMATION METHODOLOGY
Hardware Model
The model is based on current technology components in order to provide an initial Costed
Hardware Concept. The applicability of this model for 2018+ is therefore evolutionary. No
consideration of co-design or the use of application specific hardware is offered in terms of
packaging.
Data and Processing Requirements
The dominant factors affecting the cost of the SDP are based on the processing requirement of
large amounts of data emanating from the CSP and subsequent processing in real and pseudo
real-time. The input data rates and performance requirements for a "double-buffered" 12-hour
period (viz. 6 hour operation of telescope with buffering and 6 hour processing of the previous
buffer are modelled.)
Estimate Costs for Build, Ship and Test
A cost per Compute Island has been estimated but it is not shown here.
Overview of the SDP
A schematic representation of the SDP is shown below for which raw data can be viewed as being
pushed down (in the context of the diagram) from the Central Signal Processor (CSP) through a
uni-directional, multi-stage bulk data processing network, culminating at a series of Top-of-Rack
switches to processing and storage elements housed in compute islands which are inter-
connected by a secondary network, thus non-congesting with the main bulk data network, finally
connected to an archive network for science data products.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 13 of 50
Figure 2: Schematic representation of the SDP Costed Hardware Concept showing the unidirectional Ethernet Bulk Data Network (BDN) supporting Ingest to a High Performance Buffer located on the Compute Islands. Data exchange between Compute Islands is supported by an orthogonal bidirectional Low-Latency Network (LLN) currently costed as Infiniband. Science products are delivered to Storage Pods for intermediate and long-term storage over a bi-directional Ethernet network for onward user delivery.
6.2 COMPUTE ISLAND A Compute Island consists of several interconnected compute nodes (see below). Each compute
island has associated infrastructure and facilities such as shared file systems, management
network and master and control node(s). This makes each compute island largely independent
of the rest of the system. The size of the SDP will be expressed by the number of compute islands
it contains - a parameter that will be freely scalable due to the compute islands’ independent
nature although factored by the interconnectivity of the network. Most of the infrastructure will
be similar between the three SDPs, but it is conceivable that the size of an island (e.g. the number
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 14 of 50
of compute nodes within an island), or the compute node design itself differs between
telescopes. This may be due to the balance between compute / IO ratios for the different
telescopes although there is a strong preference to maintain a high degree of commonality.
While the total useful capacity of the Science Data Processor depends on many components, we
identify three major defining characteristics that we will use scale the system.
Total capacity
Capacity per Compute Island
Characteristics per node
The total capacity is defined by the number of compute islands that are available. This top-level
number, the aggregate peak performance (Rpeak) [AD05] expressed in PFlops, is defined by the
number of compute islands that make up the Science Data Processor and the capacity per
compute island. While this number is a useful way to express the size of the system, its usefulness
is limited since it does not take computational efficiency into account. Ideally total capacity of
the system would be defined by the science or system requirements, but considering the
constraints introduced above, it is more likely that total capacity will be defined by the available
budgets (energy, capital or operational).
The capacity of a compute island is defined by the number of nodes per island and the
characteristics of these nodes. This capacity is expressed in computational capacity, i.e. TFlops,
but it is likely that computational capacity will not drive the sizing of the compute islands. Island
capacity is defined by the most demanding application, in terms of required memory, network
bandwidth, or compute capacity that requires a high capacity interconnect.
The basic building block of a compute island is the compute node. The characteristics of these
nodes are defined by design equations [AD05], but within these bounds a vast number of valid
node designs can be identified, especially when taking into consideration vendor roadmaps. The
SDP parametric model define a number of ratio rules that describe suitable node designs. Within
the bounds of these rules, cost energy efficiency and maintainability are considerations that may
be used to select optimal node implementations. Operational costs, in particular energy versus
deployment and maintenance cost, will play a key role in this decision. It is clear that this decision
cannot be made until more information is available on the likely technology options available for
nodes.
At this point in time, we are not in a position to consider future technologies and as such an
estimate of a potential compute island would comprise:
Compute Nodes 56 Based on ToR switch size with uplinks to the next stage
Archive Gateway Nodes 2 Maintaining archival capabilities
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 15 of 50
Login/Service/Master Nodes 2 H/A Cluster and Resource Management
Remote Service Node 1 Managing boot for diskless nodes
Rack Infrastructure 1 Based on 42U current capability
Build and Integration 1 Cost for Scalable Unit (not included in hardware cost)
Scalable Unit Connectivity Discussed below in networking
Buffer Discussed below in Buffer Table 1: Configuration of a Compute Island
As yet no allowance in the cost has been made for additional rack-cooling to accommodate heat
density and power requirements [AD06]. In addition, further packaging may be required to
support multiple accelerators and buffers although this needs to be assessed in relation to the
evolution of the compute node model. It should be noted that while in this scenario a Compute
Island maps into a single rack, this may not be the case if the node characteristics or design
changes, which is highly likely.
6.3 BUFFER Following the model of a highly data-driven architecture, the buffer requirement was modelled
as being distributed locally across compute nodes within a Data Island and not shared globally.
Based on the number of compute nodes this is likely to be many TBytes/node and as such
packaging will be critical to provide not only the necessary bandwidth but also the capacity per
node - this will undoubtedly require a refinement to the compute node architecture assumed
here.
The amount of buffer space has been taken from [AD05]. To cost this we have used the pricing
for both volatile and non-volatile storage media using the following principles:
Storage Pricing/Power
€/TB Watt/TBytes
2014 2018 2020 2022 Assumed Constant
SATA 30 8 5 4 0.625
SAS 80 20 13 11 0.75
SSD 600 150 100 86 1
DRAM 5000 1250 833 714 250 Table 2: Cost trajectory for non-volatile and volatile storage No discount has been applied to this pricing. We have assumed a ratio of 9:1 SATA:SSD in the costing estimate
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 16 of 50
6.4 COMPUTE NODE The characteristics of a SDP compute node, in this estimate, have been defined by maximising
the computational capability for a set of imaging techniques [AD05]. This has led to a compute
node design which must deliver
a. A relatively high degree of efficiency on Accelerator-assisted CPUs.
b. A substantial amount of buffer/node (10s to 100s TBytes)
c. High I/O capability to deliver connections to multiple network.
The actual amount of independent processing performed by each compute node will be largely
defined by the input dataset and type (as defined by data object), processing performance and
dataset size.
For cost estimates we assumed each compute node being similar to that which is widely used in
moderately sized HPC cluster systems in research and academia. This comprises a single or dual
socket Intel Xeon CPU (Host) + GPU accelerator(s). Our current model assumes that the
"majority" of processing will be performed on the accelerator(s) and modelled as such. There will
be tasks within the processing pipeline not suited to GPU implementation and assumed to run
locally on the Host CPU. This is not explicitly factored into our estimates. The form-factor for the
compute node will be highly dependent on the requirements for I/O connectivity and accelerator
hosting, especially given the constraints of current Xeon architectures and PCI connectivity
through the chip-set. This limits the number of PCI devices achievable within a particular form
factor and may also imply different types of node for different telescopes.
Current nodes of the assumed type (e.g. using Nvidia Tesla K20 GPUs), typically yield peak double-
precision performance of 1-1.5 TFlops (some competing devices by AMD already offer 2 TFlops).
We assume Moore's Law to hold for GPU computing power until 2018. As contingency we
conservatively assume a mere further doubling of GPU compute power between 2018 and 2022.
These assumptions are combined with the performance requirements [AD05] to estimate the
number of required accelerators to the 2018+ timescale. List prices at launch together with
power ratings for the Tesla range are provided here [RD09]
We have applied an efficiency value for the node performance based on this peak performance
trajectory, and need to continually monitor and validate this by prototyping. How the memory
bandwidth for these devices will scale with processing performance is highly speculative. For the
purposes of this exercise we have assumed that the overall efficiency will be constant. We
assume that the power per GPU will remain constant at around 300W (although experiments
with the Wilkes cluster in Cambridge suggest that efficiency can be maintained even where GPUs
are de-clocked, thus improving energy consumption). Based on the number of compute nodes,
we can now begin to construct a network supporting many thousands of these units.
The price performance of a node has been based on components available today and in particular
a COTS server consisting of the following:
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 17 of 50
Compute Node Qty
Xeon Ivy Bridge 2
Accelerator 2
DRAM GBytes* 64-1024 GBytes
10GbE 2
IB FDR/40GbE 1
IB/10GbE Cable 3
Power (W) ~800 Table 3: Compute node configuration * depending on working memory set [AD05]
The price in Euro is based on a 40% discount off list assuming the large volumes that will be
necessary. (This will need to be reviewed and is highly dependent on procurement strategy).
6.5 INTERCONNECT SYSTEM - BULK DATA TRANSPORT Assuming that Ethernet will be the network of choice for constructing the multi-stage bulk data
transport network from ingest ports to each of the compute islands, we can size an Ethernet
network based on a number of links down to compute nodes and a level of over-subscription up
to the next stage, while encompassing a certain degree of redundancy.
To estimate the network cost, we have assumed that a number of compute nodes will be
aggregated by a ToR (Top-of-Rack) switch and these ToR switches, in-turn, aggregated into a
multi-stage network from the CSP->SaDT->SDP with an intermediate stage providing routing
through a Software Defined Networking (SDN) as shown in Figure 2.
A first level of approximation is that the network will be constructed by a 3-stage network in
which a degree of over-subscription can be tolerated. The tolerance will be a function of the anti-
congestion behaviour of the network as a whole and in an over-subscribed network this will be
mitigated by a SDN sensitive to the traffic pattern across the network. The design of the switch-
stack is such that we deploy a 1st stage switch layer to accommodate the direct CSP links; a 2nd
switch stage to fan-out these links to afford a level of redundancy and a 3rd stage (implicit to the
Compute Island) to provide aggregation of final egress of ingest into the buffer, ready for
processing. Networks architectures of this form are found in large-scale IT system, although
typically follow the nomenclature of core (1st-stage), aggregation (2nd) and access (3rd stage).
Third Stage – Implicit to Compute Island Configuration
As described above the 3rd stage of networking as helped define the physical instantiation of a
Compute Island, a useful unit of scalability, albeit the concept of a ToR switch may well become
redundant as models for disaggregation and microserver packaging become mainstream. The
level of aggregation is driven by the size (arity) of the ToR switch – today approximately 48-64
ports of 10GbE each [RD08]
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 18 of 50
The costing exercise has been based on the cheapest high-port count High Speed Ethernet
(10GbE) switch available which supports a 56D16U (3.5x over-subscription). Currently each port
is an SFP+ connection as Cat6 switches are not yet freely available and have lower port counts as
they tend to consume more power. It is anticipated that Cat6 capability will evolve in the near
future reflecting LOM (Landed on motherboard) 10GbE capability in future motherboards.
Whether such NICs will support RDMA (e.g. iWarp or RoCE [RD10]) will depend on adoption,
although low-latency is not the prime consideration here. For cabling, this cost has been
estimated based on 10GbE SFP+ Cu cables within a rack. It should also be noted that the recently
announced 25GbE standard, driving higher port count devices may have a significant effect on
pricing.
Second Stage Networking
To construct the second stage of networking we consider the over-subscription (cost reduction
factor) based on the number of uplinks available from each ToR switch in the 1st stage of network.
For the purposes of initial cost estimates the number of 40GbE uplinks is chosen as 4. Thus the
total number of 2nd stage networking will be based on the total number of uplinks from the 3rd
stage and the amount of downlinks from the CSP which has been assumed again to be based on
the highest port count 40GbE spine (ToR) switch which presents 36x40GbE (to reiterate, for
simplicity we have not assumed more expensive core switches for this stage of the estimate,
although this may change as the solution is refined). This means that the amount of over-
subscription may vary between the Low, Mid and Survey telescopes depending on ingest rate.
The amount to which this network can be over-subscribed will require validation based on
management of flows and minimization of congestion. Cable costs, at this time, are difficult to
estimate both in terms of cost and in the lengths required. For the purposes of this exercise an
estimate for a QSFP connector with an average price of 150 Euro is made.
First Stage Networking
A first stage of networking is required conforming to the CSP input data rates and number of
channels [1] will be constructed as defined by the ingest data rate of the visibility buffer. On the
basis of this, the number of 40GbE or 100GbE channels can be calculated. At the time of writing
it is unclear which interface speed will be available for CSP connection and therefore the model
assumes 40GbE. The actual choice of this, similarly to the 2nd stage network, is again based on
40GbE spine switches as opposed to core switches. Furthermore the model for oversubscription
assumes a non-congesting network and that the proposed SDN environment manages flows in
such a manner to provide this balance. Cables for this have been estimated on a number of Single
Mode Fibre modules optics to form a “Patch Panel” to 1st stage network. In summary, the
following have been assumed to be applicable per port pricing for 2018+ based on a 2014 port
price of 10GbE of around 169Euro and 600Euro for 40GbE (excluding cables). This is based on
SFP+ and QSFP physical connections.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 19 of 50
Price/Port for ToR (Euro) Watt/Port
Year 2014 2018 2020 2022 Assumed Constant
10GbE 180 57 57 57 3
40GbE 1000 200 150 150 3.5 Table 4: Cost trajectory for High Speed Ethernet No discount has been applied to this pricing.
While this model of ToR switch may well be appropriate for the next 5 years, it may well evolve
in the next 5-10 years to a model of network disaggregation or micro-server in which the notion
of ToR becomes moot. It is also not clear whether the density of a ToR switch will also increase
(c.f. the recently announced 25 GbE standard). Furthermore the actual performance and cost of
the network will be driven by the end-to-end connectivity and bandwidth. Currently 10GbE (or
56 Gbps IB) is state-of-the-art, however we may expect 40GbE to be more cost-efficient in 2020
as shown in the diagram below taken from the IEEE although this will have consequences in terms
of the configuration of the Compute Island. If 10 or 40GbE becomes the de-facto on-board
connectivity we would expect significant price drops.
Figure 3: IEEE Prediction for 40GbE availability in commodity x86 Servers
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 20 of 50
Network Redundancy
Depending on the ultimate strategy taken on the network, higher degrees of redundancy may be
an appropriate tactic to employ. This has not been modelled as yet and will be addressed through
CDR and form part of the analysis conducted within the ILS activities [AD08].
6.6 LOW-LATENCY NETWORKING Re-ordering of data within the buffer will necessitate the means to exchange data across Data
Islands and hence a mechanism to cater for inter Compute Island communication and to mitigate
any potential congestion of the bulk data transport for intra Compute Island communication a
separate application network has been accommodated. This is currently based on Infiniband FDR
technology, although this may well change in the timescales under discussion here, to high-speed
Ethernet supporting RDMA as these technologies merge. This network is termed the low-latency
network as to differentiate it from the bulk data transport which is latency tolerant and
bandwidth driven and in the case of the latter lacks the necessary flow-control methods and QoS
(notwithstanding any SDN capabilities) offered by traditional Ethernet switch stacks. Although
full bisectional bandwidth is easily supported within a compute island, similar support between
islands becomes more difficult as cable management and cost estimation become a burden,
similar to that for the bulk data transport. In addition at very large node counts multiple stages,
particularly within a fat-tree topology, impose increasing amounts of memory on the compute
node in support of collective operations although there are a number of initiatives to ameliorate
this problem for particular message passing schema. Topologies, other than fat-tree, may well be
appropriate although these may increase the uncertainty on cable costs and management. For
completeness the Infiniband roadmap is shown here as a reference although other technologies,
in addition to RDMA Ethernet, are planned by various vendors.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 21 of 50
Figure 4: Infiniband Roadmap [RD11] SDR - Single Data Rate, DDR - Double Data Rate, QDR - Quad Data Rate, FDR - Fourteen Data Rate, EDR - Enhanced Data Rate, HDR - High Data Rate, NDR - Next Data Rate
The cost of this network has been estimated based on a 8:1 oversubscribed network using the
cluster configuration tool [RD12] and the number of compute nodes defined in [AD09]. The costs
(available today) for either a 3:1 or 1:1 network are also estimated here:
OVER-SUBSCRIPTION LOW MID SUR
1:1 2,039,175 5,796,533 9,996,203
3:1 917,044 2,149,628 3,014,756
8:1 647,269 1,491,761 2,078,085
Table 5: Estimated costs (Euro) for various over-subscribed FDR Infiniband networks for various telescopes
6.7 MANAGEMENT AND ARCHIVE A separate (10GbE) network has been provisioned for support of the archive. Please note that
this is shown in relation to the full archive for 2022.
The management network is provisioned in terms of standard 1GbE network switches and
included in the price of a compute island
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 22 of 50
7 HIERARCHICAL STORAGE (WAS CALLED SCIENCE ARCHIVE)
The science archive consists of the following hardware components: medium performance buffer
and long term storage.
The science data archive operational model and the data life cycle of the archived science data
are currently unknown. Therefore a generic storage design (disk based) was adopted for the long
term storage similar to what is being used by online backup and cloud storage vendors. Once the
operational model and data life cycle is understood the archive design can be optimized and
could potentially include tape storage.
The need for a third storage tier, namely that of a slow or intermediate buffer (medium
performance buffer), arises from SUC4, which is understood to require stacking data taken over
many days, weeks or months. Storing test data and support of commissioning activities are other
potential uses of the slow buffer. Since there is no need for double buffering in the slow buffer,
it is half the size of the fast buffer. The size of the fast buffer is taken from AD05. The archive will
therefore consist of tiered storage and this will be optimized in the future to meet the operational
requirements and minimize the total cost of ownership. In order to manage the tiered storage
archive Hierarchical Storage Management software will be required as discussed in 7.4.8.1.
Since the operational model for the archive is unclear, no assumptions have been made on the
level of data redundancy that is required apart from providing a second copy of the data (as per
AD03). All storage volumes are indicated as raw storage volumes.
7.1 LONG TERM STORAGE & MEDIUM PERFORMANCE BUFFER This is a vendor independent storage pod design based on commodity components. To project
the cost to 2022 the storage density was scaled according to industry roadmaps, see figure 5.
Both the medium performance buffer and the long term archive storage is based on the [RD13]
. This is a vendor independent storage pod design based on commodity components. The storage
pod consists of a 4U enclosure which takes 45 3.5” consumer SATA hard disk drives. The standard
connectivity is 1GbE and thus we add a 10GbE Ethernet card for connectivity to the archive
network. To scale the costs from the 2014 design to 2022 we scale the storage density per hard
disk according to the ASTC (Advanced Storage Technology Consortium) technology roadmap
[RD14]:
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 23 of 50
Figure 5: ASTC Technology Roadmap
As per the ASTC technology roadmap a Compound Annual Growth Rate (CAGR) of 30% is being
used to extrapolate storage density to the 2022 timeframe.
For the medium performance buffer we use the standard storage pod configuration and for the
long term storage we use the storage pod in a MAID (massive array of idle drives) configuration
to reduce operational costs (power consumption). There is potential for further reduction of
operational costs due to reduced costs for the included, constant overheads for networking per
port and server costs. In addition it is likely that the disk failure rate for MAID configurations is
lower than the calculated 4% per year. Neither of these factors have been included, and thus the
calculated estimates can be regarded as conservative.
The estimated cost of the hardware (excl hard disks) of the storage pod is based on a recent
procurement by ICRAR from [RD15]. The cost of a storage pod is €4219 (incl shipping to Australia).
A cost of €500 was added to the storage pod to account for the assembly, test and burn-in of the
storage pods. The cost of the rack has been included in the capital cost of the pod (assuming 10
4U pods per 42U rack). A typical price (2014) of €144 per 4 TB hard drive was used. This is a
conservative estimate as currently 4 TB internal SATA hard drives are readily available for €120.
Volume discounts can reduce the cost of hard drives further and therefore a conservative 20%
discount was used. The cost of the storage pod including media is thus €57888 per PB (excluding
network switches and cables).
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 24 of 50
7.2 DELIVERY PLATFORM HARDWARE We are planning on 6 servers to host DELIV services and provide resiliency at each telescope.
These will run all of the user access servers, the data discovery database and the tools to schedule
the data movement. We are also costing out storage that would be used to support the database
back-ends and data caching for data that is being visualised or moved to a site. The services that
are deployed need to be configured in a resilient manner, therefore it is important that we have
redundant servers available to run the different services that are foreseen within the DELIV
activity as well as their possible replacements. Therefore we see a 50% replacement rate as being
suitable, i.e., there are four main services which will each have their own hosting system and two
spare resources able to host any of the other services. Less resilience than this could leave the
system in a state, due to multiple failures, that would result in inaccessibility.
7.3 LMC HARDWARE These have not been specifically priced although an allowance has been made for a separate
1GbE management network connecting all components in the system and by a "redundant"
Management Unit rack consisting of 56 commodity Xeon units. This should provide sufficient
management processing resource for the SDP. The exact method for connectivity has not been
discussed.
8 SDP SOFTWARE COST ESTIMATION
8.1 ESTIMATION METHODOLOGY
8.2 EFFORT ESTIMATION For software development several techniques or methodologies can be used to estimate effort.
So far, a mixture of effort estimation techniques have been used depending on the current
understanding of a particular software component. In areas where the design has not been
defined yet, the Wideband Delphi method was used. In areas that map closely to existing
software and the existing software is well understood and available, source lines of code
estimation is used. In future when the design has been further defined (and where applicable),
function point analysis may be used to estimate effort.
8.3 LABOUR RATES For domain specific software development (and other domain specific tasks), an average labour
rate has been calculated based on information from these organisations: SKA-SA, UWA, UCAM,
UK National Lab, Astron, NRC.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 25 of 50
For non-domain specific software development typical commercial labour rates are used.
Labour Resources SDP rate
(avg) Min rate Max rate
Senior Project Manager € 669 €467 €770
Intermediate Project Manager € 434 €287 €625
Senior Architect € 690 €530 €770
Project administrator € 200 €200 €200
Project Engineer € 598 €467 €742
Senior Engineer € 598 €467 €742
Intermediate Engineer € 434 €287 €625
Junior Engineer € 334 €200 €488
Senior Scientist € 574 €344 €742
Intermediate Scientist € 447 €250 €625
Contracts Specialist € 352 €352 €352
Administrative Support** € 241 €241 €241
Senior Server Support Engineer € 337 €337 €337
Intermediate Server Support Engineer € 250 €250 €250
Junior Server Support Engineer € 189 €189 €189
Consultant Senior Architect € 1,200 N/A N/A
Consultant Senior Engineer € 1,000 N/A N/A
Consultant Intermediate Engineer € 800 N/A N/A
Consultant Junior Engineer € 600 N/A N/A
Senior Network Support € 337 €337 €337
Intermediate Network Support € 250 €250 €250
Junior Network Support € 189 €189 €189
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 26 of 50
Table 6: Estimated Labour Rates
8.4 SOFTWARE ESTIMATION ASSUMPTIONS For software development effort we have shown and costed the development effort separately
from the AIV and documentation effort that forms part of the overall software delivery.
The costs of Project Management and Systems Engineering have been excluded from this cost
estimation.
AIV effort
As the SDP software development project is likely to be a broadly distributed project, with small
components being developed in isolation of each other and the telescope site, a significant effort
will be needed (at all levels) to integrate these into a coherent whole. We have therefore included
an additional 30% effort (based on the development effort) to account for these activities. This
30% overhead should allow a formal program of integration and QA in a tree-like approach for
the SDP element.
Documentation effort
Due the complexity and scope of the SDP software development project and to comply with the
v3 Level 1 requirements, an additional 15% effort is included to produce documentation. This
effort excludes source code documentation in the form of comments within the source code, as
this forms part of the development effort. This additional 15% effort for documentation is
assumed to be spent after completion of the development tasks.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 27 of 50
8.5 SOFTWARE COMPUTE PLATFORM
[AD01] defined the SDP Software Stack as:
Figure 6: The Science Data Processor System stack, showing the relationship between the level 2 elements of the product tree. Boxes are arranged so that each box is allowed to use only the boxes below it. Furthermore, the horizontal partitioning of boxes into columns is approximately arranged so that boxes in vertical alignment tend to be used together. The boxes are colour coded to reflect the L1 elements of the product tree that they come from: Computer Hardware (orange) Computer Software (gray), Pipelines (green), Data (yellow), Deliver Platform (black), LMC (blue).
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 28 of 50
Compute Operating System
The operating system forms the basis of the software stack and is the interface with the hardware
compute platform. The operating system needs to support all hardware conceivably deployed in
the Science Data Processor, be extremely scalable and, as experience with precursor and
pathfinder experiments has shown, highly tunable. In this respect appropriate levels of resource
will be attributed to the maintenance and support of the O/S ensuring patch maintenance,
release consistency, fault diagnosis and repair, application integrity and upgrade schedules.
Middleware
Middleware components will be developed from existing Open Source software environments.
These consist of the following:
1. Reliable Communication Channels - based on 0MQ or equivalent
2. Event Handling and Logging
3. Cluster and Platform Management
4. Development Environment
5. System Optimization - including behavioural modelling, performance analytics
This has been estimated for 1) and 2) on experience with ALMA in respect of 0MQ or equivalent
and will also include other environments such as MPI (or equivalent) and OFED.
Event handling and logging will include the collection and analysis of compute island metrics;
ingest, bulk-data and archive data transport metrics and flow-rates, buffer metrics and the fusion
of this data in order to provide a system-wide state-of-health for the SDP. Such metrics will
include environmental parameters, performance counters, power consumption and soft and
hard errors. Additionally, methods for the early prediction of failures will be developed.
The effort for 3) is based on the exploitation of open-source frameworks, such as OpenStack, and
will address SDP specific enhancements specifically in terms of, for example, scale; pipeline
instantiation and scheduling; provisioning, management and control and in particular integration
with LMC.
The development environment (4) will include all aspect of code development, instrumentation
and optimization coupled with source-code control mechanisms, the ability to test applications
within a secure and stable environment and provide mechanisms for production roll-out will be
developed.
System optimization (5) will seek to develop tools for system-wide instrumentation and
modelling of the SDP. This task will initially explore open-source simulation and behavioural
modelling tools and their suitability for SDP. These environments will be refined during operation.
Hierarchical Storage Management
Due to the simplicity of the current design of the hierarchical storage (only disk tiers) there is no
need to provision for Hierarchical Storage Management software like IBM Tivoli. The Data Layer
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 29 of 50
provides basic hierarchical storage management functionality inherently and therefore no
additional software is required based on the current design.
In the event that the hierarchical storage design becomes more complex and or storage tiers
using other types of media are included the cost of a Hierarchical Storage Management system
will have to be included.
Application Development Environment and SDK
The cost for this item is covered in the System Level Tasks section and therefore not costed here.
Scheduler
The estimate of effort required for the scheduler depends on two main assumptions:
1. We can modify an existing Open Source scheduler sufficiently to be suitable for our
application.
2. The scheduler functionality is largely shared between it and the LMC component it
interfaces with.
Both of these assumptions mean that only limited effort is needed for the scheduler. We adopted
a slightly front-loaded approach, having two people working on the scheduler from the beginning
to get the architecture settled, with the senior developer effort reduced halfway through the
project when only implementation remains.
We do see some technology risk in this work package and therefore use an existing scheduler
with added functionality. Modern batch schedulers, like SLURM, have plugin support, which
should allow for this. We do require however that the scheduler interfaces directly with LMC to
estimate the available hardware resources days before an observation (to make a rough
schedule), and immediately beforehand (to help LMC create the physical deployment graph). This
functionality is not usually available, which incurs some risk. This is accommodated for in the
relatively high contingency for this line-item.
8.6 DATA LAYER Costs are largely based on existing projects, in particular ALMA, LHC/CERN, LSST, SDSS, the
precursor MWA and their archives. A detailed cost discussion is part of a SKA Science Book
chapter (.pdf, p.14f) with co-authors from all above mentioned projects. For non-domain
software the costing is split between licenses for commercial solutions and labour for adaptation.
The Compute Island topology is expected to require customisations in close collaboration with
vendors. Since licensing schemes are convoluted and often bundled with hardware, the archive
cost sheet (.xlsx) segregates total cost of ownership into those for storage media, power etc. It
introduces a 100 % fair comparison factor on hardware prices/petabyte to accommodate a
backup copy which is provided by the AWS Cloud option as part of its service.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 30 of 50
Data Manager
The Data Manager is a process running on each data locality and implements the physical
deployment graph in response to the execution graph defined by the LMC Data Flow Manager. It
initiates the physical movement of data and uses the pipeline interface to invoke the processing
components.
Data Lifecycle Manager
DLM is a policy-based approach to managing the flow of data throughout its lifecycle. It provides
basic functions for the archive and the backup system. It requires the adaptation of a suitable
hierarchical storage management system (HSM). A study on HSM capabilities and cost is ongoing.
As with most data layer components it is uncertain where a build or buy analysis will lead. At one
extreme, one can envisage a Cloud solution not requiring any software development for large
parts of the data layer. Another possible outcome is a mix of tightly integrated storage managers
and object stores specifically tailored for a parallel HPC environment requiring a significant
amount of vendor support for adaptation and testing. A custom solution developed within the
project is considered too demanding and hence undesirable.
Science Archive Software
This includes data product ingest, indexing, replication, and access level control governed by the
archive policy. Excluded are backup, recovery, the archive portal, and batch access which are
covered by resources allocated under Data Lifecycle Management as well as the delivery
infrastructure, which explains the relative balance of FTEs.
Precursor projects generally do not have a specific Data Layer element to compare with. This is a
result of the SDP performance requirements and the need to focus on data locality. The ALMA
DB license sheet covers all related systems in the archive as well as the operational domain.
● Oracle License Sheet [RD16]
● General discussion of DB cost models [RD17]
Local Database Services
This is an overarching SDP database service infrastructure. It includes the respective needs of the
Data Layer, Data Delivery, LMC TM caching, scheduler infrastructure, pipelines (catalogues, sky
models), auxiliary metadata, local logging and monitoring, and can support the HPC
infrastructure as needed.
ALMA had one DBA and one DB application developer during the whole construction phase. The
latter worked mainly on the data layer specific DB applications. Even in the very conservative ESO
operational environment development never stopped, with new operational DBs regularly going
online. Also the ASKAP Science Data Archive Team has a DB specialist for the more limited scope
equivalent to the SDP archive and delivery mechanism.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 31 of 50
Ingest Data from CSP into Data Layer
Short Description: Data Ingest is the process of taking the bulk data stream from CSP then
aggregating, synchronising it with the metadata stream from TM and mapping this aggregation
onto the SDP fast buffer. This functionality is intertwined with LMC.
SDNs are a relatively novel technology, hence a slightly more elaborate description is: Data
packages from the CSP/SaDT network transport layer are routed to SDP Compute Islands,
aggregated and eventually instantiated as data objects on a fast local buffer. Independent of that,
a metadata stream is broadcasted via LMC to all Compute Islands. The metadata establishes the
context of a data object. Once a data object is instantiated it becomes available to science
pipeline components and the context changes from the hardware centric Compute Island to a
processing oriented Data Island.
8.7 PIPELINE COMPONENTS
Processing Library
Most if not all the cost estimates for the processing library are based on pathfinder and precursor
information.
LOFAR spent 71 FTE on SDP related s/w (Ingest, Calibration, Imaging, Pipeline framework). 3 FTE
were spent on the GSM (rough estimate) and 3.5 FTE were spent on Source Finding (rough
estimate), the latter 2 being done at the Dutch Universities. Of the 71 FTE, 29 FTE were spent in
Pre-Construction (i.e. pre-CDR) and 42 FTE were spent in Construction (i.e. post CDR).
Please note the following: LOFAR had its CDR in April 2007 and was opened for the General
Astronomer in 2012. Hence, the years 2001 - 2006 are considered pre-construction, and the years
2007 - 2013Q1 are considered construction. Post 2012 LOFAR s/w development continues both
in optimizing existing code as in adding new features. This is not included in the current figures.
Post 2012, a team of approx. 5 FTE spends 20% of their time on SDP related s/w maintenance;
i.e. 1 FTE / yr. The total LOFAR software development includes software for TM, CSP, and SDP
related tasks. We have performed a rough extraction of SDP related FTEs. However, sometimes
people work on multiple components making it hard to split up the effort. Effort for the archive
is not included, since those numbers are not available. Effort for the GSM is a rough estimate.
Effort for the Source Finding is an even rougher estimate. The effort for commissioning is not
included. Part of the effort is in research related tasks. The Pre-Construction figures especially
include a lot of research and prototyping. The effort on NIP software is not included in the 71
FTE.
So far MWA spent the following effort on its Processing Library s/w. Imaging and Calibration: 6.5
FTE; Transient Detection: 2.75 FTE* (realistically 3.25 FTE); Source Detection: 2.25 FTE*
(realistically 3.25 FTE); EoR pipelines (there are two totally distinct pipelines for EoR on the
MWA): 31FTE (one cost 20FTE the other 11 FTE). Note that these pipelines represent a full EoR
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 32 of 50
analysis path from point source and foreground removal through to production of final EoR data
products.
ASKAP spent 26.5 FTE on software development in the period Jul 2006 - Jul 2015. The effort spans
both design and construction and includes Ingest, Calibration, Imaging, Source Finding, Local
Monitoring & Control, and Management. It excludes efforts for the Archive.
Meerkat spent 49 FTE including design / 35 FTE excluding design on software development in the
period 2009 - 2016. The effort can approximately be broken down as follows:
Infrastructure and architecture, incl data transport, excl. archiving
25%
Calibration and Imaging 38%
Ingest 13%
Archiving and storage 13%
LMC & User interfaces 13% Table 7: Estimated effort based on Meerkat
Science Analysis Pipeline Software
The LOFAR Source Finding software (PyBDSM) took approximately 3.5 FTE to develop (rough
estimate), which is comparable to the MWA effort (3.25 FTE). We use this as estimate for the
Postage Stamp Source Detection.
To date no precursor instrument has developed a Rotation Measure Synthesis pipeline. The
POSSUM team for ASKAP has made some progress towards such a pipeline for ASKAP but it is not
yet mature or complete. The estimate of the cost associated with production of such a pipeline
is therefore based on taking the effort required for the well understood case of a Postage Stamp
Source Detection pipeline (3.5 FTE) and adding a multiplier to account for the fact an RM pipeline
requires understanding of complex Faraday spectral products and is an entirely new modality,
giving rise to a total FTE of 4.5.
The Transient Source Detection pipeline on the MWA has taken 2.75 FTE and requires another
~0.5 FTE for completion being a total of 3.25 FTE for just the detection part alone. The imaging
aspects are not included in these estimates. We have used 3.5 FTE for the SKA to account for the
fact the work must be ported to all 3 instruments.
We haven't included the cost of the EoR pipeline due to the full EoR pipeline not being part of
the requirements.
Non-Imaging Software
The non-imaging software consists of the development of 3 real-time pipelines: pulsar timing,
pulsar search and transients. There is therefore effort needed for all three, with some overlap in
the effort required for aspects of pulsar search and pulsar timing, however these save about 20%
of the effort. In all cases we have based the estimates on existing code trees. The
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 33 of 50
precursor/pathfinders don't provide us with much of a basis because none of them attempt to
operate in real time, and they generally all use the same publicly available codes and string them
together in a pipeline. For LOFAR generating such a pipeline script cost more than 3 FTE alone.
The only area where this isn't the case is the transients where real time systems have been
developed from scratch for LOFAR single stations and Parkes. It is also important to note that the
codes currently used in pulsar timing and pulsar search have been built up over many years by
astronomers in the main. They haven't been developed as real-time, robust, well-documented
and unit tested codes (this is not to denigrate those codes at all!) and so are used here as guide.
For pulsar timing our estimate is based on the TEMPO2 [RD18] software suite and the calibration,
template matching and data manipulation and graphics code from PSRCHIVE [RD19]. Assessing
these codes we believe that approximately 50,000 lines of code are required and we used a code
writing rate of 16 lines-of-code/day which is based on professional code development estimates
and include some design work based on the CDR documentation and detailed testing.
For the pulsar search the estimates are based on the requirement to develop a code to do the
cross-beam sifting of the candidates to find unique and real candidates and also for the
development of a machine learning code which identified the real pulsars in the data stream. We
also need data manipulation tools but those can be the same as those developed for the timing
above. The sifting code is well established in a number of existing search codes and so our
estimate is robust here. The machine learning code is one where we (and others) are currently
doing lots of design work and so we have a reasonable idea, but there is a larger error bar on this
estimate. We have a total code requirement of about 10,000 lines.
The single pulse processing software needs to filter, present and be able to trigger on signals of
interest. As mentioned before our team has been involved in developing such code already and
our estimate is that 5,000 lines of code are required here.
Imaging Pipeline Software
It is likely that the domain specific SKA software will be more complex than the LOFAR s/w and
that effort is needed for further investigation and optimisation. On the other hand we also have
working software suites like CASA, and the ASKAP / Meerkat / LOFAR s/w. As a rough estimate it
is therefore likely that the SKA Processing Library s/w will take as much effort as the total LOFAR
effort. Therefore, the total effort for the Ingest Pipeline, the Calibration Pipeline, and the
Continuum Imaging Pipeline is currently estimated to be 70 FTE, of which 10 FTE is in the area of
Ingest, and 30 FTE both for Calibration and for Continuum Imaging. An additional 15 FTE is then
anticipated for Spectral Line Imaging and another 15 FTE for low-latency Slow Transients Imaging.
This brings the total estimate for the Imaging effort to 60 FTE.
The Imaging effort is then further split up into 30 FTE for Deconvolution and 30 FTE for Gridding
/ FFT. These efforts are then further broken down into 15 FTE for general Imaging development
and 5 FTE specific effort for each of the telescopes: LOW, MID, and SURVEY. Current Imaging
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 34 of 50
effort is costed based on a combination of ALMA & ASKAP software effort. This effort is divided
up into implementation of generic algorithms, which is expected to be common to all three
instruments, plus optimisation of these generic algorithms for specific instruments. It is expected
that optimisation of more universal algorithms such as the FFT will be undertaken by industrial
partners.
Line of Code estimates:
LOC/hr Mach
.
Source Notes
5.6 (±10%) GPU Cobalt
(Nijboer)
Significant domain specific
expertise
7.0‡ CPU NIP (Stappers) Source only
5.3‡ GPU NIP (Stappers) Source only
3.5‡ FPGA NIP (Stappers) Source only
‡Calculated assuming 1 day = 5.7 hours.
Table 8: Line of Code (LOC) production rate estimates
AWImager example:
Type Code† Comment† Total
Header 1995 716 4177
Implementation 5093 5076 14292
Test Routines 782 67 1088
†A code or comment line must contain an alphanumeric character, thus a single brace or so does
not count as code.
Table 9: AWImager LOC analysis
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 35 of 50
It is widely accepted that purely LOC driven software costs provide an under-estimate of required
effort for any software project; however, they can be used as a starting point. We use the
development of the COBALT correlator as a guideline LOC estimate here. This recently completed
project involved re-writing existing software from the LOFAR BlueGene correlator for the new
GPU-based COBALT correlator. For comparison we also include generic numbers obtained by
PIP.NIP for different platforms, see Table 8 Line of Code (LOC) production rate estimates. Taking
as a guideline, 1 year = 1400 project hours, i.e. 27 hrs per week on average, these numbers give
a range of 20.0 - 30.2 LOC/day; 94.5 - 151.2 LOC/wk ; 4900 - 7862 LOC/yr.
Online sources, such as this discussion on the popular developer website stackoverflow [RD20]
suggest that 10 LOC/day is a typical number taken over the full duration of a project, but that the
average LOC per day over the actual development period of the project is significantly higher.
This suggests that the LOC/hr numbers here are substantially higher than full duration numbers,
but are consistent with development period numbers for waterfall style projects where
development typically comprises only 20-30% of the total project time.
For AWImager, the source alone comprises 5093 LOC, see Table 9. This would suggest that re-
writing AWImager would take approximately 0.65-1.04 FTE years. Including inline documentation
and header files would increase this figure to approximately 2.5 FTE years and including the
BEAM calculation would add a further 0.5 FTE. This results in approx. 3 FTE years in total. We
note that test routines are not considered here.
AWImager relies heavily on routines from the CASA package. The components of CASA for
calibration and imaging comprise approx. 100 kLOC in total. It is difficult to divide these cleanly
between the two, but scaling from AWImager, it would require approx. 20 FTE years to re-
produce and twice this to include inline documentation. Making the rough assumption that the
CASA LOCs are divided 2:1 between IMG:CAL, this implies that the supporting routines from CASA
would take a further 24 FTE years to reproduce (without double counting for routines common
to CASA & AWImager). This code would then need to be assembled into the individual pipelines
(continuum, spectral, slow). Experience from existing instruments suggests that assembling a
pipeline from existing code requires 1.5-2.0 FTE years of effort. Here we assume a representative
value for the 3 pipelines of 5 FTE years.
Following from the ASKAPsoft software development model we assume a 33% overhead for
instrument specific optimisation and scaling of individual software components.
We note that the numbers for development of the generic software components can be justified
almost completely (29 FTE years rounded to 30 FTE years) on LOC considerations. This suggests
that a certain amount of risk is incorporated here, as such numbers possess an intrinsically large
uncertainty (+/- 10% in the COBALT numbers alone) as well as being expected to represent a
minimum amount of required effort.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 36 of 50
Note: The Fast Imaging Pipeline (Slow Transients) differs from the Continuum Imaging Pipeline in
the sense that FFT costs dominate the processing budget. However in principle, there are no
additional components in the pipeline compared to the other imaging pipelines. In terms of
processing speed it may be necessary to implement a different form of FFT (sparse FFTs on GPU
are looking good at the moment; 2 memos in preparation) and a different form of source
detection (since sFFT routines only output significant components, not images). Additional costs
may be required at the LMC Control level, but these should not be substantial.
Ingest Pipeline Software
It is likely that the SKA Ingest Pipeline is more complex than the LOFAR equivalent and that
additional effort is needed for further investigation and optimisation. On the other hand we also
have working software suites like CASA, and the ASKAP / Meerkat / LOFAR software. As a rough
estimate it is therefore likely that the Ingest Pipeline software will take as much effort as the
equivalent LOFAR effort, 10 FTE.
Calibration Pipeline Software
It is likely that the Calibration Pipeline is more complex than the LOFAR equivalent and that
additional effort is needed for further investigation and optimisation. On the other hand we also
have working software suites like CASA, and the ASKAP / Meerkat / LOFAR software. As a rough
estimate it is therefore likely that the Calibration Pipeline software will take as much effort as the
equivalent LOFAR effort, 30 FTE.
The Calibration Pipeline effort is then further split up in 15 FTE for general Calibration
development and 5 FTE specific effort for each of the telescopes: LOW, MID, and SURVEY.
Image Space Search Engine
This FTE effort is associated with the source detection & characterisation required as part of the
calibration loop in order to produce the global sky model. It is based on the s/w expense derived
from the MSSS survey on LOFAR and the global sky model work for MWA which both are of order
0.5 FTE per annum, totalling 2.5 FTE over the 5 year period.
Algorithmic Software
This effort is to ensure that software components are reused rather than redeveloped
independently across different portions of SDP. Such components may originate from external
libraries, for example FFTs, Linear Algebra, etc. or from development activities within the SDP
Library. It aims to ensure that optimal components are used at all times throughout SDP, to
reduce development costs and particularly maintenance costs (including porting). Obviously, this
activity interacts closely with a number of other software development activities in PIP (and,
possibly, other portions of SDP) and the amount of resources required would reflect the location
of the boundary with other activities. It is currently envisaged that 1 FTE per annum would be
required, but portions of the activities may be subsumed within other parts of SDP.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 37 of 50
Sky Model Use and Creation
The LOFAR Global Sky Model database took approximately 3 FTE to develop (rough estimate).
We estimate that the SKA GSM will be more complex due to the fact that a larger variety of source
models need to be supported (more science cases and different source models as compared to
LOFAR). Therefore, the SKA GSM development effort is estimated to be 5 FTE.
9 DATA DELIVERY PLATFORM
9.1 TIERED DATA TRANSFER SERVICE This is the service that will allow data to be moved out of SDP sites to Regional Centre sites and
between all sites to allow for data backup and data recovery. We are currently evaluating
different tools that could be used for parts of this service, such as FTS and NGAS. While there will
be significant use of COTS and existing Open Source tools, there will be additional development
work in this package to create a data transfer scheduling environment appropriate for the SKA
Use Cases. This will include interfaces necessary to allow for the integration of the scheduler
interface with other management services within the SDP, as well as present information to LMC
such that the health of the system is known and monitored. Due to this additional development
work we have assigned more time from a Eng:Sr in the design work and additional time from an
Eng:Int in the implementation.
9.2 USER PORTAL This provides the hosting environment for all of the DELIV tools. This could be built by adapting
an existing web portal, such as CyberSKA, or by creating a new system using one of the
community web portal frameworks (Elgg, Drupal, etc.) depending on the state of the art at the
time of implementation. While the implementation of this can largely be performed by a Eng:Jr,
it will be important for them to have sufficient supervision. Also the integration of the Data
Delivery service will require significant design thought and implementation oversight. The design
and creation of specific user interfaces for different types of users is also costed.
9.3 DATA DISCOVERY SERVICE This is used to find data with particular attributes. We expect this will build on an existing IVOA
services implementation, such as the one developed by CADC. For this we see roughly equal
amounts of time for design from Arch:Sr and Eng:Sr; 0.2 FTE of each. Following the design
delivered at the end of the first year the implementers will then have to build the service. For the
implementation we need 0.2 of an Eng:Sr and 1.1 FTE of an Eng:Jr (for 2 years). This is because
much of the work should be able to be performed by an Eng:Jr with some supervision. In addition
to the services there is a need for data converters and interfaces into the object metadata
provided by the DATA package. For this whole activity we are costing an additional 0.2 Arch:Sr
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 38 of 50
and 0.3 Eng:Sr for design and 0.25 Eng:Sr and 0.75 Eng:Jr for the implementation. Note that we
are currently planning for this to use a commercial quality database management system that is
an additional software cost. Commercial databased are costed separately in the Cost Model.
9.4 DATA VISUALISATION SERVICE This allows remote visualisation of data that is stored at a SDP site or at a Regional Centre. This
could build on an existing astronomical remote visualisation system such as the CyberSKA viewer,
or could combine other desktop tools with a remote desktop system. We believe this will need
similar design effort as the Data Discovery Service. This implementation of the core visualisation
tools needs additional work from an Eng:Sr, so we have assigned 0.5 FTE and the additional 0.75
FTE of an Eng:Jr to perform the more routine parts of the work.
9.5 REGIONAL CENTRE INTERFACE We are planning around performing one year of work to design the various services and
interfaces, followed by two years to implement them. For the design of all of these we need part
of the time of a senior architect and part of the time of a senior engineer. For the development
work we need part of the time of a senior engineer, plus some time from an intermediate
engineer and/or some time from a junior engineer. There will be key requirements around
scalability and as such it is necessary that the implementation team is led by someone with
experience of delivering high impact services in software that will be reliable, and also that are
able to adapt and bring new ideas into the implementation process where necessary when
changes in approach are needed
10 LOCAL MONITORING AND CONTROL
LMC is scheduled to be developed in two distinct epochs. The first will span 2 years and uses the
majority of the development effort. This is due to the fundamental role the LMC plays in the SDP,
particularly from an integration point of view. By having LMC in place early on, testing and
validation of other components is greatly simplified. The remainder of the 5 year construction
period will see a reduced LMC team in place to carry forward the existing work and assist with
integration with other components.
The estimate of the work is based in part on the experiences gained in similar software for the
MeerKAT radio telescope. This is an area that has received considerable attention within the
MeerKAT software team, and thus many of the components (particularly control, monitoring and
logging) have well understood design spaces and costing.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 39 of 50
10.1 LOCAL TELESCOPE MODEL A local representation of various telescope parameters used internally by SDP for processing. This
will include items such as static configuration, sensor data and programmatic models required as
a service by SDP components. Existing designs cover much of the scope of this effort
10.2 DATA FLOW MANAGER (LMC) The highest risk item in the LMC, this is responsible for the generation of logical graphs from the
data model and pipeline descriptions, and then the instantiation of a physical model based on
available compute resources at run time. This physical model is then handed over to the Data
layer for execution. The graphs are potentially huge (100's of millions of nodes), and this
approach has not been tested in the radio astronomy community before, hence the technological
risks are high.
10.3 QA MONITORING Aggregating metrics provided by internal SDP components that relate to the scientific
performance of the telescope. This also involves preparation of these for a variety of end users,
including statistical analysis to enable easy visualisation. Minor modifications to existing designs.
10.4 USER INTERFACES There are potentially a number of SDP user interfaces, including debugging, commissioning,
pipeline, and QA. At this stage the allocation of this work is relatively uncertain, but the risk factor
is low and existing designs could be used without massive modification.
10.5 MASTER CONTROLLER AND ERROR HANDLING The master controller provides a single point of contact for all TM communication. It handles
control commands, and forwards these to relevant SDP components. A major component of this
effort relates to error handling, particularly the way in which errors are reported to TM so as to
allow operators to make schedule decisions in the presence of errors. This component of the
master controller is likely to be based on existing technology, but will probably be a new design.
10.6 EVENT MONITORING AND LOGGING Handles collation and reporting of internal health monitoring data, and provides a framework to
action events and other alarms from these. Also provides a distributed logging framework for use
by SDP components. This is a low risk item with existing codebases that can be used.
10.7 DATA FLOW MODELS The generation of Data Flow Models includes several steps. The first of these is the construction of the Logical Data Graphs, which encode the functional steps needed to achieve a particular
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 40 of 50
scientific capability. These will be developed as part of the design phase, and largely reside within the scope of the pipeline tasks. To convert a Logical Data Graph to a Physical Data Graph requires interaction with the COMP and DATA sub-elements to determine resource availability, which is then used by the LMC Data Flow Manager to map the LDG to a PDG. In addition to the resource information, LMC also requires benchmarking of each component to determine resource usage and estimated runtimes. This benchmarking will be done by COMP as particular components are developed by the pipeline teams.
10.8 TASK MANAGEMENT AND CONTROL A single intermediate project manager is required to handle the day to day aspects of running a
distributed software development team.
11 SYSTEM LEVEL TASKS
The system level integration activities are explicitly enumerated here. Although additional effort
has been added to each individual element to allow for integration (and AIV support), these
resources represent the dedicated, telescope level team to handle SDP integration on a per site
basis.
Integration and QA are the primary FTE components. The integration team of 4 full time members
is sized sufficiently to allow dedicated on-site support during periods of integration, and to assist
the AIV efforts. This team should be put together as early as possible, to allow their capability to
grow and mature with the development effort. This will ensure that expert assistance is always
available during integration efforts.
Likewise the QA team will have a single intermediate level resource, and junior members
dedicated to each telescope. This will allow telescope specific quality assurance and testing runs
to be produced.
On the architecture level, provision is made for a single senior architect in the software and
hardware aspects of the SDP. This project level resource will provide architectural analysis and
support to the elements during development and testing. Given that the software effort is likely
to be more distributed than the hardware, a junior resource to shadow the software architect
and provide continuity across time zones, is included.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 41 of 50
12 SDP EARLY OPERATIONS (NOT CHANGED SINCE M7)
12.1 EARLY OPERATIONS COSTS SCOPE The cost calculation for support during early operations includes the support and maintenance
costs that are needed during the Construction Phase over and above the teams involved with the
build and deployment of the infrastructure and development and rollout of the software. It
covers the support needed for the functioning and operation of systems deployed over the
construction term.
12.2 GENERAL PRINCIPLES APPLIED ● The costs for resources are based on 2014 terms and no provision has been made
for adjustment due to inflation over the term.
● The cost for spares and maintenance are based on the assumed hardware
procurement costs taking Moore’s law into account.
● It is assumed that the first deployment will commence in month 6, the second
deployment in month 30 and the final deployment in month 54 of the construction phase.
● Support and maintenance costs are included from the month following the
commencement of the respective deployments.
● The cost of maintenance may be adjusted pending a full analysis of the
maintenance requirements.
12.3 HARDWARE COMPUTE PLATFORM
Compute Island
The hardware support of the Compute Islands is based on a combination of the industry best
practice for the support of Linux server hardware in a virtual environment up to the operating
system level at a service level of 95% availability and experience of current deployments for other
systems such as ASKAP and LOFAR.
The support ratio was adjusted to accommodate the requirement for 24x7 support at the two
sites with standby provided by a senior system engineer to ensure the availability of the system.
A minimum of 2 resources (50:50 split between senior and junior) is required per site to ensure
continuity in support.
The support ratio for the initial two deployments is based on 100 compute islands per support
engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers.
The support ratio for the final deployment was adjusted to 50 compute islands per support
engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers. The
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 42 of 50
adjustment is required due to the volumes deployed and to cater for the availability requirement
of 95%. A minimum of 1 Senior System engineer is required per site at all times.
Provision was also made for travelling costs during early operations at 6 trips of between 3 and
7 days per trip for 4 resources for the first two deployments and 2 trips of between 3 and 7 days
for 4 resources for the final deployment.
Provision was made for the procurement of hardware maintenance from the hardware
manufacturer of the compute islands for the initial deployment at 15% of the purchase price of
the compute islands. The percentage is based on the availability requirement for the compute
islands (the higher the availability the higher the maintenance costs).
Due to the volumes and the requirement for the system to be available for the processing of data
with minimal interruption, it was decided to replace the hardware maintenance with onsite
spares from the second deployment onwards. Provision was therefore made for onsite spares at
10% of the purchase price of the compute islands. The onsite server support engineers will
perform the repair of the hardware using the onsite spares provided
Buffer
The support of the buffer storage is based on industry best practice for the support of SSD type
storage, taking into account the volumes and availability requirement. It consists of the following
resource combination:
0.4 Senior storage engineer per 1000PB
0.5 Intermediate storage engineer per 1000PB
3 Junior storage engineers per 1000PB
The support of the buffer storage will only commence with the final deployment.
Provision was also made for 2 trips of between 3 and 7 days for 1 resource for the final
deployment.
SDP Infrastructure
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the server racks to cater for any failures. This was applied for all three
deployments
Hierarchical Storage
Provision was made for support of the archive storage at €325.20 per PB for the initial
deployment, €307.37 per PB for the second deployment and €287.76 per PB for the final
deployment.
The support costs were calculated taking into account a failure rate percentage of 4%.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 43 of 50
Provision was also made for travelling costs during early operations at 6 trips of between 3 and
7 days per trip for 1 resource for the first two deployments and 2 trips of between 3 and 7 days
for 2 resources for the final deployment.
This below was originally under the heading ‘buffer archive’
Provision was made for support of the archive storage at €4,442.40 per PB for the initial
deployment, €3,173.14 per PB for the second deployment and €1,776.96 per PB for the final
deployment.
The support costs were calculated taking into account a failure rate percentage of 4%.
Provision was also made for travelling costs during early operations at 6 trips of between 3 and
7 days per trip for 1 resource for the first two deployments and 2 trips of between 3 and 7 days
for 2 resources for the final deployment.
Interconnect System
(was Low Latency Core Switches)
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches from the first deployment onwards. This is based on the standard
maintenance percentage for network equipment deployments.
No provision was made for additional spares.
Compute node OS development
No provision was made during early operations for additional maintenance of the developed
software as the development team catered for in the construction cost calculation is assumed to
be sufficient to deal with any operational support requirement during construction.
Hardware support
Hardware maintenance
Archive storage
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the archive storage to cater for any failures. This was applied for all three
deployments.
Archive buffer storage
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the archive buffer storage to cater for any failures. This was applied for all
three deployments.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 44 of 50
Archive media
Provision was made for the storage media needed for the archive storage and archive buffer
based on the volumes derived from the performance use case for the three deployments.
The costs associated with the media are as follows:
First deployment € 14,444.44 per PB media
Second deployment € 10,317.46 per PB media
Final deployment € 5,777.78 per PB media
Archival network core switches
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches from the first deployment onwards. This is based on the standard
maintenance percentage for network equipment deployments.
No provision was made for additional spares.
Interconnect system
12.3.13.1 Hardware support
12.3.13.2 All network equipment (incl Low Latency core switches, Archive Switches and Data
Transport network)
Provision was made for 1 senior network support engineer per site for all three deployments.
The senior network support engineer is to oversee and ensure knowledge transfer to the server
support engineers to ensure that they can assist with the support of the network switches.
Provision was also made for travelling costs during early operations at 6 trips of between 3 and
7 days per trip for 2 resources for the first two deployments and 2 trips of between 3 and 7 days
for 2 resources for the final deployment.
12.3.13.3 Data Transport network
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches from the first deployment onwards. This is based on the standard
maintenance percentage for network equipment deployments.
No provision was made for additional spares.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 45 of 50
Non-Domain Software
12.3.14.1 Maintenance of developed software
No provision was made during early operations for additional maintenance of the developed
software as the development team catered for in the construction cost calculation is assumed to
be sufficient to deal with any operational support requirement during construction.
12.3.14.2 COTS software maintenance
Provision was made for software maintenance on the following COTS items at 25% of the
procurement cost:
· System Operating Systems
· Data Transfer OTS
· Database software
· Cloud Management Framework
· Scheduler to drive data transfers according to priority and policy
No maintenance provision was made for the Oracle software as the initial procurement costs
include maintenance for a period of 5 years (construction phase).
12.3.14.3 Archive HSM licenses
Provision was made for renewing licenses and replacing obsolete items at 33% of the original
procurement costs of the HSM licenses.
Domain Software
12.3.15.1 Maintenance of developed software
No provision was made during early operations for additional maintenance of the developed
software as the development team catered for in the construction cost calculation is assumed to
be sufficient to deal with any operational support requirement during construction.
System Level tasks (SDP overall)
No provision was made during early maintenance, as the team to oversee the tasks on a holistic
level for SDP is provided for as part of the construction costs.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 46 of 50
13 SDP OPERATIONS COSTS (NO CHANGE SINCE M7)
The costs for support during operations are based on the support required for the full
deployment for a period of 12 months after the conclusion of the construction phase (currently
assumed to be 1 January 2023 to 31 December 2023).
13.1 GENERAL PRINCIPLES APPLIED The costs for resources are based on 2014 terms and no provision has been made for adjustment
due to inflation over the term.
The cost for spares and maintenance are based on the assumed hardware procurement costs
taking Moore’s law into account.
The cost of maintenance may be adjusted pending a full analysis of the maintenance
requirements.
13.2 HARDWARE
Compute System Hardware
13.2.1.1 Hardware support
13.2.1.1.1 Compute islands (including management compute islands)
13.2.1.1.2 Estimated Costs for Operational Budget per Annum
As the majority of the power is consumed by compute islands, individual power for the elements
of the Compute Island is shown in the cost spreadsheet. Taking into account the contribution
from accelerators alone, the Compute Island is projected to consume around 11GF/W compared
to the [RD21] system of 3.5GF/W which is number 2 on the Green Top500 and the [RD22] system
of 1.64GF/W which is number 7 on the standard Top500. It is anticipated that this is achievable
by 2022 as de-clocking of cores separately from memory block will maintain memory bandwidth
and hence efficiency.
The hardware support of the Compute Islands is based on a combination of the industry best
practice for the support of Linux server hardware up to the operating system level at a service
level of 95% availability and experience of current deployments for other systems such as ASKAP
and LOFAR.
The support ratio was adjusted to accommodate the requirement for 24x7 support at the two
sites with standby provided by a senior system engineer to ensure the availability of the system.
A minimum of 2 resources (50:50 split between senior and junior) is required per site to ensure
continuity in support.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 47 of 50
The support ratio for the ongoing operational support is based on 50 compute islands per support
engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers. This
is required to cater for the availability requirement of 95%. A minimum of 1 Senior System
engineer is required per site at all times.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 4 resources.
13.2.1.1.3 Buffer storage
The support for the buffer storage is based on industry best practice for the support of SSD type
storage, taking into account the volumes and availability requirement. It consists of the following
resource combination:
0.4 Senior storage engineer per 1000PB
0.5 Intermediate storage engineer per 1000PB
3 Junior storage engineers per 1000PB
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 1 resource.
13.2.1.1.4 Hardware maintenance
13.2.1.1.4.1.1 Compute islands (including management compute islands)
Due to the volumes and the requirement for the system to be available for the processing of data
with minimal interruption it was decided to make provision for onsite spares. Provision was made
for onsite spares at 10% of the purchase price of the compute islands. The onsite server support
engineers will perform hardware repair using the onsite spares provided.
13.2.1.1.4.1.2 Low Latency core switches
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches. This is based on the standard maintenance percentage for network
equipment deployments.
No provision was made for additional spares.
13.2.1.1.4.2 Maintenance of developed software
13.2.1.1.4.2.1 Compute node OS development
Provision was made for the maintenance of the software developed at 40% of the original cost
of development. This is aligned to industry practice and caters for the complexity of the
environment.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 48 of 50
13.2.1.1.4.2.2 Documentation
Provision was made for the documentation of any changes made as a result of the software
maintenance and is based on 15% of the software maintenance costs.
13.3 INFRASTRUCTURE
Hardware maintenance
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the server racks to cater for any failures.
13.4 LONG TERM STORAGE (ARCHIVE)
Hardware support
13.4.1.1 Archive storage
Provision was made for support of the archive storage €287.76 per PB.
The support costs were calculated taking into account a failure rate percentage of 4%.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 2 resources.
13.4.1.2 Archive buffer storage
Provision was made for support of the archive storage at €1,776.96 per PB.
The support costs were calculated taking into account a failure rate percentage of 4%.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 2 resources.
Hardware maintenance
13.4.2.1 Archive storage
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the archive storage to cater for any failures.
13.4.2.2 Archive buffer storage
Provision was made for maintenance to be procured from the hardware supplier at 15% of the
procurement cost of the archive buffer storage to cater for any failures.
13.4.2.3 Archive growth
Provision was made for 100% growth in the Archive storage and Archive Buffer per annum as per
AD05.
The costs associated with the growth are as follows:
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 49 of 50
Media € 5,777.78 per PB media
13.4.2.4 Archival network core switches
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches. This is based on the standard maintenance percentage for network
equipment deployments.
No provision was made for additional spares.
13.5 INTERCONNECT SYSTEM
Hardware support
13.5.1.1 All network equipment (incl Low Latency core switches, Archive Switches and Data
Transport network)
Provision was made for 1 senior network support engineer per site.
The senior network support engineer is to oversee and ensure knowledge transfer to the server
support engineers to ensure that they can assist with the support of the network switches.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 2 resources.
13.5.1.2 Data Transport network
Provision was made for the procurement of hardware maintenance at 25% of the procurement
cost of the switches. This is based on the standard maintenance percentage for network
equipment deployments.
No provision was made for additional spares.
13.6 NON-DOMAIN & DOMAIN SOFTWARE
Maintenance of developed software
Provision was made for the maintenance of the software developed at 40% of the original cost
of development. This is aligned to industry practice and caters for the complexity of the
environment.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 4 resources.
Documentation
Provision was made for the documentation of any changes made as a result of the software
maintenance and is based on 15% of the software maintenance costs.
Document No: SKA-TEL-SDP-0000046 Unrestricted
Revision: 2 Author: F. Graser et al.
Release Date: 2015-02-09 Page 50 of 50
COTS software maintenance
Provision was made for software maintenance on the following COTS items at 25% of the
procurement cost:
· System Operating systems
· Data Transfer OTS
· Database software
· Cloud Management Framework· Scheduler to drive data transfers according to priority and
policy
Maintenance provision of 22.5% was made for the Oracle software per annum.
Archive HSM licenses
Provision was made for the renewing of licenses and replacing of obsolete items at 33% of the
original procurement costs of the HSM licenses.
System Level tasks (SDP overall)
Provision was made for the 40% of the original cost during the construction phase to ensure
continuity and alignment on an ongoing basis.
Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per
trip for 6 resources.
PDR07-01CostsBasisofEstimatev2-2 final-1 (1)EchoSign Document History February 10, 2015
Created: February 09, 2015
By: Verity Allan ([email protected])
Status: SIGNED
Transaction ID: XJEUFNC54XI3C4V
“PDR07-01CostsBasisofEstimate v2-2 final-1 (1)” HistoryDocument created by Verity Allan ([email protected])February 09, 2015 - 6:15 PM GMT - IP address: 131.111.185.15
Document emailed to Ferdl Graser ([email protected]) for signatureFebruary 09, 2015 - 6:15 PM GMT
Document viewed by Ferdl Graser ([email protected])February 10, 2015 - 5:21 AM GMT - IP address: 105.184.40.35
Document e-signed by Ferdl Graser ([email protected])Signature Date: February 10, 2015 - 5:22 AM GMT - Time Source: server - IP address: 105.184.40.35
Document emailed to Paul Alexander ([email protected]) for signatureFebruary 10, 2015 - 5:22 AM GMT
Document viewed by Paul Alexander ([email protected])February 10, 2015 - 8:59 AM GMT - IP address: 131.111.185.15
Document e-signed by Paul Alexander ([email protected])Signature Date: February 10, 2015 - 8:59 AM GMT - Time Source: server - IP address: 131.111.185.15
Signed document emailed to Paul Alexander ([email protected]), Ferdl Graser ([email protected]) andVerity Allan ([email protected])February 10, 2015 - 8:59 AM GMT