Download - PDR.07.01 COSTING BASIS OF ESTIMATE SKA-TEL-SDP …broekema/papers/SDP-PDR/PDR07-01 Costs … · deliver such a system in the 2018+ time-frame while operating within the envisaged

Document No: SKA-TEL-SDP-0000046 Unrestricted

Revision: 2 Author: F. Graser et al.

Release Date: 2015-02-09 of 50

PDR.07.01 COSTING BASIS OF ESTIMATE

Document number…………………………………………………………………………………SKA-TEL-SDP-0000046

Context…………………………………………………………………………………………………………………..…………MGT

Revision…………………………………………………………………………………………………………………………….……2

Author……………………………………………………………………………………………….Ferdl Graser, John Taylor

Release Date………………………………………………………………………………………………………….2015-02-09

Document Classification………………………………………………………………………………….…. Unrestricted

Status……………………………………………………………………………………………………………………………. Draft




Name Designation Affiliation

Ferdl Graser SDP Systems Engineer Space Advisory Company

Signature & Date:

Name Designation Affiliation

Paul Alexander SDP Project Lead University of Cambridge

Signature & Date:

Version Date of Issue Prepared by Comments

0.1

ORGANISATION DETAILS

Name Science Data Processor Consortium

Signature:

Email:

Signature:

Email:

Ferdl Graser (Feb 10, 2015)Ferdl Graser

[email protected]

Paul Alexander (Feb 10, 2015)Paul Alexander

[email protected]

https://secure.echosign.com/verifier?tx=XJEUFNC54XI3C4V

https://secure.echosign.com/verifier?tx=XJEUFNC54XI3C4V




1 CONTENTS

1 Contents................................................................................................................................... 3

2 List of Figures ........................................................................................................................... 7

3 List of Tables ............................................................................................................................ 7

4 References ............................................................................................................................... 8

4.1 Applicable Documents ..................................................................................................... 8

4.2 Reference Documents ...................................................................................................... 8

5 Introduction ........................................................................................................................... 10

5.1 Scope .............................................................................................................................. 11

5.2 Assumptions ................................................................................................................... 11

6 Hardware Compute Platform ................................................................................................ 12

6.1 Hardware Cost Estimation Methodology ....................................................................... 12

Hardware Model ..................................................................................................... 12

Data and Processing Requirements ........................................................................ 12

Estimate Costs for Build, Ship and Test .................................................................. 12

Overview of the SDP ............................................................................................... 12

6.2 Compute Island .............................................................................................................. 13

6.3 Buffer .............................................................................................................................. 15

6.4 Compute Node ............................................................................................................... 16

6.5 Interconnect System - Bulk Data Transport ................................................................... 17

Third Stage – Implicit to Compute Island Configuration ........................................ 17

Second Stage Networking ....................................................................................... 18

First Stage Networking ............................................................................................ 18

Network Redundancy ............................................................................................. 20

6.6 Low-Latency Networking................................................................................................ 20

6.7 Management and Archive .............................................................................................. 21

7 Hierarchical Storage (was called Science Archive) ................................................................ 22

7.1 Long Term Storage & Medium Performance Buffer ...................................................... 22




7.2 Delivery Platform Hardware .......................................................................................... 24

7.3 LMC Hardware ................................................................................................................ 24

8 SDP Software Cost Estimation ............................................................................................... 24

8.1 Estimation Methodology ................................................................................................ 24

8.2 Effort Estimation ............................................................................................................ 24

8.3 Labour Rates ................................................................................................................... 24

8.4 Software Estimation Assumptions ................................................................................. 26

AIV effort ................................................................................................................. 26

Documentation effort ............................................................................................. 26

8.5 Software Compute Platform .......................................................................................... 27

Compute Operating System .................................................................................... 28

Middleware ............................................................................................................. 28

Hierarchical Storage Management ......................................................................... 28

Application Development Environment and SDK ................................................... 29

Scheduler ................................................................................................................ 29

8.6 Data Layer....................................................................................................................... 29

Data Manager ......................................................................................................... 30

Data Lifecycle Manager ........................................................................................... 30

Science Archive Software ........................................................................................ 30

Local Database Services .......................................................................................... 30

Ingest Data from CSP into Data Layer ..................................................................... 31

8.7 Pipeline Components ..................................................................................................... 31

Processing Library ................................................................................................... 31

Science Analysis Pipeline Software ......................................................................... 32

Non-Imaging Software ............................................................................................ 32

Imaging Pipeline Software ...................................................................................... 33

Ingest Pipeline Software ......................................................................................... 36

Calibration Pipeline Software ................................................................................. 36

Image Space Search Engine .................................................................................... 36

Algorithmic Software .............................................................................................. 36




Sky Model Use and Creation ................................................................................... 37

9 Data Delivery Platform .......................................................................................................... 37

9.1 Tiered Data Transfer Service .......................................................................................... 37

9.2 User Portal ...................................................................................................................... 37

9.3 Data Discovery Service ................................................................................................... 37

9.4 Data Visualisation Service .............................................................................................. 38

9.5 Regional Centre Interface .............................................................................................. 38

10 Local Monitoring and Control ............................................................................................ 38

10.1 Local Telescope Model ................................................................................................... 39

10.2 Data Flow Manager (LMC) ............................................................................................. 39

10.3 QA Monitoring ................................................................................................................ 39

10.4 User Interfaces ............................................................................................................... 39

10.5 Master Controller and Error Handling ........................................................................... 39

10.6 Event Monitoring and Logging ....................................................................................... 39

10.7 Data Flow Models ........................................................................................................... 39

10.8 Task Management and Control ...................................................................................... 40

11 System Level Tasks ............................................................................................................. 40

12 SDP Early operations (Not changed SINCE M7) ................................................................. 41

12.1 Early operations costs scope .......................................................................................... 41

12.2 General principles applied ............................................................................................. 41

12.3 Hardware Compute Platform ......................................................................................... 41

Compute Island ....................................................................................................... 41

Buffer ...................................................................................................................... 42

SDP Infrastructure ................................................................................................... 42

Hierarchical Storage ................................................................................................ 42

Interconnect System ............................................................................................... 43

Compute node OS development ............................................................................ 43

Hardware support ................................................................................................... 43

Hardware maintenance .......................................................................................... 43

Archive storage ....................................................................................................... 43




Archive buffer storage ......................................................................................... 43

Archive media ...................................................................................................... 44

Archival network core switches .......................................................................... 44

Interconnect system ............................................................................................ 44

Non-Domain Software ......................................................................................... 45

Domain Software ................................................................................................. 45

System Level tasks (SDP overall) ......................................................................... 45

13 SDP Operations costs (NO CHANGE SINCE M7) ................................................................. 46

13.1 General principles applied ............................................................................................. 46

13.2 Hardware ........................................................................................................................ 46

Compute System Hardware .................................................................................... 46

13.3 Infrastructure ................................................................................................................. 48


13.4 Long Term Storage (Archive) .......................................................................................... 48



13.5 Interconnect system ....................................................................................................... 49


13.6 Non-Domain & Domain Software .................................................................................. 49

Maintenance of developed software ..................................................................... 49

Documentation ....................................................................................................... 49

COTS software maintenance .................................................................................. 50

Archive HSM licenses .............................................................................................. 50

System Level tasks (SDP overall) ............................................................................. 50




2 LIST OF FIGURES

Figure 1: US outline plans [RD02] for Exascale ............................................................................ 10

Figure 2: Schematic representation of the SDP Costed Hardware Concept showing the

unidirectional Ethernet Bulk Data Network (BDN) supporting Ingest to a High Performance

Buffer located on the Compute Islands. Data exchange between Compute Islands is supported

by an orthogonal bidirectional Low-Latency Network (LLN) currently costed as Infiniband.

Science products are delivered to Storage Pods for intermediate and long-term storage over a

bi-directional Ethernet network for onward user delivery. ....................................................... 13

Figure 3: IEEE Prediction for 40GbE availability in commodity x86 Servers .............................. 19

Figure 4: Infiniband Roadmap [RD11] SDR - Single Data Rate, DDR - Double Data Rate, QDR -

Quad Data Rate, FDR - Fourteen Data Rate, EDR - Enhanced Data Rate, HDR - High Data Rate,

NDR - Next Data Rate ................................................................................................................... 21

Figure 5: ASTC Technology Roadmap .......................................................................................... 23

Figure 6: The Science Data Processor System stack, showing the relationship between the level

2 elements of the product tree. Boxes are arranged so that each box is allowed to use only the

boxes below it. Furthermore, the horizontal partitioning of boxes into columns is

approximately arranged so that boxes in vertical alignment tend to be used together. The boxes

are colour coded to reflect the L1 elements of the product tree that they come from: Computer

Hardware (orange) Computer Software (gray), Pipelines (green), Data (yellow), Deliver

Platform (black), LMC (blue). ....................................................................................................... 27

3 LIST OF TABLES

Table 1: Configuration of a Compute Island ................................................................................ 15

Table 2: Cost trajectory for non-volatile and volatile storage No discount has been applied to

this pricing. We have assumed a ratio of 9:1 SATA:SSD in the costing estimate ...................... 15

Table 3: Compute node configuration * depending on working memory set [AD05] .............. 17

Table 4: Cost trajectory for High Speed Ethernet No discount has been applied to this pricing.

....................................................................................................................................................... 19

Table 5: Estimated costs (Euro) for various over-subscribed FDR Infiniband networks for various

telescopes ..................................................................................................................................... 21

Table 6: Estimated Labour Rates ................................................................................................. 26

Table 7: Estimated effort based on Meerkat............................................................................... 32

Table 8: Line of Code (LOC) production rate estimates ................................................................ 34

Table 9: AWImager LOC analysis .................................................................................................. 34




4 REFERENCES

4.1 APPLICABLE DOCUMENTS The following documents are applicable to the extent stated herein. In the event of conflict

between the contents of the applicable documents and this document, the applicable

documents shall take precedence.

Reference Number Reference

[AD01] PDR.01 SKA.TEL.SDP-0000013 - SKA SDP Architecture

[AD02] PDR.02.01 Sub-element design: COMP

[AD03] PDR.02.02 Sub-element design: Data Delivery

[AD04] PDR.02.04 Sub-element design document: LMC

[AD05] PDR.05 SKA-TEL-SDP-0000040 Parametric models of SDP Compute Requirements

[AD06] PDR.08 Preliminary Plan for Construction

[AD07] PDR.02.05 Sub-element design document: PIP

[AD10] PDR.07A Cost Spreadsheet Revisions

[AD08] PDR.11 Preliminary Integrated Logistics Support Plan

4.2 REFERENCE DOCUMENTS The following documents are referenced in this document. In the event of conflict between the

contents of the referenced documents and this document, this document shall take precedence.

Reference Number Reference

RD01 https://asc.llnl.gov/fastforward/

RD02 http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/talk4-Harrod.pdf

RD03 http://insidehpc.com/2014/11/slidecast-nvidiaibm-build-two-coral-100-petaflop-supercomputers-2017/

RD04 http://ark.intel.com/products/64595/Intel-Xeon-Processor-E5-2670-20M-Cache-2_60-GHz-8_00-GTs-Intel-QPI

RD05 http://www.anandtech.com/show/6446/nvidia-launches-tesla-k20-k20x-gk110-arrives-at-last

RD06 http://www.storagereview.com/ssd_vs_hdd

RD07 http://www.storagereview.com/ssd_vs_hdd

RD08 http://www.colfaxdirect.com/store/pc/home.asp

RD09 http://www.anandtech.com/show/8729/nvidia-launches-tesla-k80-gk210-gpu.

RD10 http://content.yudu.com/A2097a/SCWDEC12JAN13/resources/14.htm

RD11 https://cw.infinibandta.org/document/dl/7580

https://drive.google.com/open?id=1Y40xF859USSDaGqFe4U5nJ8y19JdpPRJDjfxqTLtUU8&authuser=0

http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/talk4-Harrod.pdf

http://www.exascale.org/bdec/sites/www.exascale.org.bdec/files/talk4-Harrod.pdf













https://cw.infinibandta.org/document/dl/7580




RD12 http://mellanox.com/configurator

RD13 https://www.backblaze.com/blog/backblaze-storage-pod-4/

RD14 http://www.idema.org/?page_id=5868

RD15 http://www.45drives.com/products/order/dw-redundant.php

RD16 https://jira.ska-sdp.org/secure/attachment/13501/OracleLicenses.xls

RD17 https://jira.ska-sdp.org/browse/PDR-134?focusedCommentId=30502&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-30502

RD18 http://www.atnf.csiro.au/research/pulsar/tempo2/

RD19 http://psrchive.sourceforge.net/

RD20 http://stackoverflow.com/questions/966800/mythical-man-month-10-lines-per-developer-day-how-close-on-large-projects

RD21 http://www.green500.org/lists/green201311&green500from=1&green500to=100

RD22 https://www.tacc.utexas.edu/stampede/

https://www.backblaze.com/blog/backblaze-storage-pod-4/

http://www.idema.org/?page_id=5868

http://www.45drives.com/products/order/dw-redundant.php

https://jira.ska-sdp.org/secure/attachment/13501/OracleLicenses.xls

https://jira.ska-sdp.org/browse/PDR-134?focusedCommentId=30502&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-30502



http://www.atnf.csiro.au/research/pulsar/tempo2/

http://stackoverflow.com/questions/966800/mythical-man-month-10-lines-per-developer-day-how-close-on-large-projects

http://stackoverflow.com/questions/966800/mythical-man-month-10-lines-per-developer-day-how-close-on-large-projects

http://www.green500.org/lists/green201311&green500from=1&green500to=100

http://www.green500.org/lists/green201311&green500from=1&green500to=100




5 INTRODUCTION

This section provides an estimate of the costs for the SDP for a phased delivery of system

components from 2018 onwards. The starting point for the cost estimate of hardware

components is commercial pricing available today; best estimates are presented for potential

discounts given the large volumes; and performance is extrapolated based on potential benefits

through technological improvements (e.g. Moore's Law, evolution of standards, etc.).

The SDP is designed to satisfy the computational, data management and archival aspects of the

SKA. Unlike traditional HPC environments, the SDP is designed with an emphasis on being data-

driven. The SDP needs to provide Exascale-class performance for in-situ scientific analysis and

derived science products. Speculation on the IT technology components that will be available to

deliver such a system in the 2018+ time-frame while operating within the envisaged power

budget is difficult. To set the scene for this, the USA is predicting an Exaflop system being

available as prototype in 2021 (see below), with a power usage of order 20-30 MW and

potentially with a budget of $200 million. In order to meet this challenge, the USA is in the process

of establishing plans for prototype compute node implementations during 2014 [RD01] which

will be developed over the next 2 years and act as demonstration points towards a Petascale

compute node prototype, an overall Petascale prototype system and thence the subsequent

Exascale system. In addition to this development programme, the US had also invested in 2 multi-

PFlops (c. 100PFlops) systems [RD03] costing some $300M for installation in 2017/8.

Figure 1: US outline plans [RD02] for Exascale




5.1 SCOPE The basis of the cost model assumes technology components as described in AD02 to deliver

buffer, network, management and processing requirements allied to potential developments in

the IT industry and commodity components. As such, we envisage the basic elements of the

system will be compatible with current parlance: commodity compute nodes, delivering an

overall efficiency based on the arithmetic intensity for a particular application/algorithm; storage

in terms of working memory for compute nodes and implementation of the buffer; networking

in terms of the necessary data transport mechanisms from ingest to the compute nodes; and

potential inter-compute node communication and subsequent archival [AD01].

The design equations [AD05] were used as input to the cost model. This cost model represents a

single point in the solution space and a thorough trade-off analysis will be performed through to

CDR to find an optimum, and possibly improved, point in the solution space.

5.2 ASSUMPTIONS A number of the assumptions and conclusions used in this document should be viewed from a

perspective of potential disruption in the next 2-3 years as mentioned above. However, in the

absence of any definitive information that can be disclosed at this present time, the information

provided here offers a means to speculate on potential future systems. The assumptions that

have been made in the cost analysis are summarised below:

● Computational Performance – An approximation to Moore's Law has been applied to

price/performance. We assume a doubling of performance at constant price every 24 months.

● Storage – Moore's Law has been applied to assume a halving in the unit cost of storage

(DRAM, SSD, SAS) every 24 months up to 2020 and 36 months for beyond.

● Storage Performance estimated as –

● 50 GBytes/sec [RD04] - DRAM

● 250-500 GBytes/sec from [RD05] - GDRAM

● 140 Mbytes/sec from [RD06] - SAS/SATA,

● 500 Mbytes/sec from [RD07] - SSD

● Low-Latency network to support inter-compute node communication has been based on

[RD08] .

● Network – The price/port for 10GbE and 40GbE switches will reduce under market

pressure as “Big Data” drives large-scale data centres.

● Cables – These will remain constant for 10GbE and 40GbE connections. Current R-NICs

supporting RDMA will not be CAT 6. SMF transceiver costs will decrease.

http://www.colfaxdirect.com/store/pc/home.asp

http://www.colfaxdirect.com/store/pc/home.asp




Not covered:

● Computational Performance – Use of application-specific accelerators, FPGA.

● Network – Connections from CSP may be 100GbE but this is not factored in.

● Novel proprietary networking and I/O technologies that may become available as CPUs

and I/O become more tightly integrated, even though this undoubtedly will have a big impact.

● Advances in non-volatile memory technology

Where appropriate these are discussed in [AD02] and supporting documents.

6 HARDWARE COMPUTE PLATFORM

6.1 HARDWARE COST ESTIMATION METHODOLOGY

Hardware Model

The model is based on current technology components in order to provide an initial Costed

Hardware Concept. The applicability of this model for 2018+ is therefore evolutionary. No

consideration of co-design or the use of application specific hardware is offered in terms of

packaging.

Data and Processing Requirements

The dominant factors affecting the cost of the SDP are based on the processing requirement of

large amounts of data emanating from the CSP and subsequent processing in real and pseudo

real-time. The input data rates and performance requirements for a "double-buffered" 12-hour

period (viz. 6 hour operation of telescope with buffering and 6 hour processing of the previous

buffer are modelled.)

Estimate Costs for Build, Ship and Test

A cost per Compute Island has been estimated but it is not shown here.

Overview of the SDP

A schematic representation of the SDP is shown below for which raw data can be viewed as being

pushed down (in the context of the diagram) from the Central Signal Processor (CSP) through a

uni-directional, multi-stage bulk data processing network, culminating at a series of Top-of-Rack

switches to processing and storage elements housed in compute islands which are inter-

connected by a secondary network, thus non-congesting with the main bulk data network, finally

connected to an archive network for science data products.




Figure 2: Schematic representation of the SDP Costed Hardware Concept showing the unidirectional Ethernet Bulk Data Network (BDN) supporting Ingest to a High Performance Buffer located on the Compute Islands. Data exchange between Compute Islands is supported by an orthogonal bidirectional Low-Latency Network (LLN) currently costed as Infiniband. Science products are delivered to Storage Pods for intermediate and long-term storage over a bi-directional Ethernet network for onward user delivery.

6.2 COMPUTE ISLAND A Compute Island consists of several interconnected compute nodes (see below). Each compute

island has associated infrastructure and facilities such as shared file systems, management

network and master and control node(s). This makes each compute island largely independent

of the rest of the system. The size of the SDP will be expressed by the number of compute islands

it contains - a parameter that will be freely scalable due to the compute islands’ independent

nature although factored by the interconnectivity of the network. Most of the infrastructure will

be similar between the three SDPs, but it is conceivable that the size of an island (e.g. the number




of compute nodes within an island), or the compute node design itself differs between

telescopes. This may be due to the balance between compute / IO ratios for the different

telescopes although there is a strong preference to maintain a high degree of commonality.

While the total useful capacity of the Science Data Processor depends on many components, we

identify three major defining characteristics that we will use scale the system.

Total capacity

Capacity per Compute Island

Characteristics per node

The total capacity is defined by the number of compute islands that are available. This top-level

number, the aggregate peak performance (Rpeak) [AD05] expressed in PFlops, is defined by the

number of compute islands that make up the Science Data Processor and the capacity per

compute island. While this number is a useful way to express the size of the system, its usefulness

is limited since it does not take computational efficiency into account. Ideally total capacity of

the system would be defined by the science or system requirements, but considering the

constraints introduced above, it is more likely that total capacity will be defined by the available

budgets (energy, capital or operational).

The capacity of a compute island is defined by the number of nodes per island and the

characteristics of these nodes. This capacity is expressed in computational capacity, i.e. TFlops,

but it is likely that computational capacity will not drive the sizing of the compute islands. Island

capacity is defined by the most demanding application, in terms of required memory, network

bandwidth, or compute capacity that requires a high capacity interconnect.

The basic building block of a compute island is the compute node. The characteristics of these

nodes are defined by design equations [AD05], but within these bounds a vast number of valid

node designs can be identified, especially when taking into consideration vendor roadmaps. The

SDP parametric model define a number of ratio rules that describe suitable node designs. Within

the bounds of these rules, cost energy efficiency and maintainability are considerations that may

be used to select optimal node implementations. Operational costs, in particular energy versus

deployment and maintenance cost, will play a key role in this decision. It is clear that this decision

cannot be made until more information is available on the likely technology options available for

nodes.

At this point in time, we are not in a position to consider future technologies and as such an

estimate of a potential compute island would comprise:

Compute Nodes 56 Based on ToR switch size with uplinks to the next stage

Archive Gateway Nodes 2 Maintaining archival capabilities




Login/Service/Master Nodes 2 H/A Cluster and Resource Management

Remote Service Node 1 Managing boot for diskless nodes

Rack Infrastructure 1 Based on 42U current capability

Build and Integration 1 Cost for Scalable Unit (not included in hardware cost)

Scalable Unit Connectivity Discussed below in networking

Buffer Discussed below in Buffer Table 1: Configuration of a Compute Island

As yet no allowance in the cost has been made for additional rack-cooling to accommodate heat

density and power requirements [AD06]. In addition, further packaging may be required to

support multiple accelerators and buffers although this needs to be assessed in relation to the

evolution of the compute node model. It should be noted that while in this scenario a Compute

Island maps into a single rack, this may not be the case if the node characteristics or design

changes, which is highly likely.

6.3 BUFFER Following the model of a highly data-driven architecture, the buffer requirement was modelled

as being distributed locally across compute nodes within a Data Island and not shared globally.

Based on the number of compute nodes this is likely to be many TBytes/node and as such

packaging will be critical to provide not only the necessary bandwidth but also the capacity per

node - this will undoubtedly require a refinement to the compute node architecture assumed

here.

The amount of buffer space has been taken from [AD05]. To cost this we have used the pricing

for both volatile and non-volatile storage media using the following principles:

Storage Pricing/Power

€/TB Watt/TBytes

2014 2018 2020 2022 Assumed Constant

SATA 30 8 5 4 0.625

SAS 80 20 13 11 0.75

SSD 600 150 100 86 1

DRAM 5000 1250 833 714 250 Table 2: Cost trajectory for non-volatile and volatile storage No discount has been applied to this pricing. We have assumed a ratio of 9:1 SATA:SSD in the costing estimate




6.4 COMPUTE NODE The characteristics of a SDP compute node, in this estimate, have been defined by maximising

the computational capability for a set of imaging techniques [AD05]. This has led to a compute

node design which must deliver

a. A relatively high degree of efficiency on Accelerator-assisted CPUs.

b. A substantial amount of buffer/node (10s to 100s TBytes)

c. High I/O capability to deliver connections to multiple network.

The actual amount of independent processing performed by each compute node will be largely

defined by the input dataset and type (as defined by data object), processing performance and

dataset size.

For cost estimates we assumed each compute node being similar to that which is widely used in

moderately sized HPC cluster systems in research and academia. This comprises a single or dual

socket Intel Xeon CPU (Host) + GPU accelerator(s). Our current model assumes that the

"majority" of processing will be performed on the accelerator(s) and modelled as such. There will

be tasks within the processing pipeline not suited to GPU implementation and assumed to run

locally on the Host CPU. This is not explicitly factored into our estimates. The form-factor for the

compute node will be highly dependent on the requirements for I/O connectivity and accelerator

hosting, especially given the constraints of current Xeon architectures and PCI connectivity

through the chip-set. This limits the number of PCI devices achievable within a particular form

factor and may also imply different types of node for different telescopes.

Current nodes of the assumed type (e.g. using Nvidia Tesla K20 GPUs), typically yield peak double-

precision performance of 1-1.5 TFlops (some competing devices by AMD already offer 2 TFlops).

We assume Moore's Law to hold for GPU computing power until 2018. As contingency we

conservatively assume a mere further doubling of GPU compute power between 2018 and 2022.

These assumptions are combined with the performance requirements [AD05] to estimate the

number of required accelerators to the 2018+ timescale. List prices at launch together with

power ratings for the Tesla range are provided here [RD09]

We have applied an efficiency value for the node performance based on this peak performance

trajectory, and need to continually monitor and validate this by prototyping. How the memory

bandwidth for these devices will scale with processing performance is highly speculative. For the

purposes of this exercise we have assumed that the overall efficiency will be constant. We

assume that the power per GPU will remain constant at around 300W (although experiments

with the Wilkes cluster in Cambridge suggest that efficiency can be maintained even where GPUs

are de-clocked, thus improving energy consumption). Based on the number of compute nodes,

we can now begin to construct a network supporting many thousands of these units.

The price performance of a node has been based on components available today and in particular

a COTS server consisting of the following:




Compute Node Qty

Xeon Ivy Bridge 2

Accelerator 2

DRAM GBytes* 64-1024 GBytes

10GbE 2

IB FDR/40GbE 1

IB/10GbE Cable 3

Power (W) ~800 Table 3: Compute node configuration * depending on working memory set [AD05]

The price in Euro is based on a 40% discount off list assuming the large volumes that will be

necessary. (This will need to be reviewed and is highly dependent on procurement strategy).

6.5 INTERCONNECT SYSTEM - BULK DATA TRANSPORT Assuming that Ethernet will be the network of choice for constructing the multi-stage bulk data

transport network from ingest ports to each of the compute islands, we can size an Ethernet

network based on a number of links down to compute nodes and a level of over-subscription up

to the next stage, while encompassing a certain degree of redundancy.

To estimate the network cost, we have assumed that a number of compute nodes will be

aggregated by a ToR (Top-of-Rack) switch and these ToR switches, in-turn, aggregated into a

multi-stage network from the CSP->SaDT->SDP with an intermediate stage providing routing

through a Software Defined Networking (SDN) as shown in Figure 2.

A first level of approximation is that the network will be constructed by a 3-stage network in

which a degree of over-subscription can be tolerated. The tolerance will be a function of the anti-

congestion behaviour of the network as a whole and in an over-subscribed network this will be

mitigated by a SDN sensitive to the traffic pattern across the network. The design of the switch-

stack is such that we deploy a 1st stage switch layer to accommodate the direct CSP links; a 2nd

switch stage to fan-out these links to afford a level of redundancy and a 3rd stage (implicit to the

Compute Island) to provide aggregation of final egress of ingest into the buffer, ready for

processing. Networks architectures of this form are found in large-scale IT system, although

typically follow the nomenclature of core (1st-stage), aggregation (2nd) and access (3rd stage).

Third Stage – Implicit to Compute Island Configuration

As described above the 3rd stage of networking as helped define the physical instantiation of a

Compute Island, a useful unit of scalability, albeit the concept of a ToR switch may well become

redundant as models for disaggregation and microserver packaging become mainstream. The

level of aggregation is driven by the size (arity) of the ToR switch – today approximately 48-64

ports of 10GbE each [RD08]




The costing exercise has been based on the cheapest high-port count High Speed Ethernet

(10GbE) switch available which supports a 56D16U (3.5x over-subscription). Currently each port

is an SFP+ connection as Cat6 switches are not yet freely available and have lower port counts as

they tend to consume more power. It is anticipated that Cat6 capability will evolve in the near

future reflecting LOM (Landed on motherboard) 10GbE capability in future motherboards.

Whether such NICs will support RDMA (e.g. iWarp or RoCE [RD10]) will depend on adoption,

although low-latency is not the prime consideration here. For cabling, this cost has been

estimated based on 10GbE SFP+ Cu cables within a rack. It should also be noted that the recently

announced 25GbE standard, driving higher port count devices may have a significant effect on

pricing.

Second Stage Networking

To construct the second stage of networking we consider the over-subscription (cost reduction

factor) based on the number of uplinks available from each ToR switch in the 1st stage of network.

For the purposes of initial cost estimates the number of 40GbE uplinks is chosen as 4. Thus the

total number of 2nd stage networking will be based on the total number of uplinks from the 3rd

stage and the amount of downlinks from the CSP which has been assumed again to be based on

the highest port count 40GbE spine (ToR) switch which presents 36x40GbE (to reiterate, for

simplicity we have not assumed more expensive core switches for this stage of the estimate,

although this may change as the solution is refined). This means that the amount of over-

subscription may vary between the Low, Mid and Survey telescopes depending on ingest rate.

The amount to which this network can be over-subscribed will require validation based on

management of flows and minimization of congestion. Cable costs, at this time, are difficult to

estimate both in terms of cost and in the lengths required. For the purposes of this exercise an

estimate for a QSFP connector with an average price of 150 Euro is made.

First Stage Networking

A first stage of networking is required conforming to the CSP input data rates and number of

channels [1] will be constructed as defined by the ingest data rate of the visibility buffer. On the

basis of this, the number of 40GbE or 100GbE channels can be calculated. At the time of writing

it is unclear which interface speed will be available for CSP connection and therefore the model

assumes 40GbE. The actual choice of this, similarly to the 2nd stage network, is again based on

40GbE spine switches as opposed to core switches. Furthermore the model for oversubscription

assumes a non-congesting network and that the proposed SDN environment manages flows in

such a manner to provide this balance. Cables for this have been estimated on a number of Single

Mode Fibre modules optics to form a “Patch Panel” to 1st stage network. In summary, the

following have been assumed to be applicable per port pricing for 2018+ based on a 2014 port

price of 10GbE of around 169Euro and 600Euro for 40GbE (excluding cables). This is based on

SFP+ and QSFP physical connections.




Price/Port for ToR (Euro) Watt/Port

Year 2014 2018 2020 2022 Assumed Constant

10GbE 180 57 57 57 3

40GbE 1000 200 150 150 3.5 Table 4: Cost trajectory for High Speed Ethernet No discount has been applied to this pricing.

While this model of ToR switch may well be appropriate for the next 5 years, it may well evolve

in the next 5-10 years to a model of network disaggregation or micro-server in which the notion

of ToR becomes moot. It is also not clear whether the density of a ToR switch will also increase

(c.f. the recently announced 25 GbE standard). Furthermore the actual performance and cost of

the network will be driven by the end-to-end connectivity and bandwidth. Currently 10GbE (or

56 Gbps IB) is state-of-the-art, however we may expect 40GbE to be more cost-efficient in 2020

as shown in the diagram below taken from the IEEE although this will have consequences in terms

of the configuration of the Compute Island. If 10 or 40GbE becomes the de-facto on-board

connectivity we would expect significant price drops.

Figure 3: IEEE Prediction for 40GbE availability in commodity x86 Servers




Network Redundancy

Depending on the ultimate strategy taken on the network, higher degrees of redundancy may be

an appropriate tactic to employ. This has not been modelled as yet and will be addressed through

CDR and form part of the analysis conducted within the ILS activities [AD08].

6.6 LOW-LATENCY NETWORKING Re-ordering of data within the buffer will necessitate the means to exchange data across Data

Islands and hence a mechanism to cater for inter Compute Island communication and to mitigate

any potential congestion of the bulk data transport for intra Compute Island communication a

separate application network has been accommodated. This is currently based on Infiniband FDR

technology, although this may well change in the timescales under discussion here, to high-speed

Ethernet supporting RDMA as these technologies merge. This network is termed the low-latency

network as to differentiate it from the bulk data transport which is latency tolerant and

bandwidth driven and in the case of the latter lacks the necessary flow-control methods and QoS

(notwithstanding any SDN capabilities) offered by traditional Ethernet switch stacks. Although

full bisectional bandwidth is easily supported within a compute island, similar support between

islands becomes more difficult as cable management and cost estimation become a burden,

similar to that for the bulk data transport. In addition at very large node counts multiple stages,

particularly within a fat-tree topology, impose increasing amounts of memory on the compute

node in support of collective operations although there are a number of initiatives to ameliorate

this problem for particular message passing schema. Topologies, other than fat-tree, may well be

appropriate although these may increase the uncertainty on cable costs and management. For

completeness the Infiniband roadmap is shown here as a reference although other technologies,

in addition to RDMA Ethernet, are planned by various vendors.




Figure 4: Infiniband Roadmap [RD11] SDR - Single Data Rate, DDR - Double Data Rate, QDR - Quad Data Rate, FDR - Fourteen Data Rate, EDR - Enhanced Data Rate, HDR - High Data Rate, NDR - Next Data Rate

The cost of this network has been estimated based on a 8:1 oversubscribed network using the

cluster configuration tool [RD12] and the number of compute nodes defined in [AD09]. The costs

(available today) for either a 3:1 or 1:1 network are also estimated here:

OVER-SUBSCRIPTION LOW MID SUR

1:1 2,039,175 5,796,533 9,996,203

3:1 917,044 2,149,628 3,014,756

8:1 647,269 1,491,761 2,078,085

Table 5: Estimated costs (Euro) for various over-subscribed FDR Infiniband networks for various telescopes

6.7 MANAGEMENT AND ARCHIVE A separate (10GbE) network has been provisioned for support of the archive. Please note that

this is shown in relation to the full archive for 2022.

The management network is provisioned in terms of standard 1GbE network switches and

included in the price of a compute island




7 HIERARCHICAL STORAGE (WAS CALLED SCIENCE ARCHIVE)

The science archive consists of the following hardware components: medium performance buffer

and long term storage.

The science data archive operational model and the data life cycle of the archived science data

are currently unknown. Therefore a generic storage design (disk based) was adopted for the long

term storage similar to what is being used by online backup and cloud storage vendors. Once the

operational model and data life cycle is understood the archive design can be optimized and

could potentially include tape storage.

The need for a third storage tier, namely that of a slow or intermediate buffer (medium

performance buffer), arises from SUC4, which is understood to require stacking data taken over

many days, weeks or months. Storing test data and support of commissioning activities are other

potential uses of the slow buffer. Since there is no need for double buffering in the slow buffer,

it is half the size of the fast buffer. The size of the fast buffer is taken from AD05. The archive will

therefore consist of tiered storage and this will be optimized in the future to meet the operational

requirements and minimize the total cost of ownership. In order to manage the tiered storage

archive Hierarchical Storage Management software will be required as discussed in 7.4.8.1.

Since the operational model for the archive is unclear, no assumptions have been made on the

level of data redundancy that is required apart from providing a second copy of the data (as per

AD03). All storage volumes are indicated as raw storage volumes.

7.1 LONG TERM STORAGE & MEDIUM PERFORMANCE BUFFER This is a vendor independent storage pod design based on commodity components. To project

the cost to 2022 the storage density was scaled according to industry roadmaps, see figure 5.

Both the medium performance buffer and the long term archive storage is based on the [RD13]

. This is a vendor independent storage pod design based on commodity components. The storage

pod consists of a 4U enclosure which takes 45 3.5” consumer SATA hard disk drives. The standard

connectivity is 1GbE and thus we add a 10GbE Ethernet card for connectivity to the archive

network. To scale the costs from the 2014 design to 2022 we scale the storage density per hard

disk according to the ASTC (Advanced Storage Technology Consortium) technology roadmap

[RD14]:




Figure 5: ASTC Technology Roadmap

As per the ASTC technology roadmap a Compound Annual Growth Rate (CAGR) of 30% is being

used to extrapolate storage density to the 2022 timeframe.

For the medium performance buffer we use the standard storage pod configuration and for the

long term storage we use the storage pod in a MAID (massive array of idle drives) configuration

to reduce operational costs (power consumption). There is potential for further reduction of

operational costs due to reduced costs for the included, constant overheads for networking per

port and server costs. In addition it is likely that the disk failure rate for MAID configurations is

lower than the calculated 4% per year. Neither of these factors have been included, and thus the

calculated estimates can be regarded as conservative.

The estimated cost of the hardware (excl hard disks) of the storage pod is based on a recent

procurement by ICRAR from [RD15]. The cost of a storage pod is €4219 (incl shipping to Australia).

A cost of €500 was added to the storage pod to account for the assembly, test and burn-in of the

storage pods. The cost of the rack has been included in the capital cost of the pod (assuming 10

4U pods per 42U rack). A typical price (2014) of €144 per 4 TB hard drive was used. This is a

conservative estimate as currently 4 TB internal SATA hard drives are readily available for €120.

Volume discounts can reduce the cost of hard drives further and therefore a conservative 20%

discount was used. The cost of the storage pod including media is thus €57888 per PB (excluding

network switches and cables).




7.2 DELIVERY PLATFORM HARDWARE We are planning on 6 servers to host DELIV services and provide resiliency at each telescope.

These will run all of the user access servers, the data discovery database and the tools to schedule

the data movement. We are also costing out storage that would be used to support the database

back-ends and data caching for data that is being visualised or moved to a site. The services that

are deployed need to be configured in a resilient manner, therefore it is important that we have

redundant servers available to run the different services that are foreseen within the DELIV

activity as well as their possible replacements. Therefore we see a 50% replacement rate as being

suitable, i.e., there are four main services which will each have their own hosting system and two

spare resources able to host any of the other services. Less resilience than this could leave the

system in a state, due to multiple failures, that would result in inaccessibility.

7.3 LMC HARDWARE These have not been specifically priced although an allowance has been made for a separate

1GbE management network connecting all components in the system and by a "redundant"

Management Unit rack consisting of 56 commodity Xeon units. This should provide sufficient

management processing resource for the SDP. The exact method for connectivity has not been

discussed.

8 SDP SOFTWARE COST ESTIMATION

8.1 ESTIMATION METHODOLOGY

8.2 EFFORT ESTIMATION For software development several techniques or methodologies can be used to estimate effort.

So far, a mixture of effort estimation techniques have been used depending on the current

understanding of a particular software component. In areas where the design has not been

defined yet, the Wideband Delphi method was used. In areas that map closely to existing

software and the existing software is well understood and available, source lines of code

estimation is used. In future when the design has been further defined (and where applicable),

function point analysis may be used to estimate effort.

8.3 LABOUR RATES For domain specific software development (and other domain specific tasks), an average labour

rate has been calculated based on information from these organisations: SKA-SA, UWA, UCAM,

UK National Lab, Astron, NRC.




For non-domain specific software development typical commercial labour rates are used.

Labour Resources SDP rate

(avg) Min rate Max rate

Senior Project Manager € 669 €467 €770

Intermediate Project Manager € 434 €287 €625

Senior Architect € 690 €530 €770

Project administrator € 200 €200 €200

Project Engineer € 598 €467 €742

Senior Engineer € 598 €467 €742

Intermediate Engineer € 434 €287 €625

Junior Engineer € 334 €200 €488

Senior Scientist € 574 €344 €742

Intermediate Scientist € 447 €250 €625

Contracts Specialist € 352 €352 €352

Administrative Support** € 241 €241 €241

Senior Server Support Engineer € 337 €337 €337

Intermediate Server Support Engineer € 250 €250 €250

Junior Server Support Engineer € 189 €189 €189

Consultant Senior Architect € 1,200 N/A N/A

Consultant Senior Engineer € 1,000 N/A N/A

Consultant Intermediate Engineer € 800 N/A N/A

Consultant Junior Engineer € 600 N/A N/A

Senior Network Support € 337 €337 €337

Intermediate Network Support € 250 €250 €250

Junior Network Support € 189 €189 €189




Table 6: Estimated Labour Rates

8.4 SOFTWARE ESTIMATION ASSUMPTIONS For software development effort we have shown and costed the development effort separately

from the AIV and documentation effort that forms part of the overall software delivery.

The costs of Project Management and Systems Engineering have been excluded from this cost

estimation.

AIV effort

As the SDP software development project is likely to be a broadly distributed project, with small

components being developed in isolation of each other and the telescope site, a significant effort

will be needed (at all levels) to integrate these into a coherent whole. We have therefore included

an additional 30% effort (based on the development effort) to account for these activities. This

30% overhead should allow a formal program of integration and QA in a tree-like approach for

the SDP element.

Documentation effort

Due the complexity and scope of the SDP software development project and to comply with the

v3 Level 1 requirements, an additional 15% effort is included to produce documentation. This

effort excludes source code documentation in the form of comments within the source code, as

this forms part of the development effort. This additional 15% effort for documentation is

assumed to be spent after completion of the development tasks.




8.5 SOFTWARE COMPUTE PLATFORM

[AD01] defined the SDP Software Stack as:

Figure 6: The Science Data Processor System stack, showing the relationship between the level 2 elements of the product tree. Boxes are arranged so that each box is allowed to use only the boxes below it. Furthermore, the horizontal partitioning of boxes into columns is approximately arranged so that boxes in vertical alignment tend to be used together. The boxes are colour coded to reflect the L1 elements of the product tree that they come from: Computer Hardware (orange) Computer Software (gray), Pipelines (green), Data (yellow), Deliver Platform (black), LMC (blue).




Compute Operating System

The operating system forms the basis of the software stack and is the interface with the hardware

compute platform. The operating system needs to support all hardware conceivably deployed in

the Science Data Processor, be extremely scalable and, as experience with precursor and

pathfinder experiments has shown, highly tunable. In this respect appropriate levels of resource

will be attributed to the maintenance and support of the O/S ensuring patch maintenance,

release consistency, fault diagnosis and repair, application integrity and upgrade schedules.

Middleware

Middleware components will be developed from existing Open Source software environments.

These consist of the following:

1. Reliable Communication Channels - based on 0MQ or equivalent

2. Event Handling and Logging

3. Cluster and Platform Management

4. Development Environment

5. System Optimization - including behavioural modelling, performance analytics

This has been estimated for 1) and 2) on experience with ALMA in respect of 0MQ or equivalent

and will also include other environments such as MPI (or equivalent) and OFED.

Event handling and logging will include the collection and analysis of compute island metrics;

ingest, bulk-data and archive data transport metrics and flow-rates, buffer metrics and the fusion

of this data in order to provide a system-wide state-of-health for the SDP. Such metrics will

include environmental parameters, performance counters, power consumption and soft and

hard errors. Additionally, methods for the early prediction of failures will be developed.

The effort for 3) is based on the exploitation of open-source frameworks, such as OpenStack, and

will address SDP specific enhancements specifically in terms of, for example, scale; pipeline

instantiation and scheduling; provisioning, management and control and in particular integration

with LMC.

The development environment (4) will include all aspect of code development, instrumentation

and optimization coupled with source-code control mechanisms, the ability to test applications

within a secure and stable environment and provide mechanisms for production roll-out will be

developed.

System optimization (5) will seek to develop tools for system-wide instrumentation and

modelling of the SDP. This task will initially explore open-source simulation and behavioural

modelling tools and their suitability for SDP. These environments will be refined during operation.

Hierarchical Storage Management

Due to the simplicity of the current design of the hierarchical storage (only disk tiers) there is no

need to provision for Hierarchical Storage Management software like IBM Tivoli. The Data Layer




provides basic hierarchical storage management functionality inherently and therefore no

additional software is required based on the current design.

In the event that the hierarchical storage design becomes more complex and or storage tiers

using other types of media are included the cost of a Hierarchical Storage Management system

will have to be included.

Application Development Environment and SDK

The cost for this item is covered in the System Level Tasks section and therefore not costed here.

Scheduler

The estimate of effort required for the scheduler depends on two main assumptions:

1. We can modify an existing Open Source scheduler sufficiently to be suitable for our

application.

2. The scheduler functionality is largely shared between it and the LMC component it

interfaces with.

Both of these assumptions mean that only limited effort is needed for the scheduler. We adopted

a slightly front-loaded approach, having two people working on the scheduler from the beginning

to get the architecture settled, with the senior developer effort reduced halfway through the

project when only implementation remains.

We do see some technology risk in this work package and therefore use an existing scheduler

with added functionality. Modern batch schedulers, like SLURM, have plugin support, which

should allow for this. We do require however that the scheduler interfaces directly with LMC to

estimate the available hardware resources days before an observation (to make a rough

schedule), and immediately beforehand (to help LMC create the physical deployment graph). This

functionality is not usually available, which incurs some risk. This is accommodated for in the

relatively high contingency for this line-item.

8.6 DATA LAYER Costs are largely based on existing projects, in particular ALMA, LHC/CERN, LSST, SDSS, the

precursor MWA and their archives. A detailed cost discussion is part of a SKA Science Book

chapter (.pdf, p.14f) with co-authors from all above mentioned projects. For non-domain

software the costing is split between licenses for commercial solutions and labour for adaptation.

The Compute Island topology is expected to require customisations in close collaboration with

vendors. Since licensing schemes are convoluted and often bundled with hardware, the archive

cost sheet (.xlsx) segregates total cost of ownership into those for storage media, power etc. It

introduces a 100 % fair comparison factor on hardware prices/petabyte to accommodate a

backup copy which is provided by the AWS Cloud option as part of its service.

https://jira.ska-sdp.org/secure/attachment/13503/Delivering%20SKA%20Science%20V2.1.pdf

https://jira.ska-sdp.org/secure/attachment/13502/ArchivePlusSUCcostEstimate.xlsx




Data Manager

The Data Manager is a process running on each data locality and implements the physical

deployment graph in response to the execution graph defined by the LMC Data Flow Manager. It

initiates the physical movement of data and uses the pipeline interface to invoke the processing

components.

Data Lifecycle Manager

DLM is a policy-based approach to managing the flow of data throughout its lifecycle. It provides

basic functions for the archive and the backup system. It requires the adaptation of a suitable

hierarchical storage management system (HSM). A study on HSM capabilities and cost is ongoing.

As with most data layer components it is uncertain where a build or buy analysis will lead. At one

extreme, one can envisage a Cloud solution not requiring any software development for large

parts of the data layer. Another possible outcome is a mix of tightly integrated storage managers

and object stores specifically tailored for a parallel HPC environment requiring a significant

amount of vendor support for adaptation and testing. A custom solution developed within the

project is considered too demanding and hence undesirable.

Science Archive Software

This includes data product ingest, indexing, replication, and access level control governed by the

archive policy. Excluded are backup, recovery, the archive portal, and batch access which are

covered by resources allocated under Data Lifecycle Management as well as the delivery

infrastructure, which explains the relative balance of FTEs.

Precursor projects generally do not have a specific Data Layer element to compare with. This is a

result of the SDP performance requirements and the need to focus on data locality. The ALMA

DB license sheet covers all related systems in the archive as well as the operational domain.

● Oracle License Sheet [RD16]

● General discussion of DB cost models [RD17]

Local Database Services

This is an overarching SDP database service infrastructure. It includes the respective needs of the

Data Layer, Data Delivery, LMC TM caching, scheduler infrastructure, pipelines (catalogues, sky

models), auxiliary metadata, local logging and monitoring, and can support the HPC

infrastructure as needed.

ALMA had one DBA and one DB application developer during the whole construction phase. The

latter worked mainly on the data layer specific DB applications. Even in the very conservative ESO

operational environment development never stopped, with new operational DBs regularly going

online. Also the ASKAP Science Data Archive Team has a DB specialist for the more limited scope

equivalent to the SDP archive and delivery mechanism.




Ingest Data from CSP into Data Layer

Short Description: Data Ingest is the process of taking the bulk data stream from CSP then

aggregating, synchronising it with the metadata stream from TM and mapping this aggregation

onto the SDP fast buffer. This functionality is intertwined with LMC.

SDNs are a relatively novel technology, hence a slightly more elaborate description is: Data

packages from the CSP/SaDT network transport layer are routed to SDP Compute Islands,

aggregated and eventually instantiated as data objects on a fast local buffer. Independent of that,

a metadata stream is broadcasted via LMC to all Compute Islands. The metadata establishes the

context of a data object. Once a data object is instantiated it becomes available to science

pipeline components and the context changes from the hardware centric Compute Island to a

processing oriented Data Island.

8.7 PIPELINE COMPONENTS

Processing Library

Most if not all the cost estimates for the processing library are based on pathfinder and precursor

information.

LOFAR spent 71 FTE on SDP related s/w (Ingest, Calibration, Imaging, Pipeline framework). 3 FTE

were spent on the GSM (rough estimate) and 3.5 FTE were spent on Source Finding (rough

estimate), the latter 2 being done at the Dutch Universities. Of the 71 FTE, 29 FTE were spent in

Pre-Construction (i.e. pre-CDR) and 42 FTE were spent in Construction (i.e. post CDR).

Please note the following: LOFAR had its CDR in April 2007 and was opened for the General

Astronomer in 2012. Hence, the years 2001 - 2006 are considered pre-construction, and the years

2007 - 2013Q1 are considered construction. Post 2012 LOFAR s/w development continues both

in optimizing existing code as in adding new features. This is not included in the current figures.

Post 2012, a team of approx. 5 FTE spends 20% of their time on SDP related s/w maintenance;

i.e. 1 FTE / yr. The total LOFAR software development includes software for TM, CSP, and SDP

related tasks. We have performed a rough extraction of SDP related FTEs. However, sometimes

people work on multiple components making it hard to split up the effort. Effort for the archive

is not included, since those numbers are not available. Effort for the GSM is a rough estimate.

Effort for the Source Finding is an even rougher estimate. The effort for commissioning is not

included. Part of the effort is in research related tasks. The Pre-Construction figures especially

include a lot of research and prototyping. The effort on NIP software is not included in the 71

FTE.

So far MWA spent the following effort on its Processing Library s/w. Imaging and Calibration: 6.5

FTE; Transient Detection: 2.75 FTE* (realistically 3.25 FTE); Source Detection: 2.25 FTE*

(realistically 3.25 FTE); EoR pipelines (there are two totally distinct pipelines for EoR on the

MWA): 31FTE (one cost 20FTE the other 11 FTE). Note that these pipelines represent a full EoR




analysis path from point source and foreground removal through to production of final EoR data

products.

ASKAP spent 26.5 FTE on software development in the period Jul 2006 - Jul 2015. The effort spans

both design and construction and includes Ingest, Calibration, Imaging, Source Finding, Local

Monitoring & Control, and Management. It excludes efforts for the Archive.

Meerkat spent 49 FTE including design / 35 FTE excluding design on software development in the

period 2009 - 2016. The effort can approximately be broken down as follows:

Infrastructure and architecture, incl data transport, excl. archiving

25%

Calibration and Imaging 38%

Ingest 13%

Archiving and storage 13%

LMC & User interfaces 13% Table 7: Estimated effort based on Meerkat

Science Analysis Pipeline Software

The LOFAR Source Finding software (PyBDSM) took approximately 3.5 FTE to develop (rough

estimate), which is comparable to the MWA effort (3.25 FTE). We use this as estimate for the

Postage Stamp Source Detection.

To date no precursor instrument has developed a Rotation Measure Synthesis pipeline. The

POSSUM team for ASKAP has made some progress towards such a pipeline for ASKAP but it is not

yet mature or complete. The estimate of the cost associated with production of such a pipeline

is therefore based on taking the effort required for the well understood case of a Postage Stamp

Source Detection pipeline (3.5 FTE) and adding a multiplier to account for the fact an RM pipeline

requires understanding of complex Faraday spectral products and is an entirely new modality,

giving rise to a total FTE of 4.5.

The Transient Source Detection pipeline on the MWA has taken 2.75 FTE and requires another

~0.5 FTE for completion being a total of 3.25 FTE for just the detection part alone. The imaging

aspects are not included in these estimates. We have used 3.5 FTE for the SKA to account for the

fact the work must be ported to all 3 instruments.

We haven't included the cost of the EoR pipeline due to the full EoR pipeline not being part of

the requirements.

Non-Imaging Software

The non-imaging software consists of the development of 3 real-time pipelines: pulsar timing,

pulsar search and transients. There is therefore effort needed for all three, with some overlap in

the effort required for aspects of pulsar search and pulsar timing, however these save about 20%

of the effort. In all cases we have based the estimates on existing code trees. The




precursor/pathfinders don't provide us with much of a basis because none of them attempt to

operate in real time, and they generally all use the same publicly available codes and string them

together in a pipeline. For LOFAR generating such a pipeline script cost more than 3 FTE alone.

The only area where this isn't the case is the transients where real time systems have been

developed from scratch for LOFAR single stations and Parkes. It is also important to note that the

codes currently used in pulsar timing and pulsar search have been built up over many years by

astronomers in the main. They haven't been developed as real-time, robust, well-documented

and unit tested codes (this is not to denigrate those codes at all!) and so are used here as guide.

For pulsar timing our estimate is based on the TEMPO2 [RD18] software suite and the calibration,

template matching and data manipulation and graphics code from PSRCHIVE [RD19]. Assessing

these codes we believe that approximately 50,000 lines of code are required and we used a code

writing rate of 16 lines-of-code/day which is based on professional code development estimates

and include some design work based on the CDR documentation and detailed testing.

For the pulsar search the estimates are based on the requirement to develop a code to do the

cross-beam sifting of the candidates to find unique and real candidates and also for the

development of a machine learning code which identified the real pulsars in the data stream. We

also need data manipulation tools but those can be the same as those developed for the timing

above. The sifting code is well established in a number of existing search codes and so our

estimate is robust here. The machine learning code is one where we (and others) are currently

doing lots of design work and so we have a reasonable idea, but there is a larger error bar on this

estimate. We have a total code requirement of about 10,000 lines.

The single pulse processing software needs to filter, present and be able to trigger on signals of

interest. As mentioned before our team has been involved in developing such code already and

our estimate is that 5,000 lines of code are required here.

Imaging Pipeline Software

It is likely that the domain specific SKA software will be more complex than the LOFAR s/w and

that effort is needed for further investigation and optimisation. On the other hand we also have

working software suites like CASA, and the ASKAP / Meerkat / LOFAR s/w. As a rough estimate it

is therefore likely that the SKA Processing Library s/w will take as much effort as the total LOFAR

effort. Therefore, the total effort for the Ingest Pipeline, the Calibration Pipeline, and the

Continuum Imaging Pipeline is currently estimated to be 70 FTE, of which 10 FTE is in the area of

Ingest, and 30 FTE both for Calibration and for Continuum Imaging. An additional 15 FTE is then

anticipated for Spectral Line Imaging and another 15 FTE for low-latency Slow Transients Imaging.

This brings the total estimate for the Imaging effort to 60 FTE.

The Imaging effort is then further split up into 30 FTE for Deconvolution and 30 FTE for Gridding

/ FFT. These efforts are then further broken down into 15 FTE for general Imaging development

and 5 FTE specific effort for each of the telescopes: LOW, MID, and SURVEY. Current Imaging




effort is costed based on a combination of ALMA & ASKAP software effort. This effort is divided

up into implementation of generic algorithms, which is expected to be common to all three

instruments, plus optimisation of these generic algorithms for specific instruments. It is expected

that optimisation of more universal algorithms such as the FFT will be undertaken by industrial

partners.

Line of Code estimates:

LOC/hr Mach

.

Source Notes

5.6 (±10%) GPU Cobalt

(Nijboer)

Significant domain specific

expertise

7.0‡ CPU NIP (Stappers) Source only

5.3‡ GPU NIP (Stappers) Source only

3.5‡ FPGA NIP (Stappers) Source only

‡Calculated assuming 1 day = 5.7 hours.

Table 8: Line of Code (LOC) production rate estimates

AWImager example:

Type Code† Comment† Total

Header 1995 716 4177

Implementation 5093 5076 14292

Test Routines 782 67 1088

†A code or comment line must contain an alphanumeric character, thus a single brace or so does

not count as code.

Table 9: AWImager LOC analysis




It is widely accepted that purely LOC driven software costs provide an under-estimate of required

effort for any software project; however, they can be used as a starting point. We use the

development of the COBALT correlator as a guideline LOC estimate here. This recently completed

project involved re-writing existing software from the LOFAR BlueGene correlator for the new

GPU-based COBALT correlator. For comparison we also include generic numbers obtained by

PIP.NIP for different platforms, see Table 8 Line of Code (LOC) production rate estimates. Taking

as a guideline, 1 year = 1400 project hours, i.e. 27 hrs per week on average, these numbers give

a range of 20.0 - 30.2 LOC/day; 94.5 - 151.2 LOC/wk ; 4900 - 7862 LOC/yr.

Online sources, such as this discussion on the popular developer website stackoverflow [RD20]

suggest that 10 LOC/day is a typical number taken over the full duration of a project, but that the

average LOC per day over the actual development period of the project is significantly higher.

This suggests that the LOC/hr numbers here are substantially higher than full duration numbers,

but are consistent with development period numbers for waterfall style projects where

development typically comprises only 20-30% of the total project time.

For AWImager, the source alone comprises 5093 LOC, see Table 9. This would suggest that re-

writing AWImager would take approximately 0.65-1.04 FTE years. Including inline documentation

and header files would increase this figure to approximately 2.5 FTE years and including the

BEAM calculation would add a further 0.5 FTE. This results in approx. 3 FTE years in total. We

note that test routines are not considered here.

AWImager relies heavily on routines from the CASA package. The components of CASA for

calibration and imaging comprise approx. 100 kLOC in total. It is difficult to divide these cleanly

between the two, but scaling from AWImager, it would require approx. 20 FTE years to re-

produce and twice this to include inline documentation. Making the rough assumption that the

CASA LOCs are divided 2:1 between IMG:CAL, this implies that the supporting routines from CASA

would take a further 24 FTE years to reproduce (without double counting for routines common

to CASA & AWImager). This code would then need to be assembled into the individual pipelines

(continuum, spectral, slow). Experience from existing instruments suggests that assembling a

pipeline from existing code requires 1.5-2.0 FTE years of effort. Here we assume a representative

value for the 3 pipelines of 5 FTE years.

Following from the ASKAPsoft software development model we assume a 33% overhead for

instrument specific optimisation and scaling of individual software components.

We note that the numbers for development of the generic software components can be justified

almost completely (29 FTE years rounded to 30 FTE years) on LOC considerations. This suggests

that a certain amount of risk is incorporated here, as such numbers possess an intrinsically large

uncertainty (+/- 10% in the COBALT numbers alone) as well as being expected to represent a

minimum amount of required effort.




Note: The Fast Imaging Pipeline (Slow Transients) differs from the Continuum Imaging Pipeline in

the sense that FFT costs dominate the processing budget. However in principle, there are no

additional components in the pipeline compared to the other imaging pipelines. In terms of

processing speed it may be necessary to implement a different form of FFT (sparse FFTs on GPU

are looking good at the moment; 2 memos in preparation) and a different form of source

detection (since sFFT routines only output significant components, not images). Additional costs

may be required at the LMC Control level, but these should not be substantial.

Ingest Pipeline Software

It is likely that the SKA Ingest Pipeline is more complex than the LOFAR equivalent and that

additional effort is needed for further investigation and optimisation. On the other hand we also

have working software suites like CASA, and the ASKAP / Meerkat / LOFAR software. As a rough

estimate it is therefore likely that the Ingest Pipeline software will take as much effort as the

equivalent LOFAR effort, 10 FTE.

Calibration Pipeline Software

It is likely that the Calibration Pipeline is more complex than the LOFAR equivalent and that

additional effort is needed for further investigation and optimisation. On the other hand we also

have working software suites like CASA, and the ASKAP / Meerkat / LOFAR software. As a rough

estimate it is therefore likely that the Calibration Pipeline software will take as much effort as the

equivalent LOFAR effort, 30 FTE.

The Calibration Pipeline effort is then further split up in 15 FTE for general Calibration

development and 5 FTE specific effort for each of the telescopes: LOW, MID, and SURVEY.

Image Space Search Engine

This FTE effort is associated with the source detection & characterisation required as part of the

calibration loop in order to produce the global sky model. It is based on the s/w expense derived

from the MSSS survey on LOFAR and the global sky model work for MWA which both are of order

0.5 FTE per annum, totalling 2.5 FTE over the 5 year period.

Algorithmic Software

This effort is to ensure that software components are reused rather than redeveloped

independently across different portions of SDP. Such components may originate from external

libraries, for example FFTs, Linear Algebra, etc. or from development activities within the SDP

Library. It aims to ensure that optimal components are used at all times throughout SDP, to

reduce development costs and particularly maintenance costs (including porting). Obviously, this

activity interacts closely with a number of other software development activities in PIP (and,

possibly, other portions of SDP) and the amount of resources required would reflect the location

of the boundary with other activities. It is currently envisaged that 1 FTE per annum would be

required, but portions of the activities may be subsumed within other parts of SDP.




Sky Model Use and Creation

The LOFAR Global Sky Model database took approximately 3 FTE to develop (rough estimate).

We estimate that the SKA GSM will be more complex due to the fact that a larger variety of source

models need to be supported (more science cases and different source models as compared to

LOFAR). Therefore, the SKA GSM development effort is estimated to be 5 FTE.

9 DATA DELIVERY PLATFORM

9.1 TIERED DATA TRANSFER SERVICE This is the service that will allow data to be moved out of SDP sites to Regional Centre sites and

between all sites to allow for data backup and data recovery. We are currently evaluating

different tools that could be used for parts of this service, such as FTS and NGAS. While there will

be significant use of COTS and existing Open Source tools, there will be additional development

work in this package to create a data transfer scheduling environment appropriate for the SKA

Use Cases. This will include interfaces necessary to allow for the integration of the scheduler

interface with other management services within the SDP, as well as present information to LMC

such that the health of the system is known and monitored. Due to this additional development

work we have assigned more time from a Eng:Sr in the design work and additional time from an

Eng:Int in the implementation.

9.2 USER PORTAL This provides the hosting environment for all of the DELIV tools. This could be built by adapting

an existing web portal, such as CyberSKA, or by creating a new system using one of the

community web portal frameworks (Elgg, Drupal, etc.) depending on the state of the art at the

time of implementation. While the implementation of this can largely be performed by a Eng:Jr,

it will be important for them to have sufficient supervision. Also the integration of the Data

Delivery service will require significant design thought and implementation oversight. The design

and creation of specific user interfaces for different types of users is also costed.

9.3 DATA DISCOVERY SERVICE This is used to find data with particular attributes. We expect this will build on an existing IVOA

services implementation, such as the one developed by CADC. For this we see roughly equal

amounts of time for design from Arch:Sr and Eng:Sr; 0.2 FTE of each. Following the design

delivered at the end of the first year the implementers will then have to build the service. For the

implementation we need 0.2 of an Eng:Sr and 1.1 FTE of an Eng:Jr (for 2 years). This is because

much of the work should be able to be performed by an Eng:Jr with some supervision. In addition

to the services there is a need for data converters and interfaces into the object metadata

provided by the DATA package. For this whole activity we are costing an additional 0.2 Arch:Sr




and 0.3 Eng:Sr for design and 0.25 Eng:Sr and 0.75 Eng:Jr for the implementation. Note that we

are currently planning for this to use a commercial quality database management system that is

an additional software cost. Commercial databased are costed separately in the Cost Model.

9.4 DATA VISUALISATION SERVICE This allows remote visualisation of data that is stored at a SDP site or at a Regional Centre. This

could build on an existing astronomical remote visualisation system such as the CyberSKA viewer,

or could combine other desktop tools with a remote desktop system. We believe this will need

similar design effort as the Data Discovery Service. This implementation of the core visualisation

tools needs additional work from an Eng:Sr, so we have assigned 0.5 FTE and the additional 0.75

FTE of an Eng:Jr to perform the more routine parts of the work.

9.5 REGIONAL CENTRE INTERFACE We are planning around performing one year of work to design the various services and

interfaces, followed by two years to implement them. For the design of all of these we need part

of the time of a senior architect and part of the time of a senior engineer. For the development

work we need part of the time of a senior engineer, plus some time from an intermediate

engineer and/or some time from a junior engineer. There will be key requirements around

scalability and as such it is necessary that the implementation team is led by someone with

experience of delivering high impact services in software that will be reliable, and also that are

able to adapt and bring new ideas into the implementation process where necessary when

changes in approach are needed

10 LOCAL MONITORING AND CONTROL

LMC is scheduled to be developed in two distinct epochs. The first will span 2 years and uses the

majority of the development effort. This is due to the fundamental role the LMC plays in the SDP,

particularly from an integration point of view. By having LMC in place early on, testing and

validation of other components is greatly simplified. The remainder of the 5 year construction

period will see a reduced LMC team in place to carry forward the existing work and assist with

integration with other components.

The estimate of the work is based in part on the experiences gained in similar software for the

MeerKAT radio telescope. This is an area that has received considerable attention within the

MeerKAT software team, and thus many of the components (particularly control, monitoring and

logging) have well understood design spaces and costing.




10.1 LOCAL TELESCOPE MODEL A local representation of various telescope parameters used internally by SDP for processing. This

will include items such as static configuration, sensor data and programmatic models required as

a service by SDP components. Existing designs cover much of the scope of this effort

10.2 DATA FLOW MANAGER (LMC) The highest risk item in the LMC, this is responsible for the generation of logical graphs from the

data model and pipeline descriptions, and then the instantiation of a physical model based on

available compute resources at run time. This physical model is then handed over to the Data

layer for execution. The graphs are potentially huge (100's of millions of nodes), and this

approach has not been tested in the radio astronomy community before, hence the technological

risks are high.

10.3 QA MONITORING Aggregating metrics provided by internal SDP components that relate to the scientific

performance of the telescope. This also involves preparation of these for a variety of end users,

including statistical analysis to enable easy visualisation. Minor modifications to existing designs.

10.4 USER INTERFACES There are potentially a number of SDP user interfaces, including debugging, commissioning,

pipeline, and QA. At this stage the allocation of this work is relatively uncertain, but the risk factor

is low and existing designs could be used without massive modification.

10.5 MASTER CONTROLLER AND ERROR HANDLING The master controller provides a single point of contact for all TM communication. It handles

control commands, and forwards these to relevant SDP components. A major component of this

effort relates to error handling, particularly the way in which errors are reported to TM so as to

allow operators to make schedule decisions in the presence of errors. This component of the

master controller is likely to be based on existing technology, but will probably be a new design.

10.6 EVENT MONITORING AND LOGGING Handles collation and reporting of internal health monitoring data, and provides a framework to

action events and other alarms from these. Also provides a distributed logging framework for use

by SDP components. This is a low risk item with existing codebases that can be used.

10.7 DATA FLOW MODELS The generation of Data Flow Models includes several steps. The first of these is the construction of the Logical Data Graphs, which encode the functional steps needed to achieve a particular




scientific capability. These will be developed as part of the design phase, and largely reside within the scope of the pipeline tasks. To convert a Logical Data Graph to a Physical Data Graph requires interaction with the COMP and DATA sub-elements to determine resource availability, which is then used by the LMC Data Flow Manager to map the LDG to a PDG. In addition to the resource information, LMC also requires benchmarking of each component to determine resource usage and estimated runtimes. This benchmarking will be done by COMP as particular components are developed by the pipeline teams.

10.8 TASK MANAGEMENT AND CONTROL A single intermediate project manager is required to handle the day to day aspects of running a

distributed software development team.

11 SYSTEM LEVEL TASKS

The system level integration activities are explicitly enumerated here. Although additional effort

has been added to each individual element to allow for integration (and AIV support), these

resources represent the dedicated, telescope level team to handle SDP integration on a per site

basis.

Integration and QA are the primary FTE components. The integration team of 4 full time members

is sized sufficiently to allow dedicated on-site support during periods of integration, and to assist

the AIV efforts. This team should be put together as early as possible, to allow their capability to

grow and mature with the development effort. This will ensure that expert assistance is always

available during integration efforts.

Likewise the QA team will have a single intermediate level resource, and junior members

dedicated to each telescope. This will allow telescope specific quality assurance and testing runs

to be produced.

On the architecture level, provision is made for a single senior architect in the software and

hardware aspects of the SDP. This project level resource will provide architectural analysis and

support to the elements during development and testing. Given that the software effort is likely

to be more distributed than the hardware, a junior resource to shadow the software architect

and provide continuity across time zones, is included.




12 SDP EARLY OPERATIONS (NOT CHANGED SINCE M7)

12.1 EARLY OPERATIONS COSTS SCOPE The cost calculation for support during early operations includes the support and maintenance

costs that are needed during the Construction Phase over and above the teams involved with the

build and deployment of the infrastructure and development and rollout of the software. It

covers the support needed for the functioning and operation of systems deployed over the

construction term.

12.2 GENERAL PRINCIPLES APPLIED ● The costs for resources are based on 2014 terms and no provision has been made

for adjustment due to inflation over the term.

● The cost for spares and maintenance are based on the assumed hardware

procurement costs taking Moore’s law into account.

● It is assumed that the first deployment will commence in month 6, the second

deployment in month 30 and the final deployment in month 54 of the construction phase.

● Support and maintenance costs are included from the month following the

commencement of the respective deployments.

● The cost of maintenance may be adjusted pending a full analysis of the

maintenance requirements.

12.3 HARDWARE COMPUTE PLATFORM

Compute Island

The hardware support of the Compute Islands is based on a combination of the industry best

practice for the support of Linux server hardware in a virtual environment up to the operating

system level at a service level of 95% availability and experience of current deployments for other

systems such as ASKAP and LOFAR.

The support ratio was adjusted to accommodate the requirement for 24x7 support at the two

sites with standby provided by a senior system engineer to ensure the availability of the system.

A minimum of 2 resources (50:50 split between senior and junior) is required per site to ensure

continuity in support.

The support ratio for the initial two deployments is based on 100 compute islands per support

engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers.

The support ratio for the final deployment was adjusted to 50 compute islands per support

engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers. The




adjustment is required due to the volumes deployed and to cater for the availability requirement

of 95%. A minimum of 1 Senior System engineer is required per site at all times.

Provision was also made for travelling costs during early operations at 6 trips of between 3 and

7 days per trip for 4 resources for the first two deployments and 2 trips of between 3 and 7 days

for 4 resources for the final deployment.

Provision was made for the procurement of hardware maintenance from the hardware

manufacturer of the compute islands for the initial deployment at 15% of the purchase price of

the compute islands. The percentage is based on the availability requirement for the compute

islands (the higher the availability the higher the maintenance costs).

Due to the volumes and the requirement for the system to be available for the processing of data

with minimal interruption, it was decided to replace the hardware maintenance with onsite

spares from the second deployment onwards. Provision was therefore made for onsite spares at

10% of the purchase price of the compute islands. The onsite server support engineers will

perform the repair of the hardware using the onsite spares provided

Buffer

The support of the buffer storage is based on industry best practice for the support of SSD type

storage, taking into account the volumes and availability requirement. It consists of the following

resource combination:

0.4 Senior storage engineer per 1000PB

0.5 Intermediate storage engineer per 1000PB

3 Junior storage engineers per 1000PB

The support of the buffer storage will only commence with the final deployment.

Provision was also made for 2 trips of between 3 and 7 days for 1 resource for the final

deployment.

SDP Infrastructure

Provision was made for maintenance to be procured from the hardware supplier at 15% of the

procurement cost of the server racks to cater for any failures. This was applied for all three

deployments

Hierarchical Storage

Provision was made for support of the archive storage at €325.20 per PB for the initial

deployment, €307.37 per PB for the second deployment and €287.76 per PB for the final

deployment.

The support costs were calculated taking into account a failure rate percentage of 4%.





7 days per trip for 1 resource for the first two deployments and 2 trips of between 3 and 7 days


This below was originally under the heading ‘buffer archive’

Provision was made for support of the archive storage at €4,442.40 per PB for the initial

deployment, €3,173.14 per PB for the second deployment and €1,776.96 per PB for the final

deployment.



7 days per trip for 1 resource for the first two deployments and 2 trips of between 3 and 7 days


Interconnect System

(was Low Latency Core Switches)

Provision was made for the procurement of hardware maintenance at 25% of the procurement

cost of the switches from the first deployment onwards. This is based on the standard

maintenance percentage for network equipment deployments.

No provision was made for additional spares.

Compute node OS development

No provision was made during early operations for additional maintenance of the developed

software as the development team catered for in the construction cost calculation is assumed to

be sufficient to deal with any operational support requirement during construction.

Hardware support

Hardware maintenance

Archive storage


procurement cost of the archive storage to cater for any failures. This was applied for all three

deployments.

Archive buffer storage


procurement cost of the archive buffer storage to cater for any failures. This was applied for all

three deployments.




Archive media

Provision was made for the storage media needed for the archive storage and archive buffer

based on the volumes derived from the performance use case for the three deployments.

The costs associated with the media are as follows:

First deployment € 14,444.44 per PB media

Second deployment € 10,317.46 per PB media

Final deployment € 5,777.78 per PB media

Archival network core switches





Interconnect system

12.3.13.1 Hardware support

12.3.13.2 All network equipment (incl Low Latency core switches, Archive Switches and Data

Transport network)

Provision was made for 1 senior network support engineer per site for all three deployments.

The senior network support engineer is to oversee and ensure knowledge transfer to the server

support engineers to ensure that they can assist with the support of the network switches.


7 days per trip for 2 resources for the first two deployments and 2 trips of between 3 and 7 days


12.3.13.3 Data Transport network








Non-Domain Software

12.3.14.1 Maintenance of developed software




12.3.14.2 COTS software maintenance

Provision was made for software maintenance on the following COTS items at 25% of the

procurement cost:

· System Operating Systems

· Data Transfer OTS

· Database software

· Cloud Management Framework

· Scheduler to drive data transfers according to priority and policy

No maintenance provision was made for the Oracle software as the initial procurement costs

include maintenance for a period of 5 years (construction phase).

12.3.14.3 Archive HSM licenses

Provision was made for renewing licenses and replacing obsolete items at 33% of the original

procurement costs of the HSM licenses.

Domain Software

12.3.15.1 Maintenance of developed software




System Level tasks (SDP overall)

No provision was made during early maintenance, as the team to oversee the tasks on a holistic

level for SDP is provided for as part of the construction costs.




13 SDP OPERATIONS COSTS (NO CHANGE SINCE M7)

The costs for support during operations are based on the support required for the full

deployment for a period of 12 months after the conclusion of the construction phase (currently

assumed to be 1 January 2023 to 31 December 2023).

13.1 GENERAL PRINCIPLES APPLIED The costs for resources are based on 2014 terms and no provision has been made for adjustment

due to inflation over the term.

The cost for spares and maintenance are based on the assumed hardware procurement costs

taking Moore’s law into account.

The cost of maintenance may be adjusted pending a full analysis of the maintenance

requirements.

13.2 HARDWARE

Compute System Hardware

13.2.1.1 Hardware support

13.2.1.1.1 Compute islands (including management compute islands)

13.2.1.1.2 Estimated Costs for Operational Budget per Annum

As the majority of the power is consumed by compute islands, individual power for the elements

of the Compute Island is shown in the cost spreadsheet. Taking into account the contribution

from accelerators alone, the Compute Island is projected to consume around 11GF/W compared

to the [RD21] system of 3.5GF/W which is number 2 on the Green Top500 and the [RD22] system

of 1.64GF/W which is number 7 on the standard Top500. It is anticipated that this is achievable

by 2022 as de-clocking of cores separately from memory block will maintain memory bandwidth

and hence efficiency.

The hardware support of the Compute Islands is based on a combination of the industry best

practice for the support of Linux server hardware up to the operating system level at a service

level of 95% availability and experience of current deployments for other systems such as ASKAP

and LOFAR.

The support ratio was adjusted to accommodate the requirement for 24x7 support at the two

sites with standby provided by a senior system engineer to ensure the availability of the system.

A minimum of 2 resources (50:50 split between senior and junior) is required per site to ensure

continuity in support.




The support ratio for the ongoing operational support is based on 50 compute islands per support

engineer with a ratio of 1 senior system engineer for every 6 junior server support engineers. This

is required to cater for the availability requirement of 95%. A minimum of 1 Senior System

engineer is required per site at all times.

Provision was also made for travelling costs at 3 trips per annum of between 3 and 7 days per

trip for 4 resources.

13.2.1.1.3 Buffer storage

The support for the buffer storage is based on industry best practice for the support of SSD type

storage, taking into account the volumes and availability requirement. It consists of the following

resource combination:

0.4 Senior storage engineer per 1000PB

0.5 Intermediate storage engineer per 1000PB

3 Junior storage engineers per 1000PB


trip for 1 resource.

13.2.1.1.4 Hardware maintenance

13.2.1.1.4.1.1 Compute islands (including management compute islands)

Due to the volumes and the requirement for the system to be available for the processing of data

with minimal interruption it was decided to make provision for onsite spares. Provision was made

for onsite spares at 10% of the purchase price of the compute islands. The onsite server support

engineers will perform hardware repair using the onsite spares provided.

13.2.1.1.4.1.2 Low Latency core switches


cost of the switches. This is based on the standard maintenance percentage for network

equipment deployments.


13.2.1.1.4.2 Maintenance of developed software

13.2.1.1.4.2.1 Compute node OS development

Provision was made for the maintenance of the software developed at 40% of the original cost

of development. This is aligned to industry practice and caters for the complexity of the

environment.




13.2.1.1.4.2.2 Documentation

Provision was made for the documentation of any changes made as a result of the software

maintenance and is based on 15% of the software maintenance costs.

13.3 INFRASTRUCTURE



procurement cost of the server racks to cater for any failures.

13.4 LONG TERM STORAGE (ARCHIVE)

Hardware support

13.4.1.1 Archive storage

Provision was made for support of the archive storage €287.76 per PB.




13.4.1.2 Archive buffer storage

Provision was made for support of the archive storage at €1,776.96 per PB.





13.4.2.1 Archive storage


procurement cost of the archive storage to cater for any failures.

13.4.2.2 Archive buffer storage


procurement cost of the archive buffer storage to cater for any failures.

13.4.2.3 Archive growth

Provision was made for 100% growth in the Archive storage and Archive Buffer per annum as per

AD05.

The costs associated with the growth are as follows:




Media € 5,777.78 per PB media

13.4.2.4 Archival network core switches





13.5 INTERCONNECT SYSTEM

Hardware support

13.5.1.1 All network equipment (incl Low Latency core switches, Archive Switches and Data

Transport network)

Provision was made for 1 senior network support engineer per site.

The senior network support engineer is to oversee and ensure knowledge transfer to the server

support engineers to ensure that they can assist with the support of the network switches.



13.5.1.2 Data Transport network





13.6 NON-DOMAIN & DOMAIN SOFTWARE

Maintenance of developed software

Provision was made for the maintenance of the software developed at 40% of the original cost

of development. This is aligned to industry practice and caters for the complexity of the

environment.



Documentation

Provision was made for the documentation of any changes made as a result of the software

maintenance and is based on 15% of the software maintenance costs.




COTS software maintenance

Provision was made for software maintenance on the following COTS items at 25% of the

procurement cost:

· System Operating systems

· Data Transfer OTS

· Database software

· Cloud Management Framework· Scheduler to drive data transfers according to priority and

policy

Maintenance provision of 22.5% was made for the Oracle software per annum.

Archive HSM licenses

Provision was made for the renewing of licenses and replacing of obsolete items at 33% of the

original procurement costs of the HSM licenses.

System Level tasks (SDP overall)

Provision was made for the 40% of the original cost during the construction phase to ensure

continuity and alignment on an ongoing basis.



PDR07-01CostsBasisofEstimatev2-2 final-1 (1)EchoSign Document History February 10, 2015

Created: February 09, 2015

By: Verity Allan ([email protected])

Status: SIGNED

Transaction ID: XJEUFNC54XI3C4V

“PDR07-01CostsBasisofEstimate v2-2 final-1 (1)” HistoryDocument created by Verity Allan ([email protected])February 09, 2015 - 6:15 PM GMT - IP address: 131.111.185.15

Document emailed to Ferdl Graser ([email protected]) for signatureFebruary 09, 2015 - 6:15 PM GMT

Document viewed by Ferdl Graser ([email protected])February 10, 2015 - 5:21 AM GMT - IP address: 105.184.40.35

Document e-signed by Ferdl Graser ([email protected])Signature Date: February 10, 2015 - 5:22 AM GMT - Time Source: server - IP address: 105.184.40.35

Document emailed to Paul Alexander ([email protected]) for signatureFebruary 10, 2015 - 5:22 AM GMT

Document viewed by Paul Alexander ([email protected])February 10, 2015 - 8:59 AM GMT - IP address: 131.111.185.15

Document e-signed by Paul Alexander ([email protected])Signature Date: February 10, 2015 - 8:59 AM GMT - Time Source: server - IP address: 131.111.185.15

Signed document emailed to Paul Alexander ([email protected]), Ferdl Graser ([email protected]) andVerity Allan ([email protected])February 10, 2015 - 8:59 AM GMT

Download - PDR.07.01 COSTING BASIS OF ESTIMATE SKA-TEL-SDP …broekema/papers/SDP-PDR/PDR07-01 Costs … · deliver such a system in the 2018+ time-frame while operating within the envisaged

Top Related