future asset management architecture - statnett · aws amazon web services – cloud platform from...

90
i Future Asset Management Architecture SAMBA WP4 report

Upload: others

Post on 28-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

i

Future Asset Management Architecture SAMBA WP4 report

Page 2: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

ii

Page 3: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

iii

Executive summary

This report defines and describes the existing and future architecture for asset management in

Statnett. TOGAF methodology has been used for assessing, analyzing and documenting the

architecture. The architecture has been described using multiple layers and viewpoints of ArchiMate

3.0 modelling language including strategy and motivation, application layer and technology layer.

The report builds on the results and conclusions made in other Big Data and Analytics related projects

at Statnett. In particular, it builds on the results of the Finbeck and Fia projects as well as the AutoDig

2.0 projects.

The report describes as well the future Big Data and Analytics platform and defines a number of

capabilities that are required from such a platform. The asset management solution itself is expected

to be a hybrid solution based on Big Data and Analytics platform and combined with functionality

implemented in several existing internal systems as well as new components.

This report describes as well several areas that are not yet addressed in the platform being currently

introduced at Statnett and need to be further explored. The most important among these areas are

cloud integration, advanced PaaS and SaaS cloud services offering advanced AI services like natural

language comprehension, data exchange APIs and gateways with third parties as well as improving the

infrastructure for ingestion of sensor data.

Page 4: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming
Page 5: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

5

Contents

Abbreviations 6

1 Introduction 7

1.1 Underlying idea of the SAMBA-project 7

2 Methodology 8

3 Big Data and Analytics technology 11

3.1 Main On-premise Big Data distributions 11

3.2 Other on premise solutions 12

3.3 Cloud solutions 14

3.4 Other solutions 16

4 Current solution – BASELINE ARCHITECTURE 19

5 Big Data lake – TRANSITION ARCHITECTURE 22

5.1 AutoDig 2.0 project 22

5.2 ArcGIS environment 27

6 Reference Architecture – TARGET ARCHITECTURE 30

6.1 Overall Reference Architecture 30

6.2 Strategy and motivation layers 34

6.3 Capabilities and information needs 47

6.4 Business architecture 50

6.5 Overall Strategy and Motivation layer 53

6.6 Application Architecture 55

6.7 Capability to application component mapping 60

6.8 Technology Architecture 62

6.9 Technology to application component mapping 68

6.10 Governance Principles 72

6.11 Principles for Big Data and Analytics platform 72

6.12 APIs for ingestion and integration 74

7 Concluding remarks 76

8 References 77

V1 EA Diagrams 80

8.1 Strategy and Motivation 80

Page 6: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

6

Abbreviations

AWS Amazon Web Services – cloud platform from Amazon

ADM Architecture Development Method

APM Asset Performance Management

BPMN Business process model and notation

CEN European Committee for Standardization

CENELEC European Committee for Electrotechnical Standardization

CIM Common Information Model

COTS Commercial Off The Shelf

CPU Central Processing Unit (processor)

DL Deep Learning

DSO Distribution System Operator

ETSI European Telecommunications Standards Institute

EA Sparx Enterprise Architect

ETL Extract, Transform, Load

GCP Google Cloud Platform

GPU Graphics Processing Unit

HDF Hortonworks Dataflow

HDFS Hadoop File System

HDP Hadoop Data Platform

MapRFS MapR Filesystem

ML Machine Learning

NIST National Institute of Standards and Technology

PaaS Platform as a Service

SaaS Software as a Service

SGAM Smart Grids Architecture Model

TOGAF The Open Group Architecture Framework

TSO Transmission System Operator

Page 7: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

7

1 Introduction

The main objective of the WP 4 is to design and develop a reference ICT architecture that utilizes a

common integration environment and the “common data models” developed in WP3. The architecture

must facilitate openness, security, safety in addition to big data analytics and business intelligence by

rule-based filtering techniques. Open interfaces, message bus and message queuing, standardized

models and protocols are important.

In particular, the WP 4 provides:

- An overall description of different stakeholders, drivers, outcomes and tactics to address the needs related to Big Data and asset management at Statnett

- Available data sources and their requirements for the data harvesting services and suggestions for improvements with regards to data ingestion

- Specification of critical capabilities of the future asset management system and how the different use cases identified in WP 2 and WP 3 map to these capabilities

- Comparison of the reference architecture with industry standards and international suppliers as well as different best practice implementations

This report provides an overview of results of the WP 4 and describes the ICT architecture that could

in the future support the needs for data collection and analysis primarily within the asset management

domain in Statnett. In particular, this report describes the assessment and analysis of the architecture

with the input from WP 1 report [1], the use cases from WP 2-WP 3 report [2] and conclusions from

the WP 6 report [3].

The report is organized as follows. Chapter two contains information about the methodology for

assessing, analyzing, developing and documenting the architecture. Chapter three gives the overview

of the technology landscape when it comes to Big Data and Analytics field.

Chapter four provides information about the baseline architecture. Chapter five describes the

transitional architecture, which is being implemented as a part of the AutoDig 2.0 project.

Chapter six describes the reference model as defined by the Finbeck project as well as the implications

for asset management and the SAMBA project. Chapter seven and eight contain concluding remarks

and references, respectively. Appendix includes as well a full size version of the diagrams of the

architecture models described in this report.

1.1 Underlying idea of the SAMBA-project

Asset management in Statnett can be improved by utilizing new developments in ICT, such as big data

technology, data fusion and business intelligence. The underlying idea of the project is to use these

generic ICT-developments together with existing domain research results (such as models on ageing

and lifetime of power system components) to establish a reference architecture for data collection,

communication and handling. This can optimize maintenance and reinvestments through facilitating a

more efficient analysis of incipient failures, ageing mechanisms and remaining lifetime of power

system components.

The amounts of sensor data related to asset management can be overwhelming. Big Data and Analytics

technology is an important prerequisite to be able to gather, process, distribute and visualise this data

as well as to provide an open access to integrate and reuse this data as well in other systems.

Page 8: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

8

2 Methodology

TOGAF (The Open Group Architecture Framework) has been used primarily as the main framework

and process for creating, assessing and documenting the architecture in the SAMBA project. TOGAF is

a framework for enterprise architecture that provides an approach for designing, planning,

implementing, and governing an enterprise information technology architecture [4]. In particular, the

following phases of Architecture Development Method (ADM) have been used:

Preliminary A. Architecture Vision B. Business Architecture C. Information Systems Architectures D. Technology Architecture

Figure 1 TOGAF – Architecture Development Method

The remaining phases (E-H) are not relevant in the projects like SAMBA, which focus on research and

do not directly intend to implement the architecture.

For documentation purposes, ArchiMate 3.0 has been used. ArchiMate defines three main layers:

Business, Application and Technology [5]:

Page 9: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

9

Business layer describes business processes, services, functions and events. It describes the products and services offered to the customers and users

Application layer describes application services and components Technology layer describes hardware, communication infrastructure and system software

These three layers provide a structured way of bridging the different perspectives from business to

technology and infrastructure. The full model of ArchiMate 3.0 also brings or enhance another three

very useful layers:

Strategy and Motivation layer – introduced in 2016 in ArchiMate 3.0 for modeling of the capabilities of an organization and help explaining impact of changes on the business (gives better connection between strategic and tactic planning)

Implementation and Migration layer – supports modeling related to project, portfolio or program management

Physical layer – for modeling physical assets like factories

Primarily, In SAMBA, there have been created Business layer, Application layer and Technology layer

models in addition to Strategy and Motivation layer.

Moreover, as explained in WP1 report [1], the complex research challenges in this project are specific

to the transition towards the Smart grid. The SAMBA-project uses the Smart Grids Architecture Model

(SGAM) see Figure 2, from CEN-CENELEC-ETSI Smart Grid Coordination Group to describe the projects

central R&D challenges and scientific methods [6].

The SGAM framework consists of five interoperability layers representing business objectives and

processes, functions, information exchange and models, communication protocols and components.

Interoperability is an important issue and research challenge in smart grids.

The component layer covers the physical infrastructure; electrical components, sensors, networks,

routers, computers and so on that form the basis for any form of communication and information

gathering. Gaining an overview of this layer will be a starting point for the project. In the

communication layer, different protocols are used to send and receive data between components.

However, just enabling better communication does not guarantee that useful information is

exchanged.

Page 10: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

10

Figure 2 SGAM framework

The information layer describes the data models and information objects included in use cases in order

for the information to be interpreted correctly when testing use cases. A data model using open

standards (i.e. CIM) is an important prerequisite for SAMBA-project.

In the function layer in Figure 2 functions and services are represented as use cases independent of

the physical realization in systems and components. The level ensures that the right information enters

the right process and the right actor. This represent a large research challenge, as information must

enter the asset management of Statnett, a high level process in any company.

Page 11: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

11

3 Big Data and Analytics technology

This chapter discusses and describes different Big Data technologies and architectures, both on-

premise and in cloud that have been assessed in the course of the project. This includes the technology

already selected and acquired as basis for the initial Big Data and Analysis solution in Statnett.

The overall Big Data technology landscape is extensive spanning the infrastructure, storage, analytics,

data source and API tools and applications. This has been summarized by following overview by Matt

Turck [7] (See Figure 3).

Figure 3 Big Data Landscape

3.1 Main On-premise Big Data distributions

Several Big Data technology suppliers developed their own software suites for Big Data containing

Hadoop and other components. These software suites are called Hadoop distributions. These

distributions package multiple tools / technologies into a technology stack ready for customers to use.

Suppliers often offer technical support as well as a comprehensive product with several

complementary tools that can be customized for specific tasks.

Hortonworks Data Platform

Hortonworks was established in 2011 and is the only distribution that uses pure Apache Hadoop

without any proprietary tools and components. Hortonworks Data Platform is also the only pure

Page 12: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

12

Open Source project of all three distributions. Hortonworks is now also an integral part of the IBM

BigInsights [8].

In 2016 Hortonworks has created a separate line of product for processing streaming data.

Hortonworks Dataflow (HDF) is optimized to ingest, curate and handle data in flow and contains several

additional tools to facilitate that, e.g. NiFi, MiNifi and Schema Registry [9]

Cloudera

Cloudera was one of the first Hadoop distributions, established in 2008. Cloudera is based to large

extent on Open Source components, but not as much as Hortonworks. Cloudera is easier to get

installed and use than Hortonworks. The most important difference from Hortonworks is the

proprietary management stack [8].

MapR

MapR does not use the HDFS file system, but swaps it with a proprietary MapRFS. This is due to that

MapRFS gives better robustness and redundancy and largely simplified use. Most likely the on premise

distribution that offers the best performance, redundancy and user friendliness. MapR improves also

performance of other components, including Hbase (called MapR DB). MapR offers also extensive

documentation, courses and other materials [8].

3.2 Other on premise solutions

Oracle Cloudera

Oracle Cloudera is a joint solution from Oracle/Cloudera. Oracle based their Big Data platform on a

Cloudera distribution. This distribution offers some additional and useful tools and solutions that give

increased performance, in particular Oracle Big Data Appliance, Oracle Big Data Discovery, Oracle

NoSQL database and Oracle R Enterprise.

Oracle Big Data appliance is an integrated HW and SW Big Data solution running on a platform based

on Engineered Systems (like ExaData). Oracle adds Big Data Discovery visualization tools on top

of Cloudera/Hadoop while Oracle R Enterprise includes R – an open source, advanced statistical

analysis tool [8].

IBM BigInsights

IBM BigInsights for Apache Hadoop is a solution from IBM that also builds on top of Hadoop. BigInsights

offers in addition to Hadoop, some proprietary tool for analysis like BigSQL, BigSheets and BigInsights

Data Scientist that includes BigR.

IBM BigInsights for Hadoop also offers BigInsights Enterprise Management solution and IBM Spectrum

Scale-FPO file system as an alternative to HDFS [8].

SAP HANA and Vora

SAP HANA is an in-memory, column-oriented, relational database management system developed and

marketed by SAP SE. Its primary function as a database server is to store and retrieve data as requested

by the applications. In addition, it performs advanced analytics (predictive analytics, spatial data

processing, text analytics, text search, streaming analytics, graph data processing) and includes ETL

capabilities as well as an application server [10].

SAP HANA Vora is an in-memory computing engine designed to make big data from Hadoop more

accessible and usable for enterprises. SAP developed Vora out of SAP HANA as a way to address specific

business cases involving big data. Hadoop offers lower-cost storage for vast amounts of data, but

adoption initially lagged in the enterprise because the data in a data lake is unstructured and can be

Page 13: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

13

hard to deal with. SAP HANA Vora builds structured data hierarchies for the Hadoop data and

integrates it with data from HANA to enable OLAP-style in-memory analysis on the combined data

through an Apache Spark structured query language (SQL) interface [11].

OSIsoft PI

OSIsoft PI is a suite of software products that are used for data collection, historicizing, finding,

analyzing, delivering, and visualizing. It is marketed as an enterprise infrastructure for management of

real-time data and events. The term PI System is often used to refer to the PI Server but the two are

not the same. The PI System refers to all OSIsoft software products whereas the PI Server is the core

product of the PI System [12].

The following table gives a quick overview of main on-premise Hadoop distributions and their features

[8] [13].

Table 1 Comparison of most important Hadoop distributions (based on: “Hadoop buyers guide”) [8] [13]

Category Feature Hortonworks Cloudera MapR

Data access

SQL Hive Impala MapR-DB

Hive

Impala

Drill

SparkSQL

NoSQL HBase

Accumulo

Phoenix

HBase HBase

Scripting Pig Pig Pig

Batch MapReduce Spark

Hive

MapReduce

Spark

Pig

MapReduce

Search Solr Solr Solr

Graph/ML

GraphX

MLib

Mahout

RDBMS

Kudu MySQL

File system access Limited, not

standard NFS

Limited, not

standard NFS

HDFS, read/write NFS

(Posix)

Authentication Kerberos Kerberos Kerberos and native

Streaming Storm Spark Storm

Spark

MapR-Streams

Ingestion Ingestion Sqoop

Flume

Kafka

Sqoop

Flume

Kafka

Sqoop

Flume

Operations Scheduling Oozie

Oozie

Page 14: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

14

Category Feature Hortonworks Cloudera MapR

Data lifecycle Falcon

Atlas

Cloudera Navigator

Resource

management

YARN YARN

Coordination ZooKeeper

ZooKeeper

Sahara

Myriad

Security Security

Sentry

Record Service

Sentry

Record Service

Performance Data ingestion Batch Batch Batch and streaming

(write)

Metadata

Architecture

Centralized Centralized Distributed

Redundancy

HA Survives single fault Survives single fault Survives multiple faults

(self-healing)

MapReduce HA Restart of jobs Restart of jobs Continuous without

restart

Upgrades With planned

downtime

Rolling upgrades Rolling upgrades

Replication Data only Data only Data and metadata

Snapshots Consistent for

closed files

Consistent for

closed files

Consistent for all files

and tables

Disaster recovery None Scheduled file copy Data mirroring

Management

Tools Ambari

Cloudbreak

Cloudera Manager MapR Control System

Heat map, alarms Supported Supported Supported

ReST API Supported Supported Supported

Data and job

placement

None None Yes

3.3 Cloud solutions

IBM Cloud – Watson Data Platform

IBM provides a comprehensive solution for cloud based data platform. Watson Data Platform for data

ingestion, data storage and analytics [14].

Amazon EMR

Amazon EMR (Elastic Map Reduce) is a Hadoop distribution put together by Amazon and running in

Amazon cloud. Amazon EMR is easier to take into use than on premise Hadoop. Amazon is absolutely

Page 15: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

15

the biggest cloud provider but when it comes to Big Data its solution is relatively new compared to

Google [8].

Microsoft Azure

Microsoft offers three different cloud solutions based on Azure: Hadoop based HDInsights, HDP for

Windows and Microsoft Analytics Platform System.

The following table gives a quick overview of main cloud based Hadoop distributions and their features

[8] [13].

Google Cloud Platform

Google offers also Big Data cloud services. The most popular service in GCP (Google Cloud Platform) is

known as BigQuery (which is a SQL like database), Cloud Dataflow (processing framework) and Cloud

Dataproc (Spark and Hadoop services). Google has been working on Big Data technologies for a long

time, which gives a good start point when it comes to advanced Big Data tools. GCP offers analysis

and visualization tools as well as an advanced platform to test the solutions (known as Cloud Datalab)

[8].

Table 2 Comparison of most important Big Data cloud solutions [8] [13]

Category Feature Amazon

Web

Services

Azure

(HDInsights)

IBM Cloud

Watson Data

Platform

Google

Cloud Platform

Data access

File system

storage

Hadoop Cloud Object

Storage

Cloud Storage

NoSQL HBase HBase Cloudant Cloud Bigtable

SQL Hive

Hue

Presto

Hive DB2 on Cloud BigQuery

Cloud SQL

RDBMS Phoenix Compose Cloud SQL

Batch Pig

Spark

Map Reduce

Pig

Spark

Cloud Dataflow

Streaming Spark Storm

Spark

Streaming

Analytics

Google Cloud

Pub/Sub

Script

Pig

Search

Solr

Ingestion Ingestion Sqoop Streaming

Analytics

Cloud Dataflow

Visualization Visualization

Data Science

Experience

CloudData lab

Page 16: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

16

Category Feature Amazon

Web

Services

Azure

(HDInsights)

IBM Cloud

Watson Data

Platform

Google

Cloud Platform

Analytics Machine

Learning

Mahout R Server

Azure Machine

Learning

Streaming

Analytics

DSX

Analytics Engine

Google Cloud

Machine Learning

Speech API

Natural Language

API

Translate API

Vision API

Operations

Logging

Logging

Error reporting

Trace

Coordination ZooKeeper

Scheduling Oozie

Resource

Management

HCatalog

Tez

Cloud Console

Cloud Resource

Manager

Monitoring Ganglia

Monitoring

3.4 Other solutions

Predix

Predix is General Electric's software platform for the collection and analysis of data from industrial

machines. General Electric plans to support the growing industrial IoT with cloud servers and app store.

Predix as a cloud-based PaaS (Platform as a Service) is claimed to enable industrial-scale analytics for

asset performance management (APM) and operations optimization by providing a standard way to

connect machines, data, and people. Predix provides a microservices based delivery model with a

distributed architecture (cloud, and on premise) [15].

Page 17: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

17

Figure 4 GE Predix platform

Insights Foundation for Energy

IBM® Insights Foundation for Energy is an energy analytics, data management and visualization

software solution for utility and energy companies. It provides a single energy analytics platform to

support various analytic applications. This includes situational awareness visualizing patterns,

predicting actions and connecting data points to derive insights, predictive maintenance using

historical data to determine asset repair or replacement, and asset health and risk analytics to measure

asset status and assess risk and consequences in near real-time. It is available through IBM software-

as-a-service (SaaS) subscription services or as an on premise solution [16].

IFE (Insights Foundation for Energy) creates operational insights based on energy analytics to optimize

business outcomes, provides a single energy analytics platform that can expand over time to meet

evolving analytics needs and unifies systems and business processes for more innovative, effective

business procedures [16].

Page 18: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

18

Figure 5 IBM IFE

ABB Asset Health Center

ABB also offers an energy analytics, data management and visualization software solution based on

the Azure cloud platform and Cortana. ABB Asset Health Center uses predictive and prescriptive

analytics, as well as customized models incorporating industry expertise, to identify and prioritize

emerging maintenance needs based on probability of failure and asset criticality. ABB Asset Health

Center offers ingestion of asset and sensor data in the Azure BLOB Storage as well as Azure SQL

Database, Azure Machine Learning and Power BI visualization [17].

Cognite

Cognite is a Norwegian company specializing in customized Big Data, Analytics and IoT solutions mainly

for the Energy sector (offshore), in particular Aker BP and Kværner. It is based on several components

from the Google Cloud platform.

Kongsberg Digital

Kongsberg Digital has also build a similar platform for the Energy (offshore). The platform from

Kongsberg Digital is based on the Microsoft Azure cloud platform.

Page 19: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

19

4 Current solution – BASELINE ARCHITECTURE

Current status for asset management has already been documented in SAMBA through reports from

WP1 [1], WP2 and WP3 [2]. Here is the summary of the most important findings.

Statnett’s ICT-support for asset management has been developed over time, in the form of different

information systems, and often based on a per need – approach. See Figure 6 for the asset

management ICT-landscape.

Figure 6 Asset management system landscape

Table 3 describes most important components of AS-IS architecture for asset management.

Table 3 Most important components in AS-IS architecture for asset management

Component Layer Comment

AutoDig Visualization

Data Store

Fault analysis tool which collects and presents data from various

sensors and systems in an efficient way

Innsikt / HIS web Visualization Visualization / analytics platform at Statnett as well

DDK-GUI Visualization Visualization of asset data from various sources in a tabular way.

Front-end to SYSBAS.

ArcGIS Visualization

Data Store

Map visualization tool at Statnett

IFS Visualization

Data Store

ERP system

Innsikt /

HIS

web

BiCycle

Analysis / visualisation

DDK-

GUIAnleggs

guiden

TPV-

T/P

FOS

webTKP

PDC

(PMU)DFR

Spider/

EMSRelays Lightning

Power

quality

meters

OIS

Innsikt/

DWH

SYS

BAS

Arc

GISIFS

FOS

common

Auto

DIGFASIT

Data store / hub

Data sources and sensors

Page 20: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

20

Component Layer Comment

TPV-T/TPV-T Visualization Total planning tool with visualization of all stations and

switchgear

TKP Visualization Project module for overview of activities

Bicycle Visualization A maintenance DWH solution serving as visualization and

planning tool for asset management

FASIT Visualization

Data Store

System for handling the fault reports from Statnett and

Norwegian DSOs

SYSBAS Data Store Data hub for asset data from various sources

FOS common Data Store Landing area for asset data provided by the Norwegian DSOs

Innsikt DWH Data Store Storage part of Innsikt DWH

The most important systems for asset management are IFS, SYSBAS and FOS. In addition, there are a

couple of fault analysis systems, which are also important for asset management. Those include

AutoDig and FASIT. Innsikt as a common analysis platform naturally plays an important role for asset

management. His Web stores historical data.

Among all the systems, it is important to mention TPV, TKP and BiCycle. TPV-T ("Total Planning Tool")

database is practically a "mirror" of IFS, showing data for all Statnett stations, with switchgear in all

voltage levels and all components/equipment with technical data and age. Equivalent for overhead

lines and cables. The tool generates proposal for "equipment replacement measures" based on age of

the different type of components. TPV-P (project module) gives an overview of activities and is used

to group activities together, manually. BiCycle on the other hand is a specialized analytics solution for

RCM (Reliability Centered Maintenance).

The Statnetts ERP system – IFS is the kernel of the asset database and asset management functionality.

However, most of the analyses are performed in a series of additional tools which combined solve

most current user needs. However, the analyses are fragmented and mostly have different logic for

data collecting and storage. The current architecture is not a good basis for growth. The largest data

storage is a traditional data warehouse with a BI-tool on top.

Today Statnett has still not realized the possibilities that big-data-concepts can provide. The main

reasons for this is:

Data is not easily accessible for access, integration and sharing, often locked in proprietary

systems

There is no common data store / data hub which makes it possible to access and assemble

data from various sources

There is no uniform way of collecting the data from the sensors as well as the distribution

systems for collecting these data are often unreliable, not monitored, not properly maintained.

Data is often of poor quality, delayed or missing

Organizational silos which make it difficult and time consuming to integrate the systems

Use of obsolete integration paradigms (i.e. SOA – Service Oriented Architecture) which

mandate exchange data of and restrict sharing of data

Page 21: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

21

Current analytics platform (i.e. Innsikt) do not provide sufficient capacity and performance to

implement the asset management use cases efficiently

Page 22: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

22

5 Big Data lake – TRANSITION ARCHITECTURE

This chapter describes the BigData and Analytics platforms at Statnett: the platform to be developed

within the AutoDig 2.0 project and the ArcGis environment.

The AutoDig 2.0 project is an important first step on the road to implement the future Big Data and

Analytics platform at Statnett. The platform will be further developed in another project, the Finbeck

project, which defines the long term roadmap and high level reference architecture of the Big Data,

Analytics, IoT and adjacent areas. The Finbeck project will extend the architecture and the platform

with additional software components and features as suggested in the long-term strategy for Big Data

and Analytics.

5.1 AutoDig 2.0 project

Statnett is in the process of implementing a platform for BigData and Analytics as a part of the AutoDig

2.0 project. In this project we test a solution that will be crucial for the future development of the asset

management platform to support the needs described in the SAMBA use cases and not only.

AutoDig is a system for acquisition, sorting, presentation and analysis of information regarding power

system disturbances [18]. The software in use today is a prototype developed within an R&D project.

There prototype has been successfully taken into use but there is a need to develop an improved, more

stable and efficient tool to help perform this analysis work. Statnett has initiated a project, which will

deliver a new and improved operative solution in close integration with Statnett's ICT infrastructure.

The AutoDig 2.0 system will gather, store and analyze large amounts of data collected from multiple

sources and sensors in the network (See Table 4 and Figure 7).

Table 4 AutoDig 2.0 data sources

Data source Description

PMU data Multiple time series (1 kHz sampling)

DFR (Digital Fault Recorder) Time series

Power Quality Measurements / Elspec

Several time series containing aggregated parameters (50 Hz sampling) and raw data time series (50 kHz sampling)

Power Quality Measurements / Metrum Several time series containing aggregated parameters

Distance Relay Protection Comtrade

Operation and Maintenance Database Events, breaker positions, network configuration and operational measurements (P, Q, I, U, f)

Network Repository Power grid model

Time variable data /met.no Weather and lightning data

ERP Asset data

Operation Management Support system Operation and fault reports

The AutoDig 2.0 solution that Statnett is implementing is based on the use of a Big Data lake / Data

Lake architecture (see Figure 7) which includes the following elements:

Page 23: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

23

Data collected from multiple sensors and data sources, after initial processing (ELT1 / ingestion). This data is stored in a BigData lake

All relevant data is stored in the BigData Lake in a structured format, both in a CIM (common information model) format or as time series

Data stored in the BigData lake must be available for reuse in new / future applications and solutions at Statnett

The solution consists of analysis and visualization components, as detailed in For the analyses

performed with the help of AutoDig 2.0 it is crucial that the time elapse from the moment the data

should be available till ingestion and the result are presented is kept as low as possible (preferably

below one minute).

Table 5.

Figure 7 AutoDig 2.0 incl. Big Data lake

1 ELT – Extract Load Transform

Page 24: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

24

In AutoDig 2.0, collected data will be retained for a long time and be available for use in future analyses.

Statnett aims to be able to retain the raw data up to 10 years and processed/aggregated data in at

least 60 years or the lifetime of the assets.

For the analyses performed with the help of AutoDig 2.0 it is crucial that the time elapse from the

moment the data should be available till ingestion and the result are presented is kept as low as

possible (preferably below one minute).

Table 5 AutoDig 2.0 application components

Component Description

AutoDig Dashboard Tailored web app providing a consolidated and configurable work surface and integrating data visualization components, analysis components (Advanced Analytics and Self-service Analytics) as well as visualization of data stored in the Big Data lake. AutoDig Dashboard will also be used to configure and select a set of triggers/criteria for performing analysis as well as perform analyses when these criteria are satisfied

Advanced Analytics COTS2 component for visualization and self-service analytics

AutoDig Analysis Engine

Component that will analyze collected data. Currently implemented in MatLab along with a number of MatLab algorithms

AutoDig AI/Rule Engine

component that detects patterns in data (both model based analyses as well as machine learning and pattern recognition)

Figure 8 presents high-level design of the Big Data lake platform and its components.

The Big Data lake will allow APIs to access the data and data ingestion. The storage and processing

infrastructure will primarily support structured storage of the time series (used to store sensor data)

and measurements as well structured storage of files used for storing raw data. The Big Data lake will

support real time processing, batch processing and analytics functions.

The initial Big Data and Analytics platform currently being introduced at Statnett consists of the

following main software components:

IBM BigInsights and Hortonworks the acquired platform consists of IBM BigInsights component, however as IBM is in process of restructuring it, in practice the platform will consist of Hortonworks 2.6 as the main component

IBM BigSQL IBM Streams Tableau Server and Desktop IBM SPSS Modeler IBM BigR

These software components are described in the subchapters below and also presented on Figure 25.

2 COTS – Commercial Off The Shelf

Page 25: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

25

Figure 8 High-level design of the Big Data Lake platform

deployment BDL drawing - eng

Big Data Lake

Storage and processing infrastructure

Information consumers

Information sources

FOSWebFault Management

Batch processing

CIM objects

Market and Settlement

systems

Market and Settlement

systems

GIS

Operation

Management Support

Operation

Management Support

ERPAutoDig

AnalyticsRealtime processing

Structured time seriesStructured file storage

Data Science Tools

VideoPower quality

Asset dataOscillation registrationDistance Relay

Protection

Distance Relay

Protection

API

PMU

Digital Fault RecorderSCADA

LightningMet.no

API

Page 26: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

26

Hortonworks Data Platform

Hortonworks Data Platform (HDP) is a scalable open source Hadoop distribution and platform for

storing, processing and analyzing large amounts of data [19]. See also Chapter 3.1 for more details

about on premise Hadoop distributions.

Figure 9 Hortonworks platform [19]

IBM BigSQL

IBM provided BigSQL is a SQL layer on top of Hadoop/HDFS, which makes it possible to create tables

and query data using the SQL syntax. The SQL query engine supports joins, unions, grouping, common

table expressions, windowing functions, and other familiar SQL expressions.

Depending on the nature of the query, the data volumes, and other factors, Big SQL can use Hadoop's

MapReduce framework to process various query tasks in parallel or execute query locally within the

Big SQL server on a single node. [20]

Page 27: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

27

Figure 10 IBM BigSQL

IBM Streams

IBMs provided advanced computing platform that allows user-developed applications to ingest,

analyze, and correlate information as it arrives from real-time sources. The solution can handle very

high data throughput rates, up to millions of events or messages per second. [21]

Tableau Server and Desktop

Tableau is an advanced and highly performant visualization tool. It is an industry leading BI tool that

focuses on data visualization, dash boarding and data discovery [22].

IBM SPSS Modeler

IBM provided SPSS3 is a statistical tool from IBM used for non-batch and batch statistical analysis [23].

IBM SPSS Modeler is a part of the SPSS suite, which provides a set of data mining tools to develop

predictive models using business expertise and deploy them into operations to improve decision-

making. IBM SPSS Modeler supports a variety of modeling methods taken from machine learning,

artificial intelligence, and statistics. [24]

5.2 ArcGIS environment

The ArcGIS environment at Statnett is also a BigData&Analytics platform with the following main

components:

- The GeoAnalytics Server

- The GeoEvent Server

- Image Server

3 SPSS was originally named Statistical Package for Social Sciences

Page 28: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

28

- Insights for ArcGIS

GeoAnalytics Server and GeoEvent Server are a powerful combination.

Statnett uses GeoEvent as a development and production environment to streamline and analyze

lightning data and ship data in real time. GeoEvent has, among other things, great potential for use

with real-time sensor data.

GeoAnalytics, can be used in combination with scripting in Python and can use the archive data from

GeoEvent, which Statnett stores in spatiotemporal big data bars, as well as from various forms of

shares (Hadoop, AWS / Azure Cloud, etc.).

Figure 11 ArcGIS Big Data and Analytics landscape

ArcGIS GeoAnalytics Server

ArcGIS GeoAnalytics Server is designed to handle the analysis of massive datasets. GeoAnalytics tools

are a subset of Esri geoprocessing tools that use distributed and parallelized computing to run space-

time analyses on extremely large datasets. These tools can be executed using the Portal for ArcGIS

map viewer, ArcGIS Pro, the ArcGIS Server REST API, or from the new ArcGIS API for Python. ArcGIS

GeoAnalytics Server can connect to data from the Hadoop Distributed File System (HDFS), Hive, local

file shares, and data from within ArcGIS Enterprise, including using the archived spatiotemporal output

from ArcGIS GeoEvent Server as input. Because ArcGIS GeoAnalytics Server uses the base ArcGIS

Enterprise deployment to write and store analytical output, it is easy to use and share the resultant

layers and data [25] [26].

ArcGIS GeoEvent Server

ArcGIS GeoEvent Server is designed to handle high-volume, high-velocity real-time and streaming data.

It provides solutions through on-the-fly analysis and dynamic aggregation of large datasets, which

makes data visualization simple. When connected to the base ArcGIS Enterprise deployment, ArcGIS

GeoEvent Server can archive data to the spatiotemporal data store for further data analyses. [27] [28]

ArcGIS Image Server

ArcGIS Image Server provides serving, processing, analysis, and extracting value from massive

collections of imagery, rasters, and remotely sensed data. [29] [30]

Page 29: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

29

Insights for ArcGIS

Insights for ArcGIS is a web-based, data analytics workbench where you can explore spatial and non-

spatial data

Insights for ArcGIS is somewhat similar to Tableau, and can be used for example against real-time data

stored in our internal Spatiotemporal Big Data Store via GeoEvent Server. The features in GeoAnalytics

server can also be used from Insights. [31] [32]

Page 30: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

30

6 Reference Architecture – TARGET ARCHITECTURE

6.1 Overall Reference Architecture

The architecture for smarter asset management is aligned with the overall conceptual model for the

reference architecture at Statnett developed in the Finbeck project.

The Finbeck project has assessed several reference models defined by international institutions and

third parties. The most relevant reference architecture to be adopted by Statnett is the one defined

by the National Institute of Standards and Technology (NIST) in 2015. NIST reference model is a

supplier-neutral, technology and infrastructure-independent conceptual model for Big Data

architecture.

Figure 12 NIST Reference Model

The most important elements of a reference model as defined by NIST are:

System Orchestrator - ensures system requirements. This applies to business, architecture,

management, policy and resource requirements. In addition, the system orchestrator must

also monitor the system's compliance with the requirements. The system orchestrator role is

typically taken care of by one or more actors; which can be both human and machinery

(software), possibly a combination of the two

Data Provider - different data providers, which provide system data. An important

characteristic of a Big Data system is the ability to import and use data from a variety of

different sources in different formats. Examples of sources: internal and public documents,

Page 31: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

31

images, audio files, video, sensor data and logs. Asset management and asset health

management systems are examples of systems that can be source of data as well

Big Data Application Provider - ensures execution of the data life cycle in accordance with the

security requirements and requirements set by the system orchestrator. The life cycle of the

data consists of five main activities that are relatively similar to those found in traditional

data processing systems. The difference now is that data characteristics in Big Data systems

(volume, speed and variation, etc.) require a radical change in the data processing

mechanisms. These must be customized and optimized to, for example, be able to reach

response time requirements in a world of ever-increasing data volumes. The five main

activities in the Big Data Application Provider are Collection, Preparation/curation, Analytics,

Visualization and Access

Big Data Framework Provider - most of the progress made in recent years has been on

frameworks that scale performance even though the data sets being processed have Big Data

characteristics (volume, velocity, variation, etc.)

Data Consumer - is the end user, which can be either a person or another system that

consumes data. Data from the analysis and visualization activities are accessed through the

service interface offered by the Big Data Application Provider. The communication can either

be pull-based where the Big Data Application Provider responds to Data Consumer requests

or be power / push based where Data Consumer listens for automated output from the Big

Data Application Provider. All decision levels within asset management are example of

systems that can be consumers of data.

Another important framework that has been used as a basis for the architecture of the Big Data Lake

in AutoDig project is the IBM Reference Model for Big Data and Analytics presented on Figure 13.

Figure 13 IBM Reference Model for Big Data and Analytics4

4 The IBM Reference Model has been created and provided to Statnett as a part of the AutoDig 2.0 project

Data Sources Analytical Data

Lake Storage

Security

Platform

Information Management & Governance

Actionable

Insight

Analytics In-Motion

Enhanced

Applications

Discovery & Exploration

Analytics Operating System

Ingestion &

Integration

Data

Access

New sources

Traditionalsources

Data acquisition & application

access

Page 32: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

32

IBM has divided their model into 12 different areas with an increased focus on the Analytics (Analytics

in-motion and Analytical Data Lake Storage) and the consumer side (Discovery and Exploration,

Actionable Insights and Enhanced Applications).

In a course of multiple workshop and discussions in the Finbeck project and with input from the NIST

and IBM models, Statnett has defined its own reference architecture (Figure 14). Statnett’s Overall

Reference model has been divided into four main areas: data provider, big data and analytics platform,

data consumer as well as security and governance.

The Big Data and Analytics platform consists of several high level components including ingestion,

distribution, analysis, storage and access (Figure 14). The high-level architecture defined by Finbeck is

matching the NIST reference architecture except for the visualization component. Visualization

components can exist both inside and outside the reference architecture. In Statnett visualization has

been defined outside the platform. In practice there will be a few technical software components also

implemented as a part of the Big Data and Analytics platform5.

Figure 14 Overall Statnett reference architecture model for Big Data & Analytics

The descriptions of components in the high-level reference architecture and their relation to asset

management are explained in the following table.

Table 6 Descriptions of components in the High Level Reference model as defined by Finbeck project

Component Description

Data provider Considered as a component outside of the reference architecture. Detailed architecture will

still contain a description of which data sources the data platform will handle at all times.

Data sources could be systems and sensors. Asset management is an example of system that

can be data provider as well

5 I.e. Tableau or Cognos

Page 33: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

33

Component Description

Ingestion Data can be retrieved from several different data sources, which must be collected and

integrated with the data platform for further handling. The components that will handle this

will be described under Ingestion component.

Distribution Data needs to be distributed from source to consumer using one or more distribution

mechanisms. For Statnett, distribution of data consists mainly of data processing, in addition

to data handling historically using various storage technologies

Analysis The data needs to be processed in different ways, i.e. in real time (such as data streams) and

batch wise. Parts of the data processing will also handle data storage. The analysis will also

say something about the platform's ability to ensure that data supports advanced analysis

such as machine learning, deep learning, etc.

Batch Processing data batch wise, i.e., a periodization in handling the data. This means that data is

collected over time before it is distributed in the system. Data that does not need to be

visualized or analyzed in real time will normally be handled batch wise

Real time Statnett has large amounts of data handled in real time. In order for these data to be

distributed to more consumers, the platform must be able to handle flow data to meet new

needs and analyzes. Data ingested from the data sources should be able to flow as fast as

they occur in the sensors, source systems or external parties

Storage The data platform must contain several different storage components to ensure access to

historical information, traceability and access to real time information. The data platform

must handle storage such as relational databases, distributed storage, graph databases and

time series. Some storage will also be handled in processing (intermediate storage of data)

Access Data must be made available to different consumers and the architecture must support

several different ways of making available the data, consisting of API / HMI and search

API/HMI APIs (Application Programming Interface) and Human Machine Interface (HMI) are

components that will make data on the platform available to persons / systems on the

outside of the data platform. This also includes APIs that ensure the exchange of data to

external actors

Search The data platform will provide a fast, secure and easy access to the data you need. This will

require a form of search function, or Data Catalog, containing metadata about what is stored

within the architecture

Security and

governance

Security and governance provides a description of mechanisms for access control,

monitoring and safe handling of data stored in the solution including the data exchange with

external actors.

Visualization The architecture must support visualization of data and / or analyzes. Applications for

visualization can be seen as consumers for the data contained in the data platform. These

are key applications for realizing the business needs of Statnett, and one of the key

consumers. Certain technical components of the visualization will still need to be provided

as a part of the platform.

Data consumer Consumers are stakeholders of the architecture, and are described as the people or systems

that will need access to data stored in the data platform. Asset management and asset health

management systems are example of systems that can be consumers of data

Page 34: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

34

6.2 Strategy and motivation layers

As explained in chapter 2 ArchiMate 3.0 defines different layers to document the architecture. This

chapter will focus on Strategy and motivation layers and sums up requirements and expectations that

a Big Data platform for SAMBA has to meet including:

explicit requirements from the projects: implicit requirements gathered from various sources incl i.e. eSmart report [33] and other

reports.

The following subchapters explain the link between drivers, goals, tactics and capabilities that a future

SAMBA platform will support.

Strategy layer

The big data and analytics reference model analyzed and described in the Finbeck project is based on

TOGAF methodology and described using ArchiMate. The Finbeck reference model has been based on

the outcome of the analysis of the strategic aspects of the architecture using the ArchiMate 3.0

Strategy layer. The strategy layer explains the impact of technology changes on the business. In our

case the strategy layer explains also how the capabilities of Big Data and Analytics platform relate to

the overall strategic drivers, goals and outcomes and how they support the expectations from the

stakeholders. The strategy layer has been created based on interviews with several stakeholders in the

organization and with the input from earlier phases in SAMBA. Figure 15 presents one of the early

SAMBA models from WP1 that shows different elements including roles and stakeholders that are

important for asset management.

Figure 15 Elements in asset management – SAMBA model for asset management introduced in WP1

Page 35: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

35

Fia6 project has also provided important input in the assessment process, which has been used to

identify and align different roles and stakeholders in the Strategy layer. Figure 16 presents main

segments and information categories in Statnett as defined by Fia. It is apparent that asset

management has been identified a one the most central segments by Fia project.

Figure 16 Segments and information categories identified in the Fia project

During the process of assessment and analysis of the strategy layer, there have been identified eight

main stakeholders/roles in Statnett for which the Big Data and Analytics is of relevance:

Grid Owner - this is one of the three main responsibilities that Statnett has been chartered from the authorities and in which Statnett acts as the owner of the Norwegian transmission grid and the cable connections to abroad. Grid owner role is also the one where asset management plays a central part.

Grid Development - this is another of the three main responsibilities of Statnett. Grid development is about planning the future grid to meet the future needs not only for Statnett but also for the complete Norwegian power system.

System Operation - is the last of three main responsibilities of Statnett. System operation is about operating the transmission system, ensuring balance in the system as well as ensuring fair and equal treatment of all the market actors.

NVE7 – Norwegian regulator

6 Fia project focuses on Information Architecture at Statnett 7 NVE stands for The Norwegian Water Resources and Energy Directorate

Pow er grid models

Market and Settlement

Longterm planning

Asset management

Operations short term planning

Operations Actors

Observationsand

measurments

Page 36: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

36

CFO, CIO, CEO and CISO8 - internal Statnett stakeholders External stakeholders, i.e. other DSOs, TSOs, research institutions, universities, consultants

and so on, performing analysis on Statnett data or sending data to Statnett.

On the lower level, there is a number of stakeholders and roles, which support the main roles.

However, according to the evaluation performed in Finbeck and SAMBA projects, currently only a

limited set of the roles and stakeholders actually relate and are affected by adoption of Big Data and

Analytics technology. The most affected roles are fault analysis, asset management, system operation

and long term planning. There have not been identified any direct relations for e.g. CFO nor short term

planning. Neither, market nor settlement were identified as any major users of the Big Data and

Analytics technology at the time this report was written.

The strategy model presented here covers all areas, which require use of the Big Data and Analytics.

Here, for the purpose of SAMBA project mainly focuses on the asset management however due to the

WP 6 [3] focus on risk monitoring the other roles, in particular fault analysis and system operation, are

also of interest.

Figure 17 presents the strategy layer of the Big Data & Analytics reference architecture. A full size

Enterprise Architect (EA) diagram is also attached in Appendix V1.

8 CFO - Chief Financial Officer, CIO – Chief Information Officer, CEO – Chief Executive Officer, CISO – Chief Information Security Officer

Page 37: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

37

Figure 17 Reference Model - Strategy Layer

motivation Archimate3 Strategy Layer

Fault Analysis Engineer

CEO

Grid OwnerSystem Operation

Incident costs

Strategic Use of Modern

TechnologyNetwork balanceGrid costsAnalysis costs

Increase efficiency and safety

Optimize maintenance costs

More efficient problem

management

Reduce number and

consequence of incidents

Quicker fault analysis Improved real time

monitoring

Automated and

autonomous inspection

Improved more dynamic

visualisationImproved inspections -

remote or virtual

More efficient / quicker

fault management

Optimize investment costs

Optimized condition based

maintenance Asset health based

reinvestments

More frequent and

accurate inspection

Increase Automation of

Data Quality controll

Better Data Quality

NVE CISO

Adequate security of

sensitive and important

data

Power Supply Reliability

Secure sensitive data

Improved configurable

personalized visualisation

Sufficient security of

power sensitive and

personal data

Reduce Repporting Costs

More accurate fault

analysisIncreased precision of

imbalance predictions

Optimize short term

imbalance

ConstructionAsset Management

CFOGrid Development

Long Term Planning

CIO

System Operation costs

Improved Insight and

Business Understanding

HMS

Increased Capacity

Reduce bottlenecks

Add new customers

Improved socio-economic cost

benefit analysis

Administration costs

Predictive maintenanceMore efficient access to

information

Fault Analysis CoordinatorMarket and Settlement Short Term Planning

Page 38: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

38

The following table summarizes different drivers identified during the assessment phase and explain

their relation to the stakeholders:

Table 7 Overview strategy layer – stakeholders and drivers

Stakeholder Driver Meaning / Rationale

Grid Owner / Fault Analysis

Engineer

Grid costs / Analysis

costs

The overall costs related to fault analysis / problem

management

Grid Owner / Fault Analysis

Engineer

Grid costs / Incident

costs

The overall costs related to actual incidents

Grid Owner / Asset

management

Grid Owner / Construction

Grid Development /

Planning

Grid costs The overall costs of the grid related to asset

management, Construction and Planning. Includes also

analysis costs and incident costs.

Grid Development /

Planning

System Operation

System Operation / Fault

Analysis Coordination

NVE

Power Supply

Reliability

The reliability of the power supply as mandated by NVE

and OED.

System Operation

System Operation / Market

and Settlement

NVE

Network Balance Keeping the network in balance as a system.

Grid Development / Long

Term Planning

Increased Capacity Increasing the capacity to meet future demand

CEO Strategic Use of

Modern Technology

Use of technology, which will result in increase of

efficiency and safety in the future comprising increased

level of automation, Machine Learning and real-time

processing.

NVE Administration costs Optimizing costs of the administration (i.e. reporting)

CISO Secure sensitive data Securing data

Finally as a part of our Strategy layer model, there have been identified a number of goals and

outcomes related to the Big Data and Analytics at Statnett. The goals and outcomes and their relation

to drivers are summarized in the following table:

Page 39: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

39

Table 8 Overview strategy layer – drivers, goals and outcomes

Driver Goal Outcome Meaning

Analysis costs More efficient

problem

management

Improved configurable

personalized visualization

Improvements in visualization

support, which will be more

configurable and can be adapted to

individual needs

Quicker fault analysis More efficient and quicker fault

analysis

Incident costs

Power Supply

Reliability

Network Balance

Reduce number

and consequence

of the incidents

Improved real time

monitoring

Reduced latency, increased data

quality and better reliability are most

important examples of

improvements in real time

monitoring.

More accurate fault analysis More accurate findings in fault

analysis

Grid costs

Power Supply

Reliability

Network Balance

More efficient /

quicker fault

analysis

Improved more dynamic

visualization

Visualization, which dynamically

shifts the focus to issues/faults in the

grid

Improved real time

monitoring

See above

Automated and

autonomous inspection

Predefined inspections, which are

initiated by operator but performed

by drones and robots as well as

inspections, which are initiated and

performed autonomously

Improved inspection

remote or virtual

Inspections, that are performed by

the operator remotely/virtually

Network Balance Optimize short

term imbalance

Improved more dynamic

visualization

See above

Improved real time

monitoring

See above

Increased precision of

imbalance predictions

Improvement of precision of

imbalance prediction down to 5

minutes

Grid costs Optimize

maintenance

costs

Automated and

autonomous inspection

See above

Predictive maintenance Predictive ML algorithms designed to

help determine the condition of in-

service equipment in order to predict

when maintenance should be

performed

Page 40: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

40

Driver Goal Outcome Meaning

Optimized condition based

maintenance

More optimal condition based

maintenance based on analysis of

high volumes of asset data, sensor

data, both batch and real time data

More frequent and accurate

inspection

Automated inspections that can be

performed more frequently to

support more traditional condition

based maintenance and reliability

based maintenance

Improved inspection virtual

remote or virtual

See above

Improved configurable

personalized visualization

See above

Grid costs Optimize

investments costs

Asset health based

reinvestments

Investments and reinvestments

based on the actual asset health (i.e.

asset health index derived from the

sensor and asset management data).

Improved socio-economic

benefit analysis

Quicker and more accurate socio-

economic analysis due to more

performant tools and platforms

Increased capacity Reduce

bottlenecks

Improved socio-economic

benefit analysis

Improved socio-economic benefit

analysis is the most important

outcome.

Increased capacity Add new

customers

Improved socio-economic

benefit analysis

See above

Strategic use of

modern

technology

Increase

efficiency and

safety

All dependent goals The goal of increased efficiency and

safety relates to several dependent

goals (in practice all of them) and

supports the driver of strategic use

of modern technology

Grid costs

Network balance

Increase

automation of

data quality

control

Better data quality Improve data quality in all involved

systems and sensors. This includes

the improvements in the

infrastructure for collecting and

transporting the sensor data as well

as improvements in the data

consistency.

Administration

costs

Reduce reporting

costs

More accurate fault analysis See above

Page 41: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

41

Motivation layer

The motivation layer (Figure 18) links the strategy with the actual capabilities in the Big Data and

Analytics platform. This gives a more complete connection between strategic and tactic planning levels

as well as provides explanation of why different capabilities are necessary and how they support and

affect the strategy. A full size EA diagram is also attached in Appendix V1.

Page 42: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

42

Figure 18 Reference Model – motivation layer

motivation Archimate3 Motivation Layer

Aligning and harmonizing facts

from various sources

High volume data storage

Handling of real time information and

streaming analysis

Video and picture analysis

Audio analysis support

Machine Learning support Deep Learning support

Data Science Tools

User friendly visualization

Data Catalogue

Triple Store and Graph storage

Quicker fault analysis

Reduce latency of data

acquisition

Low latency IoT data transport

Improved real time

monitoring

Automated and

autonomous inspection

Improved more dynamic

visualisation

High CPU and GPU power

Improved inspections -

remote or virtual

Introduce condition based

visualization

Introduce digital twin

concept

Increase storage capacityIntroduce rule based

analysis

Introduce fault detelction

rules

Introduce Smart Event

Processing

Processing of batch data

Introduce augmented/virtual reality

Rule Engine support

Introduce drones and

robotics

Drone Fleet Management & Data

Capture

Optimized condition based

maintenance Asset health based

reinvestments

Event Notification, Filtering and

Distribution

Open Access to Data and Data

Sharing

introduce common data

nav/lake

More frequent and

accurate inspection

Data Quality and Consistency Check

Better Data Quality

Improve data quality

Adequate security of

sensitive and important

data

Redundancy and disaster

recovery

implement measures to

secure the data

Fine-grained access

control and perimeter

security/AAA

Allow new high frequency

sensors

High throughput IoT data

transport

Improved configurable

personalized visualisation

More accurate fault

analysis

Introduce fault

classification rules

Introduce configurable

and personalised

visualisation

Increased precision of

imbalance predictions

Introduce predictive

analytics

Configurable and

personalized visualization

Actor Framework

Improved Insight and

Business Understanding

Introduce Data Science

Improved socio-economic cost

benefit analysis

Introduce probabilistic

reliability assesment

Predictive maintenance

Natural language

understanding

Chatbot conversation

support

Document Storage

introduce interactive

information access

More efficient access to

information

Classic Rule Engine

Support

Sensor Time Synchronization support

Map visualization

Page 43: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

43

The following table describes how the different tactics (course of action) support the outcomes from

the strategy layer and how they are realized by different capabilities of the Big Data and Analytics

platform. Since the assessment cover the whole Finbeck project and all initiatives at Statnett, the last

column on the right explains how the different capabilities relate to the SAMBA project. This

assessment has been performed based on the analysis of the strategy and motivation layer models.

Table 9 Overview motivation layer – outcome, course of action and capability

Outcome Course of

action

Capability Meaning Relation to

SAMBA use

cases and

expectations

Improved

configurable and

personalized

visualization

Introduce

configurable

and

personalized

visualization

Configurable

and

personalized

visualization

Self-service, personalized

visualization, which is highly

configurable, is crucial to achieve

sufficient flexibility to be able to

explore the data without delay

and without involving the IT

department.

Yes, for

analysis

purposes.

Quicker fault

analysis

Improved real time

monitoring

Reduce latency

of data

acquisition

Low latency IoT

data transport

Handling of real

time

information and

streaming

analysis

Low latency in the data transport

capability as well as support for

handling of real time information

and streaming analysis capability

are important capabilities to

reduce the latency of data

acquisition and to achieve

quicker fault analysis and

improved real time monitoring.

Yes, as an

important

basic

prerequisite

Improved more

dynamic

visualization

Introduce

condition based

visualization

User friendly

visualization

Handling of real

time

information and

streaming

analysis

Rule Engine

support (Classic,

ML, DL, High

CPU/GPU)

Introduce condition based

visualization, that automatically

shifts focus to issues/faults in the

grid. Condition based

visualization requires handling of

real time information and

streaming analysis, user friendly

visualization and rule engine

support capabilities.

WP6

Improved real time

monitoring

Improved more

dynamic

visualization

Predictive

maintenance

Introduce smart

event

processing

Handling of real

time

information and

streaming

analysis

Event

notification,

Introduce smart event

processing, in particular

streaming analytics that can

predict the deviations and faults

before they occur.

Smart event processing and

streaming analytics requires

handling of real time information,

streaming analysis, event

WP2

Page 44: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

44

Outcome Course of

action

Capability Meaning Relation to

SAMBA use

cases and

expectations

Optimized

condition based

maintenance

Asset health based

reinvestments

filtering and

distribution

Processing of

batch data

notification, filtering and

distribution as well processing of

batch data capabilities.

More accurate fault

analysis

Introduce fault

classification

rules

Machine

Learning

support

Introduce the fault classification

rules, which require machine

learning support capability.

Weak

More accurate fault

analysis

Optimized

condition based

maintenance

Introduce fault

detection rules

Machine

Learning

support

Introduce fault detection rules,

which require machine learning

support capability.

WP 2

Automated and

autonomous

inspection

Predictive

maintenance

Optimized

condition based

maintenance

Introduce rule

based analysis

Rule Engine

support (Classic,

ML9, DL10, High

CPU11/GPU12)

Introduce classic rule based

analysis support, which require

machine learning support

capability.

WP 2 and

WP 6

More accurate fault

analysis

Allow new high

frequency

sensors

High throughput

IoT transport

Sensor time

synchronization

support

High volume

data storage

Allow new high frequency

sensors, which require high

throughput IoT transport, sensor

time synchronization support and

high volume data storage.

Notice that also existing

infrastructure and sensors

require improvements with

respect to these capabilities.

Related

More accurate fault

analysis

Improved insights

and business

understanding

Increase storage

capacity

High volume

data storage

Increase storage capacity, which

requires high volume data

storage platform capability

WP 2

9 ML – Machine Learning 10 DL – Deep Learning 11 CPU – Central Processing Unit 12 GPU – Graphics Processing Unit

Page 45: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

45

Outcome Course of

action

Capability Meaning Relation to

SAMBA use

cases and

expectations

Improved

inspections -

remote and virtual

Predictive

maintenance

More frequent and

accurate inspection

Improved

inspections -

remote and virtual

Introduce

drones and

robotics

Video and

picture analysis

Drone fleet

management

and data

capture

Introduce drones and robotics,

which requires video and picture

analysis capability as well as

drone fleet management and

data capture capability.

WP 6

Asset health based

reinvestments

Increased precision

of imbalance

predictions

Predictive

maintenance

Introduce

predictive

analytics

Data Science

Tools (DL, ML,

High CPU/GPU)

Introduce predictive analytics,

which requires data science tools

including deep learning, machine

learning and high cpu/gpu

capabilities.

WP 2

Improved insights

and business

understanding

Introduce Data

Science

Data Science

Tools (DL, ML,

High CPU/GPU)

Introduce predictive analytics,

which requires data science tools

including deep learning, machine

learning and high cpu/gpu

capabilities.

Related

Improved socio-

economic benefit

analysis

Introduce

probabilistic

reliability

assessment

Data Science

Tools (DL, ML,

High CPU/GPU)

Introduce probabilistic reliability

assessment, which requires data

science tools including deep

learning, machine learning and

high cpu/gpu capabilities.

WP2 and

WP6

Improved

inspections -

remote and virtual

Introduce digital

twin concept

High CPU and

GPU power

Actor

framework

Triple store and

graph storage

Aligning and

harmonizing

facts from

various sources

Introduce digital twin concept,

that enables real time 3D

visualization and control of the

assets as well as means to model

and reproduce the condition of

the grid and assets at a given

point of time.

Digital twin requires high CPU

and GPU power, actor

framework, triple store and graph

storage as well as aligning and

harmonizing facts from various

WP 6

Page 46: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

46

Outcome Course of

action

Capability Meaning Relation to

SAMBA use

cases and

expectations

(Data

Catalogue)

sources including data catalogue

capabilities.

Improved

inspections -

remote and virtual

Introduce

augmented/virt

ual reality

High CPU and

GPU power

Introduce augmented/virtual

reality, which requires high CPU

and GPU power capability in the

platform.

WP 6

Optimized

condition based

maintenance

Asset health based

reinvestments

Improved insights

and business

understanding

Increased precision

of imbalance

predictions

Introduce

common data

nav/lake

Triple store and

graph storage

Open access to

data and data

sharing

Aligning and

harmonizing

facts from

various sources

(Data

Catalogue)

Introduce common data

nav/lake, which requires triple

store and graph storage

capability, open access to data

and data sharing, aligning and

harmonizing facts from various

sources including data catalogue.

WP 2

Better data quality Improve data

quality

Data quality and

consistency

check

Improve data quality in all

involved systems and sensors.

This includes the improvements

in the infrastructure for collecting

and transporting the sensor data

as well as improvements in the

data consistency.

This requires data quality checks

and consistency checks

capabilities in the platform.

Related

More efficient

access to

information

Introduce

interactive

information

access

Natural

language

understanding

Document

storage

Chatbot

conversation

support

Introduce interactive information

access, which requires natural

language understanding

capability, document storage and

chatbot conversation support

capabilities.

Related

Adequate security

of sensitive and

important data

Implement

measures to

secure the data

Fine grained

access control

and perimeter

security/AAA

Implement measures to secure

the data, which requires fine-

grained access control and

perimeter security/AAA

Related

Page 47: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

47

Outcome Course of

action

Capability Meaning Relation to

SAMBA use

cases and

expectations

Redundancy

and disaster

recovery

capabilities as well as redundancy

and disaster recovery capability.

6.3 Capabilities and information needs

This chapter summarizes all the platform capabilities identified in the Finbeck project. As concluded in

chapter 6.2 most of these capabilities are also required for asset management and SAMBA project.

Figure 19 presents all the capabilities identified by Finbeck project [34].

Figure 19 Big Data and Analytics platform capabilities [34]

The table below explains each of the capabilities in detail.

motivation CapabilityMap - simple

Aligning and harmonizingfacts from various sources

High volume datastorage

Handling of real timeinformation and streaming

analysis

Video and picture analysis Audio analysis supportMachine Learning support Deep Learning support Data Science Tools

User friendlyvisualization

Data Catalogue

Triple Store and Graphstorage

Low latency IoT datatransport

High CPU and GPU power

Processing of batch data

Rule Engine support

Drone Fleet Management& Data Capture

Event Notification,Filtering andDistribution

Open Access to Data andData Sharing

Data Quality andConsistency Check

Redundancy and disasterrecovery

Fine-grained access controland perimeter security/AAA

High throughput IoT datatransport

Configurable andpersonalizedvisualization

Actor FrameworkDocument Storage

Chatbot conversationsupport

Natural languageunderstanding

Sensor TimeSynchronization support

Classic Rule EngineSupport

Map visualization

Page 48: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

48

Table 10 Big Data and Analytics platform capabilities

Capability name Related /

required for

asset

management?

Comment

Triple Store and Graph

storage

Yes Ability to store and process Triples and graph data.

Video and picture analysis Yes Ability to analyze and match patterns in the video and

photo files.

Open Access to Data and

Data Sharing

Yes Open API and access data. The system provides

unconstrained access to data using a number of different

ways.

Processing of batch data Yes Processing the data (often static data) in a periodic / batch

way.

Data Quality and

Consistency Check

Yes Ability to asses, control and rate the quality and accuracy

of the information stored in the data lake. In addition,

mechanisms allowing consistency checking of the

information stored in the lake.

Configurable and

personalized visualization

Yes Visualization tools that provide high level of customization

and configurability on personal level.

Redundancy and disaster

recovery

Yes Ability to continue operation of the system despite losing

some of the computational power and storage.

Actor Framework Yes Framework allowing implementation of concurrent

computation model with actors as universal primitives of

concurrency.

Deep Learning support Yes Ability to simulate / run deep neural networks in order to

analyze / train and run the predictive models.

Drone Fleet Management &

Data Capture

Yes Feature allowing steering / controlling and managing a

fleet of drones and acquiring captured data.

High CPU and GPU power Yes High computational power both CPU and GPU (graphical)

Low latency IoT data

transport

Less relevant Ability to transport the data with low delay.

The sensor data are important for asset management;

however, it is less relevant that data has very low latency.

This capability is important for the operations and Fault

Analysis, but less important for the asset management

Self Service Analytics Yes Analytics and visualization tools and views that can be

tailored to meet the needs of each individual and can be

adapted individually by each user.

Granular access control and

perimeter security/AAA

Yes Basic security features of the system allowing sufficient

control of the authentication, authorization and audit.

Page 49: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

49

Capability name Related /

required for

asset

management?

Comment

Rule Engine support Yes Feature allowing creating configurable rules that alter the

business logic of the application. Comprises use of AI,

Machine Learning and Deep Learning.

Data Catalogue Yes Metadata store providing information which enables

finding right information in the data lake

Data Science Tools Yes Various tools used by the Data Scientist including

Jupyter/zeppelin notebooks, R studio, SPSS, SparkML and

Python/scikit.

Event Notification, Filtering

and Distribution

Yes Ability to receive, filter and distribute events.

Handling of real time

information / streaming

analysis

Yes Ability to process streams of data, detect patterns and

generate events based on that.

High volume data storage Yes Data storage capable of storing amounts of data not

practical to store on process in traditional databases (i.e.

relational databases)

Machine Learning support Yes Libraries allowing use of statistical methods to analyze and

predict the output based on given parameters, using i.e.

libraries like SparkML.

User friendly visualization Yes User-friendly visualization.

Map visualization Yes Integrated support for map visualization.

High throughput IoT data

transport

Yes Data transport capability that allows sending high volumes

of data in short time.

The sensor data are important for asset management,

however it is less relevant that data transport has very

high throughput. This capability is important for the

operations and Fault Analysis, but less important for the

asset management

Aligning and harmonizing

facts from various sources

Yes Ability to relate, combine and align the information from

multiple sources/silos.

Audio analysis support Yes Ability to analyze and match patterns in the audio file

Natural language

understanding

Yes Ability to comprehend the meaning of the natural

language, i.e. the documents/documentation

Chatbot conversation

support

Yes Support for interaction using the chatbot conversations

Document storage Yes Support for storing the documents

Classic Rule Engine support Yes Classic rule engine support without AI, ML/DL

Page 50: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

50

Capability name Related /

required for

asset

management?

Comment

Sensor Time

Synchronization support

Less relevant Capability to ensure time synchronization and time

alignment of data from various sources, as well as

preserving the time delay, time gap and jitter in the stored

data at microsecond level.

Very precise (sub second) time synchronization is less

important for asset management, however it is less

relevant that data has very low latency. This capability is

important for the operations and Fault Analysis, but less

important for the asset management.

6.4 Business architecture

The following diagram (Figure 20) shows dependencies between some SAMBA WP2 use cases and

limited set of identified capabilities on a more detailed level. Such detailed assessment has only been

done for a limited number of WP2 use cases:

Page 51: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

51

Figure 20 Business Architecture – WP2 use case mapped to Big Data & Analytics platform capabilities

The following table explains the relationship in more detail; these are also documented in the

Enterprise Architect at Statnett.

Area /

Function

Use Case Capability Meaning / Rationale

Transformer T2.1 Online gas

data analysis

Event notification, filtering

and distribution

Rule Engine support

Machine Learning support

User friendly visualization

Handling of real time

information/ streaming

analysis

Online gas data analysis use case requires a

set of platform capabilities, in particular

event notification, filtering and distribution,

rule engine support, machine learning, user

friendly visualization, handling of real time

information/streaming analysis, aligning and

harmonizing facts from various sources as

well as processing of batch data.

These capabilities are necessary to ensure

efficient real time data collection, flexible

business BusinessLayer

Transformation

T3.6 Yearly health indexT3.5 Periodic oil and gas

analysis

T3.1 Thermal winding agingT2.1 Online gas data

analysis

Cabel

C2-4 DTS

Breaker

Reignition monitor of

reactor breakers

C2.3 Oil filled termination

Processing of batch data

Rule Engine support

User friendly visualisation

High volume data storage

Machine Learning support

Handling of real time

information and streaming

analysis

Alligning and harmonizing

facts from various sources

Video and picture analysis

Event Notification,

Filtering and Distribution

Page 52: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

52

Area /

Function

Use Case Capability Meaning / Rationale

Aligning and harmonizing

facts from various sources

Processing of batch data

analysis of online gas data as well as providing

notification of detected deviations.

T3.1 Thermal

winding aging

T3.5 Periodic oil

and gas analysis

Rule engine support

Aligning and harmonizing

facts from various sources

Processing of batch data

Thermal winding aging use case and periodic

oil and gas analysis use case require a set of

platform capabilities, in particular rule engine

support, aligning and harmonizing facts from

various sources as well as processing of batch

data.

These capabilities are necessary to ensure

efficient data collection and flexible analysis

of the data and pattern detection.

T3.6 health

index

Rule engine support

Video and picture analysis

Processing of batch data

Health index use case requires a set of

platform capabilities, in particular rule engine

support, video and picture analysis as well as

processing of batch data.

These capabilities are necessary to ensure

efficient data collection and flexible analysis

of the data and collected video and pictures.

Cable C2.4 DTS Event notification, filtering

and distribution

Rule Engine support

High volume data storage

Aligning and harmonizing

facts from various sources

Processing of batch data

DTS use case requires a set of platform

capabilities, in particular event notification,

filtering and distribution, rule engine support,

high volume data storage, aligning and

harmonizing facts from various sources as

well as processing of batch data.

These capabilities are necessary to ensure

efficient collection of high volume of data,

flexible analysis of DTS data and data from

other sources as well as providing notification

of detected deviations.

C2.3 Oil filled

termination

Event notification, filtering

and distribution

Rule Engine support

Machine Learning support

Aligning and harmonizing

facts from various sources

Processing of batch data

Oil filled termination use case requires a set

of platform capabilities, in particular event

notification, filtering and distribution, rule

engine support, machine learning support,

aligning and harmonizing facts from various

sources as well as processing of batch data.

These capabilities are necessary to ensure

efficient collection of data, flexible analysis of

data from multiple sources as well as

providing notification of detected deviations.

Breaker Reignition

monitor of

Event notification, filtering

and distribution

Reignition monitoring of reactor breakers use

case requires a set of platform capabilities, in

particular event notification, filtering and

Page 53: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

53

Area /

Function

Use Case Capability Meaning / Rationale

reactor

breakers

Rule Engine support

Machine Learning support

Handling of real time

information/ streaming

analysis

High volume data storage

Aligning and harmonizing

facts from various sources

distribution, rule engine support, machine

learning support, handling of real time

information/ streaming analysis, high volume

data storage as well as aligning and

harmonizing facts from various sources.

These capabilities are necessary to ensure

efficient real time collection of high volume of

data, flexible analysis of data from multiple

sources as well as providing notification of

detected deviations.

As explained in previous chapter, in addition to the WP2 related capabilities there is much larger set,

which relates to smarter asset management and SAMBA project indirectly through the assessment of

the outcomes of the WP6 report.

6.5 Overall Strategy and Motivation layer

The following diagram (Figure 21) shows the complete strategy and motivation model layer as

described above in this chapter. A full size Enterprise Architect (EA) diagram is also attached in

Appendix V1.

Page 54: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

54

Figure 21 Reference model - strategy and motivation - summary

motivation Archimate3 Strategy&Motivation Layer

Aligning and harmonizing facts

from various sources

High volume data storage

Handling of real time information and

streaming analysis

Video and picture analysis

Audio analysis support

Machine Learning support Deep Learning support

Data Science Tools

User friendly visualization

Data Catalogue

Triple Store and Graph storage

Fault Analysis Engineer

CEO

Grid OwnerSystem Operation

Incident costs

Strategic Use of Modern

TechnologyNetwork balanceGrid costsAnalysis costs

Increase efficiency and safety

Optimize maintenance costs

More efficient problem

management

Reduce number and

consequence of incidents

Quicker fault analysis

Reduce latency of data

acquisition

Low latency IoT data transport

Improved real time

monitoring

Automated and

autonomous inspection

Improved more dynamic

visualisation

High CPU and GPU power

Improved inspections -

remote or virtual

More efficient / quicker

fault management

Introduce condition based

visualization

Introduce digital twin

concept

Increase storage capacityIntroduce rule based

analysis

Optimize investment costs

Introduce fault detelction

rules

Introduce Smart Event

Processing

Processing of batch data

Introduce augmented/virtual reality

Rule Engine support

Introduce drones and

robotics

Drone Fleet Management & Data

Capture

Optimized condition based

maintenance Asset health based

reinvestments

Name: Archimate3 Strategy&Motivation Layer

Author: Leslaw Lopacki

Version: 1.0

Created: 11.10.2017 13:34:11

Updated: 03.12.2017 20:51:52

Event Notification, Filtering and

Distribution

Open Access to Data and Data

Sharing

introduce common data

nav/lake

More frequent and

accurate inspection

Data Quality and Consistency Check

Increase Automation of

Data Quality controll

Better Data Quality

Improve data quality

NVE CISO

Adequate security of

sensitive and important

data

Power Supply Reliability

Redundancy and disaster

recovery

implement measures to

secure the data

Secure sensitive data

Fine-grained access

control and perimeter

security/AAA

Allow new high frequency

sensors

High throughput IoT data

transport

Improved configurable

personalized visualisation

Sufficient security of

power sensitive and

personal data

Reduce Repporting Costs

More accurate fault

analysis

Introduce fault

classification rules

Introduce configurable

and personalised

visualisation

Increased precision of

imbalance predictions

Introduce predictive

analytics

Optimize short term

imbalance

Configurable and

personalized visualization

Actor Framework

ConstructionAsset Management

CFOGrid Development

Long Term Planning

CIO

System Operation costs

Improved Insight and

Business Understanding

Introduce Data Science

HMS

Increased Capacity

Reduce bottlenecks

Add new customers

Improved socio-economic cost

benefit analysis

Introduce probabilistic

reliability assesment

Administration costs

Predictive maintenance

Natural language

understanding

Chatbot conversation

support

Document Storage

introduce interactive

information access

More efficient access to

information

Fault Analysis Coordinator

Classic Rule Engine

Support

Sensor Time Synchronization support

Market and Settlement Short Term Planning

Map visualization

Page 55: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

55

6.6 Application Architecture

This chapter presents the long-term application architecture of the Big Data and Analytics platform in

Statnett. As explained earlier this is the overall platform as defined in the Finbeck project. Asset

management is an important user of this platform.

The platform supports the principles of context mapping and anti-corruption layer. It provides a

number of APIs and interfaces to access the data including:

API - HTTPS, JSON and programmatic - this is the main API that the internal systems can use to access the data

Notification/streaming API - this API is used to provide notifications and stream data from the platform to other internal systems

Data Exchange API - this API is use to offer the data for public access as well as to third parties as for instance other TSOs, DSOs and regulators

Cloud GW - this component represents the gateway to the cloud, however it is also crucial for integration of any cloud based services

Data Science Tools - this is a set of tools required by the data scientists Self Service visualization - this represents the generic visualization component which is

provided as a part of the platform, there will also be visualization components implemented within each client

Classic Visualization - visualization components which were used / introduced prior to establishing the Big Data and Analytics platform

API - ingestion - this is the API and set of tools for ingestion of the data, both streaming and batch, from internal and external sources

The platform itself provides a number of components supporting capabilities described in earlier

chapters. There are following components defined as a part of the platform:

Storage components - include multiple types of storage including file store, graph/RDF store, time series data store, metadata store, relational data store to store various, structured and unstructured data

Analysis components - include Analytics engine as well as Deep learning and Machine learning engines where Statnett will implement and execute advanced AI algorithms

Processing components - batch and real time - includes Batch processing engine and Streaming processing engine which will detect the patterns in the real time / streaming data

Access components - includes Data Catalogue which is important for structuring the data store in the platform and for being able to find the information

There is a number of internal systems and sources, which will communicate with the platform. These

systems can act as both the provider and consumer of data. The notification and streaming API

provides a means for more complex interactions, i.e. when an internal system need to be notified

about a pattern implemented and detected within the platform.

The architecture for asset management functionality is planned as a hybrid architecture. Some of the

functionality will be placed in the Big Data and Analytics platform, but not only. It is clear that certain

algorithms in the area of asset management will require specialized systems. The Big Data and

Analytics platform cannot meet all these needs. Therefore, there will still be need for other internal

components like asset health management component and Bi-Cycle.

Page 56: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

56

technology TechnologyPlatformViewpoint

BigData&Analytics platform

Hadoop/Hortonworks cluster

Spark 2

Zeppelin

Ambari

YARN

Streams cluster

IBM Streams

HBase

HDFS

Solr

IGC cluster

InformationGovernanceCatalogue

JanusGraph

OpenTSDB

BigSQL

Kafka

Knox

Ranger

Cognos cluster

Cognos

Oracle DBMS cluster

Oracle DBMS

Tableau cluster

Tableauserver

User PC

Tableau desktop

DataScience PC

Browser (DS) Browser (User)

SPSS cluster

SPSS server

SPSS client

ZooKeeper

BigIntegrate Sqoop Flume

Hive

BigR

Ingestion Batch&Streaming

API HTTP andProgramatic

Notificationand Streaming

API

Insights for ArcGIS node

Insights forArcGIS

Atlas

Page 57: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

57

Figure 24 describes the application layer of Big Data & Analytics Reference Model.

technology TechnologyPlatformViewpoint

BigData&Analytics platform

Hadoop/Hortonworks cluster

Spark 2

Zeppelin

Ambari

YARN

Streams cluster

IBM Streams

HBase

HDFS

Solr

IGC cluster

InformationGovernanceCatalogue

JanusGraph

OpenTSDB

BigSQL

Kafka

Knox

Ranger

Cognos cluster

Cognos

Oracle DBMS cluster

Oracle DBMS

Tableau cluster

Tableauserver

User PC

Tableau desktop

DataScience PC

Browser (DS) Browser (User)

SPSS cluster

SPSS server

SPSS client

ZooKeeper

BigIntegrate Sqoop Flume

Hive

BigR

Ingestion Batch&Streaming

API HTTP andProgramatic

Notificationand Streaming

API

Insights for ArcGIS node

Insights forArcGIS

Atlas

Page 58: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

58

Page 59: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

59

Figure 22 Reference Model - application layer

application ApplicationPlatformViewpoint

Cloud

External Consumers

Third Party

Sensors

Internal Consumers

BigData&Analytics PlatformInternal Systems and Sources

External Sensors and SourcesInternal Sensors

API - HTTPS,

JSON &Programatic

Timeseries

DatastoreFile Store

Metadata Store

Topology -

Graph - RDFDatastore

Machine LearningEngine

Self Service Visualisation

Deep LearningEngine

EMS

HIS LYNPDC (PMU-data)

DataScience Tools

AutoDIG

API - ingestion - batch&streaming

FASIT2018

Elspec - PQScada

IFS

Digital FaultRecorder

Asset HealthManagement

StreamingProcessingEngine

Notification/

StreamingAPI

DataCatalogue

DroneFleet

Management

Metrum

OIS

Fifty MMS

Fifty HVDC NOIS

Rule Engine

DataScience

Analytics Engine

BatchProcessingEngine

Fault Analysis MaintenanceManagement

AssetManagement

Operations

Classic Visualisation

Relational

Datastore

Innsikt

DataExchange

API

Public Access

Cloud GW

TSO

DSO

Regulator

Public

Datastore

Private

Datastore

Asset ConditionMonitoring &

Analysis

Risk Analysis &Monitoring

Renewal Cost-Benefit Analysis

IMPALA

met.noDistance RelayProtection

ArcGIS

Other Systems

Drone

BI-Cycle

CloudAnalytics

MACE - AFRR-MFRR

Page 60: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

60

It is important to point out that several of the components above are not yet supported by the current

Big Data and Analytics platform implemented as a part of the AutoDig 2.0 project. In particular, the

Cloud GW integration and Data Exchange API are not yet implemented and there is no support for

those components.

6.7 Capability to application component mapping

The diagram in Figure 23 explains how the different application components relate to the identified

capabilities. It is important to observe that this mapping reflects the long-term target model, beyond

the currently implemented Big Data lake platform (AutoDig 2.0 project).

The current platform does not support several of the necessary components. We have identified key

areas from the current state assessment, which identifies existing gaps and future platform

improvement opportunities.

Area / Component Gap Opportunity

Data Exchange API component Insufficient means of exchanging

the data with other parties in

secure and reliable way, i.e.

lacking the gateway functionality

to isolate the data exposed to

external users

Support for data exchange both

for public access as well as access

for third party companies, i.e.

DSOs, TSOs and regulators

Cloud GW and cloud support

component

Insufficient means of integrating

the platform with the cloud

services, i.e. IaaS and PaaS

New complex services that can be

provided as PaaS or SaaS services

in the cloud, i.e. natural language

understanding and chatbot

conversation support

API – ingestion component Limited functionality as

implemented in the AutoDig

project and there are several data

sources with significant delays

w.r.t. data transport causing

delays in detection of events, the

data is of poor quality and often

missing. This in turn limits the

value delivered to the Fault

Analysis. As a result, the “Low

latency IoT transport” capability is

poorly supported.

More accurate sensor data. Less

delay in data collection and

quicker analysis. Important

prerequisite for low latency real

time data processing 13

Drone Fleet Management & Data

Capture capability

Current application architecture

does not include this capability

Automated capturing and

processing of data collected by

the drones as well as quicker,

automated and unassisted drone

deployment

13 See also Figure 28 for a suggested detailed architecture for streaming sensor data

Page 61: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

61

Figure 23 Reference Model - capability to component ma

application CapabilityMap - application mapping

BigData&Analytics Platform

Cloud

API - HTTPS,JSON &

Programatic

TimeseriesDatastore

File Store

Metadata Store

Topology -

Graph - RDFDatastore

Machine LearningEngine

Self Service Visualisation

Deep LearningEngine

DataScience Tools

API - ingestion - batch&streaming

StreamingProcessingEngine

Notification/Streaming

API

DataCatalogue

Rule Engine

Analytics Engine

BatchProcessingEngine

Classic Visualisation

RelationalDatastore

Innsikt

DataExchange

API

Cloud GW

Public

Datastore

Private

Datastore

Cloud Analytics

Aligning and

harmonizing facts fromvarious sources

Actor Framework Deep Learning

support

Audio analysissupport

Chatbot

conversationsupport

Classic RuleEngine Support

Configurable and

personalized visualization

Data Catalogue

Data Quality andConsistency Check

Data Science

Tools

High CPU and GPU

power

DocumentStorage

Drone FleetManagement & Data

Capture

Event Notification,

Filtering andDistribution

Fine-grainedaccess control and

perimeter

security/AAA

Handling of real time

information andstreaming analysis

Processing ofbatch data

High throughput

IoT data transportHigh volume data

storage

Low latency IoT

data transport

Machine Learning

support

Natural language

understanding

Open Access to

Data and DataSharing

Video and picture

analysis

Redundancy anddisaster recovery

Rule Enginesupport

Self Service

Analytics

Sensor Time

Synchronizationsupport

Triple Store and

Graph storage

User friendly

visualization

Map visualization

Page 62: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

62

6.8 Technology Architecture

This chapter presents current technology architecture of the Big Data and Analytics platform in

Statnett. This chapter only focuses on the architecture as it looked at the time of writing of this report

and based on the Big Data platform acquired within the AutoDig 2.0 project. As explained earlier in

chapter 5 and chapter 6.1, this is the same overall platform defined in the Finbeck and AutoDig 2.0

projects. Asset management is expected to become one of the most important users of this platform.

The most important components within this platform are:

Hortonworks platform consisting of several standard components. Notice that only the

components that are in use or planned to be used are presented and not all of the

Hortonworks components.

The Hortonworks platform runs also IBM specific components like BigIntegrate, which is an

ETL14 tool used for ingestion of data into the data lake and the IBM Big SQL, which is an SQL

interface to query data stored in Hive or Hbase.

IBM Streams component used for processing of streaming data and streaming analytics

Tableau visualization component

IBM SPSS server for designing analytics and machine learning functions

Information Governance Catalogue which provides means of structuring the data in the data

lake

Cognos and Oracle RDBMs which are part of the current Innsikt data warehouse portfolio at

Statnett

Table 11 describes the components in the platform in detail:

Table 11 Technology components

Area Component SW Environment

Description15

Access Hive Apache Hadoop Hortonworks (HDP)

Apache Hive is an access tool for providing data summarization, query, and analysis. SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop [19].

Phoenix HDP Apache Phoenix is a massively parallel, relational database engine supporting OLTP for Hadoop using Apache HBase as its backing store. Phoenix hides the intricacies of the NoSQL store enabling users to create, delete, and alter SQL tables, views, indexes, and sequences; insert and delete rows singly and in bulk; and query data through SQL [19]

Pig HDP Apache Pig is a high-level platform for creating programs that run on Apache Hadoop [19]

BigSQL Hadoop IBM

IBM provided BigSQL is a software layer for creating tables and query data in BigInsights using SQL similar to Phoenix and based on Hive [20]

14 ETL – Extract, Transform, Load 15 Suppliers description of the SW component

Page 63: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

63

Area Component SW Environment

Description15

Zeppelin HDP Apache Zeppelin is a data science tool. It is a multi-purposed web-based notebook enabling data ingestion, data exploration, visualization, sharing and collaboration features to Hadoop and Spark [19]

Solr HDP Apache Solr is a highly scalable full-text search engine [19]

Kafka HDP Apache Kafka is a distributed streaming platform developed by the Apache Software Foundation written in Scala and Java [19]

Storage JanusGraph/Titan (Atlas)

The Linux Foundation

The Linux Foundation provided JanusGraph is a distributed graph database [35].

Oracle Innsikt/DWH Oracle provided Relational Database Management System

Accumulo HDP Apache Accumulo is a distributed key-value store based on Google's Bigtable [19]

HBase HDP Apache Hbase is a NoSQL/non-relational, distributed database modeled after Google's Bigtable and is written in Java [19]

OpenTSDB Open Source LPGL

OpenTSDB is a scalable time series database built on top of Hadoop and HBase. It simplifies the process of storing and analyzing large amounts of time-series data generated by endpoints like sensors or servers [36]

Storm HDP Apache Storm is a distributed platform for processing streaming data in real time [19]

Ingestion Sqoop HDP Apache Sqoop is a command-line interface application for transferring data between relational databases and Hadoop [19]

Flume HDP Apache Flume is a distributed, reliable, and highly available service for efficiently collecting, aggregating, and moving/streaming large amounts of log data [19]

Kafka HDP Apache Kafka is a distributed streaming platform developed by the Apache Software Foundation written in Scala and Java [19]

Big Integrate IBM IBM Big Integrate is an advanced ETL tool, which is a flavor of IBM DataStage

Streams IBM IBM Streams is an advanced streaming platform that can ingest large amounts of continuous data streams [21]

Big SQL IBM IBM provided BigSQL is a software layer for creating tables and query data in BigInsights using SQL similar to Phoenix and based on Hive [20]

Page 64: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

64

Area Component SW Environment

Description15

Operations Oozie HDP Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs [19]

Ambari HDP Apache Ambari is a system for provisioning, managing, and monitoring Apache Hadoop clusters [19]

YARN HDP Apache YARN is one of the key features in the Hadoop. YARN is now characterized as a large-scale, distributed operating system for big data applications [19]

ZooKeeper HDP Apache ZooKeeper is a distributed configuration service, synchronization service, and naming registry for Hadoop [19]

Security Ranger HDP Apache Ranger is a centralized platform to define, administer and manage security policies consistently across Hadoop components [19]

Knox HDP Apache Knox is a perimeter security gateway system, which 'authenticates' user credentials (mostly against AD/LDAP). Only the successfully authenticated user are allowed access to Hadoop cluster [19]

Visualization Tableau Server Tableau Tableau provided Tableau server component is a high performance data visualization software capable of processing data from various sources incl. Hadoop/BigSQL enabling self-service analytics [22]

Tableau Desktop Tableau Non-server/desktop version of Tableau [22]

Cognos IBM IBM Cognos is a web-based, integrated business intelligence suite by IBM [37]

IBM SPSS IBM IBM provided tool for modelling of predictive algorithms using data from Hadoop distributions and Spark applications [23]

Insights for ArcGIS Esri Esri provided Insights for ArcGIS is a data analytics visualization tool for spatial and non-spatial data [32]

Data catalogue

IBM Information Governance Catalogue

IBM IBM provided catalogue service for storing the metadata and making it possible to structure the data in the Big data lake [38]

Falcon HDP Apache Falcon is a framework for managing data life cycle in Hadoop clusters [19]

Atlas HDP Apache Atlas is a scalable and extensible set of core governance services. Catalogue service for storing the metadata and making it possible to structure the data in the Big data lake, similar to IBM Information Governance Catalogue [19]

Processing Spark 2 HDP Apache Spark is a fast and general engine for large-scale data processing [19]

Page 65: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

65

Area Component SW Environment

Description15

MapReduce HDP MapReduce is an original framework for writing applications that process large amounts of structured and unstructured data stored in the Hadoop Distributed File System [19]

IBM BigR Hadoop IBM

IBM provided Big R is a library of functions that provide end-to-end integration with the R language and BigInsights [39]

IBM Streams IBM IBM provided Streams is an advanced stream processing platform that can ingest, filter, analyze and correlate massive volumes of continuous data streams [21]

Page 66: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

66

Figure 24 describes the technology layer of Big Data and Analytics Reference Modell.

technology TechnologyPlatformViewpoint

BigData&Analytics platform

Hadoop/Hortonworks cluster

Spark 2

Zeppelin

Ambari

YARN

Streams cluster

IBM Streams

HBase

HDFS

Solr

IGC cluster

InformationGovernanceCatalogue

JanusGraph

OpenTSDB

BigSQL

Kafka

Knox

Ranger

Cognos cluster

Cognos

Oracle DBMS cluster

Oracle DBMS

Tableau cluster

Tableauserver

User PC

Tableau desktop

DataScience PC

Browser (DS) Browser (User)

SPSS cluster

SPSS server

SPSS client

ZooKeeper

BigIntegrate Sqoop Flume

Hive

BigR

Ingestion Batch&Streaming

API HTTP andProgramatic

Notificationand Streaming

API

Insights for ArcGIS node

Insights forArcGIS

Atlas

Page 67: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

67

Figure 24 Reference Model – technology layer

technology TechnologyPlatformViewpoint

BigData&Analytics platform

Hadoop/Hortonworks cluster

Spark 2

Zeppelin

Ambari

YARN

Streams cluster

IBM Streams

HBase

HDFS

Solr

IGC cluster

InformationGovernanceCatalogue

JanusGraph

OpenTSDB

BigSQL

Kafka

Knox

Ranger

Cognos cluster

Cognos

Oracle DBMS cluster

Oracle DBMS

Tableau cluster

Tableauserver

User PC

Tableau desktop

DataScience PC

Browser (DS) Browser (User)

SPSS cluster

SPSS server

SPSS client

ZooKeeper

BigIntegrate Sqoop Flume

Hive

BigR

Ingestion Batch&Streaming

API HTTP andProgramatic

Notificationand Streaming

API

Insights for ArcGIS node

Insights forArcGIS

Atlas

Page 68: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

68

Within the AutoDig project there has also been created an alternative simplified model based on IBM

Reference Architecture for Analytics. This model is presented on Figure 25.

Figure 25 Alternative model for technology layer based on IBM Reference Architecture for Analytics

6.9 Technology to application component mapping

The following diagram explains how the different technology components relate to the application

components described in previous chapters.

It is important to observe that this mapping reflects primarily the transition model as being currently

implemented in the Big Data lake platform (AutoDig 2.0 project) and not the long term model. As

explained in chapter 6.7 the current platform does not support several of the necessary components,

in particular:

Data Exchange API Cloud GW and cloud support API - ingestion has currently limited functionality w.r.t. data transport latency

The following table describes how the application components are mapped to corresponding components in the technology architecture and what technologies are used to implement the architecture.

Data Sources Analytical Data

Lake Storage

Security

Platform

Information Management & Governance

Actionable

Insight

Analytics In-Motion

Enhanced

Applications

Discovery & Exploration

Analytics Operating System

Ingestion &

Integration

Data

Access

Machine &Sensor data

Image & Video

Content Services

Social Data

WeatherData

Commercial Data Sets

New sources

Traditionalsources

Third-PartyData

Transactional Data

System of Record Data

Data

acquis

itio

n &

applic

ation a

ccess

InternetData

Sets

ApplicationData

DataStage

BigIntegrate

BigSQL

Cognos Analytics

SPSS

PCIBBCI

New Business Models

TM1

OpenPages

Fraud& Operations

PMQCMAIBM Streams

Spark

Governance Catalog

On-Premise

Tableau

Kafka

IBM Streams

HBase

Spark Streaming

Hive

YARN

OpenTSDB

Zeppelin

JanusGraph

Solr

Oracle DBMS(Innsikt)

Ambari

HDFS

Flume

Sqoop

Knox Ranger

Oozie

Kafka

Page 69: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

69

Table 12 Mapping of application components to corresponding technology components

Application Component Technology Component

API HTTP/JSON and programmatic IBM Big SQL Hbase OpenTSDB Hive Phoenix Spark 2

Notification and Streaming API Kafka IBM Streams

Topology Graph store JanusGraph/Titan (Atlas)

Relational Data store Oracle

Time series data store Hbase OpenTSDB

Data Science Tools Zeppelin Tableau Server Tableau Desktop IBM SPSS

Self-service visualization Tableau Server Tableau Desktop Insights for ArcGIS

Classic Visualization Cognos

Data Catalogue IBM Information Governance Catalogue

Analytics Engine Spark 2 BigR

Machine Learning IBM Streams Spark 2

Rule Engine IBM Streams Spark 2

Batch Processing Engine Spark 2

Deep Learning Engine Spark 2

Streaming processing Engine Kafka Spark 2 IBM Streams

Data Exchange API Not mapped

Cloud GW Not mapped

Public Data store Not mapped

Private Data store Not mapped

Cloud Analytics Not mapped

Other / platform Oozie Ambari YARN ZooKeeper

Security Ranger Knox

Page 70: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

70

Application Component Technology Component

Not in use Falcon MapReduce Accumulo Pig Solr Storm Sqoop Flume NiFi16 MiNifi16 Schema Registry16

Although the technology platform as defined at the moment seems to cover most of the current needs,

there are number of new technologies, which might be of interest to better cover deficiencies in the

platform. One of the important supplements that is relevant for inclusion is the other part of the

Hortonworks platform, Hortonworks Dataflow (HDF) and technologies like Nifi/MiNifi and Schema

Registry, which provide support for collecting, curating, analyzing and acting on the data in flow [9].

16 NiFi, MiNifi and Schema Registry are not part of current platform. These components are included in Hortonworks Dataflow (HDF)

Page 71: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

71

Figure 26 Reference Model – technology to application component mapping

technology TechnologyPlatformViewpoint - application mapping

BigData&Analytics platform

Hadoop/Hortonworks cluster

Spark 2

Zeppelin

Ambari

YARN

Streams cluster

IBM Streams

HBase

HDFS

Solr

IGC cluster

InformationGovernanceCatalogue

JanusGraph

OpenTSDB

BigSQL

Kafka

Knox

Ranger

Cognos cluster

Cognos

Oracle DBMS cluster

Oracle DBMS

Tableau cluster

Tableauserver

SPSS cluster

SPSS server

ZooKeeper

BigIntegrate Sqoop Flume

Hive

BigR

API - HTTPS,JSON &

Programatic

Cloud GW

API - ingestion - batch&streaming Notification/ Streaming API

Analytics Engine BatchProcessingEngine

Classic Visualisation

Cloud Analytics

DataCatalogue

DataScience ToolsDeep LearningEngine

File Store

Machine LearningEngine

Metadata Store

Private Datastore Public Datastore

Relational Datastore

Rule Engine Self ServiceVisualisation

StreamingProcessingEngine

TimeseriesDatastore

Topology -Graph - RDFDatastore

Security

Data Exchange API

Insights for ArcGIS node

Insights forArcGIS

Atlas

Page 72: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

72

6.10 Governance Principles

Statnett has established a number of Architecture Governance principles, which also apply to systems

and platform at Statnett. Table 13 describes the most important principles that affect the Big Data and

Analytics platform and adjacent systems.

Table 13 Architecture Governance Principles affecting the Big Data and Analytics platform

ID Principle Name Explanation

O5 Information Management

Information shall be handled in a comprehensive manner as a

common asset for Statnett. Data and metadata should be uniquely

identifiable using common keys across systems.

D1 Comprehensive information architecture Statnett's information is an independent value independent of the

ICT system and will be linked to common structure and

management.

D2 Information security and business

criticality

All information shall be used, stored and shared according to

confidentiality, integrity, availability and preservation requirements.

It must be control of what information is mission critical and it must

be stored and protected to meet accessibility requirements.

D3 Data Quality All information must have a known quality state and similar

information should have similar quality tests. Known data quality

ensures proper use and composition of data.

D4 Master data management and life cycle Statnett will have a comprehensive and unified management of the

information's source (source) and ownership (master database /

system), even when this changes over the life of the information.

D5 Storage and sharing of information Data storage and integration must be done according to common

rules and architecture. The business should always know where data

is generated, flowed, shared, changed and saved

6.11 Principles for Big Data and Analytics platform

Looking at the data platform framework, the Finbeck projects defined as well several principles to

address different concerns at Statnett:

Table 14 Principles for Big Data and Analytics platform

Principle Description Rationale

The data platform and data

sets in this should be a

common resource for the

business

The data platform should be developed in

line with the business needs and be a

common resource for the business.

Data sets that are collected, structured and

stored in the data platform should be used

across different purposes - and must be

organized and managed based on this

principle.

Complies with:

O5 – Information Management

Data sets shall be described

and classified

All data sets handled through the data

platform should have a description

(metadata) describing at least the content,

ownership, origin and valuation.

Complies with:

O5 – Information Management

D1 – Comprehensive information

architecture

Page 73: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

73

Principle Description Rationale

D4 – Master data management and life cycle

Data sets shall have

ownership and

management

All data sets that are collected and

structured in the data platform should have a

defined ownership, and have well-

functioning governance and management.

Complies with:

D5 - Storage and sharing of information

Data sets shall be subject to

access control and tracking

Access to datasets shall be restricted to

identified users explicitly authorized to

create, modify, read and delete. If necessary,

it should be possible to make access control

on the subset within a data set, for example.

columns and / or rows in tabular datasets.

The data platform will ensure that all access

to data sets is traceable.

Complies with:

D2 - Information security and business

criticality

Use of data sets comes with

responsibility

The use of datasets must be in line with

policies, and according to the interests of the

business. Access to data sets must be

protected, and further processing of data

sets beyond data platform control should be

in accordance with the instructions on the

use of data sets agreed with information

owners.

Responsibility also includes understanding of

the data sets used, and whether these

quality standards meet the quality

requirements that are the basis for the use.

The data platform will eventually contain a

collection of large amounts of data sets. The

fact that these are collected and easily

accessible will in itself constitute both an

opportunity and a risk. The person given

access to parts of these data sets must

understand and exercise accountability using

this approach. It is equally important that the

data user understands which data sets are

used and if they are suitable for the

particular application.

Data sets shall be

processed and have a

retention period in

accordance with guidelines

for information processing

Datasets must be processed in accordance

with established guidelines. The individual

datasets should have a defined storage time

set by information owners in accordance

with established guidelines.

General instructions at Statnett

Data quality is a common

responsibility

Data sets should have ownership and

management, and the main responsibility for

data quality lies in this dimension. However,

anyone using the data set is responsible for

ensuring that data quality is reported back to

the owner / manager and corrected in the

source.

It is not cost effective if the individual data

users individually work to fix data quality

challenges for a data set. It should be a

shared responsibility to return and ensure

that we develop good and correct datasets

that can be used by several individuals. Data

must be corrected in the source and the

work processes associated with this.

Data must be stored in a

cost-effective manner

Data should be saved cost-effectively, and

this is achieved through a well-defined

information architecture that follows given

standards and best practice. This

architecture is defined by the FIA project.

Complies with:

O5 – Information Management

Page 74: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

74

6.12 APIs for ingestion and integration

Integration architecture is a central aspect in a Big Data and Analytics platform and in particular when used in a hybrid architecture where business logic related to asset management would be implemented in specialized asset management systems outside this platform. There is a need for several types of integrations, both the interfaces to access the data in the Big Data lake as well as interfaces for streaming data and sending notifications.

The technology platform provides a number of tools and integration technologies to satisfy the integration needs covered by application components “API –HTTPS, JSON and programmatic” as well as “Notification/Streaming API”. In addition to these, there are also integration platforms available at Statnett, which can also be used to integrate the platform with other systems at Statnett.

Table 15 Integration components to integrate Big Data and Analytics platform with specialized asset management systems

Application Component Integration/technology Component

Type of integration

API HTTP/JSON and programmatic IBM Big SQL Hive

SQL

Hbase OpenTSDB

Web services/ReST

Spark 2 Programmatic: - Java - Scala - Python

Notification and Streaming API Kafka IBM Streams

Streaming Notifications

Ingestion IBM Big Integrate ETL

Other (available at Statnett RedHat JBoss Fuse Notifications Web services

Moreover, there are new emerging technologies, which Statnett should consider taking into use, in particular the Hortonworks Dataflow and NiFi.

In practice the future solutions will use a combination of these methods. Figure 27 presents an example of possible, future integration of a Big Data and Analytics platform with asset health management system and visualization of alerts related to asset management.

Page 75: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

75

Figure 27 Example of possible integration of asset health management system with Big Data and Analytics platform

In relation to the “API – ingestion component“ and gap related to the quality of sensor data collection, Figure 28 presents a more detailed view on the future architecture of the ingestion and distribution of sensor data, which will provide a better solution to address the issues related to quality of sensor data and reliability of the infrastructure.

Figure 28 A detailed view on the future architecture of the ingestion and distribution of sensor data

Qualitrol Fault Recorders

PMU

PQ Elspec/Metrum

PROT

Other Sensors

Oil/Gas

IEC61850Adapter

IEEE C37.118Adapter

Other Adapters

IEC61850Adapter

Adapter

PQScada/Metrum Adapter

AutoDig

Data Provider Ingestion(microservice/container)

Distribution(Pub-Sub/Kafka)

Monitoring / Operations

Data Science

PredictiveMaintenance

Machine Learning

Data Usage/ Consumer

Asset Health Management

Big Data and Analytics platform

ERP

Sensor Historian

(ETL)

Health Score(ETL/WebService)

Notifications(Kafka/WebService)

Sensors Adapters

Monitoring System

Alerts(Kafka)

Sensor data(Kafka)

Sensor data(61850)

Asset data(ETL)

Page 76: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

76

7 Concluding remarks

The WP4 report defines and describes the future architecture for asset management in Statnett.

TOGAF methodology has been used for assessing, analyzing and documenting the architecture. The

architecture has been described using multiple layers and viewpoints of ArchiMate 3.0 modelling

language including strategy and motivation, application layer and technology layer.

The WP4 architecture is based on conclusions made in the Finbeck project and is anchored in the long

term Statnett Reference Model for Big Data and Analytics. The report defines and describes a number

of capabilities that are required from such a platform. Although the focus in WP4 was on the Big Data

and Analytics architecture, the asset management solution itself is a hybrid solution based on Big Data

and Analytics platform and combined with functionality implemented in several existing internal

systems as well as new components.

Although there is a Big Data and Analytics platform being introduced within the AutoDig 2.0 project at

Statnett, this platform is not yet covering all the future needs. There are several areas that need to be

explored. This includes cloud integration, advanced PaaS and SaaS cloud services offering advanced AI

services like natural language comprehension, data exchange APIs and gateways with third parties as

well as improving the infrastructure for ingestion of sensor data.

Page 77: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

77

8 References

[1] Statnett, Status and further work - Results from WP1 in the SAMBA project, Oslo: Statnett,

2016.

[2] Statnett, "Use case collection - SAMBA WP2 and WP3 report," Statnett, Oslo, 2018.

[3] Statnett, "Risk monitoring in Statnett - SAMBA WP6 report," Statnett, Oslo, 2018.

[4] Wikipedia, "The Open Group Architecture Framework," [Online]. Available:

https://en.wikipedia.org/wiki/The_Open_Group_Architecture_Framework. [Accessed 22

December 2017].

[5] NimbleMind, "ArchiMate 3.0 – a modern modeling language for digital age," [Online].

Available: http://www.nimblemind.no/2017/09/05/archimate-3-0-a-modern/. [Accessed 22

December 2017].

[6] Smart Grids Coordination Group. Reference Architecture for the Smart Grid.

CEN/CENELEC/ETSI, 2012.

[7] M. Turck, "Matt Turck," [Online]. Available: http://mattturck.com/bigdata2017/. [Accessed 22

December 2017].

[8] NimbleMind, "Big Data - quick overview," [Online]. Available:

http://www.nimblemind.no/2016/09/21/big-data-quick-overview/. [Accessed 22 December

2017].

[9] Hortonworks, "Hortonworks Dataflow," [Online]. Available:

https://hortonworks.com/products/data-platforms/hdf/. [Accessed 29 12 2017].

[10] Wikipedia, "SAP Hana," [Online]. Available: https://en.wikipedia.org/wiki/SAP_HANA.

[Accessed 22 December 2017].

[11] SAP, "SAP Vora," [Online]. Available: https://www.sap.com/products/hana-vora-hadoop.html.

[Accessed 22 December 2017].

[12] Wikipedia, "OSIsoft," [Online]. Available: https://en.wikipedia.org/wiki/OSIsoft. [Accessed 22

December 2017].

[13] Ubuntu/Canonical, "Ubuntu," [Online]. Available: https://insights.ubuntu.com/wp-

content/uploads/HadoopBuyersGuide_sm.pdf. [Accessed 22 December 2017].

[14] IBM, "Watson Data Platform," [Online]. Available:

https://www.ibm.com/analytics/us/en/watson-data-platform/. [Accessed 26 02 2018].

[15] GE, "Predix," [Online]. Available: https://www.ge.com/digital/predix. [Accessed 22 December

2017].

Page 78: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

78

[16] Engerati, "IBM Insights Foundation for Energy," [Online]. Available:

https://www.engerati.com/sites/default/files/Day2-1640-Etienne%2520Pelletier-

IBM.compressed.pdf. [Accessed 22 December 2017].

[17] ABB, "ABB launches next-generation asset management solution to improve efficiency and

optimize costs," [Online]. Available:

http://www.abb.com/cawp/seitp202/ed4ee9084a2f169fc12580ba0039beaa.aspx. [Accessed

22 December 2017].

[18] Sintef, "NEF Teknisk Møte 2014," 2014. [Online]. Available:

http://www.sintef.no/projectweb/nef-tm/presentasjoner/. [Accessed 28 12 2017].

[19] Hortonworks, "Maximize the value of data-at-rest to deliver Big Data Analytics," [Online].

Available: https://hortonworks.com/products/data-platforms/hdp/. [Accessed 28 12 2017].

[20] IBM, "What's the big deal about Big SQL?," [Online]. Available:

https://www.ibm.com/developerworks/library/bd-bigsql/index.html. [Accessed 28 12 2017].

[21] IBM, "IBM Streams," [Online]. Available:

https://www.ibm.com/support/knowledgecenter/en/SSCRJU/SSCRJU_welcome.html.

[Accessed 28 12 2017].

[22] Tableau, "2017 Gartner Magic Quadrant," [Online]. Available:

https://www.tableau.com/resource/2017-gartner-magic-quadrant. [Accessed 28 12 2017].

[23] IBM, "SPSS statistical software," [Online]. Available: https://www.ibm.com/analytics/data-

science/predictive-analytics/spss-statistical-software. [Accessed 28 12 2017].

[24] IBM, "About IBM SPSS Modeler," [Online]. Available:

https://www.ibm.com/support/knowledgecenter/en/SS3RA7_18.1.1/modeler_mainhelp_clie

nt_ddita/clementine/entities/clem_family_overview.html. [Accessed 28 12 2017].

[25] Esri, "Geoanalytics Server," [Online]. Available:

https://www.esri.com/arcgis/products/geoanalytics-server.

[26] Esri, "What is ArcGIS GeoAnalytics Server?," [Online]. Available:

http://server.arcgis.com/en/server/latest/get-started/windows/what-is-arcgis-geoanalytics-

server-.htm. [Accessed 22 December 2017].

[27] Esri, "ArcGIS GeoEvent Server," [Online]. Available:

http://www.esri.com/arcgis/products/geoevent-server. [Accessed 22 December 2017].

[28] Esri, "GeoEvent Server," [Online]. Available: https://server.arcgis.com/en/geoevent/.

[Accessed 22 December 2017].

[29] Esri, "ArcGIS Image Server," [Online]. Available: https://www.esri.com/arcgis/products/image-

server. [Accessed 22 December 2017].

Page 79: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

79

[30] Esri, "What is ArcGIS Image Server?," [Online]. Available:

http://server.arcgis.com/en/server/latest/get-started/windows/what-is-arcgis-image-server-

.htm. [Accessed 22 December 2017].

[31] Esri, "Insights for ArcGIS," [Online]. Available: http://www.esri.com/products/arcgis-

capabilities/insights. [Accessed 22 December 2017].

[32] Esri, "Inisghts for ArcGIS," [Online]. Available: https://server.arcgis.com/en/insights/.

[Accessed 22 December 2017].

[33] eSmart Systems, "Strategiske innspill på hvordan ny teknologi kan brukes til smartere

anleggsforvaltning," Statnett, Halden, 2016.

[34] Statnett, "Finbeck – Roadmap for IKT-arkitektur, fremtidig analyseplattform - Sluttraport fase

1," Statnett, Oslo, 2018.

[35] Linux Foundation, "Janus Graph," [Online]. Available: http://janusgraph.org/. [Accessed 29 12

2017].

[36] The Open TSDB, [Online]. Available: http://opentsdb.net/. [Accessed 29 12 2017].

[37] IBM, "Cognos Analytics," [Online]. Available: https://www.ibm.com/products/cognos-

analytics. [Accessed 29 12 2017].

[38] IBM, "IBM Infosphere Information Governance Catalog," [Online]. Available:

https://www.ibm.com/us-en/marketplace/information-governance-catalog. [Accessed 29 12

2017].

[39] IBM, "Overview of IBM BigInsights Big R," [Online]. Available:

https://www.ibm.com/support/knowledgecenter/. [Accessed 29 12 2017].

Page 80: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

80

V1 EA Diagrams

8.1 Strategy and Motivation

Page 81: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming
Page 82: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

82

Page 83: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

83

Page 84: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

84

Page 85: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

85

Page 86: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

86

Page 87: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

87

Page 88: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

88

Page 89: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

89

Page 90: Future Asset Management Architecture - Statnett · AWS Amazon Web Services – cloud platform from Amazon ... The amounts of sensor data related to asset management can be overwhelming

90

Statnett SF

Nydalen allé 33, Oslo

PB 4904 Nydalen, 0423 Oslo

Telefon: 23 90 30 00

Fax: 23 90 30 01

E-post: [email protected]

Nettside: www.statnett.no