virtual data container (vdc) · about implementation details • ditas execution environment •...
TRANSCRIPT
Virtual Data Container (VDC)
2
The DITAS Project
• What is trying to solve?• Current applications are eager to acquire and consume
more and more amounts of data
• Data is coming from distributed heterogeneous devices (IoT and mobile applications)
• Need to deal with data in an effective, fast, agile and secure manner
• What it offers?• Optimal combination of edge nodes and cloud services
• Abstraction from data management
• Tools to enable developers easily create and deploy data-intensive applications
3
DITAS Toolset
• DITAS SDK• simplifies the development of data-intensive applications,
hiding the complexity of the underlying infrastructure
• provides a homogeneous view of a set of resources that could belong to both cloud and fog platforms
• Virtual Data Container (VDC)• provides an abstraction layer for developers, so that they can
focus only on data, what they want to use and why, forgettingabout implementation details
• DITAS Execution Environment• based on a powerful execution engine, capable of managing a
distributed architecture and taking care of data andcomputation movement, maintaining coordination with otherresources involved in the same application
Example of Cloud deployed Application
5
Virtual Data Container (VDC)
Data Processing
Data Access Layer
Exposed API (CAF)
Data Source
Data-Intensive Application
VDC instance
Data Source
6
Exposed API (CAF)
• The application knows only the Common AccessibilityFramework (CAF) which hides the complexity behind theVDC
• The programming model is REST-oriented
• The data owner provides the API that contains a set ofwell-described methods, in order to make available someof the data included in the data sources, to which the VDCis connected
• This API is specified in the abstract VDC Blueprint and isbased on the OpenAPI Specification, originally known asthe Swagger Specification
7
Data Processing (1/4)
• Principles • VDC embeds a set of data processing techniques able to transform data
• (e.g., encryption, compression)
• VDC allows to compose these processing techniques in pipelines
• Node-RED programming model
• Adopted Approach• From one monolithic application architecture, to microservices
architecture
• Each microservice can be described as a module that:
• assists a specific business goal
• uses simple language with a well-defined API to communicate with other (micro)services
• An application is being converted to a collection of microservices
• Serverless or Function as a Service (FaaS) architectures
8
Data Processing (2/4)
Similar Approach; Amazon Web Services (AWS) Step Functions
• Description; “web service that enables you to coordinate the components of distributed applications and microservices using
visual workflows”
• Goal; focus only on the business logic and not to waste time on coding, testing or debugging
• Benefits;
9
Data Processing (3/4)
Therefore Node-RED comes into the equation to:• visualize the VDC structure as a flow that consists of
separated tasks which are distinctive and follow a certain logical sequence
• “breaks” a task into a logical flow of steps-nodes with specific functions
• offers an abstract layer for building a solution
• provides a very modular way for implementing this solution
10
Data Processing (4/4)
Node-RED Implementation Details• Each task could be implemented via one or more flows
• Alternative paths
• Each flow could consist of one or more nodes such as:
• REST calls to components implemented as a service
• JavaScript snippets (function nodes) to perform data filtering, refinement, transformation etc.
• control nodes (e.g. switch) that direct the flow to different paths
• Each component should be exposed as a web service and be available through a RESTful API
11
Abstract VDC Blueprint (1/2)• The VDC descriptor, written in JSON format
• Captures all the properties of the VDC
• Consists of 5 distinct sections
12
Abstract VDC Blueprint (2/2)Internal Structure (section 1)
High-level textual description of the VDC to characterize it as a product, focusing on business characteristics
Data Management (section 2)
Specifies the attributes of the methods offered by the VDC and, for each method, the guaranteed levels of data quality, security and privacy
Abstract Properties (section 3)
Contains all the rules in the form of goal trees to construct the Service Level Agreement (SLA) contract between the data owner and the data consumer
Cookbook Appendix (section 4)
Describes the deployment information for a VDC
Exposed API (section 5)
Technical section to enable the application developer to fully understand how the VDC exposed methods work
13
E-Health DITAS Use Case – Exposed API (CAF)
14
E-Health DITAS Use Case – Data Processing (Node-RED Flow)