highlights of the telecommunications event data analytics toolkit

20
© 2016 IBM Corporation Highlights of the Telecommunications Event Data Analytics toolkit IBM Streams Version 4.2 Paul Zollna Senior Software Developer and TEDA Architect For questions about this presentation contact Paul Zollna [email protected]

Upload: lisanl

Post on 22-Feb-2017

144 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Highlights of the Telecommunications Event Data Analytics toolkit

© 2016 IBM Corporation

Highlights of the Telecommunications

Event Data Analytics toolkit IBM Streams Version 4.2

Paul Zollna

Senior Software Developer and TEDA Architect

For questions about this presentation contact Paul Zollna

[email protected]

Page 2: Highlights of the Telecommunications Event Data Analytics toolkit

2 © 2016 IBM Corporation

Important Disclaimer

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONALPURPOSES ONLY.

WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTYOF ANY KIND, EXPRESS OR IMPLIED.

IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OROTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS ORTHEIR SUPPLIERS AND/OR LICENSORS); OR

• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENTGOVERNING THE USE OF IBM SOFTWARE.

IBM’s statements regarding its plans, directions, and intent are subject to change orwithdrawal without notice at IBM’s sole discretion. Information regarding potentialfuture products is intended to outline our general product direction and it should notbe relied on in making a purchasing decision. The information mentioned regardingpotential future products is not a commitment, promise, or legal obligation to deliverany material, code or functionality. Information about potential future products maynot be incorporated into any contract. The development, release, and timing of anyfuture features or functionality described for our products remains at our solediscretion.

THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

Page 3: Highlights of the Telecommunications Event Data Analytics toolkit

3 © 2016 IBM Corporation

Agenda

Highlights of the com.ibm.streams.teda application framework (TEDA)

What’s new about the operators and functions

Tutorial & references

DEMO: TEDA & the Secure Application Configuration

DEMO: TEDA & Plug-In for External Applications

Page 4: Highlights of the Telecommunications Event Data Analytics toolkit

4 © 2016 IBM Corporation

Highlights of the application framework

New Context Container composite operator– The new <namespace>.context.custom::ContextContainer operator is introduced. It

allows to implement a multi-level context logic or contexts with different algorithms.

Improved configuration of the Lookup Manager application– Simplified XML description format

– More flexible referencing of database configuration

– The database source can delete lookup data now

– The database settings are configurable with Secure Application Configuration

Integration of the partitioned BloomFilter feature– The configuration, functions and output statements of the partitioned BloomFilter are fully

integrated in the ITE application framework

Enhanced handling of CSV file with enrichment data– Configurable handling of header lines, empty lines

– Configurable handling of quoted values of attributes

– Configurable separator, delimiter and end-of-line marker

New shared memory segment naming– The unique segment naming simplifies the host resource sharing

New tuple data export plug-in interface– An external applications can import the tuple data from the ITE application without side

effects on the performance of the ITE application

Page 5: Highlights of the Telecommunications Event Data Analytics toolkit

5 © 2016 IBM Corporation

What’s new about the operators and functions

New DirectoryWatch operator– The new DirectoryWatch operator adds watches to the system's inotify functionality to monitor

directories and report file changes using less CPU than the standard spl.adapter::DirectoryScan

operator

Enhancement in the error reporting of the CSVParse operator– The CSVParse operator provides new custom output functions to get error descriptions when parsing

fails

– The detailed fault information about position in the record is provided in case of a failure

New functions in the com.ibm.streams.teda.file and com.ibm.streams.teda.file.path

namespaces– symlink - creates symbolic links in the file system.

– space - determines the total, free, and available disk space capacity for a mounted file system.

– dirname - extracts the string from the provided path, which specifies the parent directory.

– filename - extracts the string from the provided path, which specifies the file name.

– stem - extracts the string from the provided path, which specifies the file name without the extension.

– extension - extracts the string from the provided path, which specifies the extension of the file name.

Page 6: Highlights of the Telecommunications Event Data Analytics toolkit

6 © 2016 IBM Corporation

Additional Resources

Reference in IBM Knowledge Center - IBM Streams 4.2

com.ibm.streams.teda:– http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.toolkits.d

oc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda.html

TEDA Tutorial for versions 1.0.2 & 2.0.0:– http://ibmstreams.github.io/streamsx.tutorial.teda

A TEDA demoapp sample on https://demo.ibmcloud.com

– Available for demos on request

An Introduction to Streaming Telecommunications Event Data Analytics:– https://developer.ibm.com/streamsdev/docs/introduction-streaming-telecommunications-

event-data-analytics-teda

Getting Started with Streaming Telecommunications Event Data Analytics:– https://developer.ibm.com/streamsdev/docs/getting-started-streaming-

telecommunications-event-data-analytics-teda

Page 7: Highlights of the Telecommunications Event Data Analytics toolkit

7 © 2016 IBM Corporation

TEDA & Application Configuration

The Secure Application Configuration feature

The database as source that provides the enrichment data

Demo: Lookup Manager application using Streams Console to configure

database credentials

Page 8: Highlights of the Telecommunications Event Data Analytics toolkit

8 © 2016 IBM Corporation

Secure Application Configuration

Application specific set of properties in secure storage

API implemented in SPL, C++ and Java

Based on JMX communication

Implemented in Streams Console

JMX API to manage the Secure Application Configuration store on the

instance or domain level

Page 9: Highlights of the Telecommunications Event Data Analytics toolkit

9 © 2016 IBM Corporation

TEDA framework with database configuration

Sensitive database credentials stored in secure storage

– The configuration files do not include credentials in plain text• Lookup Manager configuration file for default settings: config.cfg

• database configuration file is not involved: connections.xml

Changable configuration at runtime

– Update the password without cancellation and submission of the application job

Page 10: Highlights of the Telecommunications Event Data Analytics toolkit

10 © 2016 IBM Corporation

Demo

Specify the database properties in the Secure Application Configuration

Specify the Application Configuration in TEDA Framework

Load the enrichment data form database with the Lookup Manager

application and process files in the ITE application

Page 11: Highlights of the Telecommunications Event Data Analytics toolkit

11 © 2016 IBM Corporation

Details

Properties for Secure Application Configuration– lm.db.name: DEMOAPP

– lm.db.user: db2inst1

– lm.db.password: <password>

Configuration of the Lookup Manager application– lm.applicationConfiguration=MyApplConfig

– lm.db.connectionName=DEMOAPP

– lm.db=on

– lm.file=off (optional)

Page 12: Highlights of the Telecommunications Event Data Analytics toolkit

12 © 2016 IBM Corporation

Additional Resources

IBM Knowledge Center reference:

– http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams

.toolkits.doc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda$10

1.html

TEDA Tutorial in Module 11:

– http://ibmstreams.github.io/streamsx.tutorial.teda/docs/2.0.0/Module-11

Page 13: Highlights of the Telecommunications Event Data Analytics toolkit

13 © 2016 IBM Corporation

Plug-In for TEDA & External Applications

Plug-in interfaces in the ITE application

Demo: Export of deduplication data to external application using plug-in

interface in ITE application

Page 14: Highlights of the Telecommunications Event Data Analytics toolkit

14 © 2016 IBM Corporation

TEDA toolkit Plug-in

The TEDA application framework provides plug-in interfaces to export tuple

data to an external application

One or more external applications can connect to 4 different plug-in points:– Reader

– Transfomer

– Writer

– Dedup

A weak performance of the external application does not affect the

performance of the ITE file processing

None back pressure from external applications

The connections are monitored by metrics

Page 15: Highlights of the Telecommunications Event Data Analytics toolkit

15 © 2016 IBM Corporation

Plug-in interfaces in the ITE application

Statistics

Control

IngestFiles

Context

ChainDirScan

FileType Validator

ApplCtrl Scheduler

LogWriter

Dedup

Filename Dedup

ChainProcessorReader

ChainSink

ChainControl

ChainProcessorTransformer

PreFile Reader

RejectFileWriter

File Writer

Validator

Business Logic / Transform / EnrichTuple Group Split

Taps

Post Transformer

Tap

PostContext Processor

Tap

Chain Finalizer

(Files Mover)

Chain Split

File GroupSplit

Context Custom

FileReaderFileReader

Converter

ContextRestore Writer

PostContext Processor

Checkpoint Control

Legend Custom optionalCustomCommon Common or Custom Variant CVariant B

writer

reader

transformerdedup

Page 16: Highlights of the Telecommunications Event Data Analytics toolkit

16 © 2016 IBM Corporation

Demo

Specify the ITE configuration to export deduplication data

Specify the parameter of the Export operator to import the data from ITE

application

Connect a ‚fast‘ and ‚slow‘ importer to ITE application and compare the

performance of both jobs

Page 17: Highlights of the Telecommunications Event Data Analytics toolkit

17 © 2016 IBM Corporation

Plug-in interface in ITE application framework

The ITE application framework provides 4 plug-in configurations– The ITE application provides 4 unique export properties

– The <namespace>.streams::TypesCommon composite provides exported

stream schema specification for each plug-in configuration

New congestionPolicy parameter in spl.adapter::Export operator– Specifies the congestion policy of the stream that is exported

– Applicable values:• dropConnection

The connection is dropped when a downstream importer is not keeping up.

A nBrokenConnections metric indicates the connection drop count at the output port

• wait The output port causes back pressure when congested

Value Export property Exported SPL Schema

reader ite="<namespace>.chainprocessor.reader_output_RecordValidator" TypesCommon.ReaderOutStreamType

transformer ite="<namespace>.chainprocessor.transformer_output_DataProcessor" TypesCommon.TransformerOutType

writer ite="<namespace>.chainsink_input_Writer" TypesCommon.ChainSinkStreamType

dedup ite="<namespace>.context_output_Dedup" TypesCommon.TransformerOutType

Page 18: Highlights of the Telecommunications Event Data Analytics toolkit

18 © 2016 IBM Corporation

Details

Importer settings– Import of ITE stream schema types

• use demoapp.streams::*;

– Output stream type of the Importer• stream<TypesCommon.TransformerOutType> In = Import()

– Set subscription• param subscription : ite=="demoapp.context_output_Dedup";

Configuration of the ITE application– Specify the list of exporters, here Dedup only

• ite.export.streams=dedup

Page 19: Highlights of the Telecommunications Event Data Analytics toolkit

19 © 2016 IBM Corporation

Additional Resources

IBM Knowledge Center reference– Description of the ite.export.streams configuration parameter:

• http://www.ibm.com/support/knowledgecenter/SSCRJU_4.2.0/com.ibm.streams.toolki

ts.doc/spldoc/dita/tk$com.ibm.streams.teda/tk$com.ibm.streams.teda$184.html– Description of the congestionPolicy parameter in the spl.adapter::Export operator:

• http://www.ibm.com/support/knowledgecenter/en/SSCRJU_4.2.0/com.ibm.streams.toolkits.doc/spl

doc/dita/tk$spl/op$spl.adapter$Export.html#spldoc_operator__parameter__congestionPolicy

TEDA Tutorial in Module 12:

– http://ibmstreams.github.io/streamsx.tutorial.teda/docs/2.0.0/Module-12

Page 20: Highlights of the Telecommunications Event Data Analytics toolkit

20 © 2016 IBM Corporation

Thank YOU!!!