everything you need to know about the wso2 analytics platform
TRANSCRIPT
BY FRANK LEYMANN
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORMBY SAJITH RAVINDRASENIOR SOFTWARE ENGINEER
TABLE OF CONTENTS
1. Introduction...................................................................................................................................................
2. A Comprehensive Analytics Platform to Meet Enterprise Requirements............................
2.1 Batch and Real-Time....................................................................................................................
2.2 Predictive.........................................................................................................................................
2.3 Interactive........................................................................................................................................
3. Supporting The Key Stages of Analytics Needs............................................................................
3.1 Data Collection...............................................................................................................................
3.2 Data Analysis and Result Streams.........................................................................................
3.3 Visualizing and KPI Monitoring................................................................................................
4. Analytics for WSO2 Products via The Analytics Platform........................................................
4.1 API Analytics for WSO2 API Manager...................................................................................
4.2 Security Analytics for WSO2 Identity Server.....................................................................
4.3 IoT Analytics for WSO2 Enterprise Mobility Manager and WSO2 IoT Server.............
4.4 Enterprise Integration Analytics for WSO2 ESB..............................................................
5. Using WSO2 DAS to Analyze Data from Your System...............................................................
6. The WSO2 Platform Advantage...........................................................................................................
6.1 Comprehensive...............................................................................................................................
6.2 Complete..........................................................................................................................................
6.3 Simple................................................................................................................................................
6.4 Incremental.....................................................................................................................................
6.5 Pluggable.........................................................................................................................................
6.6 Agile...................................................................................................................................................
6.7 Clean Design...................................................................................................................................
7. Conclusion....................................................................................................................................................
03
03
04
04
04
05
06
07
08
08
08
09
10
11
12
13
13
13
13
14
14
15
15
15
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
02
1. INTRODUCTION
Let’s assume John is the CEO of a large enterprise; as with any modern enterprise, John’s
organization generates a significant amount of data that could possibly translate to useful
information about the business, which may not be apparent. Converting this data into
useful information will help John’s company to be more productive and to compete better.
To enable this, John hires Mark, a data scientist. Mark’s primary role is to collect, clean, and
analyze this data and present these to John and other stakeholders to help them make
better business decisions.
Data can be produced from multiple sources in varied formats. Mark has to capture this
data and store them in a single place and prepare these for analysis. Mark can use batch
analytics to combine, compare, and contrast different data sets with each other and
produce summarized data to extract relevant information that will be useful to the CEO.
Mark will also need to ensure that John and his colleagues are up to date on live events and
current trends in the market so they could take timely and proactive business decisions.
For this, Mark would need to analyze incoming data; when these are available, he can use
real-time analytics to send e-mails or an SMS to instantly notify John or provide updates on
a dashboard so this information can be accessed on a mobile device or laptop.
Mark will also need to extract certain information on a regular basis and this would need to
be presented as interactive visuals to John and other stakeholders so they can get better
insights into the data by interactively drilling these down. John’s organization can stay
one step ahead if they can foresee the future. To enable this, Mark can leverage predictive
analysis; he can use current and historical data to create and finetune machine learning
models for this purpose.
2. A COMPREHENSIVE ANALYTICS PLATFORM TO MEET ENTERPRISE REQUIREMENTS
To understand what Mark is producing by analyzing data John needs dashboards. A
dashboard will provide John with a concise view of the business by showing performance,
real-time alerts, and forecasts. John will need to access this information from anywhere at
anytime.
The WSO2 analytics platform is an ideal solution that would meet all of John’s enterprise
requirements. WSO2’s open source analytics platform is seamlessly integrated, combining
batch, interactive, real-time and predictive analytics. It allows an organization to
1. Collect data from multiple sources in different formats
2. Analyze data using various techniques
3. Communicate results effectively
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
03
The analytics platform allows users to extract data from any data source that could be in
any format. It also has a pluggable datastore architecture to allow organizations to use
datastores that match their needs.
2.1 BATCH AND REAL-TIME
The batch analytics capabilities of the WSO2 analytics platform enable the use of different
operations and techniques to summarize and aggregate data that was collected over a
period of time in order to derive a broader view of the data. Moreover, it has the capability
to analyze data on the move in real-time. Real-time analytics enable to correlate data from
multiple sources to detect patterns, anomalies, etc. and generate alerts or visualize results
through views, such as dashboards. In real-time analytics, users are interested in the results
as and when they become available, e.g. traffic monitoring, smart order routing, compliance
monitoring, and fraud detection.
2.2 PREDICTIVE
In addition to generating instant updates in real-time and analyzing data in batch mode,
it will be useful to be able to predict what the future holds based on past performance.
The WSO2 analytics platform comes with predictive analytics capabilities that enable an
organization to identify problems based on past data and make predictions about the
future. The results produced by real-time and batch analytics can be fed to create machine
learning models and fine-tuned to predict future events, e.g. trying to predict the next
value of a stream of events with sensor readings by learning from past values or trying to
detect if a given e-mail is spam or not by learning from past classifications. It should be
noted, however, that this mode of analytics needs expertise on creating correct models
and for tuning algorithms for accurate and efficient predictions.
2.3 INTERACTIVE
Another form of analytics offered by the WSO2 analytics platform is interactive analytics
where users can ‘interactively’ analyze data that was collated and processed over a period
of time using quarries. In a typical interactive analytics scenario, users will first need to see
the data in context and then drill down into details to get a better understanding of the
situation, e.g. detecting an anomaly in a series of credit card transactions using real-time
analytics or by looking at dashboards and then using interactive analytics to dig deeper
and verifying if it’s an act of fraud; this can be done by obtaining information like other
transactions carried out around the same time, historical transactions, etc.
The WSO2 analytics platform also provides capabilities for visualizing analyzed information
in multiple ways. The results can be published onto dashboards (examples shown in Figure
1) that contain various types of gadgets. Here, the interest of the stakeholders will be to
monitor key performance indicators (KPIs). Some examples of KPIs include the number of
unique visitors to a website on a given day, the number of customers who make it through
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
04
to a purchase in a sales funnel, and per user data utilization on a network. Monitoring KPIs
could be done in both real-time and in batch mode. In the case of real time, there needs to
be an alerting model, hence dashboards may not be the way to go.
Figure 1
With visualization, users can make decisions based on what they see on the screens, which
again is based on the analyzed data. Those decisions lead to actions, such as changing
process parameters and fine-tuning the process. Some of this action could also be
automated to some extent.
3. SUPPORTING THE KEY STAGES OF ANALYTICS NEEDS The WSO2 analytics platform (illustrated in Figure 2) supports the key stages of analytics
requirements: data collection, data analysis, and communication. First, a user needs to
define data streams to describe the data. Thereafter, the user can write SQL-like queries
using Spark SQL and Siddhi Event Query Language to analyze streams being defined
when publishing events or/and use machine learning models to make forecasts. Finally, the
outputs can be communicated to the end user as alerts, visualizations on dashboards, or as
APIs so users can obtain data.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
05
Figure 2
3.1 DATA COLLECTION
Data collection can be done from any data source in various protocols. The WSO2 platform
defines the concept of ‘data agents’ to collect data as shown in Figure 3.
For WSO2 products, such as WSO2 API Manager and WSO2 Enterprise Service Bus (ESB),
there are pre-built data agents that publish information, such as statistics for service
monitoring, usage monitoring, and message mediation monitoring.
For your own custom data sources you can implement custom agents with ease using the
APIs provided. Moreover, you can use ESB with its 150+ connectors in conjunction with a
business activity monitor mediator to collect data feeds from custom sources like Twitter,
Facebook, etc.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
06
Collect Data
Visualizations
APIs
Systems
Predictive Analytic s
Batch Analytics
x + y
AlertsReal-time Analytic s
QueriesInteractive Analytic s
Analyze & Make Decisions Communicate
Figure 3
3.2 DATA ANALYSIS AND RESULT STREAMS
The data agents publish streams of data into WSO2 Data Analytics Server (DAS). WSO2
DAS is capable of capturing these data streams into data storages and then analyzes these
stored data using the analyzing engine in batch mode. A simple spark query is shown
below:
WSO2 DAS can act on the incoming data streams as and when the data is received (in real
time) without storing. In addition, WSO2 DAS allows users to drill down into collected data
interactively using queries. A sample Siddhi query to recognize a pattern in a stream of
events is shown below:
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
07
BAM Mediator WSO2 Data Publisher
Event Receivers
Products
Log Publisher
3rd Party Java Systems
WSO2 ESB Connectors
SOAPHTTPJMSSMTP(E-mail)SMSThriftKafkaWebsecketMQTT
JMX Publisher
JMX
3rd Party Systems
Custom WSO2 Data Publisher
XML / JSON / Text / Map
Thrift / Binary Thrift / Binary Thrift / Binary
Thrift / Binary Thrift / Binary
Furthermore, WSO2 Machine Learner (ML) can be used to build and fine tune machine
learning models by feeding collected data. The ML wizard helps you to create models that
you can use to classify data and then these models can be run in WSO2 DAS for predictive
analysis.
WSO2 DAS can generate result streams and store the results and summarized data in
an RDBMS database. Users can listen to these results streams or obtain data from the
database via the provided API to act on the results.
3.3 VISUALIZING AND KPI MONITORING
Summarized data and results data can be used from visualization tools, such as WSO2
Dashboard Server (DS), to build dashboards to monitor KPIs. Moreover, WSO2 DAS itself
has a feature to create dashboards and gadgets. In the case of real-time analytics with
WSO2 DAS, in addition to KPI dashboards, what is more interesting is generating alerts for
matching event detections, such as sending emails or SMSs. However, WSO2 DAS result
streams too can be recorded into storage to help achieve delayed processing rather than
real-time monitoring.
4. ANALYTICS FOR WSO2 PRODUCTS VIA THE ANALYTICS PLATFORM
WSO2 products, such as WSO2 ESB, WSO2 API Manager, and WSO2 Identity Server have
built-in data agents to publish its data to WSO2 DAS. This data is analyzed using WSO2’s
batch, real-time, predictive, and interactive analytics capabilities to provide valuable
insights into the usage of those servers.
4.1 API ANALYTICS FOR WSO2 API MANAGER
In an API management solution, analytics plays a vital role as there are many instances
that need to be considered to maintain high availability and security of APIs. Some of
the key areas in API analytics are API health monitoring, API usage monitoring, and
suspicious activity monitoring. For example, if an API starts to fail suddenly or the origin
IPs of requests are changed or the pattern of API resource usages change abnormally,
the administrators should be alerted so they can take prompt action to handle possible
failures/threats.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
08
The real-time and batch analytics capabilities of the WSO2 analytics platform (Figure 4)
are leveraged to monitor the status of each API in order to generate customizable alerts on
conditions that require attention. The alerts will be communicated to responsible parties
as e-mails/SMSs as well as displayed in a dashboard so it can be monitored by the system
administrator.
Figure 4
4.2 SECURITY ANALYTICS FOR WSO2 IDENTITY SERVER
Security analytics deal with the application of big data analytics on all data related to
identity security to provide meaningful information; this data will help security admins
to further optimize their identity platforms and detect any fraudulent activity at an early
stage.
This ranges from basic statistics and graphs generated through real-time queries and
batch analytics (Figure 5) that summarize session usage, login attempt evolution, usage
of different identity providers to login to different service providers, etc. to much more
complex and investigative analytics, such as the ability to identify a security breach or
unauthorized access of resources using correlations and pattern detection. The WSO2
analytics platform is efficiently used to provide vivid views of identity-related analytics to
provide the ability to proactively handle security breaches and to ensure optimal resource
utilization and maintenance of the identity platform.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
09
Figure 5
4.3 IOT ANALYTICS FOR WSO2 ENTERPRISE MOBILITY MANAGER AND WSO2 IOT SERVER
The WSO2 analytics platform offers customizable IoT device analytics that include
predictive analytics using machine learning capabilities. It supports edge computing
devices and policy-based edge analytics as well as pre-built instant visualization for sensor
readings using live data streams gathered from devices.
In a typical IoT scenario, a device will send events containing timestamp, location/
proximity data and some readings (e.g. temperature, power, etc.). In general, with this data,
we can monitor each device as a single unit as well as look at the devices as part of a large
system.
IoT analytics (Figure 6) provide the capability to look into the details of each device for
information, such as active/inactive status and last update time. The system administrators
can have a comprehensive and concise view on all devices; information like the number
of different device types, how many devices from each device types are connected,
policy compliance ratio of devices, etc. are shown via a dashboard to enable efficient
management of the system. A geo dashboard is provided to monitor the locations of
devices connected to the system, which will give a clear view of the dynamics of the
connected devices.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
10
Figure 6
4.4 ENTERPRISE INTEGRATION ANALYTICS FOR WSO2 ESB
Enterprise integration scenarios involve various message flows that can be long and
complex. Therefore, ad hoc tuning might not be sufficient to find bottlenecks, troubleshoot,
and to achieve optimal performance. Enterprise integration analytics allow users to monitor
statistics to identify hotspots in the message flow and to fine tune configurations. In
addition, it allows the user to obtain overall statistics on the flows to monitor performance.
The users can trace messages through the mediation flow (Figure 7) and find what the
message content was in each mediator. A dashboard is provided with useful mediation
statistics like processing time per mediator, response/request time, etc. Furthermore, the
user will be provided a view to show which parts of the mediation flow are incurred most
times so the users can click on the necessary area, drill down, and get a historic view of the
performance of that section. From thereon, the user can drill down to find a specific list of
problematic messages and ultimately view a few of these to inspect its contents.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
11
Figure 7
5. USING WSO2 DAS TO ANALYZE DATA FROM YOUR SYSTEM
WSO2 DAS is a self-contained product that can be used to perform real-time, batch, and
interactive analytics. You can run machine learning models created using WSO2 ML inside
DAS to perform predictive analytics as well. In addition, dashboards can be created using
DAS to visualize the analyzed data.
Before initiating an analysis, data has to be sent to WSO2 DAS. If you’re using WSO2
products, such as WSO2 ESB and WSO2 API Manager, they have data agents built in to
publish data to DAS. All you need to do is configure the WSO2 server to point to your
DAS instance. If it’s required to publish events from your own custom data source you can
easily write a data agent using a well-defined API. You can also use WSO2 ESB with ESB
connectors to push data streams from data sources like Twitter and Facebook. Various
protocols, such as Apache Thrift, HTTP, JMS, and MQTT, are supported by DAS to receive/
publish data.
When sending data, first it’s required to define how your data streams looks like. You can
define your data model as ‘Stream Definitions’. A stream definition defines the set of fields
with the data types to describe the structure of messages received/sent via a stream.
Once the data streams are defined you can write ‘execution plans’ using Siddhi query
language to analyze your data stream in real-time and results can be pushed to a result
stream. The result stream can be communicated out of DAS as alerts using e-mail or SMS,
or just simply sent out as an event using various protocols. Then you can choose to persist
the incoming data streams and use Apache Spark queries to analyze data in batch mode.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
12
Spark scripts can be scheduled to run in regular intervals or triggered manually and the
summarized data can be written to an RDBMS database. You can also use the interactive
console to execute Spark quarries against the data too.
The persisted data stream can be interactively analyzed and drilled down via the ‘data
explorer’ UI provided in DAS. You can define different search criteria as queries or define
time span to obtain data for your analysis. The ‘activity explorer’ provides the capability to
correlate data across different streams received in a given time period.
WSO2 DAS also provides the capability to create dashboards. You can create gadgets with
different chart types to visualize streams defined in DAS and combine them as required to
create a dashboard. DAS also has industry/domain-specific toolboxes and extensions to
support business use cases, such as fraud detection and GIS data monitoring.
DAS can be deployed very easily as a single instance to start analyzing your system
initially. Its highly scalable design allows you to easily scale the analytics solutions you build
on top of DAS as your system grows. Therefore, you can start analyzing your system with
minimal resources and less effort, and scale up gradually as your system and requirements
evolve.
6. THE WSO2 PLATFORM ADVANTAGE
6.1 COMPREHENSIVE
The WSO2 analytics platform is a comprehensive platform that’s built from ground up to
meet the needs of big data analytics. It addresses all components of an analytics solution
required for an enterprise - i.e. data collection, analytics, and communication (Figure 8).
6.2 COMPLETE
The platform comprises all the pieces required to achieve comprehensive analytics from
end to end. It can monitor both WSO2 platform products, and third-party products and
systems using an agent-based architecture. It has easy-to-use tools to access the volumes
of data that is collected and provides real-time, batch, interactive, and predictive analytics
capabilities with WSO2 DAS and WSO2 ML. It has a complete toolset to build gadgets
and dashboards for visualizing results. In addition, it can be easily integrated with WSO2
ESB and WSO2 Business Process Server to take further action based on results that might
require human intervention.
6.3 SIMPLE
The platform is easy to install and get started. Comprehensive documentation on the
features and a good set of samples are available for reference. WSO2 DAS and WSO2 ML
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
13
products are self-contained with all the required tools and technologies, such as analysis
engines (e.g. Spark and Lucene, Siddhi). Therefore, it is simple to get going initially with
embedded analysis tools when you start your analytics projects.
Figure 8
6.4 INCREMENTAL
With the WSO2 analytics platform, you can start small and expand and grow at your own
pace. Since both WSO2 DAS and WSO2 ML are self-contained for the requirements of
analytics (collection, analysis and communication), you can use a single instance of these
products when you start your projects for both proof of concept phases as well as initial
production phases. As your analytics platform requirements expand and your data volume
and analysis requirements grow, you can gradually expand the scale of the platform. For
example, you can first scale up the data storage, then the analysis engine, etc. This enables
the enterprise to embrace and initiate analytics solutions with smaller budgets without
having to make heavy investments and expand the projects as deemed necessary, proven
by return of investments of the initial phases of the projects.
6.5 PLUGGABLE
The WSO2 analytics platform allows you to integrate any existing platform that you already
have in your enterprise. You can use the custom data agents to publish events/data you
want and receive them via custom data receivers. You can easily use WSO2 ESB with 150+
out of the box connectors to get data feeds from sources, such as Twitter and Facebook.
Moreover, once you have the analytics platform in place, you can plug in any new systems
that you introduce into the enterprise using the similar technique of implementing a
custom data agent/receiver pair. Plug-in cost is low and the techniques are simple to
implement thanks to the well-defined APIs, comprehensive samples in multiple domains,
and documentation available.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
14
Data Agents Event Receivers
Collecting Data
Siddhi EventProcessors
Data Indexing
Analytics Spark
Data Sinks
Communicating Results
Analytics REST API
Dashboard
Event Publishers
Analyzing Data
Data Store(Cassandra, HBase,
RDBMS etc.)
6.6 AGILE
The analytics platform is flexible and easy to adapt to your needs. Rather than you having
to mold the kinds and forms of data into the product requirements, you can use any data
schema you want and get the platform to capture and monitor those data. This is also true
for data analysis and visualization. For data analysis, you have the flexibility of defining the
analysis logic to suite your needs. At the same time, the platform helps you to schedule
and let you deal with the time dimension of the analysis with ease. In case of visualizing,
you have the flexibility of writing your own gadgets and laying them out the way you want
on your dashboards. Alternatively, the platform also has provision for you to automatically
generate gadgets with graphs and tables based on the summarized data streams.
6.7 CLEAN DESIGN
The WSO2 analytics platform is built from ground up with a clean design with no
acquisitions integrated. The key advantage of this is that the platform is tuned to ideally fit
your needs in the analytics space. In addition, WSO2 keeps fine-tuning and evolving this
platform to meet the state-of-the-art trends in this domain.
7. CONCLUSION
Today, the analytics needs of an enterprise is often demanding and sometimes complex.
With data produced from multiple sources in varied formats the primary requirement is
to capture all of this valuable data and store them in a single place to prepare for analysis.
Thereafter, batch analytics can be used to combine, compare, and contrast different data
sets with each other and produce summarized data to extract relevant information that will
be useful to management. It doesn’t just end there - management would also need to stay
on top of developments in market and current trends to make proactive business decisions.
To enable this, the enterprise would need to analyze incoming data and use real-time
analytics to send alerts or provide instant updates to dashboards that can be accessed on
any device. Furthermore, to remain competitive and a step ahead, the enterprise can use
predictive analysis where current and historical data is used to make predictions.
The WSO2 analytics platform is an ideal solution that would meet an enterprise’s
requirements; the completely open-source platform is seamlessly integrated, combining
batch, interactive, real-time and predictive analytics. These capabilities enable an
organization to collect data from multiple sources in different formats, analyze data using
various techniques, and communicate results effectively.
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
15
ABOUT THE AUTHOR
ABOUT WSO2
Check out more WSO2 White Papers and WSO2 Case Studies.
For more information about WSO2 products and services,
please visit http://wso2.com or email [email protected]
EVERYTHING YOU NEED TO KNOW ABOUT THE WSO2 ANALYTICS PLATFORM ©2016 WSO2
Sajith Ravindra
Senior Software Engineer,
WSO2
Sajith has over six years of industry experience serving in multiple roles. At WSO2, he has
been working closely with the WSO2 Complex Event Processor team and has contributed
towards the development, distribution, and scalability efforts of the product. In addition
to his product development work, he has provided technology consulting on several
customer engagements as well. He holds a bachelor’s degree in information technology
at Sri Lanka Institute of Information Technology and is currently reading for his master’s
degree in computer science at University of Moratuwa, Sri Lanka.
WSO2 is the only company that provides a completely integrated enterprise application
platform for enabling a business to build and connect APIs, applications, web services,
iPaaS, PaaS, software as a service, and legacy connections without having to write code;
using big data and mobile; and fostering reuse through a social enterprise store. Only with
WSO2 can enterprises use a family of governed secure solutions built on the same code
base to extend their ecosystems across the cloud and on mobile devices to employees,
customers, and partners in anyway they like. Hundreds of leading enterprise customers
across every sector—health, financial, retail, logistics, manufacturing, travel, technology,
telecom, and more—in every region of the world rely on WSO2’s award-winning, 100%
open source platform for their mission-critical applications. To learn more, visit http://
wso2.com or check out the WSO2 community on the WSO2 Blog, Twitter, LinkedIn, and
Facebook.