denodo datafest 2017: modern data architectures need real-time data delivery
TRANSCRIPT
Exploiting Fast And Operational Data For Competitive
Advantage Using Data Virtualisation
Mike Ferguson
Managing Director
Intelligent Business Strategies
Denodo Datafest,
London, October 2017
2Copyright © Intelligent Business Strategies 1992-2017
Topics
▪ Digital transformation
▪ The impact of digital transformation on data
• Operational challenges in a digital world
• Analytical challenges in a digital world
▪ Enabling exploitation of fast data in a digital enterprise using
data virtualisation
3Copyright © Intelligent Business Strategies 1992-2017
What Is Digitalization?
Digitalization is the process of moving to a digital
business by making use of digital technologies to
change ways of operating and to create new insights
that provide new revenue and value-producing
opportunities
4Copyright © Intelligent Business Strategies 1992-2017
What is Digital Transformation?
- A Programme Of Transition To A Digital Enterprise
At the very least this transition includes:
1. Digital transformation of operational systems and processes
2. Digital transformation of analytical systems
3. Rapid closing of the loop between analytics and operations
4. Digitisation of content
5Copyright © Intelligent Business Strategies 1992-2017
Sales Force
automation apps
Customer facing
bricks & mortar apps
Front-Office OperationsCustomer
service apps
CustomersImprove customer
engagementDigital channels are
generating big data
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
Computer
Systemdata
customer app
Customer interaction – the new way
- Need access to low latency
operational data
- Need customer intelligent computer
systems
Computer
Systemdata
Customer interaction – the old way
customer employee UI
New Digital Channels Are Becoming The Focal Point For
Customer Interactions In The Digital Front-Office
6Copyright © Intelligent Business Strategies 1992-2017
Companies Also Need To Respond More Rapidly In A Market
Where Power Has Firmly Shifted To The Customer
Also customers are much
more informed before they
buy and can churn on a
single click.
This means loyalty is cheap
The “Device Generation”
Prospects and customers are now
interacting with applications and not
people and so there is very little time to
engage them
7Copyright © Intelligent Business Strategies 1992-2017
Challenge – Processes Now Span Cloud & On-Premises Making
Transaction Data Hard To Access To Manage Operations
order
credit
check fulfil ship invoice paymentpackageschedule
Order entry
system
Credit
control
system
Production
planning &
scheduling
CAM
system
Inventory
system
Distribution
system
Billing Gen
Ledger
Orders data Customer data Product data
Order-to-Cash Process
What order changes in the last 10 mins?
What shipments are impacted by the changes
e.g. lack of inventory or shipping capacity?
Which customers are affected?
Operational reporting
is not timely
Inability to respond quickly
to problems
Problems not seen until long after they
happen e.g. incorrect shipments
Operational oversights cause processing
errors & unplanned operational cost
Inability to see across multiple instances of a
system can cause errors & duplication of effort
Business
impact
customer
app
8Copyright © Intelligent Business Strategies 1992-2017
XYZ
Corp.
Challenge - Many Companies Have Organised Business Units
Processes And Systems Around Products & Services
Customers/
ProspectsProduct/service line 1
order credit
check
fulfill ship invoice paymentpackage
Product/service line 2
Product/ service line 3Channels
/
Outlets
order credit
checkfulfill ship invoice paymentpackage
order credit
checkfulfill ship invoice paymentpackage
Order(product line 1)
Order(product line 2)
Order(product line 3)
Enterprise
9Copyright © Intelligent Business Strategies 1992-2017
New Data - Much Of The New Data Captured Is Fast Data That
Can Provide New Customer and Operational Insights
▪ Machine data
• Clickstream data, e-commerce logs
• IVR logs, App Server logs, DBMS logs
▪ Connected things (Sensor data, IoT)
• Product usage behaviour data, product performance data
• Location, temperature, light, vibration, liquid flow, pressure, RFIDs
▪ Self-service transactions
▪ Semi-structured data e.g., JSON, BSON, XML
▪ Social networks data (often unstructured e.g. Text)
▪ Open government data
Fast
data
10Copyright © Intelligent Business Strategies 1992-2017
Challenges – New Data Is Being Ingested Into Multiple Types Of
Data Store Making It Harder To Access And AnalyseEnterprise
cloud
storage
I
D N
A G
T E
A S
T
Data.Gov
C
R
Uprod cust
asset
D
MDM
NoSQL
DBMS DW
11Copyright © Intelligent Business Strategies 1992-2017
Challenge – Improving Profitability And Agility Is Proving Difficult
When Captured Data Is Becoming More Fractured
▪ Data in different locations
▪ Data in different data storage technologies
▪ Different APIs and query languages needed
to access data
▪ Data in different data structures
▪ Different data definitions for the same data in
different data stores
▪ Some data too big to move
▪ Excessive use of ETL to copy data
• Expensive and not agile
▪ Synchronization nightmare
<XML>Text</XML>
Digital
media
RDBMSs
Web
content
Flat files
Packaged
applications
Office
documentsLegacy
applications
DW/BI
systems
Big Data applications
Cloud based
applications
ECMS
“Where is all the
Customer Data?”
Accessing, governing and managing
data is becoming increasingly complex
as it becomes more distributed
12Copyright © Intelligent Business Strategies 1992-2017
Business Implications Of Product Orientation and Fractured
Customer Data In A World Where Customer Is Now King
▪ Different marketing campaigns from different divisions aimed at the same customer
▪ Different sales teams from different divisions selling to the same customer
▪ Customer service is hard
• e.g. “What is my order status for all products ordered?”
▪ Cost of operating is much higher due to duplicate processes across product lines
▪ Can’t see customer / product ownership
▪ Can’t see customer risk and customer profitability
▪ Hard to access and take advantage of new digital data about customers when it is
captured in yet another data store
▪ Higher chance of poor data quality
▪ Difficult to maintain customer data fractured across multiple applications
13Copyright © Intelligent Business Strategies 1992-2017
Digitalisation - The Requirement Now Is To Capture, Integrate And
Analyse More Data For Deeper Customer Insights AND Do It Quickly
OMNI channel analysis – analyse all
customer interactions across all channels
identity
data
behavioural data
(on-line,
location, product
usage)
social
data
Customer “DNA”
transactional
activity
Needs to be integrated in near real-time for
maximise competitive advantage
Enabling Exploitation of Fast Data in a Digital
Enterprise using Data Virtualisation
15Copyright © Intelligent Business Strategies 1992-2017
Data Virtualization Makes It Easy To Access And Report on Data
Across Processes To Manage Business Operations
Order-to-Cash Process
Data virtualization and Virtual Data Services
Benefits
Simplified access
Access to real-time data across the process
Agile and responsive
Avoid unplanned operational costs
See across multiple instances of apps
See across on-premises & cloud apps
cost
Agility
order credit
check
fulfil ship invoice paymentpackageschedule
customer
app
16Copyright © Intelligent Business Strategies 1992-2017
XYZ
Corp.
Data Virtualisation - See Views Of Orders, Shipments And
Payments Across All Lines Of Business
Customers/
ProspectsProduct/service line 1
order credit
check
fulfill ship invoice paymentpackage
Product/service line 2
Product/ service line 3Ch
an
ne
ls/
Ou
tle
ts
order credit
checkfulfill ship invoice paymentpackage
order credit
checkfulfill ship invoice paymentpackage
Order(product line 1)
Order(product line 2)
Order(product line 3)
Enterprise
Data
virtu
aliz
atio
n
Data
virtu
aliz
atio
n
Data
virtu
aliz
atio
n
17Copyright © Intelligent Business Strategies 1992-2017
Performance - Need Parallelism In Data Virtualisation to Speed Up
Data Access And Integration Across Hybrid Operational Processes
order credit
check
fulfil ship invoice paymentpackageschedule
customer
app
DV Slave DV Slave DV Slave DV Slave
SQL
Cost based
optimizer
DV
master
DV Slave
BI Tool Application
In memory caching PLUS
in-memory parallel
processing of aggregations
pushdown pushdown pushdown pushdown
Data
virtualisation
serverin memory
DV needs parallel pushdown and MPP
in-memory processing of cached and
aggregate data
SQL or REST
pushdown
18Copyright © Intelligent Business Strategies 1992-2017
Data Virtualisation - Integrated Customer insight
Data Virtualisation Can Integrate Customer Insight AND Make It
Available As Services To Integrate Into All Front Office Channels
EDW
DW & marts
NoSQL DB e.g. graph DB
mart
DW
Appliance
Advanced Analytics
(structured data)
Advanced
Analytics
Streaming
data
RT Analytics
C
R
Uprod cust
asset
master dataCustomer sentiment,
interactions,
online behaviour,
& new data
Customer
relationships*,
social network
influencers
Customer real-
time location,
product usage &
on-line behaviour
Customer
master data
Customer
purchase activity
& transaction
history
Customer predictive
analytical model
development
Sales Force
automation apps
Customer facing
bricks & mortar apps
Front-Office OperationsCustomer
service apps
Customers
Improve customer
engagement
E-commerce
application
M-commerce
Mobile apps
Social commerce
applications
Digital channels are
generating big data
e.g. In-store apps
In-branch apps
19Copyright © Intelligent Business Strategies 1992-2017
Data
sources
Performance - Parallel Processing In Data Virtualisation Speeds
Up Integration Of Customer Insights From Analytical Systems
parallel processing in the sourceDV = data virtualisation
EDW
DW & marts
NoSQL DB e.g. graph DB
mart
DW
Appliance
Advanced Analytics
(structured data)
Advanced
Analytics
Streaming
data
RT Analytics
C
R
Uprod cust
asset
master data
DV Slave DV Slave DV Slave DV Slave
SQL
Cost based
optimizer
DV
master
DV Slave
BI Tool Application
In memory caching PLUS
in-memory parallel
processing of aggregations
pushdown pushdown pushdown pushdown
Data
virtualisation
serverin memory
DV needs parallel pushdown and MPP
in-memory processing of cached and
aggregate dataSQL or REST
pushdown
20Copyright © Intelligent Business Strategies 1992-2017
Product Example – Denodo 7 In-Memory MPP Query Processing
With Query Pushdown Optimisation
22
Query Optimization: Denodo 7
Denodo 7: In-memory fabric + Rules engine (aggregation pushdown) + Cost based optimizer
Obtain Total Sales By Customer Country in the Last Two Years
2M rows
(sales by customer
this year)
2M rows
(sales by customer
previous year)
Customer(2M rows)
Cached
Current Sales(100 million rows)
Historical Sales(1 billion rows)
union
group by
customer ID
group by
customer ID
join
Group by
year
Partial Aggregation
push downAlready available in Denodo 6
Maximizes source processing
Reduces network traffic
On-demand Parquet generationGeneration of Parquet file
in the cluster, in streaming mode
Integration with pre-cached dataCached data already stored in the cluster
in a Parquet file
Fast parallel executionSupport for Spark, Presto and Impala
For fast analytical processing in
inexpensive Hadoop-based solutions
Integrated with Cost Based OptimizerBased on data volume estimation and
the cost of these particular operations,
the CBO can decide to move all or part
Of the execution tree to the MPP
In-memory + Rules engine (aggregation pushdown) + Cost based optimizer
• Optimizer can decide to move data on the fly to the fabric during query execution for any part of the execution pipeline
• Uses pushdown to minimize network traffic with in-memory, parallel processing
• Partitioned data caching and MPP of post pushdown query processing operations
• Can combine both and leverage big data technologies like Spark, Presto, Impala, etc.)for high
performance o access fast data volumes in Big Data platforms
Query Acceleration:
21Copyright © Intelligent Business Strategies 1992-2017
Benefits Of Parallel Processing In The Data Virtualisation Server
▪ Rapid integration of operational data across hybrid processes
▪ Rapid integration of insights across big data, fast data and data warehouse
data stores
▪ Smart customer facing applications able to access in-memory information
services that integrate data in parallel
▪ High performance operational and analytical processing in a modern digital
enterprise through parallel processing of
• In-memory aggregate data retrieved from sources
• Cached data in the data virtualisation server
• Data in some sources after pushdown
22Copyright © Intelligent Business Strategies 1992-2017
Thank You!
www.intelligentbusiness.biz
@mikeferguson1
(+44)1625 520700
Thank You!
Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an
independent analyst and consultant he specializes in business intelligence, analytics, data
management and big data. With over 35 years of IT experience, Mike has consulted for dozens
of companies, spoken at events all over the world and written numerous articles. Formerly he
was a principal and co-founder of Codd and Date Europe Limited – the inventors of the
Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing
Director of DataBase Associates.