denodo datafest 2017: modern data architectures need real-time data delivery

22
Exploiting Fast And Operational Data For Competitive Advantage Using Data Virtualisation Mike Ferguson Managing Director Intelligent Business Strategies Denodo Datafest, London, October 2017

Upload: denodo

Post on 21-Jan-2018

59 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

Exploiting Fast And Operational Data For Competitive

Advantage Using Data Virtualisation

Mike Ferguson

Managing Director

Intelligent Business Strategies

Denodo Datafest,

London, October 2017

Page 2: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

2Copyright © Intelligent Business Strategies 1992-2017

Topics

▪ Digital transformation

▪ The impact of digital transformation on data

• Operational challenges in a digital world

• Analytical challenges in a digital world

▪ Enabling exploitation of fast data in a digital enterprise using

data virtualisation

Page 3: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

3Copyright © Intelligent Business Strategies 1992-2017

What Is Digitalization?

Digitalization is the process of moving to a digital

business by making use of digital technologies to

change ways of operating and to create new insights

that provide new revenue and value-producing

opportunities

Page 4: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

4Copyright © Intelligent Business Strategies 1992-2017

What is Digital Transformation?

- A Programme Of Transition To A Digital Enterprise

At the very least this transition includes:

1. Digital transformation of operational systems and processes

2. Digital transformation of analytical systems

3. Rapid closing of the loop between analytics and operations

4. Digitisation of content

Page 5: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

5Copyright © Intelligent Business Strategies 1992-2017

Sales Force

automation apps

Customer facing

bricks & mortar apps

Front-Office OperationsCustomer

service apps

CustomersImprove customer

engagementDigital channels are

generating big data

E-commerce

application

M-commerce

Mobile apps

Social commerce

applications

E-commerce

application

M-commerce

Mobile apps

Social commerce

applications

Computer

Systemdata

customer app

Customer interaction – the new way

- Need access to low latency

operational data

- Need customer intelligent computer

systems

Computer

Systemdata

Customer interaction – the old way

customer employee UI

New Digital Channels Are Becoming The Focal Point For

Customer Interactions In The Digital Front-Office

Page 6: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

6Copyright © Intelligent Business Strategies 1992-2017

Companies Also Need To Respond More Rapidly In A Market

Where Power Has Firmly Shifted To The Customer

Also customers are much

more informed before they

buy and can churn on a

single click.

This means loyalty is cheap

The “Device Generation”

Prospects and customers are now

interacting with applications and not

people and so there is very little time to

engage them

Page 7: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

7Copyright © Intelligent Business Strategies 1992-2017

Challenge – Processes Now Span Cloud & On-Premises Making

Transaction Data Hard To Access To Manage Operations

order

credit

check fulfil ship invoice paymentpackageschedule

Order entry

system

Credit

control

system

Production

planning &

scheduling

CAM

system

Inventory

system

Distribution

system

Billing Gen

Ledger

Orders data Customer data Product data

Order-to-Cash Process

What order changes in the last 10 mins?

What shipments are impacted by the changes

e.g. lack of inventory or shipping capacity?

Which customers are affected?

Operational reporting

is not timely

Inability to respond quickly

to problems

Problems not seen until long after they

happen e.g. incorrect shipments

Operational oversights cause processing

errors & unplanned operational cost

Inability to see across multiple instances of a

system can cause errors & duplication of effort

Business

impact

customer

app

Page 8: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

8Copyright © Intelligent Business Strategies 1992-2017

XYZ

Corp.

Challenge - Many Companies Have Organised Business Units

Processes And Systems Around Products & Services

Customers/

ProspectsProduct/service line 1

order credit

check

fulfill ship invoice paymentpackage

Product/service line 2

Product/ service line 3Channels

/

Outlets

order credit

checkfulfill ship invoice paymentpackage

order credit

checkfulfill ship invoice paymentpackage

Order(product line 1)

Order(product line 2)

Order(product line 3)

Enterprise

Page 9: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

9Copyright © Intelligent Business Strategies 1992-2017

New Data - Much Of The New Data Captured Is Fast Data That

Can Provide New Customer and Operational Insights

▪ Machine data

• Clickstream data, e-commerce logs

• IVR logs, App Server logs, DBMS logs

▪ Connected things (Sensor data, IoT)

• Product usage behaviour data, product performance data

• Location, temperature, light, vibration, liquid flow, pressure, RFIDs

▪ Self-service transactions

▪ Semi-structured data e.g., JSON, BSON, XML

▪ Social networks data (often unstructured e.g. Text)

▪ Open government data

Fast

data

Page 10: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

10Copyright © Intelligent Business Strategies 1992-2017

Challenges – New Data Is Being Ingested Into Multiple Types Of

Data Store Making It Harder To Access And AnalyseEnterprise

cloud

storage

I

D N

A G

T E

A S

T

Data.Gov

C

R

Uprod cust

asset

D

MDM

NoSQL

DBMS DW

Page 11: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

11Copyright © Intelligent Business Strategies 1992-2017

Challenge – Improving Profitability And Agility Is Proving Difficult

When Captured Data Is Becoming More Fractured

▪ Data in different locations

▪ Data in different data storage technologies

▪ Different APIs and query languages needed

to access data

▪ Data in different data structures

▪ Different data definitions for the same data in

different data stores

▪ Some data too big to move

▪ Excessive use of ETL to copy data

• Expensive and not agile

▪ Synchronization nightmare

<XML>Text</XML>

Digital

media

RDBMSs

Web

content

E-mail

Flat files

Packaged

applications

Office

documentsLegacy

applications

DW/BI

systems

Big Data applications

Cloud based

applications

ECMS

“Where is all the

Customer Data?”

Accessing, governing and managing

data is becoming increasingly complex

as it becomes more distributed

Page 12: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

12Copyright © Intelligent Business Strategies 1992-2017

Business Implications Of Product Orientation and Fractured

Customer Data In A World Where Customer Is Now King

▪ Different marketing campaigns from different divisions aimed at the same customer

▪ Different sales teams from different divisions selling to the same customer

▪ Customer service is hard

• e.g. “What is my order status for all products ordered?”

▪ Cost of operating is much higher due to duplicate processes across product lines

▪ Can’t see customer / product ownership

▪ Can’t see customer risk and customer profitability

▪ Hard to access and take advantage of new digital data about customers when it is

captured in yet another data store

▪ Higher chance of poor data quality

▪ Difficult to maintain customer data fractured across multiple applications

Page 13: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

13Copyright © Intelligent Business Strategies 1992-2017

Digitalisation - The Requirement Now Is To Capture, Integrate And

Analyse More Data For Deeper Customer Insights AND Do It Quickly

OMNI channel analysis – analyse all

customer interactions across all channels

identity

data

behavioural data

(on-line,

location, product

usage)

social

data

Customer “DNA”

transactional

activity

Needs to be integrated in near real-time for

maximise competitive advantage

Page 14: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

Enabling Exploitation of Fast Data in a Digital

Enterprise using Data Virtualisation

Page 15: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

15Copyright © Intelligent Business Strategies 1992-2017

Data Virtualization Makes It Easy To Access And Report on Data

Across Processes To Manage Business Operations

Order-to-Cash Process

Data virtualization and Virtual Data Services

Benefits

Simplified access

Access to real-time data across the process

Agile and responsive

Avoid unplanned operational costs

See across multiple instances of apps

See across on-premises & cloud apps

cost

Agility

order credit

check

fulfil ship invoice paymentpackageschedule

customer

app

Page 16: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

16Copyright © Intelligent Business Strategies 1992-2017

XYZ

Corp.

Data Virtualisation - See Views Of Orders, Shipments And

Payments Across All Lines Of Business

Customers/

ProspectsProduct/service line 1

order credit

check

fulfill ship invoice paymentpackage

Product/service line 2

Product/ service line 3Ch

an

ne

ls/

Ou

tle

ts

order credit

checkfulfill ship invoice paymentpackage

order credit

checkfulfill ship invoice paymentpackage

Order(product line 1)

Order(product line 2)

Order(product line 3)

Enterprise

Data

virtu

aliz

atio

n

Data

virtu

aliz

atio

n

Data

virtu

aliz

atio

n

Page 17: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

17Copyright © Intelligent Business Strategies 1992-2017

Performance - Need Parallelism In Data Virtualisation to Speed Up

Data Access And Integration Across Hybrid Operational Processes

order credit

check

fulfil ship invoice paymentpackageschedule

customer

app

DV Slave DV Slave DV Slave DV Slave

SQL

Cost based

optimizer

DV

master

DV Slave

BI Tool Application

In memory caching PLUS

in-memory parallel

processing of aggregations

pushdown pushdown pushdown pushdown

Data

virtualisation

serverin memory

DV needs parallel pushdown and MPP

in-memory processing of cached and

aggregate data

SQL or REST

pushdown

Page 18: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

18Copyright © Intelligent Business Strategies 1992-2017

Data Virtualisation - Integrated Customer insight

Data Virtualisation Can Integrate Customer Insight AND Make It

Available As Services To Integrate Into All Front Office Channels

EDW

DW & marts

NoSQL DB e.g. graph DB

mart

DW

Appliance

Advanced Analytics

(structured data)

Advanced

Analytics

Streaming

data

RT Analytics

C

R

Uprod cust

asset

master dataCustomer sentiment,

interactions,

online behaviour,

& new data

Customer

relationships*,

social network

influencers

Customer real-

time location,

product usage &

on-line behaviour

Customer

master data

Customer

purchase activity

& transaction

history

Customer predictive

analytical model

development

Sales Force

automation apps

Customer facing

bricks & mortar apps

Front-Office OperationsCustomer

service apps

Customers

Improve customer

engagement

E-commerce

application

M-commerce

Mobile apps

Social commerce

applications

Digital channels are

generating big data

e.g. In-store apps

In-branch apps

Page 19: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

19Copyright © Intelligent Business Strategies 1992-2017

Data

sources

Performance - Parallel Processing In Data Virtualisation Speeds

Up Integration Of Customer Insights From Analytical Systems

parallel processing in the sourceDV = data virtualisation

EDW

DW & marts

NoSQL DB e.g. graph DB

mart

DW

Appliance

Advanced Analytics

(structured data)

Advanced

Analytics

Streaming

data

RT Analytics

C

R

Uprod cust

asset

master data

DV Slave DV Slave DV Slave DV Slave

SQL

Cost based

optimizer

DV

master

DV Slave

BI Tool Application

In memory caching PLUS

in-memory parallel

processing of aggregations

pushdown pushdown pushdown pushdown

Data

virtualisation

serverin memory

DV needs parallel pushdown and MPP

in-memory processing of cached and

aggregate dataSQL or REST

pushdown

Page 20: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

20Copyright © Intelligent Business Strategies 1992-2017

Product Example – Denodo 7 In-Memory MPP Query Processing

With Query Pushdown Optimisation

22

Query Optimization: Denodo 7

Denodo 7: In-memory fabric + Rules engine (aggregation pushdown) + Cost based optimizer

Obtain Total Sales By Customer Country in the Last Two Years

2M rows

(sales by customer

this year)

2M rows

(sales by customer

previous year)

Customer(2M rows)

Cached

Current Sales(100 million rows)

Historical Sales(1 billion rows)

union

group by

customer ID

group by

customer ID

join

Group by

year

Partial Aggregation

push downAlready available in Denodo 6

Maximizes source processing

Reduces network traffic

On-demand Parquet generationGeneration of Parquet file

in the cluster, in streaming mode

Integration with pre-cached dataCached data already stored in the cluster

in a Parquet file

Fast parallel executionSupport for Spark, Presto and Impala

For fast analytical processing in

inexpensive Hadoop-based solutions

Integrated with Cost Based OptimizerBased on data volume estimation and

the cost of these particular operations,

the CBO can decide to move all or part

Of the execution tree to the MPP

In-memory + Rules engine (aggregation pushdown) + Cost based optimizer

• Optimizer can decide to move data on the fly to the fabric during query execution for any part of the execution pipeline

• Uses pushdown to minimize network traffic with in-memory, parallel processing

• Partitioned data caching and MPP of post pushdown query processing operations

• Can combine both and leverage big data technologies like Spark, Presto, Impala, etc.)for high

performance o access fast data volumes in Big Data platforms

Query Acceleration:

Page 21: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

21Copyright © Intelligent Business Strategies 1992-2017

Benefits Of Parallel Processing In The Data Virtualisation Server

▪ Rapid integration of operational data across hybrid processes

▪ Rapid integration of insights across big data, fast data and data warehouse

data stores

▪ Smart customer facing applications able to access in-memory information

services that integrate data in parallel

▪ High performance operational and analytical processing in a modern digital

enterprise through parallel processing of

• In-memory aggregate data retrieved from sources

• Cached data in the data virtualisation server

• Data in some sources after pushdown

Page 22: Denodo DataFest 2017: Modern Data Architectures Need Real-time Data Delivery

22Copyright © Intelligent Business Strategies 1992-2017

Thank You!

www.intelligentbusiness.biz

[email protected]

@mikeferguson1

(+44)1625 520700

Thank You!

Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an

independent analyst and consultant he specializes in business intelligence, analytics, data

management and big data. With over 35 years of IT experience, Mike has consulted for dozens

of companies, spoken at events all over the world and written numerous articles. Formerly he

was a principal and co-founder of Codd and Date Europe Limited – the inventors of the

Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing

Director of DataBase Associates.