the new dominant companies are running on data

24
The new dominant companies are running on data Take your company to the next level of value and efficiency Rich DillEnterprise Solutions Architect[email protected]

Upload: snaplogic

Post on 21-Jan-2018

74 views

Category:

Data & Analytics


2 download

TRANSCRIPT

The new dominant companies are running on data

Take your company to the next level of value and efficiency

Rich Dill– Enterprise Solutions Architect– [email protected]

©2017 SnapLogic, Inc. All Rights Reserved Confidential Content

2

What problem do we want to solve?

How do we get value from all this data?

What is the solution?

Confidential Content

3

Sometimes it is not obvious to everyone involved

©2017 SnapLogic, Inc. All Rights Reserved

Decisions made without facts are opinions

◦ What are the facts? Again and again and again – what are the facts? Shun wishful thinking, ignore divine revelation, forget what

“the stars foretell,” avoid opinion, care not what the neighbors think, never mind the unguessable “verdict of history” – what are

the facts, and to how many decimal places? You pilot always into an unknown future; facts are your single clue. Get the facts!”

RH

Turn your latent assets into liquid to realize their value

- No longer latent but now liquid

◦ Data has to be on the move

- It must be leveraged by the masses

The business goal

◦ Actually deliver on the promise of transforming data into actionable information

◦ Predictive analytics improve forecasting

◦ Prescriptive analytics can guide business behaviour

◦ Geolocation analytics can improve resource utilization and inventory turns

What are the results?

- Delivering insights to executives yields direction

- Delivering insights to line workers yields results

corporate overview

Not everyone has the same problemUse cases are variations on a common theme

Confidential Content

4

©2017 SnapLogic, Inc. All Rights Reserved

Sampling of Industry Focused Use Cases

Umbrella IndustryFraud

Detection

Upsell &

Cross-sell

Customer360 Fault

Prediction

Sentiment

Analysis

Personalization M & A Management

Consulting

Manufacturing X X

Retail X X X X X X

Healthcare X X

Financial Services X X X X X

Energy X X

Logistics &

Transportation

X X X

Services X X

CPG X X X

Computer Software X X

Telecom X X X X X X X

Deployment PatternData Refinery or

Data Lake Pop.

Hub-and-Spoke Hub-and-Spoke Data Refinery Data Refinery or

Data Lake Pop.

Data Refinery or

Data Lake Pop.

Common Data

Modeling

Common Data

Modeling

Data Lake Population

Data Lake

Storage: S3, HDFS,

Processing/Transformation

Ingestion

Source System 1

Source

System 2

Source System 3

Source System N

Pull

Push

Stream

Streaming

Database

SaaS App

File

7

Data Refinery

Data Lake OLAPPush

Storage: S3, HDFS,

Processing/Transformation

Ingestion

Pull

Push

Stream

Source System 1

Source

System 2

Source System 3

Source System N

Streaming

Database

SaaS App

File

8

Common Data Model

Data Lake

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

HDFS, S3, Blob Staging

**

Source System 1

Source System 1Downstream

Apps

Push

Streaming

Database

SaaS

App

File

Processing/Transformation

Ingestion

Pull

Push

Stream

Storage: S3, HDFS,

9

Hub-and-Spoke

Data Lake EDWPush

Data Mart

Data Mart

Data Mart

Data Science Workbench

Pull

Push

Stream

Storage: S3, HDFS,

Processing/Transformation

Ingestion

Source System 1

Source

System 2

Source System 3

Source System N

Streaming

Database

SaaS App

File

corporate overview

The first solution: custom built

Michelangelo@Uber

Confidential Content

11

Welcome my son to the machine…

©2017 SnapLogic, Inc. All Rights Reserved

The problem

◦ “There were no systems in place to build reliable, uniform, and reproducible pipelines for creating and managing training and prediction data at scale.”

The solution: Machine Learning as a Service

◦ ML-as-a-service platform that democratizes machine learning and makes scaling AI to meet the needs of business as easy as requesting a ride.

Michelangelo consists of a mix of open source systems and components built in-house. The primary open sourced components used are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow.

Cost

◦ Two years

◦ $60 million

Results

◦ A Wall Street Journal report claims SoftBank has been in touch with Uber with the apparent goal of buying a “multi-billion dollar stake” in the company. To date, Uber has raised close to $12 billion from investors, with its most recent valuation reportedly above $60 billion. July 25, 2017

©2017 SnapLogic, Inc. All Rights Reserved Confidential Content

12

A Model feature report

Building on success

Confidential Content

13

Both the systems and staff continue to learn and evolve

©2017 SnapLogic, Inc. All Rights Reserved

“As the platform layers mature, we plan to invest in higher level tools and services to

drive democratization of machine learning and better support the needs of our

business”

For more information

◦ https://eng.uber.com/michelangelo/

corporate overview

The second solution: custom integration

The five year plan

Confidential Content

15

Rome was not built in a day

©2017 SnapLogic, Inc. All Rights Reserved

The problem◦ A large multinational corporation grew in part by acquisition

◦ Technology stacks and silos as far as the eye can see

◦ They had one or more of every kind of technology

◦ They had hundreds of data warehouses and data marts

The cost◦ Implementing any new business processes were blindingly expensive, took too long and were not what the user was expecting

or needed

The solution◦ Simplify, standardized, consolidate and adopt a cloud strategy

◦ Insert a Data Lake into the data lifecycle

◦ Adopt a Citizen Integrator model where ever possible

The business result◦ The combination of migration from a perpetual software license model to SaaS and the reduced labor costs of the Citizen

Integrator model resulted in savings in the millions

The evolving data lifecycle

Confidential Content

16

©2017 SnapLogic, Inc. All Rights Reserved

Data Lake

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

Source System 1

Source System 2

Source System 3

Source System N

EDW

Data Mart

Data Mart

Data Mart

Data Science Workbench

EDW

Data Mart

Data Mart

Data Mart

Two stages, OLTP to DW and Data marts Three stages, OLTP to Data Lake, the to on shore

Data marts and DW

Results

Confidential Content

17

Happy productive business users

©2017 SnapLogic, Inc. All Rights Reserved

Faster time to market for new programs with agility and LOB alignment

Over 500 users from almost all business units

Savings in the millions

A more agile business environment

corporate overview

The third is a solution

The solution approach

Confidential Content

19

Business goal drive the architectural requirements

©2017 SnapLogic, Inc. All Rights Reserved

The problem/business goal

◦ Obtain a customer 360 view by removing the constraints of an on-premises environment and move to a cloud-first environment where multiple departments/constituents can access data and obtain insights.

Key Characteristics of a cloud-first enterprise stack:

◦ Scalable

◦ Collaborative

◦ Promotes easy data sharing

◦ Reduces on-premises maintenance overhead with auto updates

The process

◦ Upgrade the cloud data warehouse

◦ Move legacy BI to a modern tool like Tableau or PowerBI, for greater data fluency

◦ Create a foundation for an AI/ML workbench for predictive analytics

◦ Use ML framework like TensorFlow from Google generates Java code that runs anywhere

20

Proposed Enterprise Stack

Amazon S3

Amazon EMR

SnapLogic (AWS Deployed)

Pull

Push

Stream

Push Tableau

Streaming

Database

Webservices

File

SAS

Cognos

Analytics

Kafka, JMS

Hbase, Hive, Dynamo, Mongo, Redshift,

SQLServer, AzureSQL, Aurora, MySQL

REST, SOAP

Flat Files, XML, JSon, Excel, Word doc, PDF,

S3, FTP/SFTP, ORC, Parquet

Sources & Targets

Social MediaFacebook, LinkedIn,

Twitter

Machine Learning Integration Point

Key Benefits of Proposed Architecture

Confidential Content

21

©2017 SnapLogic, Inc. All Rights Reserved

Enables migration in phases rather than all at once

Promotes data re-use and reduces time to insight across the organization

Scalable and flexible to accommodate company’s changing needs

Reduced maintenance costs to enable IT to stay focused on enabling the business

Complete view of the customer with real-time data updates

Better focused marketing programs (less waste, higher performance)

Greater customer loyalty due to more relevant customer engagement

Observations from the field

Confidential Content

22

Some observations and a few of Rich’s rules of technology

©2017 SnapLogic, Inc. All Rights Reserved

Technology is a tool, use the right one for the job

◦ It amazes me how some engineers have almost religious beliefs in their favorite technology

- If the only tool you have is a hammer…

Software evolves like a funnel

◦ Early releases have limitations that are fixed with later releases

We work in an industry where change is constant

◦ Absolute truths can change every 5-10 years

◦ The rate of change can make you old, or keep you young. As the Iron Giant said, choose!

Different technologies require different approaches and techniques

◦ I don’t code Scala like C or Cobol

◦ “A mind is like a parachute it only functions when it is open” Thomas Dewar

The adoption curve entails risk… and costs

◦ There is a reason we call it the bleeding edge

Open source is not free

◦ The money you save on license cost, you will spend on additional labor, plus 25%

©2017 SnapLogic, Inc. All Rights Reserved Confidential Content

Q & A

Thank You

San Mateo, CA

Boulder, CO

New York, NY

London, UK

Melbourne, AUS

Hyderabad, India

www.snaplogic.com

Rich Dill– Enterprise Solutions Architect

[email protected]