learnings do the math - cio summits · big data’s recent enterprise popularity can be attributed...

35
0 Mu Sigma Confidential Chicago, IL Bangalore, India www.mu-sigma.com Proprietary Information "This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden" Chicago, IL Bangalore, India www.mu-sigma.com Proprietary Information "This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden" Do The Math Embarking on Decision Sciences & Analytics 2013 Learnings [email protected] Twitter: @crunchdata

Upload: truongthu

Post on 12-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

0Mu Sigma Confidential

Chicago, IL

Bangalore, India

www.mu-sigma.com

Proprietary Information

"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"

Chicago, IL

Bangalore, India

www.mu-sigma.com

Proprietary Information

"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly forbidden"

Do The Math

Embarking on Decision Sciences & Analytics

2013

Learnings

[email protected]

Twitter: @crunchdata

11Mu Sigma Confidential

Analytics Trends & Big Data

The Decision Sciences Journey: Key Learnings & Scale

Agenda

2Mu Sigma Confidential

The information explosion is leading to an unprecedented ability to store, manage and analyze data; we need to run ahead..

Information Ecosystem

► Text

Expanding ApplicationsData Source Explosion

Technology EvolutionAdvanced Analytical Techniques

Digital Data in 2011:

~ 1750 Exabytes

Facebook: 100+

million users per day

Mobile Phone

Subscription: 4.6

billion

15 Million Bing

recommendation records

RFID Tags: 40 billion

I

N

F

O

R

M

A

T

I

O

N

A

G

E

Click-stream

dataPurchase intent

data

Social Media,

Blogs

Email

Archives

RFID & Geospatial Information

Events

Opinion Mining

Social Graph

Targeting

Influencer

Marketing

Offer

Optimization

Clinical Data

Analysis

Fraud

Detection

SKU

Rationalization

Massively large

databases

Search

Technologies

Complex Event

Processing

Cloud

Computing

Software as a

service

Interactive

VisualizationIn Memory

Analytics

CART

Adaptive

Regression

Splines

Machine

Learning

Radial

Basis Functions

Support Vector

Machines

Neural

Networks Geospatial Predictive

Modeling

3Mu Sigma Confidential

Three Key Macro Analytics Trends Targeting Consumption

Macro Trends: Agenda in All Three

Analytics TrendsComputational Enablement

– Scalable Analytics, Big Data & NoSQL technologies, GP-GPU, In –Memory & High Performance Computing

– Master Data Management, Cloud, and Federated Data – the end of the central EDW?

– Analytics as a Service and the Cloud

– Open Source Maturity

– Rise of Agile Self Service

– Data Model Disruption: ETL vs. ELT

Usability and Visualization

– Dashboards will Evolve into Cockpits

– Improved Interactive Data Visualization & Aesthetics

– Augmented / Virtual Reality / Natural User Interfaces

– Graph Analytics sparked by Social Data

– Discovery with Computational Geometry

– Mobile BI and the Integration of Consumer & Enterprise Devices

– Gamification: Do you want to play a game?

Intelligent Systems (“Anticipation Denotes Intelligence”)

– Operational Smart Systems are Back propagating – Pervasive Analytics

– Real Time Analytics and Event Streams

– Artificial Intelligence (AI) is not what we expected

– Don’t forget the 4th tier it’s Juicy

– Metaheuristics and the Ants

– Operational Analytics, You are the Exception

– Experiments in the Enterprise and the Quasi

– Just Say No to OLS

– Energy Informatics if Radiating

– Intelligent Search

4Mu Sigma Confidential

“Big Data”, the term, is about applying new technologies to meet unfulfilled needs: ANALYTICS AT SCALE…

What is Big Data?

Message Passing Interface Map Reduce

BIG Computation & Big Data Scaling Economically

Intensive Computational Analysis not Supported by the Traditional Data Warehouse Stack

and Architecture

5Mu Sigma Confidential

Big Data’s recent enterprise popularity can be attributed to two key factors

Big Data Technology Evolution

2003 2004 2005 2006 2007 2008 2009 2010 2011

Apache: Hadoop project

Yahoo: 10K core cluster

IBM

Acquires

Early Research Open source dev momentum

Initial success stories

Commercialization &

Adoption

Cloudera named most promising startup funded in 2009

elastic map-reduce

Teradata buys Aster (map-reduce DB)

Google Trigger: Releases “Map Reduce” & “GFS” paper

Google: Bigtablepaper

Lucene subproject

HP Acquires Vertica

EMC Acquires

Greenplum

2012

Oracle

MSOFT

SAS

Technology Trigger

– Robust Open Source Technology for Parallel Computation on Commodity Hardware (Hadoop / NoSQL)

Frustration Trigger..

– Tyranny of the Data Model, Cost & Complexity for data integration, Agility, Impact, Bureaucracy, Difficulty & Costly to scale of existing EDW technology

Faced with storing massive data sets from indexing the

web, Google searches for & finds a way to store & retrieve

the data in commodity hardware

6Mu Sigma Confidential

Why should you care, is this phenomenon all marketing hype or business critical?

Data is an asset and growing by leaps and bounds

– Structured and unstructured (DARK MATTER)

– Within and Beyond the firewall

A competitive differentiator is the use of data to improve decision making thru analytics

– “Executives are Professional Gamblers”

– Need an Edge..

» We are in the information age…

Consumption of analytics differentiates

» Embedded analytics vs. ppt culture (deckware?)

The idea is linked to the concept of a digital age or digital revolution, and carries the ramifications of a shift from traditional industry that the industrial revolution brought through industrialization, to an economy based on the manipulation of information, i.e., an information society (computation)

It’s all about analytics done at “SCALE”, enabling scale cost effectively.

Who Cares?

7Mu Sigma Confidential

A Mindset for Enabling Decision Sciences At ScaleDataset -> Toolset -> Skillset -> Mindset

Big Data – What is it?

Skillset

– Computer Scientist

» Map-Reduce, HDFS, HIVE, R, PIG, NoSQL, Unstructured Data

» Parallel computation & Stream Processing

– Data & Visual Explorer

» Data “Artisans” / Design Thinking

– Math / Stats / Data Mining / Econometrics / Machine Learning

Mindset

– Horizontal thinking

– Automation and Scale

» Higher utilization of machines for decisions

– Experimentation & Agility

– Learning vs Knowing

– Consumption Focused

– Algorithmic and Heuristic (Man & Machine)

– Intrapreneur, Curious, Persuasive, Soft Skills

8Mu Sigma Confidential

the Big Data eco-system is a challenge to keep up with..Mindset: Knowing vs. Learning..

The Big Data Ecosystem

”Formulating a right question is always hard, but with big data, it is an order of

magnitude harder, because you are blazing the trail (not grazing on the green field).” source: Gartner

Source: The Big Data Group llc

9Mu Sigma Confidential

Big Data Tech Move Towards the Center & Not the Periphery..

Source: cloudera

10Mu Sigma Confidential

Other Big Data Examples of Monetization

Large insurance companies spend hundreds of millions a year on “trip charges” for new

homeowner quotes

– Significantly Reduced spend with image analysis of roof types

Recommender algorithms implemented on a large e-commerce retail site

– Support cross-sell, new revenue stream and improved customer delight

Reduction in annual license and software fees, storage, and server expenses

Operational analytics improvements

– Improved customer experience thru scoring & tagging more rapidly

– Improved inventory management and customer satisfaction thru SKU level prediction and control

– Implemented at scale test and learn methods for store and SKU optimization

– Ad monetization, improved conversion due to accelerated elasticity estimation and segment tuning

Real time social media “intervention” generating sales leads

New revenue streams created by monetization of client’s transaction data

Data Hub / Lake / Reservoir / 360 / Fabric / Integration: Finally the EDW Lives?

11Mu Sigma Confidential

Quantified Self – Get ready for a Wave of Sensor Data & Analytics..Tricorder? THINK BIGGER…..

http://en.wikipedia.org/wiki/Quantified_Self

– “a collaboration of users and tool makers who share an interest in self knowledge through self-tracking”

– The primary methodology of self-quantification is data collection, followed by visualization, cross-referencing and the discovery of correlations.

Wearable Computing

Star Trek: Computer, Communicator, Replicator…

12Mu Sigma Confidential

Trend: Dashboards will Evolve to be More Actionable and Forward Looking

Dashboredom?

http://www.information-

management.com/issues/20_7/dashboards_data_management_quality_b

usiness_intelligence-10019101-1.html

“As we move from the information age into the intelligence age”..

Usability and Visualization: Evolution

13Mu Sigma Confidential

Trend: Interactive Visualization Improvements and the Importance of Aesthetics & Design

Advanced Visualization will continue to increase in depth and

relevance to broader audiences. Visionary vendors

like Tableau, QlikTech, SAS JMP/Visual Analytics, Panopticon,

and Spotfire (now Tibco) made their mark by providing

significantly differentiated visualization capabilities compared

with the trite bar and pie charts of most BI players' reporting

tools.

The latest advances in state-of-the-art UI technologies such as

Microsoft’s SilverLight, Adobe Flex, R, and Processing,

Javascript/ HTML5 via frameworks like Ext JS, D3

http://d3js.org/ augur the era of a revolution in state-of-the art

visualization capabilities, and Multi Touch Devices …

http://bardoli.blogspot.com/2009/12/top-10-trends-for-2010-in-analytics.html movie: Avengers & Iron Man

http://selection.datavisualization.ch/Usability and Visualization: Evolution

1414Mu Sigma Confidential

Analytics Trends & Big Data

The Decision Sciences Journey: Key Learnings & Scale

Agenda

Analytics Trends

15Mu Sigma Confidential

Reference DesignDecision Support Stack for Operationalizing Analytics at Scale

Math + Business

+ Technology

Math + Technology

Technology

Dataset

++ Design

Weak Point for Enterprises Today

In Scaling Decision Sciences

16Mu Sigma Confidential

Innovation thrives at the intersections of disciplines

IT

Math

BusinessTechnology

Consulting

++ Applied Math

- - High Cost

IT

++ Business

++ Applied Math

One has to play in the continuum…

Culture Traits

Learning over

Knowing

Agility

Scale &

Convergence

Multi-disciplinary

skills

Innovativeness

Cost Effectiveness

Experimentation

17Mu Sigma Confidential

Run the

business

The dynamic nature of the business creates the need for organizations to solve ill-defined shifting problems

Muddy Problem

Fuzzy Problem

Clear Problem

Solution

Agility experiments

Discovery based

Experience with

business

Change / Learn

the business

Consultative approach

Semi-automated

Socializing with smaller

sub groups

Solid understanding of

what all happened to

reach this stage

Need to design for

scale

Tech based

Heuristic

Right brained

Creative

Judgment based

Experimental

More complex

Multi-input,

multi-output

Algorithmic

Left brained

Routine

Process

oriented

More defined

More mature

Amenable to

automation

Mystery Heuristic Algorithmic Codification Toolification Systematization

Segments of Problems

18Mu Sigma Confidential

Two different paradigms are emerging in analytics and decision sciences

Business

Problem

Data

Available?

Problem-Driven Approach

HypothesisData

Sufficient?

N

Acquire Data

Y

AnalysisInsights &

Recommendations

Discovery-Driven Approach

REQUIRE HARMONY : NOT JUST BALANCE AND NOT ONLY CONFLICT

Discovery Driven

Approach

Agenda-less Observation

Data Exploration

Pattern Anticipation

Business Opportunity

Empirical Validation

Continuous Improvement

19Mu Sigma Confidential

The information age requires one to think in terms of the decision supply chain: Rinse and Repeat

Source Produce Distribute Consume

Manufacturing Supply Chain

Decision Sciences Supply Chain

Define Problem

Gather Data

Perform Analysis

Get Results

Generate Insights

Give Recommendations

Communicate Action Perform Action

Measure Success

Example: Pipeline Analytics

20Mu Sigma Confidential

…and the ability to quickly translate and consume decision sciences to drive strategic imperatives

Align

Incentives

Creation

Creation of insights using data and

application of Business, Math and

Technology to solve business problems

Translation of business problems to

analytics problems and translation back

from analytics solutions to business

solutions

Evangelization and consumption of

insights and recommendations to make,

monitor and measure decisions

Translation & consumption of decision sciences will be core to providing

competitive advantage

Institutionalization of Decision Sciences

21Mu Sigma Confidential

Decision Sciences is NOT an Evolution from BI & Analytics

High

Low

Need

Fo

r C

reati

vit

y

LowNeed for Rigor & Process Discipline

Data

Management

Type I

Reporting &

MIS

Type II

Descriptive

Data Analysis

Type III

Inquisitive

Modeling &

Forecasting

Type IV

Predictive

Insight

Generation

Type V

Prescriptive

High

22Mu Sigma Confidential

It is a SYMPHONY of multiple notes working together

Descriptive

Analytics

Prescriptive

Analytics

Predictive

Analytics

Inquisitive

Analytics

D

IP

P D I P P

23Mu Sigma Confidential

We are in the ‘Information Age’ and its time that data and human analytics capabilities are augmented with machine intelligence: Super Man vs. Iron Man vs. Robocop..

Intelligent Systems

Intelligent Systems: Next Wave of Innovation and Change (Iron Man Suit)

Data Mining Finally Meets the Operation.. Powered by Big Data Scale & Value

Convergence of the global industrial

system with the power of advanced

computing, analytics, low-cost sensing

and new levels of connectivity

permitted by the Internet

* Source: GE (Industrial Internet – Pushing the Boundaries of Minds and Machines)

Intelligent

Machines

Advanced

Analytics

People

At Work

24Mu Sigma Confidential

Yippee ! 5 % discount ..

Yeah.. I think I need these headphones too

The current business scenario demands automating and injecting intelligence into various enterprise touch points

Business Situation

Webpage

Add to

Cart

Checkout

Yearly Bonus

Let me buy a new laptop

Taxonomy, Associations,

Nearest NeighborPopup

Optimization,

Need State Personalization

Collaborative Filtering

Recommender Systems

Offer Optimization

Behind the Scenes

Email

Marketing

Mix

CRM

Revenue

Management

Hey ! I have some good deals being provided here

Wow ! Great offers .. Since I have some more

money, let me have a look

Customer Segmentation

Propensity Score

Attribution Methods to Optimize Media Spend

Pricing

Elasticity

Add Intelligent Analytics Everywhere You Can to Unlock The Value In Information

Search Buy laptop

25Mu Sigma Confidential

Current News on Intelligent Systems

CEP Vendors Acquired June 2013: Streambase (Tibco) and Progress

Apama (Software AG)

USA Today, September 16, 2012

– http://www.usatoday.com/money/business/story/2012/09/16/review-automate-this-probes-math-in-global-trading/57782600/1

Two Second Advantage: Published 2011

– Wayne Gretzky: Edge: He predicted a little better than anyone else, where the hockey puck would be.

– Enterprise 2.0: Every transaction became a bit of digital data

– Enterprise 3.0: Every EVENT can become a bit of digital data

– There will always be value in making projections, days, months, years, predictive analytics, data mining, and analytics systems on historical data – batch systems

– But in today’s world, enterprises need something more, instantaneous, proactive, predictive capability – real time systems

GE Mind and Machines: November 2012

» Industrial Internet, Intelligent Systems, and Intelligent Machines

» http://www.ge.com/docs/chapters/Industrial_Internet.pdf

» http://www.youtube.com/watch?v=SvI3Pmv-DhE

26Mu Sigma Confidential

Reference DesignDecision Support Stack for Operationalizing Analytics at Scale

Math + Business

+ Technology

Math + Technology

Technology

Dataset

++ Design

Weak Point for Enterprises Today

In Scaling Decision Sciences

27Mu Sigma Confidential

Typical Analytics Workflow

Teradata

15%

of

the p

rocess

70%

of

the p

rocess

15

%

Time Period Selection

Sister Store Selection

Processing of codes

Basic data pull

EmailRequest Client – Analytics

Report

Run the optimization

process

Set inputs in the code

SAS

Processing of codes

Optmodel optimization

VBA Excel

Output as excels

Client Report Generation

Excel

CPLEX

optimization

SAS

1

2

4

5

6

7

8

9

10

11

3

Example of Analytics Batch Process Mgmt

28Mu Sigma Confidential

Success of an Analytics Shop leads to the Golden “Handcuffs” that must be broken to scale..

Business Situation

Utilizing Analytics Process Orchestration Can Help Organizations Tackle These Specific

Problem Areas at Scale

Creation and

consumption challenges

due to a lack of seamless

technology integration

Scaling issues due to lack of a process metaphor

Insufficient documentation by developers

leading to knowledge management challenges

Lack of automation for

hidden decisions

Analysts unable to spend time with

high value analytics due to

operational issues – data pull,

technology integration

KPI’s to be monitored over

time – difficult to detect

exceptions automatically

29Mu Sigma Confidential

Utilize “Analytics BPM”: Workflow and Business Process Management Platform to Manage Man & Machine

Batch Analytics Automation

Performance Gains Across Client Engagements

Industry Engagement Business Impact

Retail

Technology

Healthcare

Automates reporting framework

Optimizes search engine marketing spends

Helps create marketing mix solution

Improvement in analyst productivity

Increased collaboration and utilization of better process management strategies

Improved model granularity

Flexible bidding process

Increased click through rates by 8%

Easier tool deployment

Reduced development time by 15%

Support creation of additional reports rapidly

Analytics BPM allows one to build human and machineworkflows and embed them within your enterprise.

•Nexus BPM v1.5 (open source LGPL; based on Jboss JBPM)

•muFLOW v2.1 (based on Activiti.org) Mu Sigma Product

Key Features

A modeling environment that “glues”

together reusable components from

multiple technologies and enables

automation

Extensible and Self Documenting

Ability to create rich analytics vehicles

that can be customized by use case

Provision to leverage past knowledge

Facilitates efficient process

management such as modeling,

automation, monitoring, and analysis

Analytics

BPM

30Mu Sigma Confidential

When should an organization think about Analytics BPM?

muFlow™ - Objective

Visual Metaphor + Component Model = Extensible & Agile

Analytics BPM

We were having trouble integrating different technology solutions within our organization. We had automated

individual technologies to a significant extent, but we were facing roadblocks connecting certain disparate

application systems with each other. After some research, we moved our control layer to a BPM and

everything has been considerably smooth ever since.

– a banking client

31Mu Sigma Confidential

Create intelligent business workflows; linking People, Systems, Information and Analytics on one platform

Example: Propensity to Click

32Mu Sigma Confidential

Trend: Operational Analytics, You are the Exception

Intelligent Systems

Source: Tibco

Signal Generator

Filter / Rules

Root Cause Learner

Tip: Utilize Three

Interacting Modules to

Create an Intelligent

Exception System

33Mu Sigma Confidential

Future is Now: “Top Shops” & the Realization of Intelligent Systems: The Algorithm Becomes a First Class Citizen..

Source: SAS Model Manager

Intelligent Systems

Analytic Lifecycle Management

• “Software” Parallels

• Dev, Test, Stage,

Production, & Versioning

• Algorithms are first class

• Re-Use

• API Economy

Continuous Analytics

performance management

• Champion/Challenger

• Live Update and

Optimization & Simulation

Model Management

Algo Wars

34Mu Sigma Confidential

Questions?

Chicago, IL

Bangalore, India

2013

www.mu-sigma.com

Proprietary Information

"This document and its attachments are confidential. Any unauthorized copying, disclosure or distribution of the material is strictly prohibited"

[email protected]

Twitter: @crunchdata