why hadoop for 360 degree insight - technical primercustomer insight? - a technical primer

23
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : [email protected] W : www.rittmanmead.com Why Hadoop for 360-Degree Customer Insight? - A Technical Primer Mark Rittman, CTO, Rittman Mead November 2015

Upload: mark-rittman

Post on 11-Apr-2017

1.151 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Why Hadoop for 360-Degree Customer Insight? - A Technical Primer Mark Rittman, CTO, Rittman Mead November 2015

Page 2: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

What is Customer 360-Degree Analysis?

•Gather together all meaningful information about the customer (“360-degree view”) •Organizing, matching, profiling & storing every interaction in real time

•Matched and combined; factual, interpreted, learned ‣Across all channels, and on public forums and social media

•Captures interactions across all-touch points and all channels ‣Including activity on social networks, forums, blogs

•Typically stored and processed in a Hadoop “data reservoir” •Dynamic customer profiles with segmentation, behavioural analysis “at scale”

•Downstream feeds into DW, CRM and other systems

Page 3: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

•Detailed events and transactions now combined with granular behavioural & attitudinal data

Adding “Who” and “Why” to Customer & Transaction Data

SingleCustomerViewEnriched

CustomerProfile

Correlating

Modeling

Machine Learning

Scoring

“How” InteractionData

Voice+ChatTranscripts In-person

dialogs

Webserverlogs

Blogs

Surveys

SocialMedia

“Why”AttitudinalData

“What” BehaviouralData

Transaction History

Retail Activity

PaymentHistory

BasketAnalysis

Attributes

Segments

Relationships

“Who”DescriptiveData

Demographics

Page 4: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Connect the Silos, Understand Customers, Drive Decisions

execute smarterlisten better

consumption logs, clickstream & devices

demographic, user and credit data

customer contacts and service cases

transactions and subscriptions

content metadata, ratings, comments

marketing campaign response

social mediaactivity

programmatic advertising

audience acquisition, retention

multi-channelmarketing

targeted promotions

next bestoffer

personalized content

product & service strategy

content acquisition

learn faster

EnrichedCustomerProfile

Correlating

Modeling

Scoring

Micro-Segments

History

Preferences

Page 5: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

But Wait … Isn’t This Just Data Warehousing & Data Mining?

•Data warehouses were conceived as a single source of reporting truth •Formally accept, model and integrate data to provide analytical reporting platform •Well-established design patterns for long-term data storage •Stored in structured, indexed, optimised “schema on write” storage •Data moved through layers via formal ETL •Extreme Performance, Highly Secure •Analytic SQL, In-Database Analytics ‣So why not use for this Customer 360 data?

Page 6: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Data Warehouse Loading Requires Formal ETL and Modeling

$1m

AnalyticDBMSNode

ETL

DataModel

ETL Developer

DataModeller

CuratedData

Page 7: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Traditional DW Databases Started as Single-Node Systems

$1m

AnalyticDBMSNode

ETL

DataModel

ETL Developer

DataModeller

ETLDevelopmenttakestime,isfragile,butresultsinwell-curateddataButwhataboutdatawhoseschemaisnowknown?Orfinalusehasnotyetbeendetermined?

DimensionaldatamodellinggivesstructuretothedataforbusinessusersButalsorestrictshowthatdatacanbeanalysedWhatiftheend-userisbetterplacedtoapplythatschema?

CuratedData

Page 8: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Traditional DW Databases Started as Single-Node Systems

$1m

AnalyticDBMSNode

ETL

DataModel

ETL Developer

DataModeller

Analyticworkloadstypicallyoriginatewithtabular,structureddataWell-suitedtodashboardsandreporting,dimensionalanalysisMostalsosupportdatamining,advancedanalyticsButlimitedintermsofsupportforflexible-schemadatasetsAndlimitedsupportforunstructuredandsemi-structureddataCuratedData

CRMR

DataMining/Stats

Page 9: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Traditional DW Databases Started as Single-Node Systems

$1m

AnalyticDBMSNodeDBInstance

Compute

ETL

DataModel

ETL Developer

DataModeller

DatabasessuchasOraclewereoriginallydesignedforasingleservernodeScalabilityachievedby“verticalscale-out”,i.e.buyabiggerserver

ButservershavelimitsintermsofhowpowerfuljustonecangetAndcostrisesexponentiallyasRAM,CPUetcincreases

Page 10: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Shared-Everything DWs Can Scale to 5-10 Nodes - At Cost

$1m

AnalyticDBMSNode

DBInstance

Compute

ETL

DataModel

ETL Developer

DataModeller

$1m

AnalyticDBMSNode

Compute

$1m

AnalyticDBMSNode

Compute

$1m

AnalyticDBMSNode

SingleDBInstance

Compute

ETLDevelopmenttakestime,isfragile,butresultsinwell-curateddataButwhataboutdatawhoseschemaisnowknown?Orfinalusehasnotyetbeendetermined?

DimensionaldatamodellinggivesstructuretothedataforbusinessusersButalsorestrictshowthatdatacanbeanalysedWhatiftheend-userisbetterplacedtoapplythatschema?

ButlimitsonhowfarthiscangoMaximumsizeofclusteraround5-10nodesAndcost-eachnodetypicallycosts$1m

Page 11: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

AnalyticDBMSNode

AnalyticDBMSNode

Shared-Nothing DWs Scale Further - But Required Sharding

$1m

AnalyticDBMSNode

Compute

DataModel

ComputeCompute

DBShard DBShard DBShard

ComplexShard-AwareETL

A-H I-M N-Z

$1m $1m

Shared-nothingdatabasescanpotentiallyscalefurtherNoneedtomaintainasingledatabaseinstance

Butscalingachievedthrough“sharding”thedatasetETLandotherprocessesneedtoconsiderdatalocality

LeadstomorecomplexETLthansingle-instance

Page 12: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

AnalyticDBMSNode

AnalyticDBMSNode

Shared-Nothing DWs Scale Further - But Required Sharding

$1m

AnalyticDBMSNode

Compute

DataModel

ComputeCompute

DBShard DBShard DBShard

ComplexShard-AwareETL

A-F O-R S-T

$1m $1m

AnalyticDBMSNode

Compute

DBShard

AnalyticDBMSNode

Compute

DBShard

AnalyticDBMSNode

Compute

DBShard

AnalyticDBMSNode

Compute

DBShard

$1m$1m $1m $1m

G-J K-N U-W X-Z

..andaddingmorenodesmeansre-shardingthedatasetAlsorulesoutmixed-workloadDBswithOLTP

Page 13: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Introducing Hadoop - Cheap, Flexible Storage + Compute

•A new approach to data processing and data storage •Rather than a small number of large, powerful servers, it spreads processing overlarge numbers of small, cheap, redundant servers

•Spreads the data you’re processing over lots of distributed nodes

•Has scheduling/workload process that sends parts of a job to each of the nodes

•And does the processing where the data sits •Shared-nothing architecture •Low-cost and highly horizontal scalable

Job Tracker

Task Tracker Task Tracker Task Tracker Task Tracker

Data Node Data Node Task Tracker Task Tracker

Page 14: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Introducing Hadoop - Cheap, Flexible Storage + Compute

•Hadoop & NoSQL better suited to exploratory analysis of newly-arrived data ‣Flexible schema - applied by user rather than ETL ‣Cheap expandable storage for detail-level data ‣Better native support for machine-learning anddata discovery tools and processes

‣Potentially a great fit for our new and emergingcustomer 360 datasets, and great platform for analysis

Page 15: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Hadoop Designed for Real-Time Storage of Raw Data Feeds

$50k

HadoopNode

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogs

Real-timeFeeds

RawData

Page 16: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Supplement with Batch + API Loads of ERP + 3rd Party Data

$50k

HadoopNode

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogs

Real-timeFeeds

CRMData Transactions SocialFeeds Demographics

BatchLoads APIs,WebServiceCalls

RawData

Page 17: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Supplement with Batch + API Loads of ERP + 3rd Party Data

$50k

HadoopNode

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogsCRMData Transactions SocialFeeds Demographics

RawData

Customer360Apps

PredictiveModels

SQL-on-Hadoop

Businessanalytics

Real-timeFeeds,batchandAPI

Page 18: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

HadoopNode HadoopNodeHadoopNodeHadoopNode

Supplement with Batch + API Loads of ERP + 3rd Party Data

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogsCRMData Transactions SocialFeeds Demographics

Real-timeFeeds,batchandAPI

HadoopNode

Compute

HadoopNode

Compute ComputeCompute

$5k

Compute Compute

$50k

HadoopNode

RawDataacrossClusterFilesystem

Compute

Page 19: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Hadoop-Based Storage & Compute : A Better Logical Fit

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogsCRMData Transactions SocialFeeds Demographics

Real-timeFeeds,batchandAPI

$50k

HadoopNode

$50k

HadoopNode

$50k

HadoopNodeHadoopNodeHadoopNode

$50k$50k

HadoopNode HadoopNode

$50k

EnrichedCustomerProfile

Modeling

Scoring HadoopDataReservoirRawcustomerdatastoredatdetail Enrichedandprocessedforinsights

$50k

Page 20: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Typically Stored on Flexible, Scalable Hadoop + NoSQL

Voice+ChatTranscripts

CallCenterLogsChatLogs iBeaconLogs WebsiteLogsCRMData Transactions SocialFeeds Demographics

Real-timeFeeds,batchandAPI

$50k

HadoopNode

$50k

HadoopNode

$50k

HadoopNodeHadoopNodeHadoopNode

$50k$50k

HadoopNode HadoopNode

$50k

EnrichedCustomerProfile

Modeling

Scoring HadoopDataReservoirRawcustomerdatastoredatdetail Enrichedandprocessedforinsights

$50k

Page 21: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

•Oracle Engineered system for big data processing and analysis •Start with Oracle Big Data Appliance Starter Rack - expand up to 18 nodes per rack •Cluster racks together for horizontal scale-out using enterprise-quality infrastructure

OracleBigDataApplianceStarterRack+Expansion

• ClouderaCDH+Oraclesoftware• 18High-specHadoopNodeswith

InfiniBandswitchesforinternalHadooptraffic,optimisedfornetworkthroughput

• 1CiscoManagementSwitch• SingleplaceforsupportforH/W+S/W

Deployed on Oracle Big Data Appliance Engineered System

OracleBigDataApplianceStarterRack+Expansion

• ClouderaCDH+Oraclesoftware• 18High-specHadoopNodeswith

InfiniBandswitchesforinternalHadooptraffic,optimisedfornetworkthroughput

• 1CiscoManagementSwitch• SingleplaceforsupportforH/W+S/W

EnrichedCustomerProfile

Modeling

Scoring

Infiniband

Page 22: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Architected using “Data Reservoir” Design Pattern

•Data for customer 360 system typically landed into a Hadoop & NoSQL-based •Applies aggregation, joining and machine-learning processes to extract insights

DataTransfer DataAccess

DataFactory DataReservoir

BusinessIntelligenceTools

HadoopPlatform

FileBasedIntegration

StreamBased

Integration

Datastreams

Discovery&DevelopmentLabsSafe&secureDiscoveryandDevelopment

environment

Datasetsandsamples

Models andprograms

Marketing/SalesApplications

Models

MachineLearning

Segments

OperationalData

Transactions

CustomerMasterata

UnstructuredData

Voice+ChatTranscripts

ETLBasedIntegration

RawCustomerData

Datastoredintheoriginal

format(usuallyfiles)suchasSS7,ASN.1,JSONetc.

MappedCustomerData

Datasetsproducedbymappingandtransformingrawdata

Page 23: Why hadoop for 360 degree insight  - technical primercustomer insight? - a technical primer

T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India)

E : [email protected] W : www.rittmanmead.com

Why Hadoop for 360-Degree Customer Insight? - A Technical Primer Mark Rittman, CTO, Rittman Mead November 2015