our big data journey: a consultant's perspective

46
CrinLogic Our Big Data Journey A Consultant’s Perspective April 26, 2012 5/1/2012 1

Upload: ed-kohlwey

Post on 05-Dec-2014

1.248 views

Category:

Technology


0 download

DESCRIPTION

David Douglas, CrinLogic The Big Data headlines are unrelenting; with each passing day seemingly bringing new discoveries, products, partnerships, venture funds, you name it into the mix. If anything, it is all a bit confusing. Listening to all this you might come to the conclusion that Big Data will solve most of your problems, place your company miles ahead of your competition, drive your Net Promoter Scores through the roof, and fall just short of solving world hunger (ok…maybe not that far). And one can’t blame you if you think all one needs to do is install the Hadoop ecosystem of projects, conjure up some possible business use cases, throw some commodity hardware into the mix, attend some training, purchase some Big Data analytics software and VOILA, you have arrived and can enjoy the fruits of your Big Data efforts. With tongue firmly planted in cheek, the reality is vastly different. This talk is partially a reality check on Big Data implementation strategies - starting with Big Data is easy, becoming proficient is hard, fully integrating into a broader enterprise data strategy is very hard – and partially an information sharing session on what we’re learning as we engage with customers in various industries on Big Data. Among other things we will explore: building the business case; software and hardware requirements analysis; selection process and implementation approaches; what tends to work well, not so well, and what to avoid; and how big data is likely to affect enterprise data architecture. David Douglas is a member of Hadoop-DC User Group and is a co-founder of CrinLogic, a Big Data consultancy based in the greater DC area. He has devoted his 17 years of professional experience to helping clients maximize the value of their strategic IT initiatives. Prior to co-founding CrinLogic, David started two other companies. The first was an angel-backed Sales Force Automation software company he sold in 2002 and the second is a consulting services company that focuses on Agile and Lean software adoption and large-scale program implementation services. He helped start the Data Warehousing practice at American Management Systems and was one of the first consultants to join IBM’s Business Intelligence practice.

TRANSCRIPT

Page 1: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Our Big Data Journey

A Consultant’s Perspective

April 26, 2012

5/1/2012 1

Page 2: Our Big Data Journey: A Consultant's Perspective

CrinLogic5/1/2012 2

Page 3: Our Big Data Journey: A Consultant's Perspective

CrinLogic

A little Big Data story to start us out ;)

5/1/2012 3

Page 4: Our Big Data Journey: A Consultant's Perspective

CrinLogic

About me

David Douglas is a member of Hadoop-DC and co-founder of CrinLogic. He has over 17 years of IT consulting experience with concentration in Business Intelligence, Agile and Lean software development, and large program implementations. He is a passionate believer in Big Data and the enormous possibilities it offers.

CrinLogic is a Big Data consulting firm. Our passion for Big Data is surpassed only by our curiosity and love of learning. We offer full service Big Data consulting services and training. Visit us at www.CrinLogic.com. We are based in DC, Chicago, Austin, and Sarajevo.

[email protected]

443.413.4038

5/1/2012 4

Page 5: Our Big Data Journey: A Consultant's Perspective

CrinLogic

This talk is about…

Some things I’d like you to walk away with

1. A big picture perspective on this market

2. What customers are saying

3. Thoughts on developing a business case

4. Learnings

What this talk is not about1. A technical discussion of Big Data

5/1/2012 5

Page 6: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Data for this talk came from

1. Talking with 30 plus companies from all walks of life and all stages of maturity

2. Talking with colleagues in the Big Data space (hardware and software vendors)

3. Current customer engagements

4. Research

5/1/2012 6

My biggest surprise is my awareness of how little I know. No one really understands how to build a Big Data solution. We are all learning as we go.

Page 7: Our Big Data Journey: A Consultant's Perspective

CrinLogic

My Perspective on Big Data

Enterprise Data Architecture

Post adopter syndrome

Business problem focused

Rising tide

Systems Thinking

Iterative

Failures

5/1/2012 7

Page 8: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Yes it is big and growing!

5/1/2012 8

Page 9: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Except when compared to Bieber

5/1/2012 9

Page 10: Our Big Data Journey: A Consultant's Perspective

CrinLogic

And me of course ;)

5/1/2012 10

Though I’ve been trending down…

Page 11: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Need to expand my Network on LinkedIn

5/1/2012 11

Page 12: Our Big Data Journey: A Consultant's Perspective

CrinLogic

So what do the customers really think?

5/1/2012 12

Page 13: Our Big Data Journey: A Consultant's Perspective

CrinLogic

They are confused

5/1/2012 13

and rightfully so!

Page 14: Our Big Data Journey: A Consultant's Perspective

CrinLogic

No generally accepted definition for “Big Data”

5/1/2012 14

“We don’t generate enough data for that”

“Don’t you need at least 100TBs?”

Or they simply think they are already using Big Data

And lest we forget the 3Vs…

Volume, Variety, Velocity(just a couple pointers on these)

Page 15: Our Big Data Journey: A Consultant's Perspective

CrinLogic

So many products and choices promising so much

5/1/2012 15

Where Database

Hardware

Open Source

Analytical Tools

Network

Page 16: Our Big Data Journey: A Consultant's Perspective

CrinLogic

So much software offerings

5/1/2012 16

Page 17: Our Big Data Journey: A Consultant's Perspective

CrinLogic

No generally accepted definition for “Data Scientist”

5/1/2012 17

Are they a critical success factor for Big Data Solutions? [True or False]

Were they a critical success factor to Business Intelligence solutions?

OSEMI – Obtain, Scrub, Explore, Model, Interpretwww.dataists.com Hillary Mason & Chris Wiggins

Page 18: Our Big Data Journey: A Consultant's Perspective

CrinLogic

So tell me the why please?

5/1/2012 18

Page 19: Our Big Data Journey: A Consultant's Perspective

CrinLogic

McKinsey’s 5 Value Propositions

1. Make information transparent and usable more readily

2. Expose variability and enable performance improvement

3. Better customer segmentations

4. Advanced analytics for better decision making

5. New products

5/1/2012 19

Page 20: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Not seeing the Big Analytics Piece

5/1/2012 20

In fact, of the many companies employing Big Data we’ve talked to or are working with are not doing big data analytics

Page 21: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Tactical versus Strategic

5/1/2012 21

Tactical solves an immediate pain point

•Batch jobs taking too long

•Reaching limit of scalability on current infrastructure

•Budget was reduced recently but still have to deliver

•New project ‘just so happens’ to need this newer technology

Strategic implies

•Seeking competitive differentiation

•Creating actual solutions with value

Big Data as strategic direction is much harder

Page 22: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Big Data Strategic Business Case Approach

5/1/2012 22

Page 23: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Figure out the Business Case

5/1/2012 23

Congratulations! The CEO of a large Financial Services firm has asked you and your team to map out the company’s Big Data Strategy so he can present to the board. He is known for being thorough. Now get to work!!

Of the below choices, which is the best first step?a. Scour the Internet for Big Data

use case success stories for Financial Services and then go talk to VPs in that area

b. Build a virtual cluster on your machine, open direct link to Twitter hose and show CEO what the community is saying about him real-time

c. Phone a friend (or CrinLogic)

d. Build relationships, interview all areas of the company, research market, and consolidate the results [but time-box it to a couple weeks]

Goal is to identify the most appropriate areas to start…high reward…high visibility

Page 24: Our Big Data Journey: A Consultant's Perspective

CrinLogic

This can be helpful…

5/1/2012 24

Recover Charged Off Accounts

ManageDelinquencies Recoveries &

Fraud

Collect on Delinquent Accounts

Manage Customer

Relationship

Service Customers

Establish Strategic

Imperatives

Develop Business Strategy

Develop Marketing Strategy

Acquire Customers

Develop Card Acquisition Offers

Identify Prospects

Decision Applications & Book

Accounts

Solicit Prospects & Promote Offers

Fulfill on Decisions

Develop Acquisition Campaigns

Develop Collections Strategies

Develop Recoveries Strategies

Detect and Recover Fraud

Fulfill on Offers/Changes

Develop Account Management Offers

& Policies

Communicate Offers/Changes

Decision Response/Request

Identify Customers/ Targets

Design Account Management

Campaigns

Develop Fraud Strategies

Define Customer Experience

Maintain Accounts

Process Credit Card Transactions

Provide Customer Service

Develop Servicing Strategies

Develop Market Innovations

Manage Information Technology

Manage Regulatory

Affairs & Compliance

Manage Finance &

Accounting

Manage Rewards

Manage Accounting and Reporting

Manage Line of Business

Manage Credit Risk

Manage Planning & Analysis

Manage Treasury

Manage IT Operations

Manage External Compliance

Manage Correspondence

Manage Funds Disbursements

Manage Human

Resources

Manage HR

Sample Large Financial Institution

Page 25: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Identify Big Data Impact Areas

5/1/2012 25

Recover Charged Off Accounts

ManageDelinquencies Recoveries &

Fraud

Collect on Delinquent Accounts

Manage Customer

Relationship

Service Customers

Establish Strategic

Imperatives

Develop Business Strategy

Develop Marketing Strategy

Acquire Customers

Develop Card Acquisition Offers

Identify Prospects

Decision Applications & Book

Accounts

Solicit Prospects & Promote Offers

Fulfill on Decisions

Develop Acquisition Campaigns

Develop Collections Strategies

Develop Recoveries Strategies

Detect and Recover Fraud

Fulfill on Offers/Changes

Develop Account Management Offers

& Policies

Communicate Offers/Changes

Decision Response/Request

Identify Customers/ Targets

Design Account Management

Campaigns

Develop Fraud Strategies

Define Customer Experience

Maintain Accounts

Process Credit Card Transactions

Provide Customer Service

Develop Servicing Strategies

Develop Market Innovations

Manage Information Technology

Manage Regulatory

Affairs & Compliance

Manage Finance &

Accounting

Manage Rewards

Manage Accounting and Reporting

Manage Line of Business

Manage Credit Risk

Manage Planning & Analysis

Manage Treasury

Manage IT Operations

Manage External Compliance

Manage Correspondence

Manage Funds Disbursements

Manage Human

Resources

Manage HR

No Impact

Low Impact

Moderate Impact

High Impact

Page 26: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Naturally! Fraud & Recoveries

5/1/2012 26

ManageDelinquencies Recoveries &

Fraud

No Impact

Low Impact

Moderate Impact

High Impact

Recover Charged Off

Accounts

Collect on Delinquent Accounts

Develop Collections Strategies

Develop Recoveries Strategies

Detect and Recover Fraud

Develop Fraud Strategies

• Determine Collections Strategy

• Enter Collections

• Exit Collections Strategy

• Fulfill Collections Strategy

• Monitor Commitments

• Service Collections Account

• Charge Off Bad Debt

• Process Bankruptcies

• Process Estates

• Process Recoveries Payments

• Analyze Collections Strategies

• Maintain Collections Systems

• Detect Fraud

• Decision Identity Fraud

• Decision Transaction Fraud

• Recover Fraud

• Research Fraud Strategies

• Design/Test Fraud Strategies

• Implement Fraud Strategies

Page 27: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Be ready to answer these questions

5/1/2012 27

1. Do they currently have an analytics group?2. Do they make decisions based on data?3. Do they have data center management skills?4. Do they have stringent regulatory requirements? 5. What are the current sources of data?6. What other sources of data are of interest?7. What are their KPIs?8. What is the maturity of their enterprise data architecture?9. What is the maturity of their business intelligence

initiative(s)?10. Others?

Page 28: Our Big Data Journey: A Consultant's Perspective

CrinLogicMAP 4.0 Product Features

Maturity Level

Data/Software Decision Sciences

5

4

3

2

1

Analytical

Master( Institutionalized

Analytics)

Analytics

Amateur(Some BI)

Analytical

Practitioner(In-house Insight

s team)

Localized

Analytics(Some Sales

Drilldown)

BI Tools/ Reporting

Engine

Really basic Analytics

MS Office Tools

BI Reporting Tools

Internal attempts

Full BI Suites With Some

Data Mining/Analytics

(Mostly Built-In)

Oracle Suite

Microsoft Suite

SAS

IBM

Spotfire

[…]

Specialized/Targeted

Analytics Products

Analytics

Holy Grail

Threshold

based

Insights

Automated

Decks

Preemptive

Suggestions

Insights

Optimization

Automated

Insights

Forecasting/Full

Simulation

Forward Looking DSS

Global Suite

(Supply Chain/PnP/Mix/Media …)

with integrated workflow (ERP …)

Pricing and Promotions

Marketing Mix

Segmentation

Consumerization

Assortment

Churn/Attrition

Supply Chain

[…]

Actionable

Implementable

Insights

Current Market is fragmented and

overlapping.

Highly specialized.

Many players often produced excel-

based tools

Supply Chain Simulator

PnP Simulator

Mix Simulator

Media Buy

[…]

Analytics

Laggard

Full picture/context optimization

Dhiraj Rajaram, CEO Mu Sigma andJoseph de Castelnau, SVP Engineering Nielsen

Model courtesy of

Predictive Analytics Maturity Model

Page 29: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Opportunity Areas

5/1/2012 29

Business

Strategic Value

HighLow

High

Low

Highest benefits are most likely realized when building these

products or features

Size of bubble = Est. Effort

IT Strategic Value

Sources: “Measuring the Business Value of Information Technology”, Intel Press

Page 30: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Opportunity Areas

5/1/2012 30

Business Strategic Alignment

IT Strategic Alignment

HighLow

High

Low

Size of bubble = Est. Effort

So why do these get built?

Sources: “Measuring the Business Value of Information Technology”, Intel Press

Page 31: Our Big Data Journey: A Consultant's Perspective

CrinLogic

The Things We Learned

5/1/2012 31

Page 32: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Implementation Approach

5/1/2012 32

• Big Data does not lend itself to a Big Bang approach (actually does anything really?)

• Proof of concepts make perfect sense to gain traction (top-down push is preferable to federated)

• As with any effort with such potential, appropriate oversight by combination of IT/Business executive

Other considerations• Form a central team with key skills in building Big Data solutions. This consulting

team should help train, mentor, and provide consulting expertise to new initiatives ….helps ensure consistency in approach.

• There is a price of entry…each new participating area should bring resources to the table

• Encourage building a community of analytic junkies and support them…community building … goal is information sharing…build a Big Data culture

• Preference for consolidation

Page 33: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Components of Commodity Hardware

5/1/2012 33

# sockets, # cores, core memory, processor speed

SATA, SSD, # Disks

Page 34: Our Big Data Journey: A Consultant's Perspective

CrinLogic

2009/2010 H/W Recommendations

5/1/2012 34

• 4 x 1TB hard drives

• 2 x Quad-core CPUs, each 2.0-2.5GHz

• 16GB RAM

• Gigabit Ethernet

Page 35: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Commodity Hardware Today

5/1/2012 35

1U

2U

4U

4-6 x 2TB SATA Drives1 or 2 Socket1 x 6 or 2 x 6 Cores4GB Core Memory24+ GB RAM12 x 2TB SATA Drives1 or 2 Socket1 x 6 or 2 x 6 Cores4-8GB Core Memory24 + GB RAM20 or 36 x 2TB SATA Drives80 x 3TB (???)1 or 2 Socket1 x 6 or 2 x 6 Cores4-8GB Core Memory

Approx $4K

Approx $6K

Approx $12K

10GB ethernet?($7K)

Page 36: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Thoughts on Storage TCO

5/1/2012 36

• *Price != Cost and TCA is < 20% of TCO

• $ per TB not an exact science

*David Merrill, Hitachi Data Systems Chief Economist, “Storage Economics: Four Principles for Reducing Total Cost of Ownership” July, 2011

Page 37: Our Big Data Journey: A Consultant's Perspective

CrinLogic

‘New to Big Data’ learnings

Iterative process…if you go in

claiming you ‘know’ the use case you want to solve you are in for a surprise

A lot of this is not intuitive…e.g. MapReduce,

Columnar based DBs and we live in an RDBMS world

5/1/2012 37

It takes a wealth of skills not resident in a single person… understand batch MapReduce framework, networks, grid computing, analytics, subject matter experts, and more

Open Source or ‘Free’ not a big selling point for the

larger companies

Page 38: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Some key learnings from early adopters

Don’t forget operations…in

2010 Facebook had between 400-500 operations professionals…on par with entire engineering organization

Be ready to embrace emergent solutions and emergent architecture… and emergent support base within company

Vendor support still needs to catch up…not the

level of support companies are used to from established technology vendors

5/1/2012 38

Source: http://framethink.wordpress.com/2011/01/17/how-facebook-ships-code/

Page 39: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Thinking about workloads…latency

Low Latency(real-time)

High Latency(1 hour plus)

Start here!

Solutions are generally a mix of different paradigms

5/1/2012 39

Page 40: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Other Random Learnings

5/1/2012 40

• For many companies, there will likely be a cultural change required to become good in Big Data analytics

• Customers have major concerns about security and cloud

• Don’t tell a risk officer that Hadoop’s replication framework mitigates need for disaster recovery

• How about you all…any Random Learnings you want to share?

Page 42: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Final Thoughts

5/1/2012 42

Page 43: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Musings

5/1/2012 43

• Chief Data Officer || Chief Data Scientist

• Just because you can retain all this data does it mean you should?

• Big Data and virtualization

Page 44: Our Big Data Journey: A Consultant's Perspective

CrinLogic5/1/2012 44

Thinking about Starting a Big Data solution

1. Big Data strategy assessment?

2. Go small (success breeds success)

3. Let RT and near RT come to you…don’t start there

4. Ensure you have the right skills (or bring them in)

5. If only R&D focus then upside may be limited…business needs to have a seat at the table

Consider hiring a professional Big Data consulting firm to help in the transition!

Page 45: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Some good resources for you

BlogsDatabases and Data Infrastructure

http://www.dbms2.com

http://dbmsmusings.blogspot.com.

http://databeta.wordpress.com.

http://blogs.gartner.com/donald-feinberg

http://itmarketstrategy.com/

Big Data Analytics

http://hunch.net/

http://ml.typepad.com/

www.dataists.com

http://www.dataspora.com/blog/

http://blog.data-miners.com/

http://www.visualcomplexity.com/vc/blog/

5/1/2012 45

Videoshttp://www.youtube.com/watch?v=SS27F-hYWfU&feature=relmfu

http://www.youtube.com/watch?v=2FpO7w6X41I

http://www.youtube.com/watch?v=OmlX3IHb0JE

http://www.youtube.com/watch?src_vid=UaGINWPK068&annotation_id=annotation_65559&v=XAuwAHWpzPc&feature=iv

http://www.youtube.com/watch?v=eUcej07dGu4

http://www.youtube.com/watch?v=viPRny0nq3o

Page 46: Our Big Data Journey: A Consultant's Perspective

CrinLogic

Q&A

5/1/2012 46