big data analytics from a practitioners view

23
Big Data Analytics from a Practitioners view Sep 2013 Raghu Kashyap

Upload: raghu-kashyap

Post on 12-Jul-2015

900 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Big Data Analytics from a Practitioners View

Big Data Analytics from

a Practitioners viewSep 2013

Raghu Kashyap

Page 2: Big Data Analytics from a Practitioners View

About Raghu Kashyap

page 1

Areas of Responsibility

Data Insights Group (Site analytics,

Competitive Intelligence, Big Data)

Orbitz India, supporting Analytics

and BI teams

US, Europe, Australia(APAC)

Personal

Director – Data Insights Group

Strong background with technology(13

years) passion and experience with

analytics(4 years) and big data (3.5

year)

Masters in Computer Science

Golf, traveling, helping non-profit

organizations, spending time with my

wife and 2 boys

Twitter: @ragskashyap

Blog: http://kashyaps.com

Email: [email protected]

Page 3: Big Data Analytics from a Practitioners View

Orbitz Worldwide

page 2

Page 4: Big Data Analytics from a Practitioners View

Challenges

Lack of multi-dimensional capabilities

Heavy investment on the tools

Precision vs Accuracy

Data Governance

Page 5: Big Data Analytics from a Practitioners View

continued….

No data unification or uniform platform

across organizations and business

units

No easy data extraction capabilities

Page 6: Big Data Analytics from a Practitioners View

Hadoop history at OWW

page 5

Page 7: Big Data Analytics from a Practitioners View

Web Analytics & Big Data

OWW generates couple million air and hotel

searches every day.

Massive amounts of data. Over hundred GB

of log data per day.

Expensive and difficult to store and process

this data using existing data infrastructure.

Page 8: Big Data Analytics from a Practitioners View

Love Thy Hadoop

page 7

Long term storage for

very large data sets.

Open access to

developers and analysts.

Allows for ad-hoc

querying of data and

rapid deployment of

reporting applications.

Page 9: Big Data Analytics from a Practitioners View

Hadoop Growth

page 8

Page 10: Big Data Analytics from a Practitioners View

Hadoop Cluster

page 9

Page 11: Big Data Analytics from a Practitioners View

Treemap of HDFS storage

page 10

Page 12: Big Data Analytics from a Practitioners View

Approach with Hadoop and ETL

Raw logs

Flat files

Event Model

Map Reduce

ETL

External Tables

Data Warehouse (Greenplum)

GP Connector

Page 13: Big Data Analytics from a Practitioners View

Opportunities

page 12

Machine Learning

Site Analytics Data

PPC bidding efficiencies

Internal log analysis. Hgrep

MVT testing

Advanced Analytics

Page 14: Big Data Analytics from a Practitioners View

Show me the money

EFX – Every Friggin X

PPC bidding efficiencies

MAC vs. PC

Page 15: Big Data Analytics from a Practitioners View

Marketing Channel optimization

page 14

Orbitz.comDirect

Paid -Brand

Paid –Non

Brand

SEO –Brand

SEO -Non

BrandEmail

Meta

Travel Research

Affiliates

Display Ads

Page 16: Big Data Analytics from a Practitioners View

Hotel Rate Cache optimization

page 15

Data is collected as part of RCDC.

Includes every live rate search (aka

burst) performed by our hotel stack.

Raw data: ~200 GB, compressed, 108

records.

Extraction: <40 GB compressed, 109

records.

Page 17: Big Data Analytics from a Practitioners View

MVT

Analyze behavioral and Test data from our

MVT testing

page 16

Page 18: Big Data Analytics from a Practitioners View

DWH Log analysis

page 17

• Analysis of Greenplum DB logs within Hadoop

to analyze the data usage patterns.

• Impact analysis

• Hadoop usage for the last 30 days of DB log

analysis.

Page 19: Big Data Analytics from a Practitioners View

HIPPO is your best friend

• Expect organizational resistance from

unanticipated directions

• You can do wonders in the analytics area if

you get buy in.

Page 20: Big Data Analytics from a Practitioners View

Lessons Learnt

Analytics using Big Data comes with a price.

Data Governance

Senior Leadership buy in

I can't tell you the key to success, but the key

to failure is trying to please everyone." -Ed

Sheeranpage 19

Page 21: Big Data Analytics from a Practitioners View

How to capitalize on Big Data?

page 20

Learn from people who have already

done this.

DO NOT reinvent the wheel

Buy v/s Build balance

Build once and leverage mulitple

places.

Go where clients don’t want to go or

cant go in terms of execution.

Page 22: Big Data Analytics from a Practitioners View

What matters to Practitioners?

Things change dramatically in the

world of analytics

Being Agile is very important

Dashboards and Reports can take

you only to a certain level

Buy in from key groups is important

Grow business and impress Boss

page 21

Page 23: Big Data Analytics from a Practitioners View

2222222

Thank you