getting started with big data analytics

12
{ The Single Step Beginning your big data journey

Upload: rob-winters

Post on 04-Dec-2014

392 views

Category:

Technology


1 download

DESCRIPTION

Presentation at IT Tinget 2013 on what is big data, agile BI, and technology considerations when starting a Big Data program

TRANSCRIPT

Page 1: Getting Started with Big Data Analytics

{The Single Step

Beginning your big data journey

Page 2: Getting Started with Big Data Analytics

How is “big data” different from traditional BI?

What components do we need for big data?

What magic can we work with big data?

Today’s Stops

Page 3: Getting Started with Big Data Analytics

Spil Games: A leader in online gaming

• 180 million monthly and 12 million daily players

• More than one billion gameplays monthly

• >50 websites, local in 15 languages

• Active in every country of the world (even Vatican City!)

• Platform, Publisher, Developer

Page 4: Getting Started with Big Data Analytics

VELOCITY

VARIETY

VERACITY

What is big data?

VALUEThe Only V that Matters

Page 5: Getting Started with Big Data Analytics

Traditional BI: Know before you measure

X Matters

Define Metrics

Define Requirement

s

Develop Data Source

Design Data Mart

Design Report

Sign Off Report

Reporting Available

SlowIT-CentricInflexible

Page 6: Getting Started with Big Data Analytics

Big Data BI: Agile approach, data first

Capture

Explore Define

Apply +

Track

OpenAdaptive

Evolving Structure

Page 7: Getting Started with Big Data Analytics

Do we need real time analytics?

Traditional ETLReal Time

• Once a day• Once a week• Delayed

• Faster than human perception

• <200 milliseconds

“In Time”

In Time: Information is available fast enough to influence decisions• Following a product release (hours)• While a customer is in the shop/on the site (minutes)• While the query runs (seconds)

The Velocity Continuum

In Time: Fast enough, Cheap enough, Easy enough

Page 8: Getting Started with Big Data Analytics

Parts and needs of a big data stack

Unstructured data intake

Unstructured data storage

Structured data storage

Human interface layer

Predictive analytics tools

Select A,B,sum(C)From XGroup by 1,2

• High Query Performance• Denormalized• Scalable; high concurrency

• Cheap• Flexible Schema• Easy Management

• Scalable• Schemaless or adaptive schema• Resilient

• Highly Flexible• Simple to use• In-tool metadata

• Not memory constrained• Flexible inputs/outputs• Easy iteration

Page 9: Getting Started with Big Data Analytics

Spil: Harmony of open source/commercialUnstructured data intake

Unstructured data storage

Structured data storage

Human interface layer

Predictive analytics tools

• >100x faster than based systems• Handles tables >10B rows easily• Excellent concurrency on load/query

• Data marts not required• Cross-platform merging• Anyone can develop

• Open source• Easy development• Integrates with rest of tools

• Industry standard• Open source• Ecosystem

• Existing infrastructure• Integration with production systems

Page 10: Getting Started with Big Data Analytics

Demographic Prediction

Analytical use cases

Multivariate Testing/Site Optimization

Explore, Learn, Predict, Measure

Page 11: Getting Started with Big Data Analytics

Getting your big data off the ground

Start Fresh

Have a Problem

Be Agile

Pragmatism >Perfection

Be FlexibleBe FastMake MistakesFind Value

A tool, not a goal

Page 12: Getting Started with Big Data Analytics

Good Luck on your Journey!

Rob WintersDirector, Reporting/AnalyticsSpil Gameswww.robertdwinters.com