finding out more with data analytics and aws

59
Finding out more from your advertising campaigns using data analytics

Upload: amazon-web-services

Post on 11-Apr-2017

342 views

Category:

Technology


0 download

TRANSCRIPT

Finding out more from your advertising campaigns using

data analytics

1 instance for 100 hours

1 instance for 100 hours

Video encoding Customer activity analysis

Twitter parsing …

1 instance for 100 hours =

100 instances for 1 hour

Why would you need 100 instances? (or even more)

Big Data

Many definitions

Very large volume with low density of information

Three V’s: Velocity Variety Volume

Social interactions and web activity data

Amongst many others..

GB TB PB

Compute Storage Big Data Unconstrained data growth

95% of the 1.2 zettabytes of data in the digital universe is unstructured

70% of of this is user-generated content

Unstructured data growth explosive, with estimates of compound annual growth (CAGR) at 62% from 2008 – 2012.

Source: IDC

ZB

EB

Web sites Blogs/Reviews/Emails/Pictures

Social Graphs Facebook, Linked-in, Contacts

Application server logs Web sites, games

Sensor data Weather, water, smart grids

Images/videos Traffic, security cameras

Twitter 50m tweets/day 1,400% growth per

year

Where does it come from?

Compute Storage Big Data

Why now?

Web sites Blogs/Reviews/Emails/Pictures

Social Graphs Facebook, Linked-in, Contacts

Application server logs Web sites, games

Sensor data Weather, water, smart grids

Images/videos Traffic, security cameras

Twitter 50m tweets/day 1,400% growth per

year

Why now?

Compute Storage Big Data

Web sites Blogs/Reviews/Emails/Pictures

Social Graphs Facebook, Linked-in, Contacts

Application server logs Web sites, games

Sensor data Weather, water, smart grids

Images/videos Traffic, security cameras

Twitter 50m tweets/day 1,400% growth per

year

Why now?

Mobile connected world (more people using, easier to collect)

Compute Storage Big Data

Web sites Blogs/Reviews/Emails/Pictures

Social Graphs Facebook, Linked-in, Contacts

Application server logs Web sites, games

Sensor data Weather, water, smart grids

Images/videos Traffic, security cameras

Twitter 50m tweets/day 1,400% growth per

year

Why now?

More aspects of data (variety, depth, location, frequency)

Compute Storage Big Data

Web sites Blogs/Reviews/Emails/Pictures

Social Graphs Facebook, Linked-in, Contacts

Application server logs Web sites, games

Sensor data Weather, water, smart grids

Images/videos Traffic, security cameras

Twitter 50m tweets/day 1,400% growth per

year

Why now?

Possible to understand (not just answer specific questions)

Compute Storage Big Data

What’s different?

We can collect more

There is more

And data has gravity…

Data App App

http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/

Data has gravity

Compute Storage Big Data

Data

http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/

…and inertia at volume…

Compute Storage Big Data

Data

http://blog.mccrory.me/2010/12/07/data-gravity-in-the-clouds/

…easier to move applications to the data

Compute Storage Big Data

Cloud has the power to process

Lorem ipsum dolor sit

amet, consectetur

adipiscing elit. Etiam

quis ligula neque, eget

venenatis sem.

Suspendisse non eros

nulla, at placerat nibh.

Very large dataset seeks strong &

consistent compute for

short term relationship,

possibly longer. GSOH a

plus aws.amazon.com

Personal

Lorem ipsum dolor sit

amet, consectetur

adipiscing elit. Etiam

quis ligula neque, eget

venenatis sem.

Suspendisse non eros

nulla, at placerat nibh.

Cras id lectus mattis est

ullamcorper blandit.

Proin ut nisi vitae enim

vulputate tempor.

Phasellus id commodo

eros. Mauris nec

dignissim turpis. Nunc

Cras id lectus mattis

est ullamcorper

blandit. Proin ut nisi

vitae enim vulputate

tempor. Phasellus id

commodo eros.

Mauris nec dignissim

turpis. Nunc

Bring compute capacity to the data

Compute Storage Big Data

Cras id lectus mattis

est ullamcorper

blandit. Proin ut nisi

vitae enim vulputate

tempor. Phasellus id

commodo eros.

Mauris nec dignissim

turpis. Nunc

From one instance…

Compute Storage Big Data

…to thousands

Compute Storage Big Data

and back again…

Compute Storage Big Data

The revolution

have data

can store

have data

can store can analyse

have data

economically

fast

Who is your customer really?

What do people really like?

What is happening socially with your products?

How do people really use your products?

34

Lesson 1: don’t leave your Amazon account logged in at home

Lesson 2: use the data you have to

drive proactive marketing

1 instance for 100 hours =

100 instances for 1 hour

Small instance = $8

Amazon Elastic MapReduce

But what is it?

A framework Splits data into pieces Lets processing occur

Gathers the results

Elastic MapReduce

Code Name node

Output S3 + SimpleDB

S3 + DynamoDB

Elastic cluster

HDFS Queries

+ BI Via JDBC, Pig, Hive

Input data

Very large click log (e.g TBs)

Very large click log (e.g TBs)

Lots of actions by John Smith

Very large click log (e.g TBs)

Lots of actions by John Smith

Split the log into

many small pieces

Very large click log (e.g TBs)

Lots of actions by John Smith

Split the log into

many small pieces

Process in an EMR cluster

Very large click log (e.g TBs)

Lots of actions by John Smith

Split the log into

many small pieces

Process in an EMR cluster

Aggregate the results

from all the nodes

Very large click log (e.g TBs)

What John Smith

did

Lots of actions by John Smith

Split the log into

many small pieces

Process in an EMR cluster

Aggregate the results

from all the nodes

What John Smith

did

Very large click log (e.g TBs) Insight in a fraction of the time

1 instance for 100 hours =

100 instances for 1 hour

Small instance = $8

1 instance for 1,000 hours =

1,000 instances for 1 hour

Small instance = $80

Features powered by Amazon Elastic MapReduce:

People Who Viewed this Also Viewed

Review highlights Auto complete as you type on search

Search spelling suggestions Top searches

Ads

200 Elastic MapReduce jobs per day Processing 3TB of data

Data Analytics

3.5 billion records

71 million unique cookies

1.7 million targeted ads

required per day

Execute batch processing data sets

ranging in size from dozens of

Gigabytes to Terabytes

Building in-house infrastructure to

analyze these click stream datasets

requires investment in expensive

“headroom” to handle peak demand.

“Our first client

campaign experienced a 500% increase in their return on ad

spend from a similar campaign a year

before”

Targeted Ad

User recently

purchased a

sports movie

and is searching

for video games (1.7 Million per day)

Want to try some of this?