[aws la media & entertainment event 2015]: cloud analytics for audience engagement

Post on 22-Feb-2017

1.290 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Cloud Analytics for Audience Engagement

Mike Limcaco | AWS Principal Solutions Architect Mick Bass | 47Lining CEO

http://mashable.com

Search

Watch

Listen

Play

Download

Purchase

Rate It

Review It

Sharing

Tagging

Bookmarking

GB TB PB

ZB

EB

65M+ Subscribers 50 Countries

1000+ Devices 10B+ Hours/Quarter

Netflix: Over 75% of what people watch come from recommendations [1]

[1] http://bit.ly/1WIInoh

Machine Learning for Predictive Analytics

•  Branch of Artificial Intelligence and Statistics •  Programming computers based on historical experience •  Focuses on prediction based on known properties learned

from training data

Signals Predictions

A Few Machine Learning Tasks

Recommendations

Clustering

Classification

Why Cloud?

•  Scale •  Adaptability •  Agility

(Some) Big Data Services

Amazon S3

Internet scale storage

Amazon Machine Learning

Hosted predictive analytics service

Amazon EMR

Hosted Hadoop framework

Maximizing Audience Engagement

•  Recommendations

•  Ad / Marketing Campaign Targeting –  Sentiment analysis –  Automated segmentation

•  Predict audience churn

Examples

Recommendations

The Concept

ü Capture Audience Signal Data ü Create history of user and item preferences

ü Estimate similar users and items ü Record these in Search Engine ü Query Search Engine with User History

ü  Enjoy recommendations!

http://www.slideshare.net/tdunning/recommendation-techn

Log Storage

ETL

User Interface

Serving Layer

users

Recommend Engine

users

Media platforms

Mobile

Search Play Buy Rate

Recommendations

Apache Mahout on Amazon EMR

•  Library of scalable machine-learning algorithms

•  Single-node as well as distributed capabilities

–  Hadoop Map-Reduce (BATCH | OFFLINE) –  Evolving to support other execution platforms

(NEAR-REALTIME) •  Apache Spark •  H20

Spark H20

Recommendation Clustering Classification

Math Library

Hadoop

Map-Reduce

mike,view,movie-a mike,view,movie-b mike,view,movie-c mike,buy,movie-b chris,view,movie-b chris,buy,movie-d …

movie-b movie-c:2.772588722239781 movie-a:2.772588722239781

movie-d ….Indicators

(“Items Similar To This….”)

% mahout spark-itemsimilarity -i input-folder/data.txt -o output-folder/ --filter1 buy -fc 1 -ic 2 --filter2 view

Step 1: Logs à History Matrix

User1 Thing1 User2 Thing2 User3 Thing3 User2 Thing4 User5 Thing1 User1 Thing2 User1 Thing3

Mike

Jon

Mary

Phil

Kris

Logs History Matrix

Step 2: Estimate Similar Things

History Matrix

2 8

2 4

8

4

Item-Item Matrix

Step 3: Reduce to Interesting Pairs

2 8

2 4

8

4

Item-Item Matrix

LLR

Indicators (“Items Similar To This….”)

Step 3: Reduce to Interesting Pairs

Indicators (“Items Similar To This….”)

Items Similar To This

Step 4: Store Indicators in a Search Engine (BATCH)

Superman Highlander, Dune

Star Wars Raiders, Minority Report

Highlander Superman Mulan Home Alone,

Mermaid Star Trek … … …

4587 223, 5234 748 5345, 235 12 8234 245 9543, 7673 3456 4587 … …

Index

Indicators

Step 5: Query Search Engine w/ User History (REALTIME)

748 Star Wars 45, 235 12 Highlander 8234 245 Mulan 9543,

7673 4587 Superman 12, 5234 3456 Star Trek 2458 …

Query

“12”

5345

3456

12

Sentiment Analysis

“I thought Episode 29 was not without merit J”

Is this Positive or Negative?

The Concept

ü Capture social media signals ü NRT Streams (Tweets, Comments on FB, IMDB, YouTube …)

ü Push through Sentiment Analyzer ü  Tokenize ü Classify (estimate) as Positive | Negative

ü Provide actionable insight ü  Improve sort order of recommendations ü  Alert / advise Digital Marketing team

Training

Positive Negative

Knowledge Base

Segment Classify

Segment Classify

Segment Classify

Model Training

Positive Negative

Stream Ingest

Stream Ingest

Stream Ingest

Knowledge Base

Segment Classify

Segment Classify

Segment Classify

Model Training

Positive Negative

Stream Ingest

Stream Ingest

Stream Ingest

Knowledge Base

“I adored this movie”

“adore” = POSITIVE

GNIP

Datasift

Other

Positive Negative

Amazon Kinesis

Amazon Kinesis

Amazon Kinesis

Model Training & Storage

Stream Ingest

Sentiment Classification

Amazon Machine Learning

Trending Sentiment

Neutral Negative Positive

Sentiment Training Sets

Case Study: OTT Predictive Analytics

UsingDataforStrategicAdvantage

Proof-of-ConceptOTTservicesprovidetheopportunitytoestablisha“conversa.onwiththecustomer.”BigdatatechniquescanbeappliedtofusemassiveamountsofOTTlogs,3rdpartydemographics,andsocialmediasen<menttogaininsightintotheaudienceanddeveloppredic.vecapabili.es.47LiningandAmazonWebServicespartneredtodevelopaPoCforanOTTcustomertouseamachine-drivenapproachtoaccuratelypredictuserbehaviorwithintheirconsumervideoapplica<on.

100MConsumerinterac9ons

MillionsofUsers

3rdPartyDemographics

10KTitles

SocialMedia

Fuse/Visualize

5DataSources*

Per-SegmentRecommenda9ons

Predictors Enablers

71%AccurateChurn

Enhanceaudience

engagementthroughrelevantoffers

ResultsScalability

Automa9on

Agility

Cost

effec9veness

MachineLearning

*OrderofMagnitudeforPoCScope

Speed,Agility,Scalability

WhyCloud?

38

AWSMachineLearning.Delivered71%predic=veaccuracyforuserchurnduringPoC.

Hardware.Cloudeconomicsallowustoonlypayforwhatweuse.

RedshiC.Petabyte-scaledatawarehouse.Effec9velyfeedsMachineLearningorEMR.

Elas=cMapReduce.Unparalleledspeedandscaleforbigdatarecommenda9ons.

Managedservicesthat“justwork”providingspeed,agilityandscale

Pre-CloudChallengesNeedtoprocure

hardwareforpeaks

ProprietarymachinelearningsoTwareonlyDatascien9stscanuse

Needtoprocuredata

warehouse

Always-on,on-premisehadoopclusterswith

highmanagementcosts

PoCArchitecture

39

OTTApp

S3

RedshiT

MachineLearning

Transforma9ons

PredictorServices

BusinessIntelligenceToolsVisualiza9ons&Dashboards

Perio

dic

3rdPartyDemographicData

Sen9mentAnalysis/SocialData

S3

TitleData

AWSMLAPIMahout

ALSRecommender

Elas9cMapReduce

RAnalysisSandboxes

Lessthan~$1Kininfrastructure

AWS CloudFormation

template stack

Logs

Periodic

Automa<onEnvironmentcanbereplicatedusingNucleator,AnsibleandCloudForma9on

ThePowerofKnowingyourUsers

40

Per-AccountSignatures

ClusteringDimensions

DistanceHeuris9c

ProfilingDimensions

ClusterAnalysishierarchical|model-based

SegmentAnalysis

1

2

n

segments çCohortsè

100MConsumerinterac9ons

MillionsofUsers

3rdPartyDemographics

10kTitles

SocialMedia

AccountSignatureDefini9ons

6dis<nctuserpersonasemerged

*OrderofMagnitudeforPoCScope

TurningUserKnowledgeintoEngagement

41

1

2

n

segments

Per-SegmentRecommenda9ons

Predictors

71%AccurateChurn

AccountChurnPredictor.Usingtheiden9fiedSegmentswithAmazonMachineLearning,wedevelopedamodeltopredictwhenanaccountisatriskforchurn.Accuratepredic9onsenablemeaningfulofferstousersbeforethisoccurs.

ViewedContentPredictor.

Usingtheiden9fiedSegments,weusedRedshiC,RandAmazonElas=cMapReduce/Mahouttopredictcontentthatusersmayfinddesirablethatalsoachievesengagementobjec9ves.Suchrecommenda9onsenableproac9vesculp9ngofuserbehavior.

Realizinga“S9ckiness”StrategythroughData

42

AWS’speed,agilityandscalabilityareanaturalfitforOTTpredic9veanaly9cs.

CustomerStrategyAc.onWithcapabili<esdemonstratedandde-risked,customerisintegra<ngpredictors

intotheir2016userexperience.De-riskingcost~$1kininfrastructure

71%accuracyinchurnpredic9on Reten9onoffersin2016

Per-PersonaRecommenda9ons Keepusersgluedin2016

PoCwasonlypossibleintheCloud

Knowyourcustomer

S9ckiness=abilitytosteertheconversa9onwithyourcustomer

Summary

43

Summary

•  Audience demands increasingly personalized content •  We want to understand and predict these needs and

adapt discovery & delivery of media •  Audience interactions can be analyzed to

–  Surface patterns of common behavior –  Estimate or predict audience demand / churn

•  AWS enables tools & techniques for scalable machine learning

top related