machine learning at netflix scale
DESCRIPTION
Netflix is the world’s leading Internet television network with over 48 million members in more than 40 countries enjoying more than one billion hours of TV shows and movies per month, including original series. Netflix uses machine learning to deliver a personalized experience to each one of our 48 million users. In this talk you will hear about the machine learning algorithms that power almost every part of the Netflix experience, including some of our recent work on distributed Neural Networks on AWS GPUs. You will also get an insight into the innovation approach that includes offline experimentation and online AB testing. Finally, you will learn about the system architectures that enable all of this at a Netflix scale.TRANSCRIPT
Machine Learning At Netflix Scale
Aish Fenton Manager - Research Engineering @aishfenton
Everything is a recommendation
4
Top Picks for Aish
Movies based on books
Because you watched Bob’s Burgers
Rank based on your taste
Ran
k ba
sed
on y
our
tast
e
75% of plays come from homepage
Back Story…
Proxy question: ▪ Accuracy in predicted rating ▪ Improve by 10% = $1million!
What we were interested in: ▪ High quality recommendations
predicted
actual
SVD RBMs
Top two results still used in production!
>
2006 2013
• > 44M members
• > 40 countries
• > 5B hours in Q3 2013
• Log 100B events/day
• 31.62% of peak US downstream traffic
Data and Models
▪ > 40M subscribers ▪ Ratings: ~5M/day ▪ Searches: >3M/day ▪ Plays: > 50M/day ▪ Streamed hours: o 5B hours in Q3 2013
Geo Info
Time
Impressions
Device Info
Metadata
Social
Ratings
Demographics
Member Behavior
Plays
Aish House of Cards
Latent User Vector
Latent Item Vector
3.53
RU
M
u1 u2 u3
m1 !m2!m3
House of Cards
Aish Aish
House of Cards
Mean Rating My Bias
Movie Bias
Interaction
Mean Rating My Bias
Movie Bias
Interaction
3.55 = 2.50 + -1.5 + 1.2 + pq
My rating for House of Cards
R3.53
U
M
u1 u2 u3
m1 !m2!m3
House of Cards
Aish
2.35
1.34
Time
T
t1 t2 t3 Time
▪ Matrix/Tensor Factorization ▪ Regression models (Logistic, Linear, Elastic nets) ▪ Factorization Machines ▪ Restricted Boltzmann Machines ▪ Markov Chains & other graph models ▪ Clustering / Topic Models ▪ Neural Networks ▪ Association Rules ▪ GBDT/RF ▪ …
Popularity
+ Ratings
+ More Features & Optimized Models
0% 50%
100%
150%
200%
250%
300%
Improvement Over Baseline
Anatomy of a Machine Learning
Platform
Problem
Data
Experiment Offline
Produce Model
Test / Metrics
Near-line
Online
UI Clients
Event Distribution
Online Algs
Model Trainer
Pre-compute
AB Test Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation Platform
S3 / HDFS
Offline MetricsQuery Tools
Models
Models
Near-line
Online
UI Clients
Event Distribution
Online Algs
Model Trainer
Pre-compute
AB Test Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation Platform
S3 / HDFS
Offline MetricsQuery Tools
Models
Models
▪ App Logs ▪ User Actions
▪ Ratings ▪ Plays ▪ Queue Adds
▪ Algo Actions ▪ Impressions (Presentation Bias)
▪ Context ▪ Device Info ▪ User Demographics ▪ Social ▪ Time
▪ …
Many different types of data…
Near-line
Online
UI Clients
Event Distribution
Online Algs
Model Trainer
Pre-compute
AB Test Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation Platform
S3 / HDFS
Offline MetricsQuery Tools
Models
Models
Embedded
Embedded
Weights
Real-time popularity of movie
Example: Neural Network Training
θ
Input OutputHidden Layer
Input OutputHidden Layers
Neural Network Training
1,536 cores
G2 Instances $0.60 p/h
But… things can go astray
Near-line
Online
UI Clients
Event Distribution
Online Algs
Model Trainer
Pre-compute
AB Test Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation Platform
S3 / HDFS
Offline MetricsQuery Tools
Models
Models
RU
MPre-compute
u1 u2 u3Online
Near-line
Online
UI Clients
Event Distribution
Online Algs
Model Trainer
Pre-compute
AB Test Metrics
API Layer
Monitoring
Offline
Hadoop / Data Warehouse
Experimentation Platform
S3 / HDFS
Offline MetricsQuery Tools
Models
Models
Aish played HoC
Publish new model for Aish
Aish Fenton @aishfenton https://www.linkedin.com/profile/view?id=47917219