Irmak [email protected]
The Anatomy of a Data ScienceProject
AGE 7
Oh cool.
Pretty good. Space and stuff.
AGE 14
Omigod Omigod Omigod.
Epic masterpiece is epic!!!!1!I'm in love with Leia.
AGE 30
When you think about it, it's not that good.
AGE 30
When you think about it, it's not that good.
Ah, who am I kidding? It's amazing.I'm still in love with Leia.
I mean... look at her.
What determineshow much I like a movie?
What determineshow much I like a movie?
A personal questionon something
I am passionate about
How do I boost my sales?
A business question
Can I identify experts in each division of my company and bring them
together to collaborate?
Another business question
What do customers out there think and say
about my products?
Another business question
The Anatomyof a
Data Science Project
Finding the right questionsRight metrics
Knowing what’s been done
ObsessedwithMovies
Irmak Sirer
Start with question, not data
Iterative design processMoving targets
Start with question, not data
Iterative design processMoving targets
Find DataClean Data
Manage Data
Find DataClean Data
Manage DataBIG DATA
Machine LearningStatistics
Applied Math
Open source toolsPython
Pandas, Scikit.learnSQL, Mongo
Javascript, d3, FlaskHadoop, Spark, Hive,
Mahoot
Interactive DashboardsEasy to read graphs
Explaining well, adapting to audience
Interactive DashboardsEasy to read graphs
Explaining well, adapting to audience
Ultimate Goal & Product: Insights
What determineshow much I like a movie?
What determineshow much I like a movie?
Is my reaction to amovie / book / song
predictable?
How much will I likeThe Book of Eli?
2006
Cinematch
1 billion user ratings
55,000movies
Cinematch
I have a soulmate in taste
Irmak
Cinematch
I have a soulmate in taste
Irmak Frrmack
Cinematch
I have a soulmate in taste
Watched the same movies
Irmak Frrmack
Cinematch
I have a soulmate in taste
Watched the same moviesGave the exact same ratings
Irmak Frrmack
Cinematch
I have a soulmate in taste
Watched the same moviesGave the exact same ratings
Except The Book of Eli
Irmak Frrmack
Cinematch
I have a soulmate in taste
Frrmack watched The Book of Eli
Irmak Frrmack
Cinematch
I have a soulmate in taste
Irmak Frrmack
Oh man, it was…
Cinematch
I have a soulmate in taste
Irmak Frrmack
Oh man, it was…FANTASTIC!
Cinematch
I have a soulmate in taste
Irmak Frrmack
Oh man, it was…FANTASTIC!
Predict
No perfect soulmates in real life
Irmak
Irmak
Almost soulmate 1
No perfect soulmates in real life
Irmak
Almost soulmate 1 Almost soulmate 2
No perfect soulmates in real life
Irmak
Almost soulmate 1 Almost soulmate 2
Almost soulmate 3
No perfect soulmates in real life
Irmak
Almost soulmate 1 Almost soulmate 2
Almost soulmate 4Almost soulmate 3
No perfect soulmates in real life
Irmak
87% soulmate 74% soulmate
95% soulmate82% soulmate
No perfect soulmates in real life
Irmak
No perfect soulmates in real life
Irmak
No perfect soulmates in real life
CinematchWorks well for movies that everybody rates
Cinematch Quite bad with movies that only few people rate
Cinematch
Some movies are especially difficult to predict
Biggest error source: popular but weird
15% of all errors from ONE movie
Trivial: Mean score of everyone
Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars
Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars
CinematchError: (RMSE) 0.9525 stars
Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars
CinematchError: (RMSE) 0.9525 stars
9.6%
Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars
CinematchError: (RMSE) 0.9525 stars
Better rankings Better recommendations
9.6%
Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars
CinematchError: (RMSE) 0.9525 stars
Better rankings Better recommendations
+ 8.6% + 1200% people watch top recommendation
9.6%
BigChaos Netflix Prize Report
CinematchError: 0.9525 stars
CinematchError: 0.9525 stars
$1,000,000for a 10% improvement
2006
CinematchError: 0.9525 stars
Bring it down to:Error: 0.8563 stars
$1,000,000for a 10% improvement
2006
BellKor’s Pragmatic Chaos
How did they do it?
How did they do it?
How did they do it?
Before:Solid assumptions
You have a certain taste.
Your taste dictates a hidden rating for Book of Eli.
When you watch it, this rating is revealed to you.
How did they do it?
Before:Solid assumptions
You have a certain taste.
Your taste dictates a hidden rating for Book of Eli.
When you watch it, this rating is revealed to you.WRON
G
How did they do it?
After:
Your rating changes with time.
How did they do it?
After:
Your rating changes with time.
It depends on...
How did they do it?
After:
Your rating changes with time.
It depends on...
how many you rated that day
your average rating for the day
which movies you rated on this day
shown Netflix prediction
Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009
Trivial: Mean score of everyoneError: 1.0540 stars
CinematchError: 0.9525 stars
Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009
Trivial: Mean score of everyoneError: 1.0540 stars
CinematchError: 0.9525 stars
Your time dependent rating tendencies
Trivial: Mean score of everyoneError: 1.0540 stars
CinematchError: 0.9525 stars
Your time dependent rating tendenciesError: 0.9278 stars
Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009
Trivial: Mean score of everyoneError: 1.0540 stars
CinematchError: 0.9525 stars
Your time dependent rating tendenciesError: 0.9278 stars
Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009
12.0%
Trivial: Mean score of everyoneError: 1.0540 stars
CinematchError: 0.9525 stars
Your time dependent rating tendenciesError: 0.9278 stars
without looking at which movies you like/hate!
Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009
12.0%
What does this suggest?
What does this suggest?
We cannot compare a movie with all others we've seen.
What does this suggest?
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
What does this suggest?
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
Liking (real time & remembered) depends on time and mood.
What does this suggest?
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
Liking (real time & remembered) depends on time and mood.
Other people's opinions affect our own (followers / hipsters)
What does this suggest?
We cannot compare Book of Eli with all movies we've seen.
We compare it to a limited set.
Liking (real time & remembered) depends on time and mood.
Other people's opinions affect our own (followers / hipsters)
An experiment
Music Lab: A website for downloading music
An experiment
Same website: Music download and rating
M.J. Salganik, P.S. Dodds, D.J. Watts. Science, 311:854-856, 2006
An experiment
Music Lab: A website for downloading music
Alternative A:Other people's ratings invisible
An experiment
Music Lab: A website for downloading music
Alternative A:Other people's ratings invisible
More or less equal ratings
An experiment
Music Lab: A website for downloading music
Alternative A:Other people's ratings invisible
Alternative B:All ratings visible
More or less equal ratings
An experiment
Music Lab: A website for downloading music
Alternative A:Other people's ratings invisible
Alternative B:All ratings visible
More or less equal ratings
Several songs snowball in popularity
An experiment
Music Lab: A website for downloading music
Alternative A:Other people's ratings invisible
Alternative B:All ratings visible
More or less equal ratings
Several songs snowball in popularity
It's different songs for each trial
Social influence plays a big part in determining hits and misses
Problems with rating movies
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
Liking (real time & remembered) depends on time and mood.
Other people's opinions affect our own.
Degree of liking issensitive and vague
Amazing! Total garbage
Tuesday 3am Sunday 12pm
Liking (real time & remembered) depends on time and mood.
Other people's opinions affect our own.
Degree of liking issensitive and vague
Degree of liking issensitive and vague
Dependent on many otherenvironmental factors
besides our taste
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
Degree of liking issensitive and vague
Degree of liking issensitive and vague
Difficult to describeaccurately and consistently
with a number
Predicting aside,
can I even reliably rate & rank movies I’ve seen in terms of enjoyment?
Irmak Frrmack
What are your top twenty
movies?
Irmak Frrmack
Well…Ummm…
What are your top twenty
movies?
Irmak Frrmack
Well…Ummm…I like Star Wars.
What are your top twenty
movies?
Degree of liking issensitive and vague
Can’t we dosomething
about this?
Degree of liking issensitive and vague
“Enjoyment” from a movie is very high dimensional information
“Enjoyment” from a movie is very high dimensional information
Rating means projecting this onto a single dimension
But sometimes you just want to do the best projection you can
What is my top twenty?
We cannot compare a movie with all others we've seen.
We compare it to a limited set.
Degree of liking issensitive and vague
Trying to rate Star Wars
Trying to rate Star Wars
Trying to rate Star Wars
Map enjoymentto a specific scale
1
Trying to rate Star Wars
Map enjoymentto a specific scale
1
Trying to rate Star Wars
Map enjoymentto a specific scale
1
Trying to rate Star Wars
choose corresponding rating
for this degree of liking
2
Trying to rate Star Wars
But we cannot keepthis entire history ofenjoyment in mind
Trying to rate Star Wars
But we cannot keepthis entire history ofenjoyment in mind
We fuzzily remembera small subset
Trying to rate Star Wars
But we cannot keepthis entire history ofenjoyment in mind
We fuzzily remembera small subset
We map based on this subset
Trying to rate Star Wars
But we cannot keepthis entire history ofenjoyment in mind
We fuzzily remembera small subset
We map based on this subset
Degree of liking issensitive and vague
Can’t we dosomething
about this?
We can certainly handlesingle comparisons
?
We can certainly handlesingle comparisons
We can certainly handlesingle comparisons
less vague
We can certainly handlesingle comparisons
little information
I can manually compare it with all others
And find exactly where it belongs
right after Indiana Jones
right before The Princess
Bride
Full ranking: Compare all pairs
That’s a bittoo much effortfor me
1,000,000 comparisons?
We don’t need all of them
We don’t need all of them
If
We don’t need all of them
If
,
We don’t need all of them
If
,
I have some information about
Compare a random sample of pairs
Use a ranking algorithm that utilizesall the information
Good idea!
Elo rating system
Elo rating system
Elo rating system
Elo rating system
7.00
“hotness”
Elo rating system
7.00
“hotness” range
+1.50-1.50
Elo rating system
7.00 8.00+1.50-1.50 +1.50-1.50
Elo rating system
7.00 8.00+1.50-1.50 +1.50-1.50
7.12 7.68
Elo rating system
7.00 8.00
7.12 7.68
+1.50-1.50 +1.50-1.50
Elo rating system
7.00 8.00
7.12 7.68
+1.50-1.50 +1.50-1.50
Elo rating system
7.00 8.00+150-150 +150-150
36%to win
64%to win
Elo rating system
How do we find out what these ranges are?
Elo rating system
Start with the same guess for every contender
5.00 5.00 5.00 5.00 5.00 5.00
Elo rating system
5.00 5.00
?
Elo rating system
5.00 5.00
Elo rating system
5.12 4.88
Update the best guesses accordingly
Elo rating system
5.12 5.00
?
Elo rating system
5.24 4.88
Elo rating system
5.24 5.00
?
Elo rating system
5.14 5.10
We don’t need all comparisons
If
,
I have some information about
Elo rating system
7.61 4.02
?
Elo rating system
7.61 4.02
?
89%to win
11%to win
Elo rating system
7.61
+.024.02
-.02
89%to win
11%to win
Elo rating system
7.61
-.534.02
+.53
89%to win
11%to win
Elo rating system
We now have scores on a single scale
9.07 8.42 6.40 4.88 4.20 3.03
Elo rating system
We now have scores on a single scale(estimates of people’s appreciation levels)
9.07 8.42 6.40 4.88 4.20 3.03
Elo rating system
and a ranking
1 2 3 4 5 6
9.07 8.42 6.40 4.88 4.20 3.03
Degree of liking issensitive and vague
Can we somehow applythis to movies, then?
We can do better
We can do betterBayesian ranking algorithms
We can do betterBayesian ranking algorithms
Glicko(The Elo Killer)
1999
We can do betterBayesian ranking algorithms
Glicko(The Elo Killer)
1999
TrueSkill™
2007
Bayesian ranking
4.46 4.01
+- +-
Liking (real time & remembered) depends on time and mood.
Other people's opinions affect our own.
Degree of liking issensitive and vague
Bayesian ranking
4.46 4.01
+- +-
Bayesian ranking
4.46 4.01
+- +-
82%to win
15%to win
3%to draw
Bayesian ranking
?
Bayesian ranking
? 4.3
Elo:Best guess
for the center
Bayesian ranking
? 4.3
Bayesian:It could be
centered around
Bayesian:It could also be
centered around
Bayesian ranking
? 4.2
Bayesian:or
centered around
Bayesian ranking
? 4.4
Bayesian:Less likely
but even around
Bayesian ranking
? 4.5
Bayesian ranking
? 4.3
3.5 4 4.5 5
Pro
ba
bili
ty
Bayesian ranking
? 4.3
3.5 4 4.5 5
Pro
ba
bili
ty
uncertainty
Few comparisons: Lots of uncertainty(anything from 2.3 to 4.5 is quite possible)
2.0 2.5 3.0 3.5 4 4.5 5
Pro
ba
bili
ty
After many comparisons: Quite sure(pretty much between 4.11 to 4.18)
Pro
ba
bili
ty
2.0 2.5 3.0 3.5 4 4.5 5
Bayesian ranking
?
Bayesian ranking
Star Wars
Lord ofthe Rings
2.0 3.0 4.0 5.0
Bayesian ranking
Star Wars
Lord ofthe Rings
2.0 3.0 4.0 5.0
How did they do it?
After:
Your rating changes with time.
A small, constant increasein uncertainty before eachcomparison
3.5 4 4.5 5
Pro
ba
bili
ty
uncertainty
Degree of liking issensitive and vague
Great! We have a system!
I don’t want to spend too much time on this
How many is too many?
Minimum EffortMaximum Information
Minimum EffortMaximum Information
1 3 5
1 3 5
1 3 5
1 3 5
1 3 5
Minimum EffortMaximum Information
Minimum EffortMaximum Information
Minimum EffortMaximum Information
Not reliable by itselfStill carries a lot of information
Minimum EffortMaximum Information
1 3 5
Minimum EffortMaximum Information
1 3 5 1 3 5
I don’t want to spend too much time on this
What else can we do?
Minimum EffortMaximum Information
?
Minimum EffortMaximum Information
?
I can calculate the expected amount of information from a comparison!
Minimum EffortMaximum Information
Minimum EffortMaximum Information
Certain about both moviesWon’t learn a lot
Minimum EffortMaximum Information
Certain about both moviesWon’t learn a lot
Minimum EffortMaximum Information
Certain about both moviesWon’t learn a lot
Don’t know much about eitherWill learn a lot
regardless of outcome
PythonTrueskillDjango
JavascriptMySQL
PythonTrueskillDjango
JavascriptMySQL
movievsmovie.datasco.pe
Irmak Frrmack
What are your top twenty
movies?
Quantifying human reactions are hard
books
songs
food
politicans
products
celebrities
tv shows
importance of issues
what to spend ‘fun’ budget on
teams in different sports
Degree of liking issensitive and vague
Amazing! Total garbage
Tuesday 3am Sunday 12pm
Quantifying human reactions are hard
Start with a rating,pose the correct comparisons
Quantifying human reactions are hard
Start with a rating,pose the correct comparisons
Every decision gets us closer
Degree of liking issensitive and vague
Amazing! Total garbage
Tuesday 3am Sunday 12pm
Many comparisons for a movie
over different days
averages out mood and other factors
Degree of liking issensitive and vague
Amazing! Total garbage
Tuesday 3am Sunday 12pm
movievsmovie.datasco.pe
The Anatomyof a
Data Science Project
The Anatomyof a
Data Science Project