the anatomy of a data science project

Irmak [email protected]

Irmak [email protected]

The Anatomy of a Data ScienceProject

AGE 7

Oh cool.

Pretty good. Space and stuff.

AGE 14

Omigod Omigod Omigod.

Epic masterpiece is epic!!!!1!I'm in love with Leia.

AGE 17

WTF?

AGE 30

When you think about it, it's not that good.

AGE 30

When you think about it, it's not that good.

Ah, who am I kidding? It's amazing.I'm still in love with Leia.

I mean... look at her.

What determineshow much I like a movie?


A personal questionon something

I am passionate about

How do I boost my sales?

A business question

Can I identify experts in each division of my company and bring them

together to collaborate?

Another business question

What do customers out there think and say

about my products?

Another business question

The Anatomyof a

Data Science Project

Finding the right questionsRight metrics

Knowing what’s been done

ObsessedwithMovies

Irmak Sirer

Start with question, not data

Iterative design processMoving targets

Find DataClean Data

Manage Data

Find DataClean Data

Manage DataBIG DATA

Machine LearningStatistics

Applied Math

Open source toolsPython

Pandas, Scikit.learnSQL, Mongo

Javascript, d3, FlaskHadoop, Spark, Hive,

Mahoot

Google

Interactive DashboardsEasy to read graphs

Explaining well, adapting to audience

Interactive DashboardsEasy to read graphs

Explaining well, adapting to audience

Ultimate Goal & Product: Insights


Is my reaction to amovie / book / song

predictable?

How much will I likeThe Book of Eli?

2006

Cinematch

1 billion user ratings

55,000movies

Cinematch

I have a soulmate in taste

Irmak

Cinematch


Irmak Frrmack

Cinematch


Watched the same movies

Irmak Frrmack

Cinematch


Watched the same moviesGave the exact same ratings

Irmak Frrmack

Cinematch


Watched the same moviesGave the exact same ratings

Except The Book of Eli

Irmak Frrmack

Cinematch


Frrmack watched The Book of Eli

Irmak Frrmack

Cinematch


Irmak Frrmack

Oh man, it was…

Cinematch


Irmak Frrmack

Oh man, it was…FANTASTIC!

Cinematch


Irmak Frrmack

Oh man, it was…FANTASTIC!

Predict

No perfect soulmates in real life

Irmak

Irmak

Almost soulmate 1


Irmak

Almost soulmate 1 Almost soulmate 2


Irmak


Almost soulmate 3


Irmak


Almost soulmate 4Almost soulmate 3


Irmak

87% soulmate 74% soulmate

95% soulmate82% soulmate


Irmak


CinematchWorks well for movies that everybody rates

Cinematch Quite bad with movies that only few people rate

Cinematch

Some movies are especially difficult to predict

Biggest error source: popular but weird

15% of all errors from ONE movie

Trivial: Mean score of everyone

Trivial: Mean score of everyoneError: (RMSE) 1.0540 stars


CinematchError: (RMSE) 0.9525 stars



9.6%



Better rankings Better recommendations

9.6%



Better rankings Better recommendations

+ 8.6% + 1200% people watch top recommendation

9.6%

BigChaos Netflix Prize Report

CinematchError: 0.9525 stars


$1,000,000for a 10% improvement

2006


Bring it down to:Error: 0.8563 stars

$1,000,000for a 10% improvement

2006

BellKor’s Pragmatic Chaos

How did they do it?

How did they do it?

Before:Solid assumptions

You have a certain taste.

Your taste dictates a hidden rating for Book of Eli.

When you watch it, this rating is revealed to you.

How did they do it?

Before:Solid assumptions

You have a certain taste.

Your taste dictates a hidden rating for Book of Eli.

When you watch it, this rating is revealed to you.WRON

G

How did they do it?

After:

Your rating changes with time.

How did they do it?

After:


It depends on...

How did they do it?

After:


It depends on...

how many you rated that day

your average rating for the day

which movies you rated on this day

shown Netflix prediction

Y. Koren, The BellKor Solution to the Netflix Grand Prize. 2009

Trivial: Mean score of everyoneError: 1.0540 stars





Your time dependent rating tendencies



Your time dependent rating tendenciesError: 0.9278 stars






12.0%




without looking at which movies you like/hate!


12.0%

What does this suggest?


We cannot compare a movie with all others we've seen.



We compare it to a limited set.




Liking (real time & remembered) depends on time and mood.





Other people's opinions affect our own (followers / hipsters)


We cannot compare Book of Eli with all movies we've seen.



Other people's opinions affect our own (followers / hipsters)

An experiment

Music Lab: A website for downloading music

An experiment

Same website: Music download and rating

M.J. Salganik, P.S. Dodds, D.J. Watts. Science, 311:854-856, 2006

An experiment


Alternative A:Other people's ratings invisible

An experiment



More or less equal ratings

An experiment



Alternative B:All ratings visible


An experiment





Several songs snowball in popularity

An experiment





Several songs snowball in popularity

It's different songs for each trial

Social influence plays a big part in determining hits and misses

Problems with rating movies




Other people's opinions affect our own.

Degree of liking issensitive and vague

Amazing! Total garbage

Tuesday 3am Sunday 12pm


Dependent on many otherenvironmental factors

besides our taste


Difficult to describeaccurately and consistently

with a number

Predicting aside,

can I even reliably rate & rank movies I’ve seen in terms of enjoyment?

Irmak Frrmack

What are your top twenty

movies?

Irmak Frrmack

Well…Ummm…


movies?

Irmak Frrmack

Well…Ummm…I like Star Wars.


movies?


Can’t we dosomething

about this?

“Enjoyment” from a movie is very high dimensional information

“Enjoyment” from a movie is very high dimensional information

Rating means projecting this onto a single dimension

But sometimes you just want to do the best projection you can

What is my top twenty?

Trying to rate Star Wars


Map enjoymentto a specific scale

1


choose corresponding rating

for this degree of liking

2


But we cannot keepthis entire history ofenjoyment in mind



We fuzzily remembera small subset



We fuzzily remembera small subset

We map based on this subset

SAMPLIN

G

BIASEDSAMPLIN

G

Tuesday

Friday


Can’t we dosomething

about this?

We can certainly handlesingle comparisons

?


less vague


little information

I can manually compare it with all others

And find exactly where it belongs

right after Indiana Jones

right before The Princess

Bride

Full ranking: Compare all pairs

That’s a bittoo much effortfor me

1,000,000 comparisons?

We don’t need all of them


If


If

,


If

,

I have some information about

Compare a random sample of pairs

Use a ranking algorithm that utilizesall the information

Good idea!

Elo rating system

Elo rating system

7.00

“hotness”

Elo rating system

7.00

“hotness” range

+1.50-1.50

Elo rating system

7.00 8.00+1.50-1.50 +1.50-1.50

Elo rating system

7.00 8.00+1.50-1.50 +1.50-1.50

7.12 7.68

Elo rating system

7.00 8.00

7.12 7.68

+1.50-1.50 +1.50-1.50

Elo rating system

7.00 8.00+150-150 +150-150

36%to win

64%to win

Elo rating system

How do we find out what these ranges are?

Elo rating system

Start with the same guess for every contender

5.00 5.00 5.00 5.00 5.00 5.00

Elo rating system

5.00 5.00

?

Elo rating system

5.00 5.00

Elo rating system

5.12 4.88

Update the best guesses accordingly

Elo rating system

5.12 5.00

?

Elo rating system

5.24 4.88

Elo rating system

5.24 5.00

?

Elo rating system

5.14 5.10

We don’t need all comparisons

If

,

I have some information about

Elo rating system

7.61 4.02

?

Elo rating system

7.61 4.02

?

89%to win

11%to win

Elo rating system

7.61

+.024.02

-.02

89%to win

11%to win

Elo rating system

7.61

-.534.02

+.53

89%to win

11%to win

Elo rating system

We now have scores on a single scale

9.07 8.42 6.40 4.88 4.20 3.03

Elo rating system

We now have scores on a single scale(estimates of people’s appreciation levels)

9.07 8.42 6.40 4.88 4.20 3.03

Elo rating system

and a ranking

1 2 3 4 5 6

9.07 8.42 6.40 4.88 4.20 3.03


Can we somehow applythis to movies, then?

We can do better

We can do betterBayesian ranking algorithms


Glicko(The Elo Killer)

1999


Glicko(The Elo Killer)

1999

TrueSkill™

2007

Bayesian ranking

4.46 4.01

+- +-

Bayesian ranking

4.46 4.01

+- +-

82%to win

15%to win

3%to draw

Bayesian ranking

?

Bayesian ranking

? 4.3

Elo:Best guess

for the center

Bayesian ranking

? 4.3

Bayesian:It could be

centered around

Bayesian:It could also be

centered around

Bayesian ranking

? 4.2

Bayesian:or

centered around

Bayesian ranking

? 4.4

Bayesian:Less likely

but even around

Bayesian ranking

? 4.5

Bayesian ranking

? 4.3

3.5 4 4.5 5

Pro

ba

bili

ty

Bayesian ranking

? 4.3

3.5 4 4.5 5

Pro

ba

bili

ty

uncertainty

Few comparisons: Lots of uncertainty(anything from 2.3 to 4.5 is quite possible)

2.0 2.5 3.0 3.5 4 4.5 5

Pro

ba

bili

ty

After many comparisons: Quite sure(pretty much between 4.11 to 4.18)

Pro

ba

bili

ty

2.0 2.5 3.0 3.5 4 4.5 5

Bayesian ranking

?

Bayesian ranking

Star Wars

Lord ofthe Rings

2.0 3.0 4.0 5.0

How did they do it?

After:


A small, constant increasein uncertainty before eachcomparison

3.5 4 4.5 5

Pro

ba

bili

ty

uncertainty


Great! We have a system!

I don’t want to spend too much time on this

How many is too many?

Minimum EffortMaximum Information


1 3 5

1 3 5

1 3 5

1 3 5

1 3 5


Not reliable by itselfStill carries a lot of information


1 3 5


1 3 5 1 3 5

I don’t want to spend too much time on this

What else can we do?


?


?

I can calculate the expected amount of information from a comparison!


Certain about both moviesWon’t learn a lot


Certain about both moviesWon’t learn a lot

Don’t know much about eitherWill learn a lot

regardless of outcome

PythonTrueskillDjango

JavascriptMySQL

movievsmovie.datasco.pe

Irmak Frrmack


movies?

Quantifying human reactions are hard

books

songs

food

politicans

products

celebrities

tv shows

importance of issues

what to spend ‘fun’ budget on

teams in different sports


Start with a rating,pose the correct comparisons


Start with a rating,pose the correct comparisons

Every decision gets us closer

Many comparisons for a movie

over different days

averages out mood and other factors

movievsmovie.datasco.pe

The Anatomyof a

Data Science Project

Thanks

the anatomy of a data science project

Data & Analytics