how to tell which algorithms really matter
DESCRIPTION
TRANSCRIPT
How to Tell Which Algorithms Really Matter
Ted DunningMapR Technologies
© 2014 MapR Technologies 2
Which Algorithms are Important?(and how can you know?)
Ted Dunning, Chief Application ArchitectMapR Technologies
© 2014 MapR Technologies 3
00:011.65TBWITH 298 SERVERS
© 2014 MapR Technologies 4
129KRECCOMENDATIONS
00:02
© 2014 MapR Technologies 5
Advertising Automation
Cloud
Sellers Cloud
BuyersCloud
63MAD AUCTIONS
00:03
© 2014 MapR Technologies 6
00:04422.2KGENETIC SEQUENCES
© 2014 MapR Technologies 7
Largest Biometric Database
00:054.73MAUTHENTICATIONS
© 2014 MapR Technologies 8© 2014 MapR Technologies
But How is This Done?
What really matters?
© 2014 MapR Technologies 9
Topic For Today
• What is important? What is not?• Why?• What is the difference from academic research?• Some examples
© 2014 MapR Technologies 10
What is Important?
• Deployable
• Robust
• Transparent
• Skillset and mindset matched?
• Proportionate
© 2014 MapR Technologies 11
What is Important?
• Deployable– Clever prototypes don’t count if they can’t be standardized
• Robust
• Transparent
• Skillset and mindset matched?
• Proportionate
© 2014 MapR Technologies 12
What is Important?
• Deployable– Clever prototypes don’t count
• Robust– Mishandling is common
• Transparent– Will degradation be obvious?
• Skillset and mindset matched?
• Proportionate
© 2014 MapR Technologies 13
What is Important?
• Deployable– Clever prototypes don’t count
• Robust– Mishandling is common
• Transparent– Will degradation be obvious?
• Skillset and mindset matched?– How long will your fancy data scientist enjoy doing standard ops tasks?
• Proportionate– Where is the highest value per minute of effort?
© 2014 MapR Technologies 14
Academic Goals vs Pragmatics
• Academic goals– Reproducible– Isolate theoretically important aspects– Work on novel problems
• Pragmatics– Highest net value– Available data is constantly changing– Diligence and consistency have larger impact than cleverness– Many systems feed themselves, exploration and exploitation are both
important– Engineering constraints on budget and schedule
© 2014 MapR Technologies 15
Example 1:Making Recommendations Better
© 2014 MapR Technologies 16
Recommendation Advances
• What are the most important algorithmic advances in recommendations over the last 10 years?
• Cooccurrence analysis?
• Matrix completion via factorization?
• Latent factor log-linear models?
• Temporal dynamics?
© 2014 MapR Technologies 17
The Winner – None of the Above
• What are the most important algorithmic advances in recommendations over the last 10 years?
1. Result dithering (random noise)
2. Anti-flood (don’t repeat yourself)
© 2014 MapR Technologies 18
The Real Issues
• Exploration• Diversity• Speed
• Not the last fraction of a percent
© 2014 MapR Technologies 19
Result Dithering
• Dithering is used to re-order recommendation results – Re-ordering is done randomly
• Dithering is guaranteed to make off-line performance worse
• Dithering also has a near perfect record of making actual performance much better
© 2014 MapR Technologies 20
Result Dithering
• Dithering is used to re-order recommendation results – Re-ordering is done randomly
• Dithering is guaranteed to make off-line performance worse
• Dithering also has a near perfect record of making actual performance much better
“Made more difference than any other change”
© 2014 MapR Technologies 22
Example … ε = 0.5
© 2014 MapR Technologies 23
Example … ε = log 2 = 0.69
© 2014 MapR Technologies 24
Exploring The Second Page
© 2014 MapR Technologies 25
Lesson 1:Exploration is good
© 2014 MapR Technologies 26
Example 2:Bayesian Bandits
© 2014 MapR Technologies 27
Bayesian Bandits
• Based on Thompson sampling• Very general sequential test • Near optimal regret• Trade-off exploration and exploitation
• Possibly best known solution for exploration/exploitation
• Incredibly simple
© 2014 MapR Technologies 30
Fast Convergence
© 2014 MapR Technologies 31
Thompson Sampling on Ads
An Empirical Evaluation of Thompson Sampling - Chapelle and Li, 2011
© 2014 MapR Technologies 32
Bayesian Bandits versus Result Dithering
• Many useful systems are difficult to frame in fully Bayesian form• Thompson sampling cannot be applied without posterior
sampling
• Can still do useful exploration with dithering
• But better to use Thompson sampling if possible
© 2014 MapR Technologies 33
Lesson 2:Exploration is easy to do and pays big benefits.
© 2014 MapR Technologies 34
Example 3:On-line Clustering
© 2014 MapR Technologies 35
The Problem
• K-means clustering is useful for feature extraction or compression
• At scale and at high dimension, the desirable number of clusters increases
• Very large number of clusters may require more passes through the data
• Super-linear scaling is generally infeasible
© 2014 MapR Technologies 36
The Solution
• Sketch-based algorithms produce a sketch of the data• Streaming k-means uses adaptive dp-means to produce this
sketch in the form of many weighted centroids which approximate the original distribution
• The size of the sketch grows very slowly with increasing data size
• Many operations such as clustering are well behaved on sketches
Fast and Accurate k-means For Large Datasets. Michael Shindler, Alex Wong, Adam Meyerson.
Revisiting k-means: New Algorithms via Bayesian Nonparametrics . Brian Kulis, Michael Jordan.
© 2014 MapR Technologies 37
An Example
© 2014 MapR Technologies 38
An Example
© 2014 MapR Technologies 43
Streaming k-means Ideas
• By using a sketch with lots (k log N) of centroids, we avoid pathological cases
• We still get a very good result if the sketch is created – in one pass– with approximate search
• In fact, adaptive dp-means works just fine
• In the end, the sketch can be used for clustering or …
© 2014 MapR Technologies 44
Lesson 3:Sketches make big data small.
© 2014 MapR Technologies 45
Example 4:Search Abuse
© 2014 MapR Technologies 46
Recommendations
Alice got an apple and a puppy
Charles got a bicycle
Alice
Charles
© 2014 MapR Technologies 47
Recommendations
Alice got an apple and a puppy
Charles got a bicycle
Bob got an apple
Alice
Bob
Charles
© 2014 MapR Technologies 48
Recommendations
What else would Bob like??
Alice
Bob
Charles
© 2014 MapR Technologies 49
Log Files
Alice
Bob
Charles
Alice
Bob
Charles
Alice
© 2014 MapR Technologies 50
History Matrix: Users by Items
Alice
Bob
Charles
✔ ✔ ✔
✔ ✔
✔ ✔
© 2014 MapR Technologies 51
Co-occurrence Matrix: Items by Items
-
1 2
1 1
1
1
2 1
How do you tell which co-occurrences are useful?.
00
0 0
© 2014 MapR Technologies 53
Indicator Matrix: Anomalous Co-Occurrence
✔✔
Result: The marked row will be added to the indicator field in the item document…
© 2014 MapR Technologies 54
Indicator Matrix
✔
id: t4title: puppydesc: The sweetest little puppy ever.keywords: puppy, dog, pet
indicators: (t1)
That one row from indicator matrix becomes the indicator field in the Solr document used to deploy the recommendation engine.
Note: data for the indicator field is added directly to meta-data for a document in Solr index. You don’t need to create a separate index for the indicators.
© 2014 MapR Technologies 56
Internals of the Recommender Engine
56
© 2014 MapR Technologies 58
Real-life example
© 2014 MapR Technologies 59
Lesson 4:Recursive search abuse pays
Search can implement recsWhich can implement search
© 2014 MapR Technologies 60
How Does This Apply?
© 2014 MapR Technologies 61
How Can I Start?
© 2014 MapR Technologies 62
Q & A
@ted_dunning @mapr maprtech
Engage with us!
MapR
maprtech
mapr-technologies
© 2014 MapR Technologies 64