which car fits my life? - inovex · deepstack: expert-level artificial intelligence in heads-up...
TRANSCRIPT
Which car fits my life?Mobile.de’s approach to recommendations
PyData Berlin, July 1st, 2017Florian Wilhelm, Arnab Dutta
2
Introduction
Dr. Florian Wilhelm
Data Scientist
inovex GmbH
� @FlorianWilhelm
� FlorianWilhelm
florianwilhelm.info
Dr. Arnab Dutta
Data Scientist
mobile.de GmbH
� @kopfhohen
� kraktoso
3
Outline
§Introduction§Use-cases§Theory§Our Approach§Example§Outlook
4
MOBILE.DEGERMAN MARKET LEADER
13.5 MIO UNIQUE USER PER MONTH
1.6 MIO VEHICLES
290EMPLOYEES
DREILINDEN / FRIEDRICHSHAIN BERLINHEADQUARTERS
Part ofebay Tech
5
IT-project house for digital transformation:‣ Agile Development & Management‣ Web · UI/UX · Replatforming · Microservices‣ Mobile · Apps · Smart Devices · Robotics‣ Big Data & Business Intelligence Platforms‣ Data Science · Data Products · Search · Deep Learning‣ Data Center Automation · DevOps · Cloud · Hosting‣ Trainings & Coachings
Using technology to inspire our clients. And ourselves.
inovex offices inKarlsruhe · Pforzheim · München · Köln · Hamburg.
www.inovex.de
6
Outline
§Introduction§Use-cases§Theory§Our Approach§Example§Outlook
7
Why Recommendations?Why Recommendations?
Show width of offering
Inspiration
Engagement
8
- - engagement- - inspiration- - relevance
Why Recommendations?
- - high click-through-rate - - small exit- & bounce-rates
User Benefits
Business Benefits
9
Mobile.de Conversion Funnel
WishlistHome Search Result Page
View Contact Buy
10
Recommendations on Home
Home
Recommendations based on preferencesof visiting users as an alternative entry point.
WishlistHome SRP View Contact Buy
11
Recommendations on Search Results Page
Recommendations basedon similar vehicle makeand model id to presentalternatives
WishlistHome SRP View Contact Buy
12
Recommendations on View Item Page
VIP
Recommendations based on the specific make and model a user is viewing to present alternatives
WishlistHome SRP View Contact Buy
13
Recommendations on your Wishlist
Recommendations based on the specific make and model of a deleted ad to provide almost identical recommendations
Recommendations based on theusers car preferences and the parking lot items.
WishlistHome SRP View Contact Buy
14
§Introduction§Use-cases§Theory§Collaborative Filtering§Content Based§Our Approach§Example§Outlook
15
Collaborative Filtering
items similar
16
Watched / Rated
Unwatched
Item-Item Similarity
??
??
????Item-based Recommendations
Cosine Similarity
17
Recommendation
Item-based Recommendations
P
P
P
P
P P
Wishlist
18
Summary of Collaborative Filtering✓Collective behaviour of users
✓Standard-Method (it works, it’s reliable etc.)
� Cold Start Problem: New listings need a certain number of clicks to be recommended.
� Sparsity problems: lot fewer interactiondata points than total items and users.
� Content agnostic
� Only “batch-based” learning
19
Looking For: Used Car (100%)
Prefers (Make): BMW (50%), Audi (50%)Prefers (Model): Audi A3 (25%), Audi A4 (25%),
BMW 318 (50%)Searching In: lat 52.5206, lon 13.409Search Radius: 300kmPreferred Price: 20 000€ ± 1500€ Preferred Mileage: 10 000km ± 5000km
User Preferences
Marketing
Anonymous
Content-based Filtering: User Preferences
20
Content-based Filtering
interacted
<Price: 10K, Category: small>
<Price: 6K, Category: small>
<Price: 90K, Category: sports>
<Price: 10K, Category: small>
recommend
21
Summary of Content-basedüWorks even if there are no other
users
ücontent-based preferences ofusers based on a weighted vectorof item features
� Hard to do recommendations fornew users (cold start problem)
� Non-applicable for heterogenouscontent types
� Low diversity, i.e. more of thesame
22
§Introduction§Use-cases§Theory§Our Approach§Example§Outlook
23
Find the car that perfectly fits your life
User’s Car Preferences Car Pool + Attributes(make, model, color, price, …)
Flexible(cold-start, uncertainty, real-time, ...)
Interactions of other users(views, parkings, contacts)
24
Hybrid Recommender
Collaborative Filtering
Hybrid Recommender
Contentbased
• no cold-start problem of new items
• integrate new user events in real-time
• robust and reliable concepts
• easy to tune for different use-cases
• comprehensible and debuggable
25
RecommendationEngine
User PreferenceService API
User EventTracking & Storage
RecommendationEngine
RecommendationEngine
User PreferenceComputation &
Storage
All Listings
User Preference+ Recommendation Architecture
26
Hybrid Recommender Concept
PPP P
P
Looking For: Used Car (100%)
Prefers (Make): BMW (50%), Audi (50%)Prefers (Model): Audi A3 (25%), Audi A4 (25%),
BMW 318 (50%)
Searching In: lat 52.5206, lon 13.409Search Radius: 300kmPreferred Price: 20 000€ ± 1500€Preferred Mileage: 10 000km ± 5000km
User Profile
BuyerLast Action: YesterdayFrequent User
User 12345
Likelihood to buy: 88 %
Elastic Search Query
Score0.8
0.3
1.7
3.2
1.1
0.9
……
27
§Introduction§Use-cases§Theory§Our Approach§Example§Outlook
28
Finding similar make/models
§Users are often uncertainwith their choices in orientation phase
§Help users explore similarmodels to make informeddecisions
§Exploit the collaborativeaspect in defining theconcept of similarity
29
Make/Model Recommender with LightFM
LightFM:§Matured and well documented Python package§Optimized and parallelized with Cython§Hybrid recommender based on matrix factorisation§Supports Learning-to-Rank objectives (BPR, WARP)
Audi A4 BMW 645 similar vehicles
30
Non-negative Matrix Factorisation (NMF)LF1 LF2
LF1
LF2
M (|U| x |I|) x R (|LF| x |I|)
= X
= L (|U| x |LF|)
31
NMF: Embeddings to Similarities
LF1
LF2
• car is represented as an item embedding
• Given 2 embeddings, LFitem_i and Lfitem_j_
• Compute sim(LFitem_i , LFitem_j)
• Find pairwise values for all item pairs (|I| x |I|)
item embeddings
32
NMF: Persisting Item similarities
33
Method§ best of both the worlds§ robust
Business§ higher CTR§ lesser exits rates
User§ engagement§ diversity
hybrid
34
§Introduction§Motivation & Use-cases§Theory§Implementation§Outlook
35
Deep Learning
Dermatologist-level classification of skin cancer with deep neural networks(Nature 542, 115-118, February 2017)
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker(Science, March 2017)
Recent Breakthroughs in Deep Learning Reasons for Deep Learning
• captures nonlinear relations
• holistic approach, i.e. reduces number of components possibly
• less feature engineering
• possibly improved quality
36
Approach: Wide and Deep Model
mileagepricecolor
history
mileagepricecolor
views
...
0.38
0.250.79
...0
10
...0
0.20.8
...
...
enco
deen
code
enco
deen
code
...
0.35
-0.152.03
cont.
cat.
cont.
cat.us
erem
bedd
ings
item
em
bedd
ings
Outp
ut
cros
s-ite
m
Probability that user X likes vehicle Y
Deep Component
Wide Component
one-
hot
37
Recommendations support users to find the perfect vehicle based on their preferences and by collaboration
38
Any questions?lusion