learning to rank fulltext results from clicks
DESCRIPTION
TRANSCRIPT
Learning to rank fulltext results from
clicksTomáš Kramár
@tkramar@synopsitv
Let's build a fulltext search engine.
QueryFind matches
Rank results
1 2
43
Let's build a fulltext search engine.
QueryFind matches
Rank results
1 2
43
● ElasticSearch● LIKE %%● ...
Let's build a fulltext search engine.
QueryFind matches
Rank results
1 2
43
● By number of hits● By PageRank● By Date● ...
How do you choose relevant results?
Number of keywords in title
2 2
Number of keywords in text
2 0
Domain carrerjet.sk vienna-rb.at
Category Job search Programming
Language Slovak English
Document feature How much I care about it (the higher the more I care)
# keywords in title 2.1
# keywords in text 1
Domain is carreerjet.sk -2
Domain is vienna-rb.at 3.5
Category is Job Search -1
Category is Programming 4.2
Language is Slovak 0.9
Language is English 1.5
Document feature How much I care about it
# keywords in title 2.1 2 2
# keywords in text 1 2 0
Domain is carreerjet.sk -2 1 0
Domain is vienna-rb.at 3.5 0 1
Category is Job Search -1 1 0
Category is Programming 4.2 0 1
Language is Slovak 0.9 1 0
Language is English 1.5 0 1
= 4.1 = 13.3rank = d . u
Rate each result on a scale 1-5.
rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un
d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3
d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5
d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1
d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3
rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un
d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3
d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5
d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1
d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3
di,j are known, solve this system of
equations and you have u. Done.
Except..
● You don't know the explicit ratings
● User preferences change in time● Those equations probably don't
have solution
Clicked! Assume rating 1.
Not clicked. Assume rating 0.
Except..
● You don't know the explicit ratings
● User preferences change in time● Those equations probably don't
have solution
Approximation functionh(d): d → rankh(d) = d1.u1 + ... + dn.un = estimated_rank
If the function is good, it should make minimal errorserror = (estimated_rank - real_rank)2
Gradient descent
1. Set user preferences (u) to arbitrary values
2. Calculate the estimated rank h(d) for each document
3. Calculate the mean square error4. Adjust preferences u in a way that
minimizes the error5. Repeat until the error converges
mea
n sq
uare
err
or
u# of keywords in title
cost function
mea
n sq
uare
err
or
u# of keywords in title
cost function
Calculate the derivation of cost function at this point and it will give you the direction to move in.
Preference update
ui = ui - α.h(d)dui
α learning rate
h(d)dui partial derivation of cost function h(d) by ui
Preference update
ui = ui - α.h(d)dui
α learning rate
h(d)dui partial derivation of cost function h(d) by ui
How fast will you move. Too low - slow progress. Too high - you will overshoot.
Preference update
ui = ui - α.h(d)dui
α learning rate
h(d)dui partial derivation of cost function h(d) by ui
Nothing scary. You can find these online for standard cost functions.
For mean square error:
(rank(d) - h(d)) * ui
Gradient descent
1. Set user preferences (u) to arbitrary values
2. Calculate the estimated rank h(d) for each document
3. Calculate the square error4. Adjust preferences u in a way that
minimizes the error5. Repeat until the error converges
Clicked! Assume rating 1.
Clicked! Assume rating 1. Or? Doesn't this mean result #1 is not relevant?
Clicked! Assume nothing.
Clicked! Assume it is better than #2 and #3.
What's changed?
We no longer have ratings, just document comparisons.
Cost function - something that considers ordering, e.g., Kendall's T (number of concordant and discordant pairs)
h is now a function of 2 parameters: h(d1, d2). But you can just do d2 - d1 and learn on that.
d4 > d3
d4 > d2