learning to rank fulltext results from clicks

Learning to rank fulltext results from

clicksTomáš Kramár

@tkramar@synopsitv

Let's build a fulltext search engine.

QueryFind matches

Rank results

1 2

43


QueryFind matches

Rank results

1 2

43

● ElasticSearch● LIKE %%● ...


QueryFind matches

Rank results

1 2

43

● By number of hits● By PageRank● By Date● ...

How do you choose relevant results?

Number of keywords in title

2 2

Number of keywords in text

2 0

Domain carrerjet.sk vienna-rb.at

Category Job search Programming

Language Slovak English

Document feature How much I care about it (the higher the more I care)

# keywords in title 2.1

# keywords in text 1

Domain is carreerjet.sk -2

Domain is vienna-rb.at 3.5

Category is Job Search -1

Category is Programming 4.2

Language is Slovak 0.9

Language is English 1.5

Document feature How much I care about it

# keywords in title 2.1 2 2

# keywords in text 1 2 0

Domain is carreerjet.sk -2 1 0

Domain is vienna-rb.at 3.5 0 1

Category is Job Search -1 1 0

Category is Programming 4.2 0 1

Language is Slovak 0.9 1 0

Language is English 1.5 0 1

= 4.1 = 13.3rank = d . u

Rate each result on a scale 1-5.

rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un

d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3

d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5

d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1

d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3

rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un

d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3

d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5

d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1

d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3

di,j are known, solve this system of

equations and you have u. Done.

Except..

● You don't know the explicit ratings

● User preferences change in time● Those equations probably don't

have solution

Clicked! Assume rating 1.

Not clicked. Assume rating 0.

Except..

● You don't know the explicit ratings

● User preferences change in time● Those equations probably don't

have solution

Approximation functionh(d): d → rankh(d) = d1.u1 + ... + dn.un = estimated_rank

If the function is good, it should make minimal errorserror = (estimated_rank - real_rank)2

Gradient descent

1. Set user preferences (u) to arbitrary values

2. Calculate the estimated rank h(d) for each document

3. Calculate the mean square error4. Adjust preferences u in a way that

minimizes the error5. Repeat until the error converges

mea

n sq

uare

err

or

u# of keywords in title

cost function

mea

n sq

uare

err

or

u# of keywords in title

cost function

Calculate the derivation of cost function at this point and it will give you the direction to move in.

Preference update

ui = ui - α.h(d)dui

α learning rate

h(d)dui partial derivation of cost function h(d) by ui

Preference update


α learning rate


How fast will you move. Too low - slow progress. Too high - you will overshoot.

Preference update


α learning rate


Nothing scary. You can find these online for standard cost functions.

For mean square error:

(rank(d) - h(d)) * ui

Gradient descent

1. Set user preferences (u) to arbitrary values

2. Calculate the estimated rank h(d) for each document

3. Calculate the square error4. Adjust preferences u in a way that

minimizes the error5. Repeat until the error converges

Clicked! Assume rating 1.

Clicked! Assume rating 1. Or? Doesn't this mean result #1 is not relevant?

Clicked! Assume nothing.

Clicked! Assume it is better than #2 and #3.

What's changed?

We no longer have ratings, just document comparisons.

Cost function - something that considers ordering, e.g., Kendall's T (number of concordant and discordant pairs)

h is now a function of 2 parameters: h(d1, d2). But you can just do d2 - d1 and learn on that.

d4 > d3

d4 > d2

learning to rank fulltext results from clicks

Technology