learning to rank data2day 2017

Post on 21-Jan-2018

157 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Learning to Rank

Stefan Kühn

Join me on XING

data2day Heidelberg - September 28th, 2017

Stefan Kühn (XING) Ranking 28.09.2017 1 / 30

Contents

1 Rankings and Humans

2 Ranking and Machine Learning

3 Formalizing Ranking Problems

4 Rankings and Recommender Systems

Stefan Kühn (XING) Ranking 28.09.2017 2 / 30

1 Rankings and Humans

2 Ranking and Machine Learning

3 Formalizing Ranking Problems

4 Rankings and Recommender Systems

Stefan Kühn (XING) Ranking 28.09.2017 3 / 30

Rankings in Everyday Life

TODO ListsPrioritized BacklogsTop X songs/movies/. . .You get the idea. . .

Stefan Kühn (XING) Ranking 28.09.2017 4 / 30

Rankings in History

It all started with

Stefan Kühn (XING) Ranking 28.09.2017 5 / 30

Rankings Nowadays

German States by Employee Happiness (according to Kununu)

Stefan Kühn (XING) Ranking 28.09.2017 6 / 30

Rankings, Heuristics, Decisions

Rankings are about comparisonsRankings are about decision-makingSome heuristics are about both

Recognition HeuristicIf one of two objects is recognized and the other is not, then infer that therecognized object has the higher value with respect to the criterion.proposed by Gigerenzer and Goldstein, built upon the great works of Kahneman and Tversky

Stefan Kühn (XING) Ranking 28.09.2017 7 / 30

1 Rankings and Humans

2 Ranking and Machine Learning

3 Formalizing Ranking Problems

4 Rankings and Recommender Systems

Stefan Kühn (XING) Ranking 28.09.2017 8 / 30

Learning

Is Ranking a Machine Learning Problem?

Stefan Kühn (XING) Ranking 28.09.2017 9 / 30

Machine Learning Concepts

Supervised - Learning from LabelsFigure out how to generate correct labels using the given data

ClassificationRegression

Unsupervised - Learning from DataIdentify hidden/inherent structure using the given data

ClusteringDimensionality Reduction / Manifold LearningOutlier Detection

Stefan Kühn (XING) Ranking 28.09.2017 10 / 30

Supervised versus Unsupervised

Learning to RankFigure out how to generate good ranking using the given data

What about Learning to Rank = Machine-Learned Ranking or MLR?1 Supervised because ranks are like labels?2 Unsupervised because ranks are typically based on implicit feedback,

i.e. latent/hidden/inherent structure?3 Mixed/intermediate/something else?4 Ill-posed question?

Could you please rank these options according to whatever you think isappropriate?

And by the way, how did you do it?

Stefan Kühn (XING) Ranking 28.09.2017 11 / 30

Supervised versus Unsupervised

Learning to RankFigure out how to generate good ranking using the given data

What about Learning to Rank = Machine-Learned Ranking or MLR?1 Supervised because ranks are like labels?2 Unsupervised because ranks are typically based on implicit feedback,

i.e. latent/hidden/inherent structure?3 Mixed/intermediate/something else?4 Ill-posed question?

Could you please rank these options according to whatever you think isappropriate?

And by the way, how did you do it?

Stefan Kühn (XING) Ranking 28.09.2017 11 / 30

Example: XING Stream

How to order News?

By time?By content/topic?By popularity?By clicking probability?

Every choice changes the problem tosolve while the result set is always thesame - a ranked list of items. Everychoice represents a different distancemeasure / objective function tominimize.

Stefan Kühn (XING) Ranking 28.09.2017 12 / 30

1 Rankings and Humans

2 Ranking and Machine Learning

3 Formalizing Ranking Problems

4 Rankings and Recommender Systems

Stefan Kühn (XING) Ranking 28.09.2017 13 / 30

Ranking - Problem Formulation

Items x ∈ X

Ordered Labels or Ranks 1 > 2 > . . . > k > . . .

Ranking rule f that allows to do the following:I Input: Unordered subset {x , y , z , . . .} ⊆ XI Output: Ordered list, i.e. y > z > x > . . .

Example: Text searchItems: Set of DocumentsRanking rule f : Similarity measure for documents and search terms

Stefan Kühn (XING) Ranking 28.09.2017 14 / 30

Ranking and Level of Measurement

Supervised Learning ProblemsClassification - Nominal Scale - Class LabelsRanking - Ordinal Scale - RanksRegression - Intervall Scale - Real Values

Ranking is the task of predicting labels on an ordinal scale.

Informally: Learn ordering from labeled training data - typically ordered listsof items - and try to predict ordering for new sets of items.

What is special about this?Ordering is context-dependent. One additional item (or one item less) canchange all other ranks. This is clearly different compared to regression andclassification.

Stefan Kühn (XING) Ranking 28.09.2017 15 / 30

Ranking in Information Retrieval

CC BY-SA 3.0,https://commons.wikimedia.org/w/index.php?curid=518546

Stefan Kühn (XING) Ranking 28.09.2017 16 / 30

Ranking - Pointwise

Approach CharacteristicsInput: Single itemsEvaluation: Scoring function evaluated for each point/itemOptimization: Loss function derived from individual scores

Reduces Ranking Problem to eitherRegressionClassificationOrdinal Regression

Stefan Kühn (XING) Ranking 28.09.2017 17 / 30

Ranking - Pointwise

Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf

Stefan Kühn (XING) Ranking 28.09.2017 18 / 30

Ranking - Pointwise

Problems with the Pointwise Approach

Length of item lists can differ significantlyExample: There are more website related to the search term Online(ca. 10 Mrd.) than to Offline (ca. 666 Mio)Position of items on list is not taken into accountExample: Incorrect ordering of the top 10 results will have a slightlybigger impact than errors/inversions below position 123456789

ConsequenceLonger lists will dominate the optimization, while actually the shorter listsare more important for humans/customers.

Advantages

If all individual scores are known, all possible Rankings are determined.Stefan Kühn (XING) Ranking 28.09.2017 19 / 30

Ranking - Pairwise

Approach CharacteristicsInput: Pairs of ItemsEvaluation: Preference function evaluated for each pair - binaryclassificationOptimization: Pairwise Classification Loss derived from all pairings,weighted majority voting

Reduces Ranking Problem toBinary (or pairwise) Classification

Stefan Kühn (XING) Ranking 28.09.2017 20 / 30

Ranking - Pairwise

Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf

Stefan Kühn (XING) Ranking 28.09.2017 21 / 30

Ranking - Pairwise

Problems with the Pairwise Approach

Length of item lists can differ significantlyNumber of pairs depends quadratically on the length of the listEven bigger imbalance w.r.t. list length

Advantages

Comparisons of pairs of elements is a much more natural approach toRanking than Regression or Classification.

Stefan Kühn (XING) Ranking 28.09.2017 22 / 30

Ranking - Listwise

Approach CharacteristicsInput: Set of ItemsEvaluation: Some Evaluation MetricOptimization:

I Either: Directly minimize Evaluation MetricI Or: Loss function defined for permutations of the given input

Reduces Ranking Problem to eitherDirect Optimization of Evaluation MetricListwise Loss Optimization (Distance between lists is non-trivial)

Stefan Kühn (XING) Ranking 28.09.2017 23 / 30

Ranking - Listwise

Image taken from Tie-Yan Liu @ WWW 2009 Tutorial on Learning to Rankhttp://wwwconference.org/www2009/pdf/T7A-LEARNING TO RANK TUTORIAL.pdf

Stefan Kühn (XING) Ranking 28.09.2017 24 / 30

Ranking - Listwise

Problems with the Listwise Approach

Huge complexity issueDirect Optimization: Non-smooth functionsOften only incomplete knowledge about ground truth for lists (onlytiny subset available for learning)

Advantages

Positions on lists are visible to the algorithms.

Stefan Kühn (XING) Ranking 28.09.2017 25 / 30

Important Contributions

Natural Language ProcessingI tf-idfI Okapi BM25I Link to Information Theory

Interesting Nonlinear Evaluation MetricsI P@k = Precision restricted to the best k itemsI MAPI Discounted Cumulative Gain = DCG

Interesting Non-Standard Ojective FunctionsI (N)DCG as optimization objectiveI non-continuous and non-smooth

Interesting RankersI Pointwise: Subset Ranking; McRank; PRanking (Ordinal Regression)I Pairwise: RankNet; FRank; RankBoost; Ranking SVMI Listwise: SoftRank; SoftNDCG; SVM-MAP, Structural SVM, AdaRank

Stefan Kühn (XING) Ranking 28.09.2017 26 / 30

1 Rankings and Humans

2 Ranking and Machine Learning

3 Formalizing Ranking Problems

4 Rankings and Recommender Systems

Stefan Kühn (XING) Ranking 28.09.2017 27 / 30

Example: Personalized Ad Recommendations

Standard ApproachesI Contextual BanditsI Policies based on classifiers for

each adI Collaborative FilteringI Based on Latent Features,

e.g. when using MatrixFactorization

Main ProblemI Extreme sparsity of positive

feedback

Stefan Kühn (XING) Ranking 28.09.2017 28 / 30

Example: Personalized Ad Recommendations

New ApproachesI Still Contextual BanditsI Policies based on rankers

instead of classifiersRecent Paper by Chaudhuri etal.

I Personalized AdvertisementRecommendation: A RankingApproach to Address theUbiquitous Click SparsityProblem

I Works best in the case ofextreme sparsity

Stefan Kühn (XING) Ranking 28.09.2017 29 / 30

Thank you!

Stefan Kühn (XING) Ranking 28.09.2017 30 / 30

top related