Transcript
Page 1: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

10/1/2015

Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

HUSSEIN HAZIMEH AND CHENGXIANG ZHAI

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

1

Page 2: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Pseudo Relevance Feedback

Judgments:

d1 +

d2 -

d3 +

dk -

...

Query Retrieval

Engine

Results:

d1 3.5

d2 2.4

dk 0.5

...

User

Document

collection

Judgments:

d1 +

d2 +

d3 +

dk -

...

top 10

Pseudo feedback

Assume top 10 docsare relevant

Relevance feedback User judges documents

New

Query

FeedbackLearn from

Examples

2

Page 3: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Pseudo-Relevance Feedback

It’s blind!

Good for high recall information needs

A Blind Superhero. Courtesy of iStock

3

Page 4: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Collection-based Smoothing

Collection-based smoothing is generally used for LM-based retrieval functions and for PRF models

A commonly used collection-based smoothing scheme is Dirichletprior smoothing:

Dirichlet Prior (Smoothing Parameter)

Document Length

Count of Word in Document

4

Page 5: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Study of Smoothing Methods in PRF

We will establish both analytically and empirically that collection-based smoothing is not a good choice for PRF:◦ It forces PRF models to select very common words

Additive smoothing will be shown to outperform the collection-based counterpart

5

Page 6: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

How Do LM PRF Models Work?

D1

Dn

… Averaging

Function: 𝑨

Scoring

Function: 𝒇

𝑃(𝑤|𝜃1)

𝑃(𝑤|𝜃𝑛)

𝑃(𝑤|𝜃𝐶)

𝑃(𝑤|𝜃𝐹)

6

Page 7: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

How Do LM PRF Models Work?

The feedback LM, 𝜃𝐹, would generally have the following form:

𝐴:ℝ𝑛 → ℝ is an averaging function, e.g. geometric mean

𝑓:ℝ2 → ℝ is a function increasing in the first argument and decreasing in the second

Rewards common words in feedback set

Penalizes common words in collection

7

Page 8: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Problem!

The first argument rewards common words in the collection while the second penalizes them. The analysis shows that the first argument usually “wins”!

Rewards common words in feedback set

and collection

Penalizes common words in collection

Proportional to 𝑃(𝑤|Θ𝐶)

8

Page 9: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Overview of the Analysis

We considered three PRF models in the study:◦ Divergence Minimization Model

◦ Relevance Model

◦ Geometric Relevance Model

Next, we will briefly discuss how the DMM and GRM work and then give an overview of the axiomatic analysis.

The analysis of the RM is very similar to the GRM and the same results apply

9

Page 10: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Divergence Minimization Model (Zhai and Lafferty, 2001)

The DMM solves the following optimization problem:

The solution has a closed form and is given by:

10

Page 11: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Geometric Relevance Model (Seo and Croft, 2010)

An enhanced form of the Relevance Model (RM) that replaces the arithmetic mean used in RM by the geometric mean:

Note that the function above is not is not affected by 𝑃(𝑤|𝜃𝑐), i.e., the model is not designed to penalize common words.

11

Page 12: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Main Axiom: IDF Effect (Clinchant and Gaussier, 2013)

Rationale: A PRF model is expected to penalize common words in the collection in order to select high quality discriminative terms.

Given any two words 𝑤1and 𝑤2 from the feedback set 𝐷1, 𝐷2,

12

Page 13: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

DMM with Collection-based smoothing: IDF Effect

Study the sign of:

Not straightforward. Strategy:◦ Find an attainable lower bound on the expression above

◦ Study the sign of the lower bound

◦ If the lower bound is strictly positive, then DMM supports the IDF effect

13

Page 14: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

DMM: Results of Analysis

Conclusion: Using collection-based smoothing the DMM will be either consistently reward common terms or will select only one feedback term

14

Page 15: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

GRM with Collection-based smoothing: IDF Effect

The GRM cannot support the IDF effect:

It consistently rewards favors common words in the collection

15

Page 16: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Proposed Solution: Additive Smoothing

Words get additional pseudo-counts:

Next, we show how additive smoothing prevents the models from rewarding common terms

16

Page 17: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

DMM with Additive Smoothing: IDF Effect

The DMM unconditionally supports the IDF Effect:

Now it is performing the intended objective!

17

Page 18: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

DMM: Empirical Validation

Query: “Computer”

18

Page 19: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

GRM with Additive Smoothing: IDF Effect

Although the IDF effect is still not supported:

However, common terms are no longer being rewarded!

19

Page 20: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

GRM: Empirical Validation

Query: “Computer”

20

Page 21: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Empirical Evaluation: Retrieval Measures

21

Page 22: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Empirical Evaluation: Robustness of Additive Smoothing

22

Page 23: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Measuring the Discrimination of PRF Models

In previous studies, the average of the IDF of the top terms was used as an indicator of how discriminative the terms selected by a PRF method are

Such a measure might not work well in some cases

We propose the Discrimination Measure (DM):

≈ Expected Document Frequency

Constant

23

Page 24: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Empirical Evaluation: Discrimination Measure

A several-fold decrease in the expected document frequency

24

Page 25: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Conclusion

Collection-based smoothing forces PRF models to select very common terms◦ The same problem might exist in other applications where LMs are aggregated

Additive smoothing prevents PRF models from rewarding common terms and increases the retrieval performance significantly

A new measure for quantifying PRF Discrimination

25

Page 26: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Future Work

Should PRF models penalize common words?

Analysis of other smoothing methods such as topic-based smoothing

Inspect areas, other than PRF, where collection-based smoothing is used in aggregating language models

26

Page 27: Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevance Feedback

Thanks to SIGIR for the Student Travel Grant!

Thank you for Listening!

27


Top Related