1 ranked-listed or categorized results in ir zheng zhu, ingemar j. cox, mark levene birkbeck...

19
1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

Upload: natalie-carpenter

Post on 28-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

1

Ranked-Listed or Categorized Results in IR

Zheng Zhu, Ingemar J. Cox, Mark LeveneBirkbeck College, University of London

UCL

Page 2: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

2

Content

• Motivation• Methodology• Results• Conclusions

Page 3: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

3

The motivation

• Improve navigational experience for both normal users and users of handheld devices.

• Intuitively, we would expect grouping documents to reduce search time.

Page 4: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

4

Introduction

• We quantify the benefits of grouping documents based on classification.

• We study how the benefits of grouping degrade with classification errors.

• We take into account errors that arise from both the user and the classifier.

Page 5: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

5

The methodology

• Three types of simulated user model:1. The user knows the class.2. The user doesn’t know the class.3. The user thinks he knows the class.

• Two classification scenarios:1. Correct classification

2. misclassification

Page 6: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

6

The methodology

• To measure the benefits, we define:– class rank.– document rank.

• For ranked-list results, scroll rank is used

Page 7: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

7

The Methodology

• For categorized results, based on different user models and operation scenarios, we define: – In-Class Rank(ICR), – Scrolled-Classification Rank(SCR),– Out-Class/Scroll-Class Rank(OSCR)– Out-Class/Revert Rank(ORR).

Page 8: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

8

The methodology

querydoc1

doc2

doc3

doc4

doc5

doc6

doc7

doc1

doc2

doc3

doc4

doc5

doc6

doc7

Class1:

Class2:

SR=6

ICR=1+3=4

SCR=1+3=4

ORR=2+4+6=12

OSCR=2+4+4=10

Page 9: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

9

The methodology

Simulated user/target

Correctly classified

misclassified

Knows class ICR OSCR or ORR

Does not know class

SCR or SR SCR or SR

Thinks knows class

OSCR or ORR OSCR or ORR

Page 10: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

10

The methodology

• Known-Item Search (Target Testing), followed by comparison of the ranks.

• Given a document, we generate a query so that the target document appears within a designated range of scroll rank.

Page 11: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

11

The implementation

• Open Directory Project provides an oracle for classification so that we can control both user and machine error.

• Search Engine is based on Lucene, which is an open source tool.

Page 12: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

12

The ideal case with an Oracle

Page 13: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

13

KNN Classifier

Page 14: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

14

More realistic scenario

Page 15: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

15

Conclusions

• Classification-based display can improve users’ interaction with SE

• However, this depends on the user strategy:– The hybrid strategy has the best

performance.– Using a hybrid strategy, performance

degrades gracefully with errors

Page 16: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

16

Thanks!

Page 17: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

17

Page 18: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

18

Page 19: 1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

19

Reference

• Kummamuru, K., Lotlikar, R., Roy, S., Singal, K., Krishnapuram, R.: A hierarchi-cal monothetic document clustering algorithm for summarization and browsing search results. In: Proceedings of the 13th International Conference on World Wide Web, pp. 658–665 (2004)

• Chen, H., Dumais, S.: Bring order to the web: Automatically categorizing search results. In: CHI 2000: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 145–152. ACM Press, New York (2000)

• http://www.useit.com/alertbox/reading_pattern.html