improving recommendation lists through topic diversification cainicolas ziegler, sean m....

29
Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報報報 : 報報報 1

Upload: loreen-barber

Post on 13-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Improving Recommendation Lists ThroughTopic Diversification

CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05

報告人 :謝順宏

1

Page 2: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Outline

• Introduction• On collaborative filtering• Evaluation metrics• Topic diversification• Empirical analysis• Related work• Conclusion

2

Page 3: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Introduction

• To reflect the user’s complete spectrum of interests.

• Improves user satisfaction.• Many recommendations seem to be “similar”

with respect to content.• Traditionally, recommender system projects

have focused on optimizing accuracy using metrics such as precision/recall or mean absolute error.

3

Page 4: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Introduction

• Topic diversification• Intra-list similarity metric.• Accuracy versus satisfaction.– “accuracy does not tell the whole story”

4

Page 5: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

On collaborative filtering(CF)

• Collaborative filtering (CF) still represents the most commonly adopted technique in crafting academic and commercial recommender systems.

• Its basic idea refers to making recommendations based upon ratings that users have assigned to products.

5

Page 6: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

User-based Collaborative Filtering

• a set of users• a set of products• partial rating function for each user,

},...,,{ 21 naaaA},...,,{ 21 nbbbB

]1,1[: Bri

ia

})(|{ kiki brBbBR

6

Page 7: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

User-based Collaborative Filtering

Two major steps:• Neighborhood formation.– Pearson correlation– Cosine distance

• Rating prediction

7

Page 8: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Itembased Collaborative Filtering

• Unlike user-based CF, similarity values c are computed for items rather than users.

8

Page 9: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Evaluation metrics

• Accuracy Metrics– Predictive Accuracy Metrics– Decision Support Metrics

• Beyond Accuracy– Coverage– Novelty and Serendipity

• Intra-List Similarity

9

Page 10: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Accuracy Metrics

• Predictive Accuracy Metrics– Mean absolute error (MAE)

– Mean squared error(MSE)

• Decision Support Metrics– Recall– Precision

10

Page 11: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Beyond Accuracy

• Coverage– Coverage measures the percentage of elements part

of the problem domain for which predictions can be made.

• Novelty and Serendipity– Novelty and serendipity metrics thus measure the

“non-obviousness” of recommendations made, avoiding “cherry-picking”.

11

Page 12: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Intra List Similarity(ILS)

• To measure the similarity between product

),( ek bbCo

ek bb ,

12

Page 13: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Topic Diversification

“Law of Diminishing Marginal Returns”• Suppose you are offered your favorite drink.

Let p1 denote the price you are willing to pay for that product. Assuming your are offered a second glass of that particular drink, the amount p2 of money you are inclined to spend will be lower, i.e., p1 > p2. Same for p3, p4, and so forth.

13

Page 14: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Topic Diversification

• Taxonomy-based similarity metric• To compute the similarity between product sets based

upon their classification.

14

Page 15: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Topic Diversification

• Topic Diversification Algorithm– Re-ranking the recommendation list from applying

topic diversification.

15

Page 16: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Topic Diversification

• ΘF defines the impact that dissimilarity rank exerts on the eventual overall output.

• Large ΘF favors diversification over a’s original relevance order.

• The input lists muse be considerably larger than the final top-N list.

rev

cP *

]1,5.0[

16

Page 17: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Recommendation dependency

• We assume that recommended products along with their content descriptions, only relevance weight ordering must hold for recommendation list items, no other dependencies are assumed.

• An item b’s current dissimilarity rank with respect to preceding recommendations plays an important role and may influence the new ranking.

17

Page 18: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Empirical analysis

• Dataset – BookCrossing (http://www.bookcrossing.com)– 278,858 members– 1,157,112 ratings– 271,379 distinct ISBN

18

Page 19: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Data clean & condensation

• Discarded all books missing taxonomic descriptions.

• Only community members with at least 5 ratings each were kept.

– 10339 users– 6708books– 316349 ratings

19

Page 20: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Evaluation Framework Setup

• Did not compute MAE metric values• Adopted K-folding (K=4)• We were interested in seeing how accuracy,

captured by precision and recall, behaves whe increasing θF.

20

Page 21: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Empirical analysis

• ΘF=0,

21

Page 22: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Empirical analysis

22

Page 23: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Empirical analysis

23

Page 24: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Conclusion

• We found that diversification appears detrimental to both user-based and item-based CF along precision and recall metrics.

• Item-based CF seems more susceptible to topic diversification than user-based CF, backed by result from precision, recall and ILS metric analysis.

24

Page 25: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Empirical analysis

25

Page 26: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Conclusion

• Diversification factor impact• Human perception• Interaction with accuracy

26

Page 27: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Multiple Linear Regression

27

Page 28: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Related work

• Northern Light (http://www.northernlight.com)

• Google (http://www.google.com)

28

Page 29: Improving Recommendation Lists Through Topic Diversification CaiNicolas Ziegler, Sean M. McNee,Joseph A. Konstan, Georg Lausen WWW '05 報告人 : 謝順宏 1

Conclusion

• An algorithmic framework to increase the diversity of a top-N list of recommended products.

• New intra-list similarity metric.

29