the magic barrier of recommender systems - no magic, just ratings

26
The Magic Barrier of Recommender Systems: No Magic, Just ra;ngs Alejandro Bellogín, Alan Said , Arjen de Vries @abellogin, @alansaid, @arjenpdevries UMAP 2014 20140708

Upload: alan-said

Post on 23-Aug-2014

603 views

Category:

Internet


0 download

DESCRIPTION

Recommender Systems need to deal with different types of users who represent their preferences in various ways. This difference in user behaviour has a deep impact on the final performance of the recommender system, where some users may receive either better or worse recommendations depending, mostly, on the quantity and the quality of the information the system knows about the user. Specifically, the inconsistencies of the user impose a lower bound on the error the system may achieve when predicting ratings for that particular user. In this work, we analyse how the consistency of user ratings (coherence) may predict the performance of recommendation methods. More specifically, our results show that our definition of coherence is correlated with the so-called magic barrier of recommender systems, and thus, it could be used to discriminate between easy users (those with a low magic barrier) and difficult ones (those with a high magic barrier). We report experiments where the rating prediction error for the more coherent users is lower than that of the less coherent ones. We further validate these results by using a public dataset, where the magic barrier is not available, in which we obtain similar performance improvements.

TRANSCRIPT

Page 1: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

The  Magic  Barrier  of    Recommender  Systems:  

No  Magic,  Just  ra;ngs  

Alejandro  Bellogín,  Alan  Said,  Arjen  de  Vries  @abellogin,  @alansaid,  @arjenpdevries  

 UMAP  2014  2014-­‐07-­‐08  

Page 2: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

What  is  the  magic  barrier?  

2  

Page 3: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

The  Magic  Barrier  The  upper  bound  on  recommenda;on  accuracy    Recommenda;on  accuracy  is  only  as  good  as  the  quality  of  the  underlying  data  Users  are  noisy  /  incoherent  in  their  ra;ng  behavior    

3  

Page 4: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

How  to  find  the  magic  barrier?  1.  Collect  infinite  re-­‐ra;ngs  for  each  

item  from  your  users.  2.  Find  standard  devia;on  of  ra;ng/

re-­‐ra;ng  inconsistencies  (noise)  [Said  et  al.  UMAP  ‘12]  

Realis;cally,  an  infinite  amount  of  re-­‐ra;ngs  is  not  probable  

4  

Page 5: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Coherence  

5  

Page 6: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

What  is  coherence?  Coherent  user  

we have two users that have rated the exact same set of items, although their ratingsare slightly different. Specifically, the user in Figure 1a gives more consistent ratingsto items sharing the same features, which in this case corresponds to the movie’s gen-res. In the long term we argue that such a user will have a more consistent (less noisy)behaviour, since her taste for each item feature seem to be well defined.

Action Comedy

Horror

(a) A coherent user (c(u) = �1.28)

Action Comedy

Horror

(b) A not coherent user (c(u) = �5.95)

Fig. 1: Example of a coherent vs. not coherent user. Our definition of coherence takesinto account the rating’s deviation within each item feature, which in this example con-sists of three genres: action, comedy, and horror.

3.2 A Simple Definition for User Coherence

Following the rationale presented before, we define the user coherence based on a setof item features F as:

c(u) = �X

f2F�f (u) (2)

where �f (u) =qP

i2I(u,f) (r(u, i)� r̄f (u))2 corresponds to the user’s rating devia-tion within a specific feature f , having an associated mean rating for that feature r̄f (u),which simply corresponds to the average rating within the set of items rated by user u

that belong to feature f , denoted here as I(u, f). We refer to this formulation as basic

coherence.With this formulation, the coherence c(u) captures the variance of a user’s ratings

relative to the feature space in which the items are defined. Moreover, it also incorpo-rates a negative sign to indicate that the larger the variance, the less coherent (or moreincoherent) a user should be.

The key aspect of this function, hence, is that we are accounting for the rating de-viation of the user with respect to a particular feature space. Besides, such definitionallows for a more general case, where the space F could be – instead of (textual) itemfeatures – any embedding of the items into a space F , such as an item clustering orthe latent factors of the items, and also other functions apart from standard deviation tostatistically summarise the user’s ratings for a given feature, as will be described in thenext section.

Incoherent  user  

we have two users that have rated the exact same set of items, although their ratingsare slightly different. Specifically, the user in Figure 1a gives more consistent ratingsto items sharing the same features, which in this case corresponds to the movie’s gen-res. In the long term we argue that such a user will have a more consistent (less noisy)behaviour, since her taste for each item feature seem to be well defined.

Action Comedy

Horror

(a) A coherent user (c(u) = �1.28)

Action Comedy

Horror

(b) A not coherent user (c(u) = �5.95)

Fig. 1: Example of a coherent vs. not coherent user. Our definition of coherence takesinto account the rating’s deviation within each item feature, which in this example con-sists of three genres: action, comedy, and horror.

3.2 A Simple Definition for User Coherence

Following the rationale presented before, we define the user coherence based on a setof item features F as:

c(u) = �X

f2F�f (u) (2)

where �f (u) =qP

i2I(u,f) (r(u, i)� r̄f (u))2 corresponds to the user’s rating devia-tion within a specific feature f , having an associated mean rating for that feature r̄f (u),which simply corresponds to the average rating within the set of items rated by user u

that belong to feature f , denoted here as I(u, f). We refer to this formulation as basic

coherence.With this formulation, the coherence c(u) captures the variance of a user’s ratings

relative to the feature space in which the items are defined. Moreover, it also incorpo-rates a negative sign to indicate that the larger the variance, the less coherent (or moreincoherent) a user should be.

The key aspect of this function, hence, is that we are accounting for the rating de-viation of the user with respect to a particular feature space. Besides, such definitionallows for a more general case, where the space F could be – instead of (textual) itemfeatures – any embedding of the items into a space F , such as an item clustering orthe latent factors of the items, and also other functions apart from standard deviation tostatistically summarise the user’s ratings for a given feature, as will be described in thenext section.

6  

Page 7: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

DefiniAon  of  Coherence  Ra;ng  consistency  within  a  feature  space  (genres,  themes,  etc.)    

c u( ) = − σ ff∈F∑ u( )

σ f u( ) = r u, i( )− rf u( )( )2

i∈I u, f( )∑

Item  features  

User’s  ra;ng  devia;on  within  a  feature  

7  

Page 8: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Ra;ng  devia;on  within  a  feature  space  instead  of  

infinite  re-­‐ra;ngs  

Is  this  a  good  es;mate  of  the  magic  barrier?  

8  

Page 9: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Experiments  EX1:  Is  ra;ng  coherence  a  good  predictor  for  the  Magic  Barrier?        EX2:  Can  we  use  ra;ng  coherence  to  improve  recommenda;on  results?Especially  when  we  do  not  have  re-­‐ra;ngs/magic  barrier?    

9  

Page 10: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

RaAng  Coherence  vs.  Magic  Barrier  Rank  users  according  to  magic  barrier/ra;ng  coherence:  •  High/low  magic  barrier  •  Low/high  ra;ng  coherence  in  the  

“genre”  feature  space    Find  correla;on  (Spearman)  between  magic  barrier/coherence  ranking    *Dataset  from  moviepilot.de,  contains  re-­‐ra;ngs  from  306  users  [Said  et  al.  UMAP  ‘12]  

 

10  

Page 11: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Magic  Barrier  &  Ra;ng  Coherence  correla;on  

11  

Page 12: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Experiments  EX1:  Is  ra;ng  coherence  a  good  predictor  for  the  Magic  Barrier?        EX2:  Can  we  use  ra;ng  coherence  to  improve  recommenda;on  results?  Especially  when  we  do  not  have  re-­‐ra;ngs/magic  barrier?  

12  

Page 13: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Use  raAng  coherence  for  recommendaAon  Divide  users  into  coherent  and  incoherent  –  with  higher  an  lower  magic  barrier  Movielens  1M    Train  each  group  separately    Evaluate  each  group  separately      

13  

Page 14: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

RecommendaAon  and  EvaluaAon  Algorithm:  K-­‐NN,  Pearson    Training  sets  •  “Easy”  users  split  •  “Difficult”  users  split  •  All  users  split    Evalua;on  sets  •  “Easy”  users  split  •  “Difficult”  users  split  •  All  users  split  

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

14  

Page 15: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

EvaluaAon  Comparisons    All  -­‐  All  All  –  Easy  All  –  Difficult  Easy  –  Easy  Difficult  –  Difficult      

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

RMSE:  1.090  User  coverage:  100%  

 

15  

Page 16: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

EvaluaAon  Comparisons    All  -­‐  All  All  –  Easy  All  –  Difficult  Easy  –  Easy  Difficult  –  Difficult      

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

RMSE:  0.974  User  coverage:  50%  

 

16  

Page 17: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

EvaluaAon  Comparisons    All  -­‐  All  All  –  Easy  All  –  Difficult  Easy  –  Easy  Difficult  –  Difficult      

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

RMSE:  1.195  User  coverage:  50%  

 

17  

Page 18: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

EvaluaAon  Comparisons    All  -­‐  All  All  –  Easy  All  –  Difficult  Easy  –  Easy  Difficult  –  Difficult      

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

RMSE:  0.933  User  coverage:  50%  

 

18  

Page 19: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

EvaluaAon  Comparisons    All  -­‐  All  All  –  Easy  All  –  Difficult  Easy  –  Easy  Difficult  –  Difficult      

Training  Easy  

Training  Difficult  

Test  Easy  

Test  Difficult  

RMSE:  1.226  User  coverage:  50%  

 

19  

Page 20: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Results  Subset   RMSE   User  Coverage  

All   1.090   100%  

All-­‐Easy   0.974   50%  

All-­‐Diff   1.195   50%  

Easy-­‐Easy   0.933   50%  

Diff-­‐Diff   1.226   50%  

20  

Page 21: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Results  Subset   RMSE   User  Coverage  

All   1.090   100%  

All-­‐Easy   0.974   50%  (easy)  

All-­‐Diff   1.195   50%  (diff)  

Easy-­‐Easy   0.933   50%  (easy)  

Diff-­‐Diff   1.226   50%  (diff)  

RMSE  average:  1.084  

21  

Page 22: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Results  Subset   RMSE   User  Coverage  

All   1.090   100%  

All-­‐Easy   0.974   50%  (easy)  

All-­‐Diff   1.195   50%  (diff)  

Easy-­‐Easy   0.933   50%  (easy)  

Diff-­‐Diff   1.226   50%  (diff)  

RMSE  average:  1.064  

22  

Page 23: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Experiments  EX1:  Is  ra;ng  coherence  a  good  predictor  for  the  Magic  Barrier?        EX2:  Can  we  use  ra;ng  coherence  to  improve  recommenda;on  results?  Especially  when  we  do  not  have  re-­‐ra;ngs/magic  barrier?  

23  

Page 24: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Conclusions  Ra;ng  coherence  within  an  item’s  feature  space  is  related  to  the  magic  barrier    Ra;ng  coherence  can  be  used  to  predict  users’  ra;ng  inconsistencies    A  user’s  ra;ng  inconsistencies  tell  us  how  well  our  system  performs  for  that  user  –  i.e.  whether  we  need  to  op;mize  further  

24  

Page 25: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Thanks!  

 Ques;ons?  

                     More  recommender  systems:  www.recsyswiki.com    

25  

Page 26: The Magic Barrier of Recommender Systems - No Magic, Just Ratings

Image  credits  •  Slide  1  -­‐  hqp://www.burdens.net.au/product/esf-­‐barriers  •  Slide  2  -­‐  hqp://www.freepicturesweb.com/pictures/2010-­‐07/5881.html  •  Slide  4  -­‐  hqp://redgannet.blogspot.nl/2010/08/moreletakloof-­‐johannesburg-­‐jnb.html  •  Slide  5  -­‐  hqp://en.wikipedia.org/wiki/Standard_devia;on#Interpreta;on_and_applica;on  •  Slide  9,  12,  23  -­‐  hqp://www.withautodesk.com/html/science-­‐fair-­‐ideas-­‐for-­‐physical-­‐science-­‐inven;ons.html  •  Slide  13  -­‐  hqp://fineartamerica.com/featured/16-­‐cogs-­‐les-­‐cunliffe.html  •  Slide  24  -­‐  hqps://benchprep.com/blog/the-­‐ap-­‐english-­‐exam-­‐wri;ng-­‐compelling-­‐conclusions/  

26