media, data, context... and the holy grail of user taste prediction

76
MEDIA, DATA, CONTEXT... And The Holy Grail of User Taste Prediction Xavier Amatriain MAT, UCSB Santa Barbara, March '11

Upload: xavier-amatriain

Post on 26-Jan-2015

107 views

Category:

Technology


0 download

DESCRIPTION

Slides presented at UCSB in March 1st, 2011

TRANSCRIPT

Page 1: Media, data, context... and the Holy Grail of User Taste Prediction

MEDIA, DATA, CONTEXT... And The Holy Grail of User Taste Prediction

Xavier Amatriain

MAT, UCSBSanta Barbara, March '11

Page 2: Media, data, context... and the Holy Grail of User Taste Prediction

But first...

About me and Telefonica

Page 3: Media, data, context... and the Holy Grail of User Taste Prediction

About meUp until 2005

Page 4: Media, data, context... and the Holy Grail of User Taste Prediction

About me2005 ­ 2007

Page 5: Media, data, context... and the Holy Grail of User Taste Prediction

About me2007 ­ ..

Page 6: Media, data, context... and the Holy Grail of User Taste Prediction

About 71,000 professionals

About 257,000 professionals

Staff

Services

Finances Rev: 4,273 M€EPS(1): 0.45 €

Integrated ICT solutions for all

customers

Clients About 12 million

subscribers

About 260 million

customers

Basic telephone and data services

1989

SpainOperations in 25 countries

Geographies

Rev: 57,946 M€ EPS: 1.63 €

2000 2008

About 149,000 professionals

About 68 million

customers

Wireline and mobile voice, data and

Internet services

(1) EPS: Earnings per share

Rev: 28,485 M€EPS(1): 0.67 €

Operations in16 countries

Telefonica is a fast-growing Telecom

Page 7: Media, data, context... and the Holy Grail of User Taste Prediction

Telco sector worldwide ranking by market cap (US$ bn)

Currently among the largest in the world

Source: Bloomberg, 06/12/09

Just announced 2010 results: record net earnings, first Spanish company ever to make > 10B €

Page 8: Media, data, context... and the Holy Grail of User Taste Prediction

Argentina: 20.9 millionBrazil: 61.4 millionCentral America: 6.1 millionColombia: 12.6 millionChile: 10.1 millionEcuador: 3.3 million Mexico: 15.7 millionPeru: 15.2 millionUruguay: 1.5 millionVenezuela: 12.0 million

Wireline market rank Mobile market rank

21

12

21

11

2

2

11

12

2

Notes: - Central America includes Guatemala, Panama, El Salvador and Nicaragua- Total accesses figure includes Narrowband Internet accesses of Terra Brasil and Terra Colombia, and Broadband Internet accesses of Terra Brasil, Telefónica de Argentina, Terra Guatemala and Terra México.

Data as of March ‘09

Total Accesses (as of March ‘09)159.5 million

Leader in South America

Page 9: Media, data, context... and the Holy Grail of User Taste Prediction

Spain: 47.2 millionUK: 20.8 millionGermany: 16.0 millionIreland: 1.7 millionCzech Republic: 7.7 millionSlovakia: 0.4 million

Total Accesses (as of March ’09)93.8 million

1

21

11

4

2

Wireline market rankMobile market rank

3

Data as of March ‘09

And a significant footprint in Europe

Page 10: Media, data, context... and the Holy Grail of User Taste Prediction

Scientific Research

Multimedia CoreMobile and Ubicomp

DATA MINING

User Modelling & Data Mining

HCIR

Content Distribution & P2P Wireless Systems

Social Networks

Page 11: Media, data, context... and the Holy Grail of User Taste Prediction

Enough introductions...

Page 12: Media, data, context... and the Holy Grail of User Taste Prediction

Information Overload

Page 13: Media, data, context... and the Holy Grail of User Taste Prediction

More is Less

Less Decisions

Worse Decisions

Page 14: Media, data, context... and the Holy Grail of User Taste Prediction

Search engines don’t always hold the answer

Page 15: Media, data, context... and the Holy Grail of User Taste Prediction
Page 16: Media, data, context... and the Holy Grail of User Taste Prediction

What about discovery?

Page 17: Media, data, context... and the Holy Grail of User Taste Prediction

What about curiosity?

Page 18: Media, data, context... and the Holy Grail of User Taste Prediction

What about information to help take decisions?

Page 19: Media, data, context... and the Holy Grail of User Taste Prediction

The Age of Search has come to an end

●... long live the Age of Recommendation!● Chris Anderson in “The Long Tail”

● “We are leaving the age of information and entering the age of recommendation”

● CNN Money, “The race to create a 'smart' Google”:● “The Web, they say, is leaving the era of search and entering

one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.”

Page 20: Media, data, context... and the Holy Grail of User Taste Prediction

But, what areRecommender

Systems?

Read this!

Attend this conference!

Page 21: Media, data, context... and the Holy Grail of User Taste Prediction

The value of recommendations

● Netflix: 2/3 of the movies rented are recommended● Google News: recommendations generate 38% more

clickthrough● Amazon: 35% sales from recommendations● Choicestream: 28% of the people would buy more music if

they found what they liked.

u

Page 22: Media, data, context... and the Holy Grail of User Taste Prediction

The “Recommender problem”

● Estimate a utility function that is able to automatically predict how much a user will like an item that is unknown for her. Based on:

● Past behavior● Relations to other users● Item similarity● Context● ...

Page 23: Media, data, context... and the Holy Grail of User Taste Prediction

Data mining + all those other things

● User Interface● System requirements (efficiency, scalability,

privacy....)● Business Logic● Serendipity● ....

Page 24: Media, data, context... and the Holy Grail of User Taste Prediction

The Netflix Prize

● 500K users x 17K movie titles = 100M ratings = $1M (if you “only” improve existing system by 10%! From 0.95 to 0.85 RMSE)● 49K contestants on 40K teams from

184 countries.

● 41K valid submissions from 5K teams; 64 submissions per day

● Wining approach uses hundreds of predictors from several teams

Page 25: Media, data, context... and the Holy Grail of User Taste Prediction

Approaches to Recommendation

●Collaborative Filtering● Recommend items based only on the users past behavior

● User-based● Find similar users to me and recommend what they liked

● Item-based● Find similar items to those that I have previously liked

●Content-based● Recommend based on features inherent to the items

●Social recommendations (trust-based)

Page 26: Media, data, context... and the Holy Grail of User Taste Prediction

What works

● It depends on the domain and particular problem● As a general rule, it is usually a good idea to combine:

Hybrid Recommender Systems

● However, in the general case it has been demonstrated that (currently) the best isolated approach is CF.

● Item-based in general more efficient and better but mixing CF approaches can improve result

● Other approaches can be hybridized to improve results in specific cases (cold-start problem...)

Page 27: Media, data, context... and the Holy Grail of User Taste Prediction

27

The CF Ingredients

● List of m Users and a list of n Items● Each user has a list of items with associated opinion

● Explicit opinion - a rating score (numerical scale)● Implicit feedback – purchase records or listening

history● Active user for whom the prediction task is performed● A metric for measuring similarity between users ● A method for selecting a subset of neighbors ● A method for predicting a rating for items not rated by the active user.

Page 28: Media, data, context... and the Holy Grail of User Taste Prediction

But ...

Page 29: Media, data, context... and the Holy Grail of User Taste Prediction

User Feedback is Noisy

DID YOU HEAR WHAT I LIKE??!!

...and limits Our Prediction Accuracy

Page 30: Media, data, context... and the Holy Grail of User Taste Prediction

The Magic Barrier

● Magic Barrier = Limit on prediction accuracy due to noise in original data

● Natural Noise = involuntary noise introduced by users when giving feedback● Due to (a) mistakes, and (b) lack of resolution in

personal rating scale (e.g. In a 1 to 5 scale a 2 may mean the

same than a 3 for some users and some items).

● Magic Barrier >= Natural Noise Threshold● We cannot predict with less error than the

resolution in the original data

Page 31: Media, data, context... and the Holy Grail of User Taste Prediction

Our related research questions

● Q1. Are users inconsistent when providing explicit feedback to Recommender Systems via the common Rating procedure?

● Q2. How large is the prediction error due to these inconsistencies?

● Q3. What factors affect user inconsistencies?

X. Amatriain, J.M. Pujol, N. Oliver (2009) "I like It... I like It Not: Measuring Users Ratings Noise in Recommender Systems", in UMAP 09

Page 32: Media, data, context... and the Holy Grail of User Taste Prediction

Experimental Setup

● 100 Movies selected from Netflix dataset doing a stratified random sampling on popularity

● Ratings on a 1 to 5 star scale● Special “not seen” symbol.

● Trial 1 and 3 = random order; trial 2 = ordered by popularity

● 118 participants

Page 33: Media, data, context... and the Holy Grail of User Taste Prediction

User Feedback is Noisy

● Users are inconsistent● Inconsistencies are not

random and depend on many factors ● More inconsistencies for mild

opinions● More inconsistencies for

negative opinions● How the items are presented

affects inconsistencies

Page 34: Media, data, context... and the Holy Grail of User Taste Prediction

User’s ratings are far from ground truth

Pairwise comparison between trials, RMSE is already > 0.55 or > 0.69 in the best case (Netflix Prize was to get below 0.85 !!!)

Page 35: Media, data, context... and the Holy Grail of User Taste Prediction

Rate it Again

● Given that users are noisy… can we benefit from asking to rate the same movie more than once?

● We propose an algorithm to allow for multiple ratings of the same <user,item> tuple.● The algorithm is subjected to two fairness conditions:

– Algorithm should remove as few ratings as possible (i.e. only when there is some certainty that the rating is only adding noise)

– Algorithm should not make up new ratings but decide on which of the existing ones are valid (no averaging, predicting...)

X. Amatriain, J.M. Pujol, N. Tintarev, N. Oliver (2009)"Rate it Again: Increasing Recommendation Accuracy by User re-Rating", 2009 ACM RecSys

Page 36: Media, data, context... and the Holy Grail of User Taste Prediction

Re-rating Algorithm• One source re­rating case:

• Given the following milding function:   

Examples:

{3, 1} → Ø {4} → 4{3, 4} → 3

(2 source){3, 4, 5} → 3

Page 37: Media, data, context... and the Holy Grail of User Taste Prediction

Results

Page 38: Media, data, context... and the Holy Grail of User Taste Prediction

Rate it again

● By asking users to rate items again we can remove noise in the dataset● Improvements of up to 14% in accuracy!

● Because we don't want all users to re-rate all items we design ways to do partial denoising● Data-dependent: only denoise extreme ratings● User-dependent: detect “noisy” users

Page 39: Media, data, context... and the Holy Grail of User Taste Prediction

The value or a re-rating

Adding new ratings increases performance of the CF algorithm

Page 40: Media, data, context... and the Holy Grail of User Taste Prediction

The value or a re-rating

But you are better off doing re-rating than new ratings !!

Page 41: Media, data, context... and the Holy Grail of User Taste Prediction

The value or a re-rating

And much better if you know which ratings to re-rate!!

Page 42: Media, data, context... and the Holy Grail of User Taste Prediction

Let's recap

● Users are inconsistent● Inconsistencies can depend on many things

including how the items are presented● Inconsistencies produce natural noise● Natural noise reduces our prediction accuracy

independently of the algorithm● By asking users to rate items again we can

remove noise and improve accuracy

Page 43: Media, data, context... and the Holy Grail of User Taste Prediction

But Crowds are not always wise

● Diversity of opinion

● Independence

● Decentralization

● Aggregation

Conditions that are needed to guarantee the Wisdom in a Crowd

Page 44: Media, data, context... and the Holy Grail of User Taste Prediction

Who Can we trust?

Page 45: Media, data, context... and the Holy Grail of User Taste Prediction

Crowds are not always wise

vs.

Who  won?

Page 46: Media, data, context... and the Holy Grail of User Taste Prediction

“It is really only experts who can reliably account 

for their reactions”

Page 47: Media, data, context... and the Holy Grail of User Taste Prediction

The Wisdom of the Few

X. Amatriain et al. "The wisdom of the few: a collaborative filtering approach based on expert opinions from the web", SIGIR '09

Page 48: Media, data, context... and the Holy Grail of User Taste Prediction

Expert-based CF

● expert = individual that we can trust to have produced thoughtful, consistent and reliable evaluations (ratings) of items in a given domain

● Expert-based Collaborative Filtering● Find neighbors from a reduced set of experts instead of

regular users.

1. Identify domain experts with reliable ratings

2. For each user, compute “expert neighbors”

3. Compute recommendations similar to standard kNN CF

Page 49: Media, data, context... and the Holy Grail of User Taste Prediction

User Study

● 57 participants, only 14.5 ratings/participant

● 50% of the users consider Expert-based CF to be good or very good

● Expert-based CF: only algorithm with an average rating over 3 (on a 0-4 scale)

Page 50: Media, data, context... and the Holy Grail of User Taste Prediction

Advantages of the Approach

● Noise● Experts introduce less

natural noise

● Malicious Ratings● Dataset can be monitored

to avoid shilling

● Data Sparsity● Reduced set of domain

experts can be motivated to rate items

● Cold Start problem● Experts rate items as

soon as they are available

● Scalability● Dataset is several order of

magnitudes smaller

● Privacy● Recommendations can be

computed locally

Page 51: Media, data, context... and the Holy Grail of User Taste Prediction

Architecture of the approach

Page 52: Media, data, context... and the Holy Grail of User Taste Prediction

Some implementations

● A distributed Music Recommendation engine

J. Ahn and X. Amatriain et al. "Towards Fully Distributed and Privacy-preserving Recommendations via Expert Collaborative Filtering and RESTful Linked Data", Web Intelligence '10

Page 53: Media, data, context... and the Holy Grail of User Taste Prediction

Expert Music Recommendations

Powered by...

Page 54: Media, data, context... and the Holy Grail of User Taste Prediction

Some implementations (II)

● A geo-localized Mobile Movie Recommender iPhone App

J. Bachs and X. Amatriain et al. "Geolocated Movie Recommendations based on Expert Collaborative Filtering", Recsys '10

Page 55: Media, data, context... and the Holy Grail of User Taste Prediction

Geo-localized Expert Movie Recommendations

Powered by...

Page 56: Media, data, context... and the Holy Grail of User Taste Prediction

Context Overload

Page 57: Media, data, context... and the Holy Grail of User Taste Prediction

Page 58: Media, data, context... and the Holy Grail of User Taste Prediction

Mobile phones are “personal”

Page 59: Media, data, context... and the Holy Grail of User Taste Prediction

Mobile users tend to seek “fresh” content

Page 60: Media, data, context... and the Holy Grail of User Taste Prediction

Where is the nearest florist?

Page 61: Media, data, context... and the Holy Grail of User Taste Prediction

Where is that really cool cocktail barI went to last month?

Page 62: Media, data, context... and the Holy Grail of User Taste Prediction

Interesting things close to me?

Page 63: Media, data, context... and the Holy Grail of User Taste Prediction

Events near me?

Page 64: Media, data, context... and the Holy Grail of User Taste Prediction

Lost or in an unfamiliar place?

Page 65: Media, data, context... and the Holy Grail of User Taste Prediction

Context-aware Recommendations

● A clear area of research and interest for companies: recommend me something that I like and is relevant in my current context.● Context = any variable that adds a new dimension

to the 2D user-item problem (e.g. time, geolocation, weather...)

Page 66: Media, data, context... and the Holy Grail of User Taste Prediction

User micro-profiles

● Our proposal is to represent a user by a hierarchy of micro-profiles where each micro-profile represents a class in the context variable

L. Baltrunas, X. Amatriain "Towards Time-Dependant Recommendation based on Implicit Feedback", in CARS (Context-aware Recommender Systems Workshop) Recsys '09

Page 67: Media, data, context... and the Holy Grail of User Taste Prediction

Multiverse Recommendation

● A different approach: represent the contextual recommendation problem by n-dimensional matrices (aka Tensors)

A. Karatzoglou, X. Amatriain, L. Baltrunas, N. Oliver "Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering", 2010 ACM Recsys Conference

Page 68: Media, data, context... and the Holy Grail of User Taste Prediction

Master Planner

Automatic and personalized tourist route recommendations, a new approach to discovering the world

Page 69: Media, data, context... and the Holy Grail of User Taste Prediction

Tourism 2.0

● Tourism is not the same since the web appeared:– People search for

information on where to go online (reading blogs, in their social networks...)

– People buy tickets and hotel packages online

– People post pictures and discuss tips online

Page 70: Media, data, context... and the Holy Grail of User Taste Prediction

Tourism 3.0 – Going Mobile

● The mobile web and smartphones are introducing yet another revolution

● Tourists can now access information on the go:– Looking for information on a sight

– Tips on where to go next

– Information about the weather

– ....

N. Tintarev, A. Flores, X. Amatriain (2010)"Off the beaten track - a mobile field study exploring the long tail of mobile tourist recommendations", 2010 Mobile HCI

Page 71: Media, data, context... and the Holy Grail of User Taste Prediction

Master Planner

● I am in SB, it's March and sunny, I have 6 hours to visit things and I am interested on music, art, literature, and sports

● I need: An automatic tourist route recommender system

Page 72: Media, data, context... and the Holy Grail of User Taste Prediction

Master Planner

● Completely automatic personalized/contextualized tourist recommender system

● Generates automatic city models using web resources

● Generates automatic user models from regular user profiles

● Personalizes/contextualizes generic city models

● Recommends optimized personalized routes taking into account constraints using AI techniques

Page 73: Media, data, context... and the Holy Grail of User Taste Prediction

Summary

➢ We need to build tools and approaches to help people navigate the abundance of media and information

➢ Recommender systems can help by leveraging the wisdom of the crowds

➢ But...➢ User feedback is not always our ground truth➢ Crowds are not always wise and we are better off

using experts➢ Context is becoming part of the content itself

Page 74: Media, data, context... and the Holy Grail of User Taste Prediction

Co-authors

● Josep M. Pujol and Nuria Oliver (Telefonica) worked on Natural Noise and Wisdom of the Few projects

● Neal Lathia (UCL, London), Haewook Ahn (KAIST, Korea), Jaewook Ahn (Pittsbourgh Univ.), and Josep Bachs (UPF, Barcelona) on Wisdom of the Few

● Linas Baltrunas (Bolzano U., Italy), Alexandros Karatzoglou, Paulo Villegas, Toni Cebrian (Telefonica) worked on contextual

● Miquel Ramirez (UPF, Barcelona) and Nava Tintarev (Telefonica) worked on Tourist Recommendations.

Page 75: Media, data, context... and the Holy Grail of User Taste Prediction

Conclusions

➢ Whether you are an engineer, an artist or a scientist (or all of the above), it is important to keep the “user” in mind➢ Who are my “users”? (end-user, public, other

scientists, a grant agency...)➢ How will the output of my work affect users?● How can I obtain feedback from them?➢ How can I use it?➢ ...➢

Page 76: Media, data, context... and the Holy Grail of User Taste Prediction

Thanks!

Questions?

Xavier [email protected]

http://xavier.amatriain.nethttp://technocalifornia.blogspot.com

@xamat