recommender systems session b robin burke depaul university chicago, il

67
Recommender Systems Recommender Systems Session B Session B Robin Burke Robin Burke DePaul University DePaul University Chicago, IL Chicago, IL

Upload: franklin-golden

Post on 13-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Recommender SystemsRecommender SystemsSession BSession B

Robin BurkeRobin Burke

DePaul UniversityDePaul University

Chicago, ILChicago, IL

Page 2: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

RoadmapRoadmap

Session A: Basic Techniques ISession A: Basic Techniques I– IntroductionIntroduction– Knowledge SourcesKnowledge Sources– Recommendation TypesRecommendation Types– Collaborative RecommendationCollaborative Recommendation

Session B: Basic Techniques IISession B: Basic Techniques II– Content-based RecommendationContent-based Recommendation– Knowledge-based RecommeKnowledge-based Recommendation

Session C: Domains and Implementation ISession C: Domains and Implementation I– Recommendation domainsRecommendation domains– Example ImplementationExample Implementation– Lab I

Session D: Evaluation ISession D: Evaluation I– EvaluationEvaluation

Session E: ApplicationsSession E: Applications– User InteractionUser Interaction– Web PersonalizationWeb Personalization

Session F: Implementation IISession F: Implementation II– LaLab II

Session G: Hybrid RecommendationSession G: Hybrid Recommendation Session H: RobustnessSession H: Robustness Session I: Advanced TopicsSession I: Advanced Topics

– DynamicsDynamics– Beyond accuracyBeyond accuracy

Page 3: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Content-Based Content-Based RecommendationRecommendation Collaborative recommendationCollaborative recommendation

– requires only ratingsrequires only ratings Content-based recommendationContent-based recommendation

– all techniques that use properties of the items all techniques that use properties of the items themselvesthemselves

– usually refers to techniques that only use item usually refers to techniques that only use item featuresfeatures

Knowledge-based recommendationKnowledge-based recommendation– a sub-type of content-baseda sub-type of content-based– in which we apply knowledge in which we apply knowledge

about items and how they satisfy user needsabout items and how they satisfy user needs

Page 4: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Content-Based Content-Based ProfilingProfiling Suppose we have no other usersSuppose we have no other users

– but we know about the features of the but we know about the features of the items rated by the useritems rated by the user

We can imagine building a profile We can imagine building a profile based on user preferencesbased on user preferences– here are the kinds of things the user here are the kinds of things the user

likeslikes– here are the ones he doesn't likehere are the ones he doesn't like

Usually called content-based Usually called content-based recommendationrecommendation

Page 5: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Recommendation Recommendation Knowledge Sources Knowledge Sources TaxonomyTaxonomy

RecommendationKnowledge Collaborative

Content

User

OpinionProfiles

DemographicProfiles

Opinions

Demographics

Item Features

Means-ends

DomainConstraints

Contextual Knowledge

Requirements

Query

Constraints

Preferences

Context

DomainKnowledge

FeatureOntology

Page 6: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Content-based Content-based ProfilingProfiling

iteitemm

aa11 aa22aa

33

aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

iteitemm

aa11 aa22aa

33

aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk iteite

mmaa11 aa22

aa

33

aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

iteitemm

aa11 aa22aa

33

aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

?? iteitemm

aa11 aa22aa

33

aa44......

aakk?? iteitemm

aa11 aa22aa

33

aa44......

aakk?? iteitemm

aa11 aa22aa

33

aa44......

aakk?? iteitemm

aa11 aa22aa

33

aa44......

aakk?? iteitemm

aa11 aa22aa

33

aa44......

aakk

To find relevant items

Classifier

Build classifier

iteitemm

aa11 aa22aa

33

aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

iteitemm

aa11 aa22 aa33 aa44......

aakk

Obtain rated items

PredictY N

Recommend

Page 7: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

OriginsOrigins

Began with earliest forms of user Began with earliest forms of user modelsmodels– Grundy (Rich, 1979)Grundy (Rich, 1979)

Elaborated in information filteringElaborated in information filtering– Selecting news articles (Dumais, Selecting news articles (Dumais,

1990)1990) More recently spam filteringMore recently spam filtering

Page 8: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Basic IdeaBasic Idea

Record user ratings for itemRecord user ratings for item Generate a model of user Generate a model of user

preferences over featurespreferences over features Give as recommendations other Give as recommendations other

items with similar contentitems with similar content

Page 9: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Movie RecommendationMovie Recommendation

Predictions for unseen (target) items are Predictions for unseen (target) items are computed based on their similarity (in terms of computed based on their similarity (in terms of content) to items in the user profile.content) to items in the user profile.

E.g., user profile E.g., user profile PPuu contains contains

recommend highly: and recommend “mildly”:recommend highly: and recommend “mildly”:

Page 10: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Content-Based Recommender Systems

Page 11: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Personalized SearchPersonalized Search

How can the search engine How can the search engine determine the “user’s context”?determine the “user’s context”?

Query: “Madonna and Child”

?

?

Need to “learn” the user Need to “learn” the user profile:profile:– User is an art historian?User is an art historian?– User is a pop music fan?User is a pop music fan?

Page 12: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Play List GenerationPlay List Generation Music recommendationsMusic recommendations Configuration problemConfiguration problem

– Must take into account other items already in list Must take into account other items already in list

Example: Pandora

Page 13: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

AlgorithmsAlgorithms

kNNkNN Naive BayesNaive Bayes Neural networksNeural networks Any classification technique can Any classification technique can

be usedbe used

Page 14: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Naive BayesNaive Bayes

p(A) = probability of event Ap(A) = probability of event A p(A,B) = probability of event A and event Bp(A,B) = probability of event A and event B

– joint probabilityjoint probability p(A|B) = probability of event A given event Bp(A|B) = probability of event A given event B

– we know B happenedwe know B happened– conditional probabilityconditional probability

ExampleExample– A is a student getting an "A" gradeA is a student getting an "A" grade

p(A) = 20%p(A) = 20%– B is the event of a student coming to less than 50% of meetingsB is the event of a student coming to less than 50% of meetings

p(A|B) is much less than 20%p(A|B) is much less than 20% p(A,B) would be the probability of both thingsp(A,B) would be the probability of both things

– how many students are in this category?how many students are in this category? Recommender system questionRecommender system question

– LLii is the event that the user likes item i is the event that the user likes item i– B is the set of features associated with item iB is the set of features associated with item i

Estimate p(LEstimate p(Lii|B) |B)

Page 15: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Bayes RuleBayes Rule

p(A|B) = p(B|A) p(A) / p(B)p(A|B) = p(B|A) p(A) / p(B) We can always restate a conditional We can always restate a conditional

probability in terms ofprobability in terms of– the reverse condition p(B|A)the reverse condition p(B|A)– and two prior probabilitiesand two prior probabilities

p(A)p(A) p(B)p(B)

Often the reverse condition is easier to knowOften the reverse condition is easier to know– we can count how often a feature appears in items we can count how often a feature appears in items

the user likedthe user liked– frequentist assumptionfrequentist assumption

Page 16: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Naive BayesNaive Bayes

Probability of liking an item given Probability of liking an item given its featuresits features– p(Lp(Lii|a|a11, a, a22, ... , a, ... , akk))– think of Lthink of Lii as the class for item i as the class for item i

By the theoremBy the theorem

Page 17: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Naive AssumptionNaive Assumption

IndependenceIndependence– the features athe features a11, a, a22, ... , a, ... , ak k are independentare independent– independent meansindependent means

p(A,B) = p(A)p(B)p(A,B) = p(A)p(B) ExampleExample

– two coin flips P(heads) = 0.5two coin flips P(heads) = 0.5– P(heads,heads) = 0.25P(heads,heads) = 0.25

Anti-exampleAnti-example– appearance of the word "Recommendation" and "Collaborative" appearance of the word "Recommendation" and "Collaborative"

in papers by Robin Burkein papers by Robin Burke P("Recommendation") = 0.6P("Recommendation") = 0.6 P("Collaborative") = 0.3P("Collaborative") = 0.3 P("Recommendation","Collaborative")=0.3 not 0.18P("Recommendation","Collaborative")=0.3 not 0.18

In generalIn general– this assumption is false for items and their featuresthis assumption is false for items and their features– but pretending it is true works wellbut pretending it is true works well

Page 18: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Naive AssumptionNaive Assumption

For joint probabilityFor joint probability

For conditional probabilityFor conditional probability

Bayes' RuleBayes' Rule

Page 19: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Frequency TableFrequency Table

Iterate through all Iterate through all examplesexamples– if example is "liked"if example is "liked"

for each feature afor each feature a– add one to the cell for add one to the cell for

that feature under Lthat feature under L

– similar for ~Lsimilar for ~L

LL ~L~L

aa11

aa22

......

aakk

Page 20: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

ExampleExample

Total # of movies 20Total # of movies 20– 10 liked10 liked– 10 not liked10 not liked

Feature Liked P(f|L) ~Liked P(f|~L) P(f)Ford 5 0.5 2 0.2 0.35Pitt 2 0.2 4 0.4 0.3Willis 1 0.1 4 0.4 0.25

Page 21: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Classification MAPClassification MAP

Maximum a posterioriMaximum a posteriori– Calculate the probabilities for each possible Calculate the probabilities for each possible

classificationclassification– pick the one with the highest probabilitypick the one with the highest probability

ExamplesExamples– "12 Monkeys" = Pitt && Willis"12 Monkeys" = Pitt && Willis

p(L|12 Monkeys)=0.13p(L|12 Monkeys)=0.13 p(~L|12 Monkeys)=1p(~L|12 Monkeys)=1 not likednot liked

– "Devil's Own" = Ford && Pitt"Devil's Own" = Ford && Pitt p(L|Devil's Own)=0.67p(L|Devil's Own)=0.67 p(~L|Devil's Own)=0.53p(~L|Devil's Own)=0.53 likedliked

Page 22: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Classification LLClassification LL

Log likelihoodLog likelihood– For two possibilitiesFor two possibilities– Calculate probabilitiesCalculate probabilities– Compute ln(p(LCompute ln(p(Lii|a|a11, ... , a, ... , akk)/p(~L)/p(~Lii|a|a11, ... , a, ... , akk))– If > 0, then classify as likedIf > 0, then classify as liked

ExamplesExamples– "12 Monkeys" = Pitt && Willis"12 Monkeys" = Pitt && Willis

ratio = 0.13ratio = 0.13 ln = -2.1ln = -2.1 not likednot liked

– "Devil's Own" = Ford && Pitt"Devil's Own" = Ford && Pitt p(L|Devil's Own)=0.67p(L|Devil's Own)=0.67 p(~L|Devil's Own)=0.53p(~L|Devil's Own)=0.53 ratio = 1.25ratio = 1.25 ln = 0.22ln = 0.22 likedliked

Page 23: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

SmoothingSmoothing

If a feature never appears in a classIf a feature never appears in a class– p(ap(ajj|L)=0|L)=0– that means that it will always veto the classificationthat means that it will always veto the classification

ExampleExample– new movie directornew movie director– cannot be classified as "liked"cannot be classified as "liked"

because there are no liked instances in which he is a because there are no liked instances in which he is a featurefeature

SolutionSolution– Laplace smoothingLaplace smoothing

add a small random value to all attributes before add a small random value to all attributes before startingstarting

Page 24: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Naive BayesNaive Bayes

Works surprisingly wellWorks surprisingly well– used in spam filteringused in spam filtering

Simple implementationSimple implementation– just counting and multiplyingjust counting and multiplying– requires O(F) spacerequires O(F) space

where F is the feature set usedwhere F is the feature set used– easy to update the profileeasy to update the profile– classification is very fastclassification is very fast

Learned classifier can be hard-codedLearned classifier can be hard-coded– used in voice recognition and computer gamesused in voice recognition and computer games

Try this firstTry this first

Page 25: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Neural NetworksNeural Networks

Page 26: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Biological inspirationBiological inspiration

synapses

axon dendrites

The information transmission happens at the synapses.

Page 27: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

How it worksHow it works

Source (pre-synaptic)Source (pre-synaptic)– Tiny voltage spikes travel along the axon– At dendrites, neurotransmitter released in

the synapse Destination (post-synaptic)

– Neurotransmiter absorbed by dendrites– Causes excitation or inhibition– Signals integrated

may produce spikes in the next neuron Connections

– Synaptic connections can be strong or weak

Page 28: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Artificial neuronsArtificial neurons

Neurons work by processing information. They receive and provide information in form of voltage spikes.

The McCullogh-Pitts model

Inputs

Outputw2

w1

w3

wn

wn-1

. . .

x1

x2

x3

xn-1

xn

y)(;

1

zHyxwzn

iii

Page 29: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Artificial neuronsArtificial neurons

Nonlinear generalization of the McCullogh-Pitts neuron:

y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights.

Examples:sigmoidal neuron

Gaussian neuron

Page 30: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Artificial neural Artificial neural networksnetworks

Inputs

Output

An artificial neural network is composed of many artificial neurons that are linked together according to a specific network architecture. The objective of the neural network is to transform the inputs into meaningful outputs.

Page 31: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Learning with Back-Learning with Back-PropagationPropagation Biological system

– seems to modify many synaptic connections simultaneously

– we still don't totally understand this A simplification of the learning problem:

– calculate first the changes for the synaptic weights of the output neuron

– calculate the changes backward starting from layer p-1, and propagate backward the local error terms

Still relatively complicated– much simpler than the original optimization

problem

Page 32: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Application to Application to Recommender Recommender SystemsSystems InputsInputs

– features of productsfeatures of products– binary features work bestbinary features work best

otherwise tricky encoding is requiredotherwise tricky encoding is required

OutputOutput– liked / disliked neuronsliked / disliked neurons

Page 33: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

NN RecommenderNN Recommender

Calculate recommendation score as yCalculate recommendation score as y likedliked - y - ydislikeddisliked

Item F

eatures

Liked

Disliked

……

Page 34: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Issues with ANNIssues with ANN

Often many iterations are neededOften many iterations are needed– 1000s or even millions1000s or even millions

Overfitting can be a serious problemOverfitting can be a serious problem No way to diagnose or debug the networkNo way to diagnose or debug the network

– must relearnmust relearn Designing the network is an artDesigning the network is an art

– input and output codinginput and output coding– layeringlayering– often learning simply failsoften learning simply fails

system never convergessystem never converges Stability vs plasticityStability vs plasticity

– Learning is usually one-shotLearning is usually one-shot– Cannot easily restart learning with new dataCannot easily restart learning with new data– (Actually many learning techniques have this problem)(Actually many learning techniques have this problem)

Page 35: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

OverfittingOverfitting

The problem of The problem of training a learner training a learner too muchtoo much– the learner the learner

continues to continues to improve on the improve on the training datatraining data

– but gets worse but gets worse on the real taskon the real task

Page 36: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Other classification Other classification techniquestechniques Lots of other classification techniques Lots of other classification techniques

have been applied to this problemhave been applied to this problem– support vector machinessupport vector machines– fuzzy setsfuzzy sets– decision treesdecision trees

Essentials are the sameEssentials are the same– learn a decision rule over the item learn a decision rule over the item

featuresfeatures– apply the rule to new itemsapply the rule to new items

Page 37: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Content-Based Content-Based RecommendationRecommendation Advantages:Advantages:

– useful for large information-based sites (e.g., useful for large information-based sites (e.g., portals) or for domains where items have portals) or for domains where items have content-rich featurescontent-rich features

– can be easily integrated with “content servers”can be easily integrated with “content servers” DisadvantagesDisadvantages

– may miss important pragmatic relationships may miss important pragmatic relationships among items (based on usage)among items (based on usage)

avante-garde jazz / classicalavante-garde jazz / classical– not effective in small-specific sites or sites not effective in small-specific sites or sites

which are not content-orientedwhich are not content-oriented– cannot achieve serendipity – novel connectionscannot achieve serendipity – novel connections

Page 38: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

BreakBreak

10 minutes10 minutes

Page 39: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

RoadmapRoadmap

Session A: Basic Techniques ISession A: Basic Techniques I– IntroductionIntroduction– Knowledge SourcesKnowledge Sources– Recommendation TypesRecommendation Types– Collaborative RecommendationCollaborative Recommendation

Session B: Basic Techniques IISession B: Basic Techniques II– Content-based RecommendationContent-based Recommendation– Knowledge-based RecommeKnowledge-based Recommendation

Session C: Domains and Implementation ISession C: Domains and Implementation I– Recommendation domainsRecommendation domains– Example ImplementationExample Implementation– Lab I

Session D: Evaluation ISession D: Evaluation I– EvaluationEvaluation

Session E: ApplicationsSession E: Applications– User InteractionUser Interaction– Web PersonalizationWeb Personalization

Session F: Implementation IISession F: Implementation II– LaLab II

Session G: Hybrid RecommendationSession G: Hybrid Recommendation Session H: RobustnessSession H: Robustness Session I: Advanced TopicsSession I: Advanced Topics

– DynamicsDynamics– Beyond accuracyBeyond accuracy

Page 40: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Knowledge-Based Knowledge-Based RecommendationRecommendation Sub-type of content-basedSub-type of content-based

– we use the features of the itemswe use the features of the items Covers other kinds of knowledge, tooCovers other kinds of knowledge, too

– means-ends knowledgemeans-ends knowledge how products satisfy user needshow products satisfy user needs

– ontological knowledgeontological knowledge what counts as similar in the product domainwhat counts as similar in the product domain

– constraintsconstraints what is possible in the domain and whywhat is possible in the domain and why

Page 41: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Recommendation Recommendation Knowledge Sources Knowledge Sources TaxonomyTaxonomy

RecommendationKnowledge Collaborative

Content

User

OpinionProfiles

DemographicProfiles

Opinions

Demographics

Item Features

Means-ends

DomainConstraints

Contextual Knowledge

Requirements

Query

Constraints

Preferences

Context

DomainKnowledge

FeatureOntology

Page 42: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Diverse PossibilitiesDiverse Possibilities

UtilityUtility– some systems concentrate on representing the some systems concentrate on representing the

user's constraints in the form utility functionsuser's constraints in the form utility functions SimilaritySimilarity

– some systems focus on detailed knowledge-based some systems focus on detailed knowledge-based similarity calculationssimilarity calculations

InteractivityInteractivity– some systems use knowledge to enhance the some systems use knowledge to enhance the

collection of requirement informationcollection of requirement information For our purposesFor our purposes

– concentrate on case-based recommendation and concentrate on case-based recommendation and constraint-based recommendationconstraint-based recommendation

Page 43: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Case-Based Case-Based RecommendationRecommendation Based on ideas from case-based Based on ideas from case-based

reasoning (CBR)reasoning (CBR)– An alternative to rule-based An alternative to rule-based

problem-solvingproblem-solving ““A case-based reasoner solves A case-based reasoner solves

new problems by adapting new problems by adapting solutions used to solve old solutions used to solve old problems”problems”

-- Riesbeck & Schank 1987-- Riesbeck & Schank 1987

Page 44: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Retain Review

Adapt

Retrieve

Database

NewProblem

Similar

SolutionSolution

CBR Solving ProblemsCBR Solving Problems

Page 45: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

CBR System CBR System ComponentsComponents Case-base Case-base

– database of previous cases (experience)database of previous cases (experience)– episodic memoryepisodic memory

Retrieval of relevant casesRetrieval of relevant cases– index for cases in libraryindex for cases in library– matching most similar case(s)matching most similar case(s)– retrieving the solution(s) from these case(s) retrieving the solution(s) from these case(s)

Adaptation of solutionAdaptation of solution– alter the retrieved solution(s) to reflect alter the retrieved solution(s) to reflect

differences between new case and retrieved differences between new case and retrieved case(s)case(s)

Page 46: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Retrieval knowledgeRetrieval knowledge

ContentsContents– features used to index casesfeatures used to index cases– relative importance of featuresrelative importance of features– what counts as “similar”what counts as “similar”

IssuesIssues– ““surface” vs “deep” similaritysurface” vs “deep” similarity

Page 47: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Analogy to the catalogAnalogy to the catalog

ProblemProblem– user needuser need

CaseCase– productproduct

RetrievalRetrieval– recommendationrecommendation

Page 48: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Entree IEntree I

Page 49: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Entree IIEntree II

Page 50: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Entree IIIEntree III

Page 51: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Critiquing DialogCritiquing Dialog

Mixed-initiative interactionMixed-initiative interaction– user offers inputuser offers input– system responds with possibilitiessystem responds with possibilities– user critiques or offers additional inputuser critiques or offers additional input

Makes preference elicitation gradualMakes preference elicitation gradual– rather than all-at-once with a queryrather than all-at-once with a query– can guide user away from “empty” can guide user away from “empty”

parts of the product spaceparts of the product space

Page 52: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

CBR retrievalCBR retrieval

Knowledge-based nearest-neighborKnowledge-based nearest-neighbor– similarity metric defines distance between similarity metric defines distance between

casescases– usually on an attribute-by-attribute basisusually on an attribute-by-attribute basis

EntreeEntree– cuisinecuisine– qualityquality– priceprice– atmosphereatmosphere

Page 53: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

How do we measure How do we measure similarity?similarity? complex multi-level comparisoncomplex multi-level comparison goal sensitivegoal sensitive

– multiple goalsmultiple goals retrieval strategiesretrieval strategies

– non-similarity relationshipsnon-similarity relationships Can be strictly numericCan be strictly numeric

– weighted sum of similarities of featuresweighted sum of similarities of features– ““local similarities”local similarities”

May involve inferenceMay involve inference– reasoning about the similarity of itemsreasoning about the similarity of items

Page 54: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Price metricPrice metric

Sim

ilarit

y

Price

Queryvalue

Page 55: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Cuisine MetricCuisine Metric

Asian

ChineseJapanese

European

French

PacificNew Wave

Vietnamese Thai

NouvelleCuisine

Page 56: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

MetricsMetrics

Goal-specific comparisonGoal-specific comparison– How similar is target product to the How similar is target product to the

source with respect to this goal?source with respect to this goal? AsymmetricAsymmetric

– directional effectsdirectional effects A small # of general purpose A small # of general purpose

typestypes

Page 57: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

MetricsMetrics

If they generate a true metric spaceIf they generate a true metric space– approaches using space-partitioning approaches using space-partitioning

techniquestechniques bsp, quad-trees, etc.bsp, quad-trees, etc.

Not always the caseNot always the case Hard to optimizeHard to optimize

– storing nstoring n22 distances/recalculating distances/recalculating– FindMe calculates similarity at retrieval timeFindMe calculates similarity at retrieval time

Page 58: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Combining metricsCombining metrics

Global metricGlobal metric– combination of attribute metricscombination of attribute metrics

Hierarchical combinationHierarchical combination– lower metrics break ties in upperlower metrics break ties in upper

BenefitsBenefits– simple to acquiresimple to acquire– easy to understandeasy to understand

Somewhat inflexibleSomewhat inflexible– More typical would be a weighted sumMore typical would be a weighted sum

Page 59: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Constraint-based Constraint-based RecommendationRecommendation Represent user’s needs as a set Represent user’s needs as a set

of constraintsof constraints Try to satisfy those constraints Try to satisfy those constraints

with productswith products

Page 60: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

ExampleExample

User needs a carUser needs a car– Gas mileage > 25 mpgGas mileage > 25 mpg– Capacity >= 5 peopleCapacity >= 5 people– Price < 18,000 Price < 18,000

A solution would be a list of A solution would be a list of models satisfying these models satisfying these requirementsrequirements

Page 61: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Configurable ProductsConfigurable Products

Constraints important where products Constraints important where products are configurableare configurable– computerscomputers– travel packagestravel packages– business servicesbusiness services– (cars)(cars)

The relationships between configurable The relationships between configurable components need to be expressed as components need to be expressed as constraints anywayconstraints anyway– a GT 6800 graphics card needs power a GT 6800 graphics card needs power

supply >= 300 Wsupply >= 300 W

Page 62: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Product SpaceProduct Space

Screen Size

Weight < x

Screen > y

Weig

ht Possible

Recommendations

Page 63: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

UtilityUtility

In order to rank productsIn order to rank products– we need a measure of utilitywe need a measure of utility– can be “slack”can be “slack”

how much the product exceeds the how much the product exceeds the constraintsconstraints

– can be another measurecan be another measure price is typicalprice is typical

– can be a utility calculation that is a can be a utility calculation that is a function of product attributesfunction of product attributes

but generally this is user-specificbut generally this is user-specific– value of weight vs screen sizevalue of weight vs screen size

Page 64: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Product SpaceProduct Space

Screen Size

Weight < x

Screen > y

Weig

ht A B

C

Page 65: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

UtilityUtility

SlackSlackAA = (X – Weight = (X – WeightAA) + (Size) + (SizeAA - Y) - Y)– not really commensuratenot really commensurate

PricePriceAA– ignores product differencesignores product differences

UtilityUtilityAA = = (X – Weight (X – WeightAA) + ) + (Size (SizeAA - Y ) - Y ) + + (X – Weight (X – WeightAA) (Size) (SizeAA - Y ) - Y )– usually we ignore usually we ignore and treat utilities as and treat utilities as

independentindependent– how do we know what how do we know what and and are? are?

make assumptionsmake assumptions infer from user behaviorinfer from user behavior

Page 66: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Knowledge-Based Knowledge-Based RecommendationRecommendation Hard to generalizeHard to generalize AdvantagesAdvantages

– no cold start issuesno cold start issues– great precision possiblegreat precision possible

very important in some domainsvery important in some domains

DisadvantagesDisadvantages– knowledge engineering requiredknowledge engineering required

can be substantialcan be substantial

– expert opinion may not match user expert opinion may not match user preferencespreferences

Page 67: Recommender Systems Session B Robin Burke DePaul University Chicago, IL

Next

Session C 15:00 Need laptops Install workspace