part 15: knowledge-based recommender systems · part 15: knowledge-based recommender systems ......

58
Part 15: Knowledge-Based Recommender Systems Francesco Ricci

Upload: vophuc

Post on 19-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Part 15: Knowledge-Based Recommender Systems

Francesco Ricci

2

Content

p  Knowledge-based recommenders: definition and examples

p  Case-Based Reasoning p  Instance-Based Learning p  A recommender system exploiting a “simple”

case model (the product is a case) p  A more complex CBR recommender system for

travel planning

3

“Core” Recommendation Techniques

[Burke, 2007]

U is a set of users I is a set of items/products

4

Knowledge Based Recommender

p  Suggests products based on inferences about a user’s needs and preferences

p  Functional knowledge: about how a particular item meets a particular user need

p  The user model can be any knowledge structure that supports this inference n  A query, i.e., the set of preferred features for a

product n  A case (in a case-based reasoning system) n  An adapted similarity metric (for matching) n  A part of an ontology

p  There is a large use of domain knowledge encoded in a knowledge representation language/approach.

5

ActiveBuyersGuide

Wizard: My Product Advisor

The system decides what the wizard says

Possible user's

requests

7

8

9

10

11

Trip.com

12

13

14

15

Matching in TripleHop

Example: TripleHop

activities constraint

C-UM:00341

relaxing shopping

lying on a beach

sitting in cafes

meat = beef

budget = 200

Catalogue of Destinations

matching

[Delgado and Davidson, 2002]

16

TripleHop and Content-Based RS

p  The content (destination description) is exploited in the recommendation process

p  A classical Content-Based method would have used a “simpler” content model ,e.g., keywords or TF-IDF

p  Here a more complex knowledge structure – a tree of concepts – is used to model the product (and the query)

p  The query is the user model and it is acquired every time the user asks for a new recommendation - (not exactly, more details later) n  Stress on ephemeral needs rather than building a

persistent user model p  Typical in Knowledge-Based RS, they are more focused on

ephemeral users – because Collaborative Filtering and Content-Based methods cannot cope with that users.

17

Learning User Profile: query mining

activities constraint

C-UM:00341

relaxing shopping

lying on a beach

sitting in cafes

meat = beef

budget = 200

Crete

Old query – user model

activities constraint

C-UM:00357

relaxing adventure

lying on a beach

meat = pork

hiking

new user request

activities constraint

C-UM:00357bis

relaxing adventure

lying on a beach

meat = pork hiking

budget = 200

sitting in cafes

shopping new user request, as computed by the systems. Shadowed means less important.

18

Query Augmentation

p  Personalization in search is not only “information filtering”

p  Query augmentation: when a query is entered it can be compared against contextual and individual information to refine the query n  Ex1: If the user is searching for a restaurant and

enter a keyword “Thai” then the query can be augmented to “Thai food” (See Part 8 – Query expansion – based on co-occurrence analysis in the corpus of documents)

n  Ex2: If the query “Thai food” does not retrieve any restaurant the query can be refined to “Asian food”

n  Ex3: If the query “Asian food” retrieves too many restaurant, and the user searched in the past for “Chinese” food the query can be refined to “Chinese food”.

19

Query Augmentation in TripleHop

1.  The current query is compared with previous queries of the same user

2.  Preferences expressed in past (similar) queries are identified

3.  A new query is built by combining the short term preferences contained in the query with the “inferred” preferences extracted from the persistent user model (past queries)

4.  When the query is matched against an item (destination) if two destinations have the same degree of matching for the explicit preferences then the “inferred” preferences are used to break the tie

p  This is another example of the cascade approach n  the two combined RS are based on the same

knowledge but with two definitions of the user model.

20

What is Case Based Reasoning ?

p  A case-based reasoner solves new problems by adapting solutions that were used to solve old problems (Riesbeck & Shank 1989)

p  CBR problem solving process: n  store previous experiences (cases) in memory n  to solve new problems

p Retrieve form the memory similar experience about similar situations

p Reuse the experience in the context of the new situation: complete or partial reuse, or adapt according to differences

p Store new experience in memory (learning)

[Aamodt and Plaza, 1994]

21

Case-Based Reasoning

[Aha, 1998]

22

CBR Assumption

p  New problem can be solved by n  retrieving similar problems n  adapting retrieved solutions

p  Similar problems have similar solutions

?

S S S

S S S

S S S

P P P P P P

P P

P

X

23

Examples of CBR

p  Classification: “The patient’s ear problems are like this prototypical case of otitis media”

p  Compiling solutions: “Patient N’s heart symptoms can be explained in the same way as previous patient D’s”

p  Assessing values: My house is like the one that sold down the street for $250,000 but has a better view”

p  Justifying with precedents: “This Missouri case should be decided just like Roe v. Wade where the court held that a state’s limitations on abortion are illegal”

p  Evaluating options: “If we attack Cuban/Russian missile installations, it would be just like Pearl Harbor”

24

Instance-based learning – Lazy Learning

p  One way of solving tasks of approximating discrete or real valued target functions

p  Have training examples: (xn, f(xn)), n=1,...,N p  Key idea:

n  just store the training examples n  when a test example is given then find the

closest matches n  use the closest matches to guess the value of

the target function on the test example.

25

The distance between examples

p  We need a measure of distance (or similarity) in order to know who are the neighbours

p  Assume that we have T attributes for the learning problem. Then one example point x has elements xt ∈ ℜ, t=1,…,T

p  The distance between two points x and y is often defined as the Euclidean distance:

∑=

−=T

ttt yxyxd

1

2][),(

26

two one

four

three

five six

seven Eight ?

yes

yes

yes

no no

no

no ?

27

Training data

Number Lines Line types Rectangles Colours Mondrian?

8 7 2 9 4

Test instance

Number Lines Line types Rectangles Colours Mondrian? 1 6 1 10 4 No 2 4 2 8 5 No 3 5 2 7 4 Yes 4 5 1 8 4 Yes 5 5 1 10 5 No 6 6 1 8 6 Yes 7 7 1 14 5 No

28

Lines LinesT Rect Colors Class Distance to test

Train1 4 2 8 5 no 3,32

Train2 5 2 7 4 yes 2,83

Train3 5 1 8 4 yes 2,45

Train4 5 1 10 5 no 2,65

Train5 6 1 8 6 yes 2,65

Train6 7 1 14 5 no 5,20

test 7 2 9 4

Train1 -0,32 0,32 -0,11 0,06 no 0,80

Train2 -0,08 0,32 -0,21 -0,28 yes 0,52

Train3 -0,08 -0,16 -0,11 -0,28 yes 0,69

Train4 -0,08 -0,16 0,08 0,06 no 0,77

Train5 0,16 -0,16 -0,11 0,39 yes 0,86

Train6 0,40 -0,16 0,47 0,06 no 0,76

test 0,40 0,32 -0,02 -0,28

Feature values are not normalized

Feature values are normalized

x' = (x –avg(X))/4*stdev(X)), where x is a feature value of the feature X

What is the difference between this feature values normalization and vector normalization in IR?

29

Example of CBR Recommender System

p  Entree is a restaurant recommender system – it finds restaurants: 1.  matching some

user goals (case features)

2.  or similar to restaurants the user knows and likes

30

The Product is the Case

p  In Entrée a case is a restaurant – the case is the product

p  The problem component is the description of the restaurant given by the user

p  The user will input a partial description of it – this is the only difficulty

p  The solution part of the case is the restaurant itself – i.e. the name of the restaurant

p  The assumption is that the needs of the user can be modeled as the features of the product description ….

31

Partial Match

p  In general, only a subset of the preferences will be matched in the recommended restaurant.

32

Nearest Neighbor

33

Recommendation in Entree

p  The system first selects from the database the set of all restaurants that satisfy the largest number of logical constraints generated by considering the input features type and value

p  If necessary, implicitly relaxes the lowest important constraints until some restaurants could be retrieved n  Typically the relaxation of constraints will

produce many restaurants in the result set p  Sorts the retrieved cases using a similarity metric

– this takes into account all the input features.

34

Similarity in Entree

p  This similarity metric assumes that the user goals, corresponding to the input features (or the features of the source case), could be sorted to reflect the importance of such goals from the user point of view

p  Hence the global similarity metric (algorithm) sorts the products first with respect the most important goal and then iteratively with respect to the remaining goals (multi-level sort)

p  Attention: it does not works as a maximization of a Utility-Similarity defined as the sum of local utilities.

35

Example

p  If the user query q is: price=9 AND cusine=B AND Atm=B

p  And the weights (importance) of the features is: 0.5 price, 0.3 Cusine, and 0.2 Atmosphere

p  The Entrée will suggest Dolce first (and then Gabbana) p  A more traditional CBR system will suggest Gabbana

because the similarities are (30 is the price range): n  Sim(q,Dolce) = 0.5 * (1 - 1/30) + 0.3 * 0 + 0.2 * 0 =

0.48 n  Sim(q, Gabbana) = 0.5 (1 - 3/30) + 0.3 *1 + 0.2 * 1 =

0.45 + 0.3 + 0.2 = 0.95

Restaurant Price Cusine Atmosphere

Dolce 10 A A

Gabbana 12 B B

36 http://www.visitfinland.com/web/guest/travel-planner/home

37

38

Query Tightening

39

40 [Ricci et al., 2002]

41

NutKing as a CBR System

p  Problem = recommend a set of tourism related products and build a travel plan

p  Cases = All the recommended travel plans that users have built using the system (how they were built and what they contain)

p  Retrieval = search in the memory travel plans built during “similar” recommendation sessions

p  Reuse 1.  extract from previous travel plans elementary

components (items) and use them to build a new plan

2.  rank items found in the catalogues

42

Travel Plan and Interaction Session Model

cnq Queries on content attributes

Travel bag

Collaborative Component 2: selected products (Kitzbühel,True,True,..)

rating

(category=3 AND Health=True)

(Golfing=True AND Nightlife=True)

case

Collaborative Component 1: travel wish

clf (family, bdg_medium,7,Hotel)

(Hotel Schwarzer, 3, True,…)

43

Item Ranking

Case Base

2. Search Similar Cases

tb

u

r

twc Case

3. Output Reference Set

Q Input

4. Sort locations loci by similarity to locations in

reference cases

1. Search the catalogue

loc1 loc2 loc3

Locations from Catalogue Travel

components

loc1

loc2 loc3

Ranked Items

Output

Suggest Q changes

tb

u

r

twc Current Case

Interactive query management

44

Two-fold Similarity

s1 s2 s3 s4 s5 s6

?

Product similarity

Target user

Target session

Sessions similarity

45

Rank using Two-Fold Similarity

p  Given the current session case c and a set of retrieved products R (using the interactive query management facility - IQM)

1.  retrieve 10 cases (c1, …, c10) from the repository of stored cases (recommendation sessions managed by the system) that are most similar to c with respect to the collaborative features

2.  extract products (p1, …, p10) from cases (c1, …, c10) of the same type as those in R

3.  For each product r in R compute the Score(r) as the maximum of the product of a) the similarity of r with pi, b) the similarity of the current case c and the retrieved case ci containing pi

4.  sort and display products in R according to the Score(r).

46

Example: Scoring Two Destinations

D2 D1

Destinations matching the user’s query

Sim(CC,C1) 0.2 Sim(CC,C2) 0.6

Sim(D1, CD1) 0.4 Sim(D1, CD2) 0.7 Sim(D2, CD1) 0.5 Sim(D2, CD2) 0.3

Score(Di) = Maxj {Sim(CC,Cj)*Sim(Di,CDj)}

Score(D1)=Max{0.2*0.4,0.6*0.7}=0.42 Score(D2)=Max{0.2*0.5,0.6*0.3}=0.18

current case CC

D1

current case CC

D2

?

?

similar cases in the case base

CD1

CD2

C1

C2

47

p  A case is a rooted tree and each node has a: n  node-type: similarity between two nodes in two cases is

defined only for nodes with the same node-type n  metric-type: node content structure - how to measure

the node similarity with another node in a second case

Tree-based Case Representation

c1

nt: case

mt: vector

clf1

cnq1

cart1

nt: cart

mt: vector

dests1

accs1

acts1

nt: destinations

mt: set

dest1

dest2

X1

X2

X3

X4

nt: location

mt: hierarchical

mt: vector

nt: destination

ITEM

48

Item Representation

Node Type Metric Type Example: Canazei X1 LOCATION Set of hierarchical

related symbols Country=ITALY, Region=TRENTINO, TouristArea=FASSA, Village=CANAZEI

X2 INTERESTS Array of Booleans Hiking=1, Trekking=1, Biking=1

X3 ALTITUDE Numeric 1400

X4 LOCTYPE Array of Booleans Urban=0, Mountain=1, Rivereside=0

TRAVELDESTINATION=(X1,X2,X3,X4)

dest1

X1 = (Italy, Trentino, Fassa, Canazei)

X2 = (1,1,1)

X3 = 1400

X4 = (0, 1, 0)

49

Item Query Language p  For querying purposes items x a represented as

simple vector features x=(x1, …, xn)

p  A query is a conjunction of constraints over features: q=c1 ∧ c2 ∧ … ∧ cm where m ≤ n and

⎪ ⎩

⎪ ⎨ ⎧

≤ ≤ = =

=

numerical is if nominal is if

boolean is if

k k k k

k k

i i i i i i

k x u x l x v x x true x

c

dest1

X1 = (Italy, Trentino, Fassa, Canazei)

X2 = (1,1,1)

X3 = 1400

X4 = (0, 1, 0)

(Italy, Trentino, Fassa, Canazei, 1, 1, 1, 1400, 0, 1, 0)

50

If X and Y are two items with same node-type

d(X,Y) = (1/∑i wi)1/2 [∑i wi di(Xi,Yi)2 ]1/2

where 0 ≤ wi ≤ 1, and i=1..n (number of features).

1 if Xi or Yi are unknown overlap(Xi,Yi) if Xi is symbolic |Xi - Yi|/rangei if Xi is finite integer or real

di(Xi,Yi) = Jaccard(Xi,Yi) if Xi is an array of Boolean Hierarchical(Xi,Yi) if Xi is a hierarchy Modulo(Xi,Yi ) if Xi is a circular feature (month) Date (Xi,Yi ) if Xi is a date

Sim(X,Y) = 1 - d(X,Y) or Sim(X,Y) = exp(- d(X,Y))

Item Similarity dest1

X1=(Italy, Trentino, Fassa, Canazei)

X2 = (1,1,1)

X3 = 1400

X4 = (0, 1, 0)

51

Item Similarity Example

dest1

X1 = (I, TN, Fassa, Canazei)

X2 = (1,1,1)

X3 = 1400

X4 = (0, 1, 0)

dest2

Y1 = (I, TN, Fassa,?)

Y2 = (1,0,1)

Y3 = 1200

Y4 = (1, 1, 0)

Sim(dest1,dest2 ) = exp(!(1 / 4) d1(X1,Y1)2 +!+ d4 (X4,Y4 )

2 )

= exp(!(1 / 4) (0.3)2 + (1! 2 / 3)2 + ((1400!1200) / 2000)2 + (1!1/ 2)2 )

= exp(!(1 / 4) 0, 461) = exp(!0,339) = 0, 712

1 0.7 0.5 0.3

3 in the union 2 in the union

52

Case Distance

c1

nt: case

mt: vector

clf1

cnq1

cart1

nt: cart

mt: vector

dests1

accs1

acts1

nt: destinations

mt: set

dest1

dest2

X1

X2

X3

X4

nt: location

mt: hierarchical

mt: vector

nt: destination

c2 clf12

cnq2

cart2

dests2

accs2

acts2

dest3

dest4

Y1

Y2

Y3

Y4

dest5

53

Case Distance

2213

2212

22113

1

21 ),(),(),(1),( cnqcnqdWclfclfdWcartcartdWW

ccd

ii

++=

∑=

c1

nt: case

mt: vector

clf1

cnq1

cart1

c2 clf12

cnq2

cart2

54

221

221

22121 ),(),(),(),( actactdaccsaccsddestsdestsdcartcartd ++=

nt: cart

mt: vector

c1

nt: case

mt: vector

clf1

cnq1

cart1

c2 clf12

cnq2

cart2

dests1

accs1

acts1

dests2

accs2

acts2

55

c1

nt: case

mt: vector

clf1

cnq1

cart1

nt: cart

mt: vector

dests1

accs1

acts1

nt: destinations

mt: set

dest1

dest2

c2 clf12

cnq2

cart2

dests2

accs2

acts2

dest3

dest4

dest5

)),(),(),(

),(),(),((3*21),(

524232

51413121

destdestddestdestddestdestd

destdestddestdestddestdestddestsdestsd

++

+++=

56

CBR Knowledge Containers

p  CBR is a knowledge-based approach to problem solving

p  The knowledge is “contained” into four containers n  Cases: the instances belonging to our case base n  Case representation language: the

representation language that we decided to use to represent cases

n  Retrieval knowledge: the knowledge encoded in the similarity metric and in the retrieval algorithm

n  Adaptation knowledge: how to reuse a retrieved solution to solve the current problem.

57

Conclusions

p  Knowledge-based systems exploits knowledge to map a user to the products she likes

p  KB systems uses a variety of techniques p  Knowledge-based systems requires a big effort in

term of knowledge extraction, representation and system design

p  Many KB recommender systems are rooted in Case-Based Reasoning

p  Similarity of complex data objects is required often required in KB RSs.

p  NutKing is a hybrid case-based recommender system

p  The case is the recommendation session.

58

Questions p  What are the main differences between a CF recommender

system and a KB RS (such as activebuyers.com or Entree)? p  What is the role of query augmentation? p  What is the basic rationale of a CBR recommender system? p  What is a case in a CBR recommender system such as

Entree? p  How a CBR recommender system learns to recommend? p  What are the knowledge containers is a CBR RS? p  What are the main differences between a “classical” CBR

recommender system such as Entrée and Nutking? p  What are the motivations for the introduction of the double-

similarity ranking method? p  What are the types of local similarity metrics used in

Nutking?