improving explicit preference entry by visualising data similarities

improving explicit preference entry by visualising data similaritieskris jack and florence duclaye

13 january 2008

kris jack – p2

the problem in context

background

proposed solution

user evaluation

results

discussion

conclusion

1

summary

2

3

4

5

7

6

kris jack – p3

1the problem in context

kris jack – p4

context

general context recommender systems attempt to offer items to users that will be appreciated the quality of these recommendations is largely constrained by the data

• a user’s preferences

• domain-specific ‘general knowledge'• the items that can be recommended• previous users’ opinions of items

the acquisition of such data is of central importance

specific context a hybrid content-based and collaborative filtering recommendation system the system will be implemented in the domain of cinema the user can explicit enter their preferences

kris jack – p5

the problem

entering your preferences explicitly can be boring can be difficult

if a system is too much trouble to use, then it simply won't be used

how can we improve the explicit preference entry process?

maximise the number of explicit preferences that could be elicited within a given period of time

kris jack – p6

2background

kris jack – p7

definitions

preferences, many types and definitions exist concentrating on the elicitation of monadic preferences represent a user’s like, dislike, or indifference towards an item or item

attribute (e.g. “i like tim burton”, “i dislike horror movies” and “i love kill bill”)

preference acquisition strategies explicit – user explicitly enters their preferences implicit – learning strategies that are non-invasive (e.g. user-profiling

and collaborative filtering)

explicit preference entry (epe) interface an interface that asks users to explicitly give their preferences

towards items and item attributes (e.g. “i love comedies”)

kris jack – p8

some existing epe interfaces

recommenders with epe interfaces minekey (www.minekey.com) stumbleupon (www.stumbleupon.com) movielens

drawbacks boring to use difficult to be inspired

(when guidance is lacking) difficult to describe yourself

in their terms at times

stumbleupon

minekey

http://www.minekey.com/

kris jack – p9

data visualisation

perhaps we can improve epe interfaces by visualising data data visualisation techniques have had some success in

recommenders music plasma (www.musicplasma.com) amaznode (amaznode.fladdict.net/)

music plasma amaznode

http://www.musicplasma.com/

kris jack – p10

3proposed solution

kris jack – p11

visualising data similarity in an epe interface

the epe interface should encourage users to enter their preferences may guide users based upon the preferences required should be enjoyable to use and not boring

the epe interface creation process must be robust and reliable desirable if it is automated from start to finish

creation process summary input a list of items or item descriptors (e.g. actors) using a similarity metric, find the similarities between data elements visualise the similarities between the data elements

kris jack – p12

instantiating the system's semantic knowledge

the system's semantic knowledge describes the similarity between descriptors in the database

the notion of similarity is necessarily subjective

how similar do you find these two actors?

(robert de niro and al pacino)

kris jack – p13

strategies for defining semantic similarities considered

instantiation by hand differencing mechanism co-occurrence measures clustering algorithms collaborative filtering techniques

opted for one based on co-occurrences of actor names found using the google search estimates

where m is the total number of pages considered, f(i1) and f(i2) are the number of hits for i1 and i2 respectively, and f(i1,i2) is the number of hits for the co-occurrence of i1 and i2.

in essence, the more often two items appear together on the same web pages, the more similar they are

the measure of semantic relatedness server provides free access

1 2 1 21 2

1 2

max{log ( ),log ( )} log ( , )( , )

log min{log ( ),log ( )}

f i f i f i id i i

M f i f i

−=−

kris jack – p14

normalised google distance

actors jackie chan bruce lee jane fonda

jackie chan

2,420,000 (0.0)

965,000(0.09)

145,000(0.26)

brucelee

965,000(0.09)

2,630,000(0.0)

46,700(0.37)

jane fonda

145,000(0.26)

46,700(0.37)

1,930,000(0.0)

the number of google hits for actor pairs and their normalized google distances given in brackets (the smaller the distance, the more similar the actors)

kris jack – p15

data visualisation

use of the radial tree layout to visualise data similarities manageable linear complexity in laying out the tree

• efficient even with several hundreds of nodes the more similar two items are, the closer their proximity in the graph a radial layout encourages users to explore the tree in a less hierarchical

fashion as it is unclear where the root node is tree can be focussed upon any node in the tree (implementing a smooth

transition animation)• users have previously found this form of visualisation attractive

implementation available in prefuse (www.prefuse.com) library

strategy each item (actor) is represented by a node in the tree connect every node with their two closest nodes (using item similarity)

http://www.prefuse.com/

kris jack – p16

radialtree

kris jack – p17

radial tree (partial)

kris jack – p18

epe interface

the visualisation is mounted in an epe interface that allows users to zoom in by right clicking on a graph area (in zoomed out mode) and zoom

out by right clicking on a graph area that does not contain an actor (in zoomed in mode);

pan within the graph by left clicking on a graph area and dragging in the direction to pan;

search for an actor by typing the actor’s name in the search box. when the user starts to type a name, the mode changes to zoomed out mode and all actor nodes who’s names match the string are enlarged;

change the preference towards an actor (like, dislike, neutral, no preference) by righting clicking on the actor’s node (in zoomed in mode);

re-organise the graph to centre upon one actor by double left clicking on another actor node.

kris jack – p19

epe interface

kris jack – p20

kris jack – p21

4user evaluation

kris jack – p22

evaluating the epe interface

evaulated: choice of similarity metric in the context of actors type of preferences elicited epe interface's ease of use appreciation of the epe interface

materials epe interface mounted with 3 different graphs:

• organised graph (nodes positions according to actor similarity)• unorganised graph (organised graph with nodes randomised)• demonstration graph (organised graph with nodes randomised)

each graph contained the same 500 nodes (most frequent actors from an in-house french database)

instruction sheet

kris jack – p23

participants and procedure

28 participants (14 male, 14 female) procedure

practice the functions of the epe interface using the demonstration graph (took 10 minutes on average)

task• enter as many actor-based preferences as possible in 5 minutes• once with the organised graph and once with the unorganised graph

(following a within-subjects design with 2 groups of 14 participants)• note that the graphs were not named here, the tasks were referred to as

task 1 and task 2 participants completed a questionnaire on terminating the tasks

kris jack – p24

hypothesis and measurements

hypothesis participants will find it easier to declare their preferences for actors in

the organised graph task than in the unorganised graph task, within the same time period

measuring the ease of declaring preferences quantity of preferences entered subjective questioning in the questionnaire

kris jack – p25

5results

kris jack – p26

preference elicitation

more preferences were entered using the organised graph: significant increase in 'like' preferences (34%) decrease in 'dislike' and 'neutral' preferences

Preference Elicitation

0

10

20

30

40

50

60

All Like Dislike Neutral

Organised

Unorganised

kris jack – p27

perceived ease of entering preferences

participants reported that it was: easy to enter preferences using the organised graph neither easy nor difficult to enter preferences using the unorganised

graph

ease of preference entry statements

mean

organised unorganised

"i found it easy to enter my preferences." 3.96 3.36

"entering my preferences demanded too much effort" (mean reversed) 3.64 3.11

cronbach’s alpha = 0.80

kris jack – p28

perceived differences between graphs

22 participants (79%) reported differences between the two graphs

commented that the organised graph had been hand designed so that it was easier to navigate; rearranged itself based on the actors that the participant said that they liked; arranged the participant’s favourite actors together; had more connections between nodes; arranged actors together who:

• co-starred in the same films;• shared the same nationality;• shared the same degree of celebrity;• were similar to one another.

commented that the unorganised graph was less organised

kris jack – p29

appreciation of the epe interface

participants enjoyed using the interface and would be happy using it again (mean = 4.11/5.00, sd = 0.99)

suggested improvements preference changing. some participants did not like having to click

twice to register a dislike and three times to enter a neutral preference.

zooming in. some participants would have preferred a precise indication of the region into which they could zoom into before zooming.

zooming out. some participants felt lost when zoomed in on the actors as they were not sure of where they were with respect to the entire map

kris jack – p30

6discussion

kris jack – p31

discussion

the participants mark more 'like' preferences with the organised graph (34%)

find more actors who they like and less who they dislike or are neutral towards

why?• in searching, participants tended to begin by using the search feature,

then zoom in on their desired actor

• when at the zoomed in level, they would pan around to find other actors.

• actors who were in close proximity tended to be similar

• similar actors tend to be liked too applications that can exploit likes better than dislikes may want to

introduce semantic similarity in an epe interface

kris jack – p32

discussion

how well did the notion of similarity come across? with only 5 minutes of exposure to each graph, the majority of

participants found that one was organised and that the other was lesser so or not at all

the word similar was repeatedly used by participants results serve to validate the use of the google distance metric in this

area the similarity metric thus goes some way to replacing what is

traditionally in the domain of human-design decisions the epe interface was very much appreciated

participants liked it and wanted to use it again they commented that it was more like playing a game and not like

entering their preferences

kris jack – p33

discussion

participants report that the organised graph is easier to use an organised graph is easier to navigate than an unorganised

graph they find more actors who they like

addressing interface issues replace the right click to change a preference with three icons next to

the node. a single click on the item will designate the corresponding preference

offer a 'mini-map', with the absolute position of the main map indicated, that is always zoomed out

kris jack – p34

discussion

when should the epe interface appear in the recommendation process?

from the start, users should be able to use it• initial entry of preferences

all throughout the recommendation process also• could be used to visualise learned or predicted preferences too, allowing the user to

correct any mistakes at a visual level

what other benefits does a similarity-based epe interface bring? users become aware of the notion of similarity as used by the system the logic of the system, in this case the positioning of actor nodes, becomes

learned imagine a system that uses this form of similarity to produce non-exact results

for searches (e.g. cannot fine any jackie chan films, would you like bruce lee films instead?)

understanding the logic of the system is very important in developing trust in the system

kris jack – p35

7conclusion

kris jack – p36

conclusion

a new epe interface is introduced that can takes data and organises it based on a robust similarity metric

data similarities are visualised into a pleasing tree-based graph users can navigate through the graph and explicitly enter their

preferences for different items interface favours elicitation of 'like' preferences

users enter 34% more 'like' preferences when the graph is organised with the similarity metric compared to when it is left unorganised

users report a reduction in cognitive effort when using the organised graph

the epe creation process is a robust and flexible solution to eliciting explicit user preferences in a recommendation system

kris jack – p37

the end

many thanks for your attention

improving explicit preference entry by visualising data similarities

Technology

data elementskris jack

definitions preferences

number of explicit preferences

visualising data similarity

pacinokris jack p12

preferenceskris jack

timeskris jack p8stumbleupon

contextkris jack p3