context relevance assessment for recommender systemsricci/papers/iui-2011-baltrunas.pdf ·...

Context Relevance Assessment for RecommenderSystems

Linas BaltrunasFree University ofBozen-Bolzano

Piazza Domenicani, 339100 Bozen-Bolzano, Italy

[email protected]

Bernd LudwigFree University ofBozen-Bolzano

Piazza Domenicani, 339100 Bozen-Bolzano, [email protected]

Francesco RicciFree University ofBozen-Bolzano

Piazza Domenicani, 339100 Bozen-Bolzano, Italy

[email protected]

ABSTRACTResearch on context aware recommender systems is takingfor granted that context matters. But, often attempts to showthe influence of context have failed. In this paper we con-sider the problem of quantitatively assessing context rele-vance. For this purpose we are assuming that users can imag-ine a situation described by a contextual feature, and judgeif this feature is relevant for their decision making task. Wehave designed a UI suited for acquiring such information ina travel planning scenario. In fact, this interface is genericand can also be used for other domains (e.g., music). Theexperimental results show that it is possible to identify thecontextual factors that are relevant for the given task and thatthe relevancy depends on the type of the place of interest tobe included in the plan.

Author KeywordsContext-aware, recommender systems, user preferences.

ACM Classification KeywordsH.3.3 Information Search and Retrieval: Information Filter-ing.

General TermsAlgorithms, Experimentation, Human Factors.

INTRODUCTIONRecommender Systems (RSs) are tools providing sugges-tions for items to be of use to a user [2]. Generating goodrecommendations is hard because they are evaluated subjec-tively and the RS’s knowledge about the user’s current pref-erences is largely uncertain. Even worse, the user’s deci-sion is mostly influenced by contextual conditions that differeach time the decision is taken. As an illustrative example,take the two recommended routes in Figure 1 for visiting thecity of Cles starting from Bolzano by car. Both of them arecorrect; however, they have different properties. For motor

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.IUI’11, February 13–16, 2011, Palo Alto, California, USA.Copyright 2011 ACM 978-1-4503-0419-1/11/02...$10.00.

bikers, route 1 would be a great experience as it includes thefamous Mendelpass while travelers with children in the carwould prefer route 2, a more comfortable, although longerroute on the highway.

As this example illustrates, often a recommendation can bemore relevant if its context is known. For this reason, context-aware recommender systems (CARSs) are gaining more andmore attention [3], and various approaches have been usedto incorporate context knowledge, improving performancemeasures, such as: mean absolute error [4], or recall [1], orprediction accuracy [9]. However, to adapt to the contextthe dependency of the user preferences from the contextualconditions must be modeled. This requires to record explicituser evaluations (ratings) for items in alternative contexts,e.g., the rating for a movie after it was watched with thepartner. Such data is difficult to obtain because it requiressubstantial user effort, since the user must provide ratingsin several contextual conditions. Moreover, one can acquiresuch ratings and later discover that the considered contextualconditions where actually irrelevant, i.e., the ratings are notinfluenced, and the RS is not improved [6]. Hence, a ma-jor issue for the design of CARSs is assessing the contextualfactors that are worth considering. This requires to formu-late informed conjectures about the influence of some data,before collecting the real data. It is a kind of active learningproblem, where the relevance of the data to be acquired mustbe estimated to minimize the data acquisition cost [10].

The main contribution of this paper is a methodology for thequantitative assessment of the dependency of the user pref-erences from a candidate set of contextual factors. It is basedon a interface for acquiring context relevance judgementsand a data analysis method for identifying the contextualfactors that are likely to influence the user decisions. Thisapproach can be adopted after a qualitative analysis, such asa diary study, has revealed the contextual factors that are po-tentially relevant for a recommendation task. This method-ology has been tested on a travel planning application aimedat recommending points of interests (POIs) to mobile users1.The mobile assistant we are developing is planned to offercontext-dependent recommendations for touristic POIs [5]that are updated as soon as the contextual conditions change.

1It is also being tested on a in car music recommendations scenario(not illustrated here for lack of space).

ACQUIRING CONTEXT RELEVANCEIn order to assess the influence of some contextual factorson the user decisions we collected data describing how userschange their inclination to visit a POI while they imaginethat a contextual condition holds. For that purpose, a largeset of contextual conditions (as found in the relevant litera-ture [11]) and a (relatively small) list of categories of POIs inBolzano (and other nearby cities) have been incorporated ina web form (see Figure 2). POIs were aggregated into cate-gories in order to avoid sparseness of the collected data. Wedefined eleven categories: castle; nature wonder; cycling andmountain biking; theater event; folk festival, arts and craftsevent; church or monastery; museum; spa and pampering;music event; walking path. In the web application, the userscould indicate the influence of these contextual conditionson their decision to visit POIs belonging to a randomly se-lected category. The influence is measured with three val-ues: positive, negative or neutral. Three different contextualconditions (i.e., values for contextual factors) were tested ina single page while a full questionnaire consisted of five ofsuch pages (as in Figure 2).

We observe that [8] already tried to estimate the impact ofcontextual conditions on the user evaluations by asking theuser to imagine a given contextual condition. They haveshown that this method must be used with care as users ratedifferently in real and supposed contexts. When the contextis just supposed there is a tendency of the users to exaggerateits importance. In fact, in our case we are trying to measureonly if a contextual factor has an influence (positive or neg-ative) on the user’s decisions and not the real value of theuser’s ratings. For instance, we want to understand if theproximity to a POI is influential, and not how the rating fora precise POI changes as a function of the user proximity.Moreover, as it is shown later, our statistical approach canpredict to what extend that a context factor does influencethe user. So, considering only conditions with high influ-ence we can reduce significantly the number of false posi-tives. Hence, our method is proposed as a tool for selectingpotentially relevant contextual factors; while the true evalua-tions/ratings of the items under alternative contextual condi-tions can be acquired in a classical way by asking the usersto rate items when they are really experienced in a contextualcondition (the next step of our future work).

Route 1 (50.7 km/58 min) Route 2 (66.2 km/55 min)

Figure 1. Comparison of different Routes from Bolzano to Cles.

Figure 2. Web survey tool.

33 participants (mostly from our computer science faculty)took part in the web survey. Overall, they gave 1524 re-sponses. In a single response to one of the questions shownin Figure 2, the user evaluates the influence of one contex-tual condition on his decision to visit an item of the givencategory. For the specification of the context, the factorsand conditions presented in Table 1 were applied in a ran-domized way: for each question a category is drawn at ran-dom along with a value (condition) for a context factor. Thissampling has been implemented such that a uniform distri-bution over the possible categories and context conditions isachieved. A different sampling is also applicable if a priordistribution is known.

ANALYSISWith the web survey we aimed at finding indications aboutwhich context factors influences user decisions whether tovisit or not a POI. As no information about the relationshipsbetween response variable and context was available, para-metric tests such as χ2 were not applicable. Therefore, anon-parametric statistical analysis seemed to be more appro-priate: The web survey delivered samples for the distribution

P (I|T,C1, . . . , CN ) =P (I, T, C1, . . . , CN )

P (T,C1, . . . , CN )≈

�N�

i=1

P (I|T,Ci)

P (I|T )

�· P (I|T )

where I (Influence) is the response variable, taking one ofthe three values: positive, negative, or neutral. T is the POIcategory, and the C1, . . . , CN are the context factors that(may or may not) influence the user decision. The proba-bilities P (I|T,Ci) model the influence of the context fac-tors on the user’s decision. The knowledge of P (I|T,Ci)can drive the acquisition of context-dependent ratings for thecontext factors that have a large probability to increase or de-

Context Factor Conditions Context Factor Conditions Context Factor Conditions Context Factor Conditionsbudget budget traveler crowdedness not crowded companion with girl-/boy-friend season spring

high spender crowded with family summerprice for quality empty with children autumn

time of the day morning time health care alone winterafternoon travel goal cultural experience with friends transport public transportnight time scenic/landscape weather snowing no means of transport

day of the week weekend education clear sky bicycleworking day hedonistic/fun sunny car

distance to POI near by social event rainy temperature warmfar away religion cloudy cold

knowledge new to city activity/sport mood happy hotabout about area citizen of the city visiting friends active time available half day

returning visitor business sad more than a dayone day

Table 1. Context factors and conditions used in the web survey

crease the user evaluation for the items in a given categoryT . Hence, it is interesting to understand which Ci have im-pact on I , or in other words, which Ci explain I better thanother context factors.

Statistical MethodologyThe spread of a categorical variable X = {x1, . . ., xn} canbe measured by looking at the entropy of the random vari-able [7]. If P (X = xi) = πi, the entropy of X is:

H(X) = −�

1≤i≤n

πi · log πi

This measure of the spread can be used to estimate the asso-ciation of two variables X1 and X2, i.e., how well one vari-able explains the other. In the considered tourist recommen-dation scenario, X1 is the variable Inclination of the user tovisit an item, whileX2 is a context factor of the current situa-tion which may have an influence on the user’s decision, e.g.the current weather condition. Informally, this influence isstrong if the knowledge about the weather reduces the spreadofX1, and it is weak if the spread ofX1 remains unchangedeven if one knows the weather. Therefore, the difference be-tween the spread ofX1 and the expected spread of (X1|X2)is a measure for the association ofX1 andX2. As the spreadof (X1|X2) should not be larger than that ofX1 alone we cannormalize the difference to the interval [0, 1] by:

U =H(X1)−H(X1|X2))

H(X1)

U is 1 if the spread of (X1|X2) is zero. This occurs if foreach value of X2 the value X1 is certain (i.e. X1 is a de-terministic function of X2). U is zero, however, if X2 doesnot have any influence on X1, in which case the spread of(X1|X2) is not different from that of X1. Using entropy tomeasure spread, we get the following formula:

U = −

�

1≤i≤k

�

1≤j≤l

πi,j · log�

πi,j

πi,•π•, j

�

�

1≤j≤l

π•,j · log π•,j

where, πi,j = P (X1 = xi, X2 = yj). X1 and X2 arecategorical variables with X1 = {x1, . . . , xk} and X2 ={y1, . . . , yl}. πi,• =

�1≤j≤l πi,j and π•,j =

�1≤i≤k πi,j .

We note that U is the mutual information ofX1 andX2 nor-malized to the [0, 1] interval.

ResultsGiven this definition, U can be used to measure how wellI (the influence) can be predicted if Ci (a context factor) isknown. Therefore, in order to understand which contextualfactors are more important in predicting whether the userwill change his inclination to visit a POI, we have computedU for all factors and POI categories. Ordering the factorsin descending value of U , one gets the results reported inthe Appendix of this paper. That table indicates that thereare some factors that indeed seem to be relevant for all thecategories, among them distance to the POI, time available,crowdedness, and knowledge of the surroundings. Othersoften appear to be less relevant: transport, travel goal, dayof the week. Finally some factors appear to have a differentrelevance depending on the category. For lack of space wecannot fully describe the dependencies shown in this table.In fact, there are results that may appear controversial, e.g.,that “distance” is not important for cycling. But, this is ex-plained by observing that here “distance” refers to “distanceto the POI”, which is actually less important for a cyclistthan for a pedestrian.

CONCLUSIONS AND FUTURE WORKIn this paper we have illustrated a methodology and a toolfor acquiring explicit users’ evaluations about the relevancyof contextual factors for item selection and recommenda-tion. Contextual information is known to impact on userdecision making but often the relationship between contextand decision is largely unknown and uncertain. Which con-textual factor is relevant in a specific decision making sit-uation is hard to predict and wrong assumptions may leadto unnecessary and misleading reasoning models. The pro-posed methodology tackles these problems and has been ap-plied to a travel planning scenario. It has been shown thattourists’ preferences are strongly influenced and vary signif-icantly with respect to context and item category. The pro-posed methodology provides quantitative measures of con-text relevancy, complementing other qualitative approachesand results coming from consumer behavior literature [11].The collected data are now being used in a mobile touristassistant that pushes new recommendations to tourists whencontextual conditions changes.

In conclusion, we have shown how the uncertain relation-ships between context and decision can be explored and mea-sured. We are applying the proposed approach in a differentdecision making scenario, namely music recommendationfor a group of passengers in a car, to understand to what ex-tend the approach can be generalized to other tasks.

REFERENCES1. G. Adomavicius, R. Sankaranarayanan, S. Sen, andA. Tuzhilin. Incorporating contextual information inrecommender systems using a multidimensionalapproach. ACM Trans. Inf. Syst., 23(1):103–145, 2005.

2. G. Adomavicius and A. Tuzhilin. Toward the nextgeneration of recommender systems: a survey of thestate-of- the-art and possible extensions. Knowledgeand Data Engineering, IEEE Transactions on,17:734–749, 2005.

3. G. Adomavicius and A. Tuzhilin. Context-awarerecommender systems. In F. Ricci, L. Rokach,B. Shapira, and P. Kantor, editors, RecommenderSystems Handbook, pages 217–253. Springer Verlag,2010.

4. L. Baltrunas and F. Ricci. Context-based splitting ofitem ratings in collaborative filtering. In RecSys ’09:Proceedings of the 2009 ACM conference onRecommender systems, October 22-25, New York,USA, 2009. ACM Press.

5. V. Bellotti, B. Begole, E. H. Chi, N. Ducheneaut,J. Fang, E. Isaacs, T. King, M. W. Newman,K. Partridge, B. Price, P. Rasmussen, M. Roberts, D. J.Schiano, and A. Walendowski. Activity-basedserendipitous recommendations with the magitti mobileleisure guide. In SIGCHI conference on Human factorsin computing systems, CHI ’08, pages 1157–1166, NewYork, NY, USA, 2008. ACM.

6. S. Jeong, S. Kalasapur, D. Cheng, H. Song, andH. Cho. Clustering and naive bayesian approaches forsituation-aware recommendation on mobile devices. InInternational Conference on Machine Learning andApplications, ICMLA 2009, Miami Beach, Florida,USA, December 13-15, 2009, pages 353–358, 2009.

7. C. J. Lloyd. Statistical Analysis of Categorical Data.Wiley-Interscience, 1999.

8. C. Ono, Y. Takishima, Y. Motomura, and H. Asoh.Context-aware preference model based on a study ofdifference between real and supposed situation data. InUser Modeling, Adaptation, and Personalization, 17thInternational Conference, UMAP 2009, Trento, Italy,June 22-26, 2009, pages 102–113, 2009.

9. C. Palmisano, A. Tuzhilin, and M. Gorgoglione. Usingcontext to improve predictive modeling of customers inpersonalization applications. IEEE Transactions onKnowledge and Data Engineering, 20(11):1535–1549,Nov. 2008.

10. N. Rubens, D. Kaplan, and M. Sugiyama. Activelearning in recommender systems. In F. Ricci,L. Rokach, B. Shapira, and P. Kantor, editors,Recommender Systems Handbook, pages 735–767.Springer Verlag, 2010.

11. J. Swarbrooke and S. Horner. Consumer Behaviour inTourism, Second Edition. Butterworth-Heinemann,2006.

APPENDIXRanking of context factors by their influence on the user de-cision to visit an place of interest.

castle

churchor

monastery

cyclingor

mountain

biking

folkfestival,

artsandcrafts

event

museum

musicevent

naturewonderspa

theaterevent

walking

distance

distance

budget

distance

distance

crowdedness

distance

distance

timeavailable

distance

knowledgeof

surroundings

timeavailable

timeavailable

temperature

budget

dayweek

dayweek

knowledgeof

surroundings

daytime

budget

timeavailable

mood

crowdedness

knowledgeof

surroundings

knowledgeof

surroundings

timeavailable

temperature

crowdedness

distance

temperature

season

daytime

season

weather

temperature

mood

crowdedness

season

budget

crowdedness

dayweek

transport

weather

timeavailable

timeavailable

companion

season

timeavailable

temperature

knowledgeof

surroundings

crowdedness

travelgoal

temperature

companion

companion

distance

timeavailable

weather

knowledgeof

surroundings

timeavailable

daytime

temperature

mood

season

weather

daytime

weather

temperature

dayweek

weather

mood

companion

knowledgeof

surroundings

budget

crowdedness

budget

companion

companion

mood

season

companion

weather

daytime

daytime

travelgoal

transport

daytime

travelgoal

travelgoal

daytime

travelgoal

crowdedness

transport

crowdedness

season

temperature

mood

daytime

season

mood

transport

season

companion

dayweek

dayweek

travelgoal

travelgoal

mood

crowdedness

dayweek

temperature

knowledgeof

surroundings

travelgoal

travelgoal

daytime

season

transport

budget

companion

transport

weather

budget

dayweek

transport

transport

knowledgeof

surroundings

budget

transport

weather

travelgoal

budget

dayweek

distance

mood

mood

weather

knowledgeof

surroundings

dayweek

transport

companion

context relevance assessment for recommender systemsricci/papers/iui-2011-baltrunas.pdf ·...

Documents