preference modelling and...

Languages for PreferencesUtility-based Representations

Preference Elicitation / Learning

Preference Modelling and Learning

Paolo Viappiani

LIP6 - CNRS, Université Pierre et Marie Curiewww-desir.lip6.fr/∼viappianip/

DMRS workshop, Bolzano

18 September 2014

1/67Paolo Viappiani Preference Modelling and Learning



Outline

1 Languages for Preferences

2 Utility-based Representations

3 Preference Elicitation / LearningStandard vs Automated ElicitationMinimax-RegretBayesian ApproachesDiscussion and Future Works




Preference Handling Systems are Everywhere

Not only recommender systemsComputational advertisementIntelligent user interfacesCognitive assistantsPersonalized medecinePersonal Robots

What the theory has to say about preferences?3/67

Paolo Viappiani Preference Modelling and Learning



What are Preferences?

Preferences are “rational” desires.Preferences are at the basis of any decision aiding activity.There are no decisions without preferences.Preferences, Values, Objectives, Desires, Utilities,Beliefs,...




Are We Rational Decision Makers?

NO. Human decision makers are (often) irrational andinconsistent.(work of Nobel prize winner Daniel Kahneman)Moreover preferences are often constructed during the“decision process” itself (considering specific examples)




Binary relations

Preference Relation�⊆ A× A is a reflexive binary relationx � y stands for x is at least as good as y� can be decomposed into an asymmetric and a symmetric part

Asymmetric part

Strict preference �: x � y ∧ ¬(y � x)

Symmetric part

Indifference ∼1: x � y ∧ y � x

Incomparability ∼2: ¬(x � y) ∧ ¬(y � x)




Preference Statements

People will not directly provide their preference relation �.Rather, they will provide statements about preferred states ofaffair.Consider sentences of the type:

I like red shoes.I do not like brown sugar.I prefer Obama to McCain.I do not want tea with milk.Cost is more important than safety.I prefer flying to Athens than having a suite at Istanbul.




Representation Problem

Often impossible to state explicitly the preference relation (�), especiallywhen A is large→ need for a compact representation!

Logical Languages

Weighted Logics

Conditional Logics

...

Graphical Languages

Conditional Preference networks (CP nets)

Conditional Preference networks with trade-offs

Generalized Additive Independence networks




Representation Languages

A Compact representation is useful so that preferencescan be formulated with statements that encompass severalalternativesA preference statement: “I prefer red cars over to bluecars”But....What does this exactly mean? All red cards arepreferred to blue cars? Some red cards are preferred toblue cars? There is at lest one red car preferred to a bluecar? I prefer the average red car to the average blue car?The need of a principled “semantic” for preferencestatements




Logical LanguagesPreference Logics

Logical languages for preferences aim at giving a “semantic” topreference statements

Φ is a preference formula (for example: color red)

In logic you write x � Φ to say that x has the “feature” expressed byformula Φ

Mod(Φ) are the alternative where Φ holds

Von Wright semantics

The statement “I prefer Φ to Ψ” actually means preferring the state of affairsΦ ∧ ¬Ψ to Ψ ∧ ¬Φ

Still not enough...

Several preference semantics: strong, optimistic, pessimistic, opportunistic,ceteris paribus




From Statements to Relations: SemanticsPreference Statements

I prefer Φ to Ψ (Φ and Ψ are propositional formula)

Preference for Φ ∧ ¬Ψ over Ψ ∧ ¬Φ (von Wright’s interpretation)

Common case boolean preferences, where Φ is a variable and Ψ itsnegation (I prefer furnished apartment rather the unfurnished)

Different Semantics

Let � be a preference relation. � satisfies the preference statements if itholds x � y for:

for every x, y ∈ A : x � Φ ∧ ¬Ψ, y � Ψ ∧ ¬Φ (Strong semantics)

iff x � Φ ∧ ¬Ψ, y � Ψ ∧ ¬Φ and additionally they have the same evaluation for the other variables(Ceteris Paribus)

x and y are maximal elements of� satisfying Φ ∧ ¬Ψ and Ψ ∧ ¬Φ respectively (Optimistic)

x and y are minimal elements of� satisfying Φ ∧ ¬Ψ and Ψ ∧ ¬Φ respectively (Pessimistic)

x is maximal, y minimal elements of� satisfying Φ ∧ ¬Ψ and Ψ ∧ ¬Φ respectively (Opportunistic)




Ceteris Paribus: From Statements to Networks

Preferential IndependenceKey notion in preference reasoning. It is analogous toprobabilistic independence.CP preference a preferred to a′ ceteris paribus iff ab � a′b ∀bConditional Preference a preferred to a′ given c iff ab′ � a′b′c

for a given c

CP networksThe notion of conditional preferential indipendence constitutesthe main building block to develop graphical models forcompactly representing complex preferences→ CP networks




CP nets

Formalization

Each variable X is associated with a set of parents Pa(X ) and a conditionalpreference table (CPT).The CPT assigns, for each combination of values of the parents, a total orderon the values that X can take.

CP-nets

Research question: is assignment xpreferred to y ? how to find undominatedassignment of variables to values?

Start by nodes with no parents, assign bestvalue, look at the children,..

Technique of “worsening flip” sequence

Notice: strong analogy with Bayesian networks




Utility Representation

Utility function u : X → [0,1]

Ideal item x> such that u(x>) = 1 and u(x⊥) = 0 (scaling)Representing, eliciting u difficult in explicit formFlat utility representation often unrealistic

Color Shape Position Weight ... Utilityitem1 red round top 1kg ... 0.82item2 blue square top 2kg ... 1item3 green square bottom 5kg ... 0.96

... ... ... ... ... ... ...




Additive Utility Functions

Additive representation (common in MAUT)Sum of local utility functions ui over attributes (or local value functions vi

multiplied by scaling weights)Exponential reduction in the number of needed parameters

u(x) =n∑

i=1

ui (xi ) =n∑

i=1

αivi (xi ) (1)

Color v1

red 1.0blue 0.7

green 0.0

Shape v2

round 0square 0.2

star 1

Importance for attribute “color”: α1 = 0.2, for “shape”: α2 = 0.3.Notice: many algorithms in the recommender system community (forexample matrix factorization techniques) implicitly assume an additive model!




Generalized Additive Utility

Sum of local utility functions uI over sets of attributes (or local valuefunctions vi multiplied by scaling weights)

Higher descriptive power than strictly additive utilities, while still havinga manageable number of parameters

u(x) =m∑

i=1

uJi (xJi ) =n∑

i=1

αJi vJi (xJi ) (2)

where Ji is a set of indices, xJi the projection of x on Ji and m thenumber of factors.

Color Shape vcolor,shapered round 0.9red square 1.0red star 0.5blue round 0.4... ... ...

Position vpositiontop 1

bottom 0

Importance for factor “color+shape”: αJ1 = 0.2, for “position”: αJ2 = 0.3.16/67




Utility of Money

Money is a “special” attribute with monotonicpreference

Expected Monetary Value (EMV) 6= Expected Utility(note: explanation to the St. Petersburg paradox)

For most people: U(100$) > 0.5U(200$)

U(money) is concave (most common case):risk averse decision makerU(money) is linear: risk neutralU(money) is convex: risk seeker

Concept of Certainty Equivalent (CE)

Quasi-linear utility scale: utility expressed in amonetary scale

Moreover, influence of “status quo”→ Prospect Theory [Kahneman]




Standard vs Automated ElicitationMinimax-RegretBayesian ApproachesDiscussion and Future Works

Classic Approaches for Utility Elicitation

Assessment of multi attribute utility functionsTypically long list of questionsFocus on high risk decisionGoal: learn utility parameters (weights) up to asmall error

Which queries?Local: focus on attributes in isolationGlobal: compare complete outcomes

Standard Gamble Queries (SGQ)Choose between option x0 for sure or a gamble< x>, l , x⊥,1−l >(best option x> with probability l , worst optionx⊥ with probability 1−l)





(Standard) Elicitation of Additive Models

Consider an attribute (for example, color)Ask for the best value (say, red)Ask for worst value (gray)Ask local standard gable for each remaining color to assessit local utility value (value function)

Bound queries can be askedRefine intervals on local utility values

Scaling factorsDefine reference outcomeAsk global queries in order to assess the difference in utilityoccurring when, starting from the reference outcome,“moving” a particular attribute to the best / worst





Automated Elicitation vs Classic Elicitation

Problems with the classic view

Standard gamble queries (and similar queries) are difficult to respond

Large number of parameters to assess

Unreasonable precision required

Cognitive or computational cost may outweigh benefit

Automated Elicitation and Recommendation

Important points:

Cognitively plausible forms of interaction

Incremental elicitation until a decision is possible

We can often make optimal decisions without full utility information

Generalization across users





Adaptive Utility ElicitationUtility-based Interactive Recommender System:

Bel : belief about the user’s utility function u

Opt(Bel): optimal decision given incomplete beliefs about u

Algorithm: Adaptive Utility Elicitation

1 Repeat until Bel meets some termination condition

1 Ask user some query2 Observe user response r3 Update Bel given r

2 Recommend Opt(Bel)

Types of Beliefs

• Probabilistic Uncertainty: distribution of parameters, updated using Bayes• Strict Uncertainty: feasible region (if linear constraints: convex polytope)





Which Recommendations? Which Queries?

Several FrameworsThe questions:

How to represent the uncertainty over an utility function?

How to aggregate this uncertainty? Make recommendations ?

How to choose the next query? How to recommend a set, a ranking, ...

Minimax Regret Maximin BayesianKnoweldge constraints constraints prob. distribtion

representationWhich option minimax maximin expected

to recommend? regret utility utilityWhich query worst-case worst-case expected valueto ask next? regret reduction maximin improvement of information

Possible other choices: Hurwitz criterion, and others.Hybrid models: minimax regret with expected regret reduction,...





Minimax RegretIntuitionAdversarial game; the recommender selects the item reducing the “regret” wrt the“best” item when the uniknonw parameters are chosen by the adversary

Robust criterion for decision making under uncertainty [Savage; Kouvelis]

Showed to be effective when used for decision making under utility uncertainty [Boutilier et al., 2006] and asdriver for elicitation

Advantages

No heavy Bayesian updates

No prior assumption required

MMR computation suggests queries to ask to the user

Limitations

No account for noisy responses

Formulation of the optimization depends on the assumption about the utility





Minimax Regret

Assumption: a set of feasible utility functions W is given

The pairwise max regret

PMR(x, y; W ) = maxw∈W

u(y; w)− u(x; w)

The max regret

MR(x; W ) = maxy∈X

PMR(x, y; W )

The minimax regret

MMR(W ) of W and the minimax optimal item x∗W :

MMR(W ) = minx∈X

MR(x,W )

x∗W = arg minx∈X

MR(x,W )





Example

item feature1 feature2a 10 14b 8 12c 7 16d 14 9e 15 6f 16 0

Linear utility model with normalized utility weights(w1 + w2 = 1);u(x ; w) = (1−w2)x1 +w2x2 = (x2−x1)w2 + x1

Notice: it is a 1 dimensional problem

Initially, we only know that w2 ∈ [0, 1]

PMR(a, f ; w2)= maxw2 u(f ; w2)−u(a; w2)= maxw2 6(1−w2)− 14w2 = maxw2 6− 20w2

= 6 (for w2 = 0)

PMR(a, b; w2)=maxw2 u(b; w2)−u(a; w2)<0(a dominates b; there can’t be regret in choosinga instead of b!)

PMR(a, c; w2)=maxw2 −3(1−w2)−2w2 =2(for w2 = 1)....





Example (continued)

item feature1 feature2a 10 14b 8 12c 7 16d 14 9e 15 6f 16 0

Linear utility model with normalized utility weights(w1 + w2 = 1);u(x ; w) = (1−w2)x1 +w2x2 = (x2−x1)w2 + x1

Notice: it is a 1 dimensional problem

Computation of the pairwise regret table.

PMR(·, ·) a b c d e f MRa 0 -2 2 4 5 6 6b 2 0 4 6 7 8 8c 3 1 0 7 8 9 9d 5 3 7 0 1 2 7e 8 6 10 3 0 1 10f 14 12 16 9 6 0 16

The MMR-optimal solution is a, adversarial choiceis f , and minimax regret value is 6.

In reality no need to compute the full table (treesearch methods) [Braziunas, PhD Thesis, 2011]

Now, we want to ask a new query to improve thedecision. A very successful strategy (thoughtgenerally not optimal!) is the current solutionstrategy: ask user to compare a and f





INCREMENTAL ECLITATION WITH MINIMAX REGRET

OBSERVATIONS

Si P ⊂ P ′ alors :

ΘP′ ⊂ ΘP

PMR(x, y; ΘP′) ≤ PMR(x, y; ΘP) for all x, y ∈ XMR(x; ΘP′) ≤ MR(x; ΘP) for all x ∈ XMMR(ΘP′) ≤ MMR(ΘP)

→ Adding preference statements cannot increase MMR (and oftendecreases); justification for interactive elicitation with questions.

WHICH STRATEGY TO ASK QUESTIONS?

Worst-case Minimax Regret : arg minq∈Q

maxr

MMR(X ; ΘP∪r)

Current Choice Strategy : x∗ % y∗ or x∗ - y∗?(do you prefer x∗ or y∗ ?)





A graphical illustration for linear utility model (1/3)





Minimax Regret ComputationComputation of Pairwise Max Regret as Linear Program

Objective function: maxw∈W w · (y− x)

Usually W expressed by linear constraints such as w ·a ≥ w ·b fora % b ∈ P (a set of comparisons)

Computation of Minimax Regret

Naive approach: test all n2−n combinations of choices

Better idea: implement a search problem [Braziunas, 2011]

i=choice of recommender, j=choice of adversaryUB: upper bound on minimax regret (max regret of best solutionfound)LBi : lower bound on the max regret of option iAfter testing i against j : LBi ← max(LBi ,PMR(i, j))Whenever LBi ≥ UB: prune option i

Empirically, a small number of PMR checks is needed31/67





Example of Minimax Regret Computation

ExampleComplete “pairwise max regret" table

PMR(i , j) j=1 j=2 j=3 j=4 MR(i)i=1 0 1 2 3 3i=2 2 0 2 2 2i=3 4 1 0 1 4i=4 3 2 3 0 3

Evaluation (0 PMR checks)PMR(i , j) j=1 j=2 j=3 j=4 LBi

i=1 0 ? ? ? 0i=2 ? 0 ? ? 0i=3 ? ? 0 ? 0i=4 ? ? ? 0 0

UB+Inf









i=1 0 1 ? ? 1i=2 ? 0 ? ? 0i=3 ? ? 0 ? 0i=4 ? ? ? 0 0

UB+Inf









i=1 0 1 2 3 3i=2 ? 0 ? ? 0i=3 ? ? 0 ? 0i=4 ? ? ? 0 0

UB3









i=1 0 1 2 3 3i=2 2 0 2 2 2i=3 ? ? 0 ? 0i=4 ? ? ? 0 0

UB2









i=1 0 1 2 3 3i=2 2 0 2 2 2i=3 4 ? 0 ? 4i=4 ? ? ? 0 0

UB2









i=1 0 1 2 3 3i=2 2 0 2 2 2i=3 4 ? 0 ? 4i=4 3 ? ? 0 3

UB2





Different Aggregation Functionsx = (x1, . . . , xn), y = (y1, . . . , yn) → x % y⇔ u(x;ω) ≥ u(y;ω) Weightingparameters provide a control on:

the type of compromise sought in Multicriteria Decision Making

the attitude towards equity in Social Choice

Standard weighting parameters (weights attached to criteria)

•Weighted sum : u(x;ω) =n∑

i=1ωixi

•Weighted Tchebycheff : u(x;ω) = maxi∈[[1;n]]

{ωix∗i −xi

x∗i −x∗i}

Rank-dependent weighting parameters (weights attached to ranks)

• OWA : u(x; w) =n∑

i=1wix(i)

• Choquet : u(x; v) =n∑

i=1

[x(i) − x(i−1)

]v(X(i))





Application to OWA

Positive normalized weights (general case)

u(x; w) =n∑

i=1wix(i) non-linear in x but linear in w !

u(x ,w) ≥ u(y ,w) ⇐⇒ w .x↑ ≥ w .y↑

where x↑ is the vector x sorted by increasing order→preference of type x % y is equivalent to a linear inequality

Positive normalized decreasing weights (fair optimization)OWA with decreasing weights: wi > wj whenever i < j ensures thecompatibility with Pigou-Dalton transfers, i.e:∀i , j : xj > xi ,∀ε ∈ (0, xj − xi ), (x1, xi + ε, . . . , xj − ε, xn) � (x1, . . . , xn)→ add inequalities of type wi − wi+1 ≥ δ with δ > 0





Regret Computation for OWA

Dataset Manipulation

x↑ = (x(1), ..., x(n)) permutation of features from worst to best

PMR for OWA

maxw

w .(y↑ − x↑) (3)

s.t . 0 ≤ wi ≤ 1 ∀i (4)∑i

wi = 1 (5)

wi − wi+1 ≥ δ (6)

w .a↑ ≥ w .b↑ ∀a, b s.t. a % b ∈ P (7)





Application to Choquet integrals

Cv (x) = x(1)v(X(1)) +n∑

i=2

[x(i) − x(i−1)

]v(X(i))

x(i) ≤ x(i+1) for all i = 1, . . . ,n − 1 and X(i) = {j ∈ N, xj ≥ x(i)},

Example

∅ {1} {2} {3} {1,2} {1,3} {2,3} {1,2,3}v 0 0.1 0.2 0.3 0.5 0.6 0.7 1

x = (10,6,14) and y = (10,12,8)

Cv (x) = 6 + (10− 6)v({1,3}) + (14− 10)v({3}) = 9.6Cv (y) = 8 + (10− 8)v({1,2}) + (12− 10)v({2}) = 9.4

With Cv we observe that x is preferred to y .





Fairness and convex capacities in the Choquet integral

v is said to be convex or supermodular whenv(A ∪ B) + v(A ∩ B) ≥ v(A) + v(B) for all A,B ⊆ N,

Proposition (Chateauneuf and Tallon, 1999)When preferences are represented by a Choquet integral, thenchoosing v convex is equivalent to the following property:∀x1, x2, . . . xp ∈ Rn, ∀k ∈ {1,2, . . . ,p} and ∀i ∈ {1,2, . . . ,p}, λi ≥ 0such that

∑pi=1 λi = 1 we have:

Cv (x1) = Cv (x2) = . . . = Cv (xp)⇒ Cv (

p∑i=1

λix i ) ≥ Cv (xk )

Example (convex v and fairness)x = (18,18,0), y = (0,18,18), z = (18,0,18)t = (12,12,12) = (x + y + z)/3v convex⇒ [Cv (x) = Cv (y) = Cv (z)⇒ t % x , t % y , t % z]





COMPUTATION OF PMR FOR CHOQUET: GENERAL

CASE

CAPACITY

The function v : 2N → [0, 1] is a normalized capacity if:

v(∅) = 0, v(N) = 1

v(A) ≤ v(B) for all A ⊂ B ⊆ N (monotonicity)

PMR(x, y; ΘP) = maxv

Cv(y)− Cv(x)

s.t. v(∅) = 0

v(N) = 1

v(A) ≤ v(A ∪ {i}) ∀A ⊂ N, ∀i ∈ N\ACv(a) ≥ Cv(b) ∀a, b s.t. a % b ∈ P

LP with an exponentiel number of variables and constraintsAt most 2n variables in the objective function





Choquet integral and Möbius massesAn alternative representation in terms of the Möbius inverse:

Möbius inverse and Möbius masses

To any set-function v : 2N → R is associated m : 2N → R a mapping calledMöbius inverse, defined by:

∀A ∈ N, m(A) =∑B⊆A

(−1)|A\B|v(B), v(A) =∑B⊆A

m(B)

Coefficients m(B) for B ⊆ A are called Möbius masses.

Remark: v convex if Möbius masses positive (belief function, Shafer 1976)

Choquet integral as a function of Möbius masses

Cv (x) =∑B⊆A

m(B)∧i∈B

xi

2-additive capacity (m(B) = 0 iff |B| > 2) → capacity completely characterized by (n2 + n)/2 coefficients. Good

compromise between compacity of the model and expressivity. 44/67Paolo Viappiani Preference Modelling and Learning




PMR for 2-additive Choquetm vector of variables (moebsius mass) encoding a capacity v :m = (m1, . . . ,mn,m11,m12, . . . ,m1n,m23, . . . ,m2n . . . ,mn−1n)x = (x1, . . . , xn, x11, x12, . . . , x1n, x23, . . . , x2n . . . , xn−1n), xij = xi ∧ xj

Therefore we can write Cv (x) = m.(x)

The case of a 2-additive capacity

maxm

m.(x − y) (8)

s.t .∑j∈J

mij ≥ 0, i =1,. . ., n; ∀J : {i} ⊆ J ⊆ N (9)

n∑i=1

mi +n∑i

n∑j=i+1

mij = 1 (10)

m.a ≥ m.b ∀a, b s.t. a % b ∈ P

Constraint 11 ensures that the capacity is normalized. Monotonicity (9) requires n 2n−1 constraints. Only for

2-additive capacities: efficient formulation in term of convex combination of extreme masses [Hullermeier 2012]. 45/67Paolo Viappiani Preference Modelling and Learning




PMR for 2-additive Belief Functionsm vector of variables (moebsius mass) encoding a capacity v :m = (m1, . . . ,mn,m11,m12, . . . ,m1n,m23, . . . ,m2n . . . ,mn−1n)x = (x1, . . . , xn, x11, x12, . . . , x1n, x23, . . . , x2n . . . , xn−1n), xij = xi ∧ xj

Cv (x) = m.(x)

belief functions are capacities such that mi,j ≥ 0

The case of (2-additive) belief functions

maxm

m.(x − y) (11)

s.t .n∑

i=1

mi +n∑i

n∑j=i+1

mij = 1 (12)

mi ≥ 0, i =1, . . . ,n, (13)mij ≥ 0, i =1, . . . ,n, j = i + 1, . . . ,n, (14)

m.a ≥ m.b ∀a, b s.t. a % b ∈ P46/67





PREFERENCE QUERIES OF TYPE “1A0 � Λ ?”COMPARISON BETWEEN:

1A0, the fictions alternative such that 1A0i = 1 if i ∈ A and 1A0i = 0otherwise, andΛ, the fictious alternative such that Λi = λ for all i ∈ N .

OBSERVATIONS

Cv(1A0) = v(A) for all A ⊆ NCv(Λ) = λ for all λ ∈ R

CONSEQUENCES

1A0 % Λ ⇔ v(A) ≥ λ ⇔ v(B) ≥ λ for all B ⊇ A1A0 - Λ ⇔ v(A) ≤ λ ⇔ v(B) ≤ λ for all B ⊆ A

PROPOSITION 1 : COMPATIBILITY WITH PREFERENCES PUpdating [lB , uB] for all B ⊇ A (resp. B ⊆ A) at each insertion of preferences1A0 % Λ (resp. 1A0 - Λ) in P, the compatibility with P is assured.





COMPUTATION OF PMR FOR CHOQUET

Let P be preferences statements obtained by asking only 1A0 % Λ queries.

PROPOSITION 2 : EXISTENCE OF A COMPATIBLE CAPACITY

Let v|D : 2N → [0, 1] be a partial function defined on D ⊆ 2N such that:• v|D(A) ∈ [lA, uA] for all A ∈ D• v|D(A) ≤ v|D(B) for all A,B ∈ D such that A ⊂ B

In this case, v|D can be completed in a capacity of ΘP .


Cv(y)− Cv(x)

s.t. v(X(i+1)) ≤ v(X(i)) ∀i ∈ {1, . . . , n− 1}v(Y(i+1)) ≤ v(Y(i)) ∀i ∈ {1, . . . , n− 1}v(X(i)) ≤ v(Y(j)) ∀i, j ∈ N s.t. X(i) ⊂ Y(j) and X(i) 6⊂ Y(j+1)

v(Y(i)) ≤ v(X(j)) ∀i, j ∈ N s.t. Y(i) ⊂ X(j) and Y(i) 6⊂ X(j+1)

uX(i)≤ v(X(i)) ≤ lX(i) ∀i∈N

uY(i)≤ v(Y(i)) ≤ lY(i)

∀i ∈ N

→ LP with at most 2(n−1) variables et 4(n−1)+2n = 6n− 4 constraints.






A closer look at the Objective Function

Remember that Cv (x) = x(1)v(X(1)) +∑n

i=2

[x(i) − x(i−1)

]v(X(i)).

Cv(y)− Cv(x) =∑

A∈{X(i) | X(i) 6=Y(i)}

−(x(i) − x(i−1))vA + (15)

∑A∈{Y(i) | Y(i) 6=X(i)}

(y(i) − y(i−1))vA + (16)

∑A∈{X(i) | X(i)=Y(i)}

(y(i) − y(i−1) − x(i) + x(i−1))vA. (17)

Objective function has to be maximized: vA will be as small aspossible for all A ∈ {X(i) | X(i) 6= Y(i)} and as large as possible for allA ∈ {Y(i) | Y(i) 6= X(i)}.






Let P be preferences statements obtained by asking only 1A0 % Λ queries.


Cv(y)− Cv(x)

s.t. v(X(i+1)) ≤ v(X(i)) ∀i ∈ {1, . . . , n− 1}v(Y(i+1)) ≤ v(Y(i)) ∀i ∈ {1, . . . , n− 1}v(X(i)) ≤ v(Y(j)) ∀i, j ∈ N s.t. X(i)⊂Y(j) and X(i) 6⊂Y(j+1)

v(Y(i)) ≤ v(X(j)) ∀i, j ∈ N s.t. Y(i)⊂X(j) and Y(i) 6⊂X(j+1)

Y(i)⊂X(j), Y(i) 6⊂X(j+1), and Y(i−1) 6⊂X(j)

uX(i)≤ v(X(i)) ≤ lX(i) ∀i∈N

uY(i)≤ v(Y(i)) ≤ lY(i)

∀i ∈ N

→ LP with at most 2(n−1) variables et 3(n−1)+2n = 5n− 3 constraints.

[Benabbou et al., 2014]





QUERY STRATEGIES FOR CHOQUET

SELECTION CRITERIA: WORST-CASE MINIMAX REGRET

minA⊆N

minΛ

max{MMR(X ; ΘP∪{1A0%Λ}),MMR(X ; ΘP∪{1A0-Λ})}

(myopic measure of “value of information”)

OBSERVATIONS

For all A ⊆ N :

MMR(X ; ΘP∪{1A0%Λ}) is a function of λ decreasing on [lA, uA]

MMR(X ; ΘP∪{1A0-Λ}) function of λ increasing on [lA, uA]

These two functions have the same maximumCONSEQUENCES

MMR(X ; ΘP∪{1A0%Λ}) and MMR(X ; ΘP∪{1A0-Λ}) have necessarily anintersection.Minimization of Λ can be done with dichotomic search.





QUERY STRATEGIES FOR CHOQUET

OBSERVATION 1 :There are exactly 2n − 2 possible subsets of criteria A ⊆ N .

HEURISTIC

Consider only the attributes implicated in the objective PMR(x∗, y∗; ΘP) forx∗ and y∗ that are associated with minimax regret.

REMARK 2 :It is possible that no single query can reduce MMR

TIE-BREAKING

Choose A minimizing λ =lA + uA

2:

0.5 MMR(X ; ΘP∪{1A0%Λ}) + 0.5 MMR(X ; ΘP∪{1A0-Λ})

[Benabbou et al., 2014] 52/67Paolo Viappiani Preference Modelling and Learning




Experiment 1

10-dimensional Knapsack problem with 1000 items. Interactive elicitation with a simulated user; minimax-regret

assuming a Choquet integral, when the user is answering according to different underlying models.





Experiment 2

Datasets of 100 alternatives evaluated on 10 criteria and characterized by a set of performance vectors X a are

randomly generated; constructed in such a way that∑n

i=1 xai = 1 for all x ∈ X a , where a ∈ {0.5, 1, 2} so as to

obtain different types of Pareto sets.





Which Recommendations? Which Queries?

Several FrameworsThe questions:

How to represent the uncertainty over an utility function?

How to aggregate this uncertainty? Make recommendations ?

How to choose the next query? How to recommend a set, a ranking, ...

Minimax Regret Maximin BayesianKnoweldge constraints constraints prob. distribtion

representationWhich option minimax maximin expected

to recommend? regret utility utilityWhich query worst-case worst-case expected valueto ask next? regret reduction maximin improvement of information

Possible other choices: Hurwitz criterion, and others.Hybrid models: minimax regret with expected regret reduction,...





Bayesian Framework for Recommendation andElicitation

Let’s assume utility u(x ; w) parametric in w for agiven structure, for example u(x ; w)=w · xP(w) probability distribution over utility function

Expected utility of a given item xEU(x)=

∫u(x) P(w) dw

Current expected utility of best recommendation x∗

EU∗=maxx∈A EU(x); x∗=arg maxx∈A EU(x)

When a new preference is known (for instance,user prefers apples over orange), the distribution isupdated according to Bayes (Monte Carlo methods,Expectation Propagation)

(possible prior distribution)

(distribution updated after user feedback)





Response Models

Model the user’s cognitive ability of answering correctly to a preference query

Noiseless responses (unrealistic but often assumed in researchpapers!)

Constant error (can model distraction, e.g. clicking on the wrong icon)

Logistic error (Boltzmann distribution), a commonly used probabilisticresponse model for comparison/choice queries: “Among options in setS, which one do you prefer?”

probability of response “x is my preferred item in S”

Pr (S → x) =eγu(x)∑

y∈S eγu(y)

γ is a temperature parameter (how “noisy” is the user).

For comparison queries (“Is item1 better than item2?”)P(selecting 1st item) as a function of the difference in utility





What Query to Ask Next?

The problem can be modeled as a POMDP [Boutilier, AAAI 2002],however impractical to solve for non trivial cases

Idea: ask query with highest “value”, a posteriori improvement indecision quality

In a Bayesian approach (Myopic) Expected Value of Information

EVOIθ(q) =∑r∈R

Pθ(r)EU∗θ|r − EU∗θ

where R is the set of possible responses (answers); θ is the currentbelief distribution and θ|r the posterior and Pθ(r) the prior probability ofa given response.

Ask query q∗ = arg max EVOIθ(q) with highest EVOI

In non-Bayesian setting, one can use non probabilistic measures ofdecision improvement (for example, worst-case regret reduction, ...)





Efficient Computation of QueriesOptimal Query Sets and Recommendation Sets

How to optimize VoI depends on the query type, in general too many possiblequeries to iterate over them

For comparison queries there are n(n−1)2 possible queries

For choice queries of size k there are(n

k

)candidate queries

For bound queries, continuous space

Optimal Query Sets and Recommendation Sets

Characterization of the myopically optimal choice query[Viappiani and Boutilier, 2010]

Tight connection with problem of generating a set of recommendations

The optimization problem is submodular→ greedy optimization givesstrong guarantees

Lazy evaluation techniques are computationally very efficient (<1second for large datasets; naive methods several hours...)





Experimental Results (Bayesian Elicitation)

We know that the myopic value of information of the queries posed by thegreedy strategies are close to the optimum. Empirical results show that theyare also effective when evaluated in an iterative process; the recommendeditem converges to the “true” best item in few cycles.





Application: Active Collaborative Filtering

Incremental Elicitation of Rating Profiles

Recommender systems such as Netflix asks you to rate moremovies, in order to improve their preference models andprovide better recommendations in the future.This can be seen as an implicit “query” asking for yourpreferred movies among the ones presented.





Other Approaches

Point-wise methods (Support Vector Machines)Sorting / Ordered classification problemsLearning preference relations (qualitative preferences)Learning CP networksRelational learningRankings: neighbour models, probabilistic approaches(Mallows, Babington-Smith),...





The Road Ahead

Preference learning must deal with “biased” decisionmakers: more work at the intersection of behavioraldecision theory and principled mathematical methodsLearning/elicitation wrt Prospect TheorySocial context of personalizationUtility-based evaluation of “contagion” effect in socialnetworksSequential decision problems (MDPs / reinforcementlearning) and sequential optimization of elicitation(optimization of VOI for long horizon)





References 1 (Fundamentals: Preference Modeling)

L. J. Savage, The Foundations of Statistics, Wiley, New York, 1954.

P. Kouvelis and G. Yu, Robust Discrete Optimization and Its Applications, Kluwer,Dordrecht, 1997.

Roberts F.S, Measurement theory, with applications to Decision Making, Utilityand the Social Sciences, Addison-Wesley, Boston, 1979.

Roubens M., Vincke Ph., Preference Modeling, Springer Verlag, Berlin, 1985.

Fishburn P.C., Interval Orders and Interval Graphs, J. Wiley, New York, 1985.

Fodor J., Roubens M., Fuzzy preference modelling and multicriteria decisionsupport, Kluwer Academic, Dordrecht, 1994.

Pirlot M., Vincke Ph., Semi Orders, Kluwer Academic, Dordrecht, 1997.

Fishburn P.C., Preference structures and their numerical representations,Theoretical Computer Science, vol. 217, 359-383, 1999.

Öztürk M., Tsoukiàs A., Vincke Ph., Preference Modelling, in M. Ehrgott, S.Greco, J. Figueira (eds.), State of the Art in Multiple Criteria Decision Analysis,Springer Verlag, Berlin, 27 - 72, 2005.





References 2

Prefererence Representation and Languages

Craig Boutilier, Ronen I. Brafman, Carmel Domshlak, Holger H. Hoos, DavidPoole. CP-nets: A Tool for Representing and Reasoning with Conditional CeterisParibus Preference Statements. JAIR 21: 135-191 (2004)

Souhila Kaci. Working with Preferences: Less Is More. Cognitive Technologies,Springer 2011.

Behavioral Decision Theory and Biases

Kahneman, Daniel; Paul Slovic and Amos Tversky. Judgment under Uncertainty:Heuristics and Biases, Cambridge University Press, 1982.

Gilovich, Thomas and Dale Griffin, Daniel Kahneman. Heuristics and biases:The psychology of intuitive judgment, Cambridge University Press, 2002.

John W. Payne, James R. Bettman and Eric J. Johnson. The Adaptive DecisionMaker, Cambridge University Press, 1993.

Daniel Kahneman. Thinking, Fast and Slow. Farrar, Straus and Giroux, 2011.





References 3 (Incremental Utility Elicitation/Learning)Urszula Chajewska, Daphne Koller, Ronald Parr. Making Rational DecisionsUsing Adaptive Utility Elicitation. AAAI 2000: 363-369.Craig Boutilier. A POMDP Formulation of Preference Elicitation Problems.AAAI/IAAI 2002: 239-246.C. Boutilier, R. Patrascu, P. Poupart, D. Schuurmans. Constraint-basedoptimization and utility elicitation using the minimax decision criterion. ArtificialIntelligence 170(8- 9): 686-713 (2006).Paolo Viappiani, Craig Boutilier. Regret-based optimal recommendation sets inconversational recommender systems. RecSys 2009: 101-108.Paolo Viappiani, Craig Boutilier. Optimal Bayesian Recommendation Sets andMyopically Optimal Choice Query Sets. NIPS 2010: 2352-2360.Darius Braziunas, Craig Boutilier. Assessing regret-based preference elicitationwith the UTPREF recommendation system. ACM EC 2010: 219-228.D. Braziunas. Decision-theoretic elicitation of generalized additive utilities, Ph.D.dissertation, University of Toronto, 2011.A. F. Tehrani, W. Cheng, K. Dembczynski, E. Hullermeier. Learning monotonenonlinear models using the Choquet integral. Machine Learning 89(1-2):183-211 (2012)Paolo Viappiani, Christian Kroer. Robust Optimization of Recommendation Setswith the Maximin Utility Criterion. ADT 2013: 411-424.Nawal Benabbou, Patrice Perny, Paolo Viappiani. Incremental Elicitation ofChoquet Capacities for Multicriteria Decision Making. ECAI 2014: 87-92. 66/67





References 4 (Applications, Intelligent User Interfaces)

Bart Peintner, Paolo Viappiani, Neil Yorke-Smith. Preferences in InteractiveSystems: Technical Challenges and Case Studies. AI Magazine 29(4): 13-24(2008).

Pearl Pu, Boi Faltings, Li Chen, Jiyong Zhang, Paolo Viappiani. UsabilityGuidelines for Product Recommenders Based on Example Critiquing Research.Recommender Systems Handbook 2011: 511-545.

Paolo Viappiani, Boi Faltings, Pearl Pu. Preference-based Search usingExample-Critiquing with Suggestions. JAIR 27: 465-503 (2006).

Krzysztof Z. Gajos, Daniel S. Weld, Jacob O. Wobbrock. Automaticallygenerating personalized user interfaces with Supple. Artificial Intelligence174(12-13): 910-950 (2010).

Markus Stolze, Michael Strobel. Recommending as Personalized Teaching -Towards Credible Needs-based eCommerce Recommender Systems inDesigning personalized user experiences in eCommerce, pages 293-313 (2004).

Myers, K.; Berry, P.; Blythe, J.; Conley, K.; Gervasio, M.; McGuinness, D.; Morley,D.; Pfeffer, A.; Pollack, M.; and Tambe, M. 2007. An Intelligent Personal Assistantfor Task and Time Management. AI Magazine 28(2): 47-61.


preference modelling and...

Documents