predicting preference reversals via gaussian process uncertainty aversion
TRANSCRIPT
References
Predicting Preference Reversals
via Gaussian Process Uncertainty Aversion
Rikiya Takahashi1 Tetsuro Morimura2
1SmartNews, [email protected]
2IBM Research - [email protected]
May 10, 2015
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Discrete Choice Modelling
Goal: predict prob. of choosing an option from a choice set.
Why solving this problem?
For business: brand positioning among competitors
For business: sales promotion (yet involving some abuse)
To deeply understand how human makes decisionsAISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Random Utility Theory
Each human is a maximizer of random utility.
i ’s choice from Si = arg maxj∈Si
fi (vj )︸ ︷︷ ︸mean utility
+ εij︸︷︷︸random noise
Si : choice set for i , vj : vector of j ’s attributes, fi : i ’smean utility function
Assuming independence among every option’s attractiveness
For both mean and noise: (e.g., logit (McFadden, 1980))For only mean: (e.g., nested logit (Williams, 1977))
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Why Random Utility Theory has been Used?
Voices from friends (machine learners & econometricians)
1 Rationality of independence assumption
Attributes of unchosen options are irrelevant to thechosen option’s benefit.I bought diamond. This is the best. It’s ridiculous tothink that other dirty stones affected my final choice.
2 Computational practicality
Unless scoring each option, how to decide the best one?Formalizing data likelihood is straight and easy.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Complexity of Real Human’s Choice
An example of choosing PC (Kivetz et al., 2004)
Each subject chooses 1 option from a choice set
A B C D ECPU [MHz] 250 300 350 400 450Mem. [MB] 192 160 128 96 64
Choice Set #subjects{A, B, C} 36:176:144{B, C, D} 56:177:115{C, D, E} 94:181:109
Can random utility theory still explain the preference reversals?
B�C or C�B?
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Agenda
1 Introduction of the Goal and Issues2 Irrational Context Effects
Similarity EffectAttraction EffectCompromise EffectPrior Work
3 Proposing a Bayesian Model of Mental Conflict
4 Numerical Studies
5 Conclusion
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Similarity Effect (Tversky, 1972)
Top-share choice can change due to correlated utilities.
E.g., one color from {Blue, Red} or {Violet, Blue, Red}?
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Attraction Effect (Huber et al., 1982)
Introduction of an absolutely-inferior option A− (=decoy)causes irregular increase of option A’s attractiveness.
Despite the natural guess that decoy never affects the choice.
If D�A, then D�A�A−.
If A�D, then A is superior to both A− and D.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Compromise Effect (Simonson, 1989)
Moderate options within each chosen set are preferred.
Different from non-linear utility function involvingdiminishing returns (e.g.,
√inexpensiveness+
√quality).
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Positioning of Our Work in LiteratureSim.: similarity, Attr.: attraction, Com.: compromise
Sim. Attr. Com. Mechanism Predict. for LikelihoodTest Set Maximization
SPM OK NG NG correlation OK MCMC
MDFT OK OK OK dominance & indifference OK MCMC
PD OK OK OK nonlinear pairwise comparison OK MCMC
MMLM OK NG OK none OK Non-convex
NLM OK NG NG hierarchy NG Non-convex
BSY OK OK OK Bayesian OK MCMC
LCA OK OK OK loss aversion OK MCMC
MLBA OK OK OK nonlinear accumulation OK Non-convex
Proposed OK NG OK Bayesian OK Convex
MDFT: Multialternative Decision Field Theory (Roe et al., 2001)PD: Proportional Difference Model (Gonzalez-Vallejo, 2002)MMLM: Mixed Multinomial Logit Model (McFadden and Train, 2000)SPM: Structured Probit Model (Yai, 1997; Dotson et al., 2009)NLM: Nested Logit Models (Williams, 1977; Wen and Koppelman, 2001)BSY: Bayesian Model of (Shenoy and Yu, 2013)LCA: Leaky Competing Accumulator Model (Usher and McClelland, 2004)MLBA: Multiattribute Linear Ballistic Accumulator Model (Trueblood, 2014)
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Agenda
1 Introduction of the Goal and Issues
2 Irrational Context Effects3 Proposing a Bayesian Model of Mental Conflict
Utility Estimation as Dual PersonalityIrrationality by Bayesian ShrinkageConvex Optimization when using Posterior Mean
4 Numerical Studies
5 Conclusion
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Utility Estimation as Dual Personality
How about regarding utilities as samples in statistics?
Assumption 1: Utility function is partially disclosed to DMS.1 UC computes the sample value of every option’s utility,
and sends only these samples to DMS.2 DMS statistically estimates the utility function.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Mental Conflict as Bayesian Shrinkage
Assumption 2: DMS does Bayesian shrinkage estimation.i ∈{1, . . . , n}: context, yi ∈{1, . . . ,m[i ]}: final choiceXi , (xi1∈RdX , . . . , xim[i ])
>: features of m[i ] options
Objective Data: values of random utilities
vi ,(vi1, . . . , vim[i ])>∼N
(µi , σ
2Im[i ]
), vij = b+w>φφ (xij )
µi : Rm[i ]: vec. of the true mean utility, σ2: noise levelb: bias term, φ : RdX →Rdφ : mapping function. wφ: vec. of coefficients
Subjective Prior: choice-set-dependent Gaussian process
µi ∼ N(0m[i ], σ
2K(Xi ))
s.t. K(Xi ) = (K (xij , xij ′))∈Rm[i ]×m[i ]
µi ∈Rm[i ]: vec. of random utilities, K(·, ·): similarity between options
Final choice: based on (Posterior mean u∗i + i.i.d. noise) as
u∗i = K(Xi )(Im[i ]+K(Xi )
)−1 (b1m[i ]+Φi wφ
),
yi = arg maxj
(u∗ij + εij ) where ∀j εij ∼ Gumbel .
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Irrationality by Bayesian Shrinkage
Implication of (1): similarity-dependent discounting
u∗i = K(Xi )(Im[i ]+K(Xi )
)−1︸ ︷︷ ︸shrinkage factor
(b1m[i ]+Φi wφ
)︸ ︷︷ ︸vec. of utility samples
. (1)
Under RBF kernel K (x, x′) = exp(−γ‖x− x′‖2),an option dissimilar to others involves high uncertainty.
Strongly shrunk into prior mean 0.
Context effects as Bayesian uncertainty aversion
0 0.2 0.4 0.6 0.8
1 1.2 1.4
1 2 3 4
Fin
al E
va
lua
tio
n
X1=(5-X2)
DA- A
{A,D}{A,A
-,D}
0 0.2 0.4 0.6 0.8
1 1.2 1.4
1 2 3 4
Fin
al E
va
lua
tio
n
X1=(5-X2)
DCBA
{A,B,C}{B,C,D}
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Convex Optimization when using Posterior Mean
Global fitting of the parameters using data (Xi , yi )ni=1
Fix the mapping and similarity functions during updates.
Shrinkage factor Hi ,K(Xi )(Im[i ] + K(Xi ))−1 is constant!
Obtaining a MAP estimate is convex w.r.t. (b,wφ).
maxb,wφ
n∑i=1
`( bHi 1m[i ]+Hi Φi wφ︸ ︷︷ ︸Context−specific Hi is multiplied .
, yi )−c
2‖wφ‖2
Exploiting the log-concavity of multinomial logit
`(u∗i , yi ), logexp(u∗iyi
)∑m[i ]j ′=1 exp(u∗ij ′)
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Agenda
1 Introduction of the Goal and Issues
2 Irrational Context Effects
3 Proposing a Bayesian Model of Mental Conflict
4 Numerical Studies
5 Conclusion
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Experimental Settings
Evaluates accuracy & log-likelihood for real choice data.
Dataset #1: PC (n=1, 088, dX =2)
Dataset #2: SP (n=972, dX =2)
Subjects are asked of choosing a speaker.
A B C D EPower [Watt] 50 75 100 125 150
Price [USD] 100 130 160 190 220
Choice Set #subjects{A, B, C} 45:135:145{B, C, D} 58:137:111{C, D, E} 95:155: 91
Dataset #3: SM (n=10, 719, dX =23)
SwissMetro dataset (Antonini et al., 2007)Subjects are asked of choosing one transportation, eitherfrom {train, car, SwissMetro} or {train, SwissMetro}.Attribute of option: cost, travel time, headway, seattype, and type of transportation.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Cross-Validation Performances
High predictability in addition to the interpretable mechanism.
For SP, successfully detected combination of compromiseeffect & prioritization of power.
1st best for PC & SP.
2nd best for higher-dimensional SM: slightly worse thanhighly expressive nonparametric version of mixedmultinomial logit (McFadden and Train, 2000).
-1.1
-1
-0.9
-0.8
Avera
ge L
og-L
ikelih
ood
Dataset
PC SP SM
LinLogitNpLogit
LinMixNpMixGPUA
0.3
0.4
0.5
0.6
0.7
Cla
ssific
ation A
ccura
cy
Dataset
PC SP SM
LinLogitNpLogit
LinMixNpMixGPUA
2
3
4
100 150 200E
valu
ation
Price [USD]
EDCBA
Obj. Eval.{A,B,C}{B,C,D}{C,D,E}
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Conclusion
Introduced a simple & interpretable Bayesian choice model.
Bayesian shrinkage involving mental conflict
Irrational choice-set-dependent Gaussian process prior
Uncertain aversion as a cause of context effects
Accurate prediction when absolute preference andcompromise effect are mixed.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
Future Directions
More active Bayesianism for realistic human models
Integration with other Bayesian discrete choice models(e.g., (Shenoy and Yu, 2013))
Explaining attraction effect
Current limitation: decoy gets high share due tosymmetric similarity to target option.
Extension to time-series decision making models
E.g., emulating how human plays multi-armed bandit(Zhang and Yu, 2013)
Choice-set optimization avoiding irrational context effects
News channel = set of news articles
Diversified item recommendation (Ziegler et al., 2005)
Via linear submodular bandits (Yue and Guestrin, 2011)
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
References I
Antonini, G., Gioia, C., Frejinger, E., and Themans, M. (2007).Swissmetro: description of the data.http://biogeme.epfl.ch/swissmetro/examples.html.
Dotson, J. P., Lenk, P., Brazell, J., Otter, T., Maceachern, S. N.,and Allenby, G. M. (2009). A probit model with structuredcovariance for similarity effects and source of volumecalculations. http://ssrn.com/abstract=1396232.
Gonzalez-Vallejo, C. (2002). Making trade-offs: A probabilistic andcontext-sensitive model of choice behavior. PsychologicalReview, 109:137–154.
Huber, J., Payne, J. W., and Puto, C. (1982). Addingasymmetrically dominated alternatives: Violations of regularityand the similarity hypothesis. Journal of Consumer Research,9:90–98.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
References II
Kivetz, R., Netzer, O., and Srinivasan, V. S. (2004). Alternativemodels for capturing the compromise effect. Journal ofMarketing Research, 41(3):237–257.
McFadden, D. and Train, K. (2000). Mixed MNL models fordiscrete response. Journal of Applied Econometrics,15:447 –470.
McFadden, D. L. (1980). Econometric models of probabilisticchoice among products. Journal of Business, 53(3):13–29.
Roe, R. M., Busemeyer, J. R., and Townsend, J. T. (2001).Multialternative decision field theory: A dynamic connectionistmodel of decision making. Psychological Review, 108:370–392.
Shenoy, P. and Yu, A. J. (2013). A rational account of contextualeffects in preference choice: What makes for a bargain? InProceedings of the Cognitive Science Society Conference.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
References III
Simonson, I. (1989). Choice based on reasons: The case ofattraction and compromise effects. Journal of ConsumerResearch, 16:158–174.
Trueblood, J. S. (2014). The multiattribute linear ballisticaccumulator model of context effects in multialternative choice.Psychological Review, 121(2):179– 205.
Tversky, A. (1972). Elimination by aspects: A theory of choice.Psychological Review, 79:281–299.
Usher, M. and McClelland, J. L. (2004). Loss aversion andinhibition in dynamical models of multialternative choice.Psychological Review, 111:757– 769.
Wen, C.-H. and Koppelman, F. (2001). The generalized nestedlogit model. Transportation Research Part B, 35:627–641.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
References IV
Williams, H. (1977). On the formulation of travel demand modelsand economic evaluation measures of user benefit. Environmentand Planning A, 9(3):285–344.
Yai, T. (1997). Multinomial probit with structured covariance forroute choice behavior. Transportation Research Part B:Methodological, 31(3):195–207.
Yue, Y. and Guestrin, C. (2011). Linear submodular bandits andtheir application to diversified retrieval. In Shawe-taylor, J.,Zemel, R., Bartlett, P., Pereira, F., and Weinberger, K., editors,Advances in Neural Information Processing Systems 24, pages2483–2491.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion
References
References V
Zhang, S. and Yu, A. J. (2013). Forgetful Bayes and myopicplanning: Human learning and decision-making in a banditsetting. In Burges, C., Bottou, L., Welling, M., Ghahramani, Z.,and Weinberger, K., editors, Advances in Neural InformationProcessing Systems 26, pages 2607–2615. Curran Associates,Inc.
Ziegler, C.-N., McNee, S. M., Konstan, J. A., and Lausen, G.(2005). Improving recommendation lists through topicdiversification. In Proceedings of the 14th internationalconference on World Wide Web (WWW 2005), pages 22–32.ACM.
AISTATS 2015 Predicting Preference Reversals via Gaussian Process Uncertainty Aversion