choice models - personal.psu.edusegmentation in choice models using latent class analysis basic...
TRANSCRIPT
![Page 1: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/1.jpg)
1
© Arvind Rangaswamy (2017) All Rights Reserved
January 31. 2017
Choice Models
Arvind Rangaswamy [email protected]
www.arvind.info
![Page 2: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/2.jpg)
2
© Arvind Rangaswamy (2017) All Rights Reserved
Topics for Today
Discussion of Guadagni and Little (1983) paper
Discussion of the logit choice model, with emphasis on estimation
Brief discussion of other papers
![Page 3: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/3.jpg)
3
© Arvind Rangaswamy (2017) All Rights Reserved
The Logit Model (Popularized in Marketing by Guadagni and Little 1983)
The objective of the model is to predict the probabilities that an individual will choose each of several alternatives. Instead of having two stages, the first to model utility, and the second to translate utility into choice probabilities, we directly model choice probabilities in one stage.
The probabilities lie between 0 and 1, and sum to 1 across the
choice alternatives.
The Logit model is consistent with the proposition that
customers pick the choice alternative that offers them the
highest utility on a purchase occasion, but the utility has a
random component that varies from one purchase occasion
to the next.
This model is subject to the IIA property.
![Page 4: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/4.jpg)
4
© Arvind Rangaswamy (2017) All Rights Reserved
The (Conditional Multinomial) Logit Model
On each choice occasion, the (unobserved) utility that customer i gets from
alternative k is given by:
… (1)
Where is the random component of the customer’s utility. One option is to
assume that is distributed independent Gumbel (i.e., type 1 extreme value).
Utility (𝑼𝒌𝒊 ) is the sum of an observable term (𝑽𝒌
𝒊 = 𝜷𝒋𝒌𝑿𝒋𝒌𝒊
𝒋 ) and unobservable term
(𝒌𝒊 ), making 𝑼𝒌
𝒊 unobservable or latent.
𝑽𝒌𝒊 is the intrinsic value or “attractiveness” (view it as inferred preference or utility
value) of alternative k to customer i. 𝑿𝒋𝒌𝒊 is the observed or measured value of
variable j (a characteristic of customer i such as age, or choice alternative k) when
customer i makes a choice/purchase.
jk is the importance weight associated with variable j for attribute k. If the effect of
a variable is common to all alternatives (e.g., age of customer), then we can use the
notation j .
ik
ik
ik VU
ik i
k
![Page 5: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/5.jpg)
5
© Arvind Rangaswamy (2017) All Rights Reserved
Potential Probability Models for
Normal
Extreme value
Gompertz
![Page 6: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/6.jpg)
6
© Arvind Rangaswamy (2017) All Rights Reserved
Types of Logit Models
Binary Logit model (Logistic model)
Ordinal Logit model (Choices ordered)
McFadden Conditional Logit model (Choices not ordered,
differences in characteristics of alternatives influence choices)
Nested Logit model
Mixed Logit model (with random coefficients)
![Page 7: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/7.jpg)
7
© Arvind Rangaswamy (2017) All Rights Reserved
Mathematical Specification of
the Conditional Logit Model
Customer i chooses the product which offers the highest utility, i.e., probability of choosing alternative k is:
Then if are distributed extreme value, individual i’s probability of choosing brand 1 or choice alternative 1(𝑷𝟏
𝒊 ) is given by:
… (3)
where C is the choice set. Similar equations can be specified for the probabilities that customer i will choose other alternatives; That is, the logit model is a sequence of equations, not just one equation.
In “Aggregate Logit model,” j is the same for all individuals.
𝑷𝒌𝒊 = 𝑷 𝑼𝒌
𝒊 𝑼𝒎𝒊 ; 𝐟𝐨𝐫 𝐚𝐥𝐥 𝒎 𝐢𝐧 𝐭𝐡𝐞 𝐜𝐡𝐨𝐢𝐜𝐞 𝐬𝐞𝐭 … (2)
𝑷𝟏𝒊 =
𝒆𝑽𝟏𝒊
𝒆𝑽𝒌𝒊
𝑪
ik
![Page 8: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/8.jpg)
8
© Arvind Rangaswamy (2017) All Rights Reserved
Example with Four Alternatives and One Independent Variable
𝑷𝟏𝒊 = 𝑷𝒓𝒐𝒃 𝒀𝟏
𝒊 = 𝟏 =𝒆𝜷𝟏𝒙𝟏
𝒊
𝒆𝜷𝟏𝒙𝟏𝒊
+𝒆𝜶𝟐+𝜷𝟏𝒙𝟐𝒊
+𝒆𝜶𝟑+𝜷𝟏𝒙𝟑𝒊
+𝒆𝜶𝟒+𝜷𝟏𝒙𝟒𝒊
𝑷𝟐𝒊 = 𝑷𝒓𝒐𝒃 𝒀𝟐
𝒊 = 𝟏 =𝒆𝜶𝟐+𝜷𝟏𝒙𝟐
𝒊
𝒆𝜷𝟏𝒙𝟏𝒊
+𝒆𝜶𝟐+𝜷𝟏𝒙𝟐𝒊
+𝒆𝜶𝟑+𝜷𝟏𝒙𝟑𝒊
+𝒆𝜶𝟒+𝜷𝟏𝒙𝟒𝒊
𝑷𝟑𝒊 = 𝑷𝒓𝒐𝒃 𝒀𝟑
𝒊 = 𝟏 =𝒆𝜶𝟑+𝜷𝟏𝒙𝟑
𝒊
𝒆𝜷𝟏𝒙𝟏𝒊
+𝒆𝜶𝟐+𝜷𝟏𝒙𝟐𝒊
+𝒆𝜶𝟑+𝜷𝟏𝒙𝟑𝒊
+𝒆𝜶𝟒+𝜷𝟏𝒙𝟒𝒊
𝑷𝟏𝒊 = 𝑷𝒓𝒐𝒃 𝒀𝟒
𝒊 = 𝟏 =𝒆𝜶𝟒+𝜷𝟏𝒙𝟒
𝒊
𝒆𝜷𝟏𝒙𝟏𝒊
+𝒆𝜶𝟐+𝜷𝟏𝒙𝟐𝒊
+𝒆𝜶𝟑+𝜷𝟏𝒙𝟑𝒊
+𝒆𝜶𝟒+𝜷𝟏𝒙𝟒𝒊
We can think of the Logit model for this application as generating one
coefficient estimate (1), three alternative-specific constants (i’s) and
four equations (Note 1 is set to 0. Why?):
𝒀𝒌𝒊 =
𝟏 𝐢𝐟 𝐜𝐨𝐧𝐬𝐮𝐦𝐞𝐫 𝒊 𝐜𝐡𝐨𝐨𝐬𝐞𝐬 𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 𝒌 𝟎 𝐢𝐟 𝐜𝐨𝐧𝐬𝐮𝐦𝐞𝐫 𝒊 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐜𝐡𝐨𝐨𝐬𝐞 𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 𝒌
![Page 9: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/9.jpg)
9
© Arvind Rangaswamy (2017) All Rights Reserved
An Important Property of the Logit Model
Probability of Individual i Choosing Alternative k ( ) 0.0 0.5 1.0
Low
High
ikP
The marginal impact is highest when
the customer is “sitting on the
fence,” i.e., when 𝑷𝒌𝒊 = 0.5 (for linear
utility function).
Question: Is this a good property to have?
Marginal impact of
variable j (e.g., price)
associated with
alternative k (𝑷𝒌
𝒊
𝑿𝒋𝒌𝒊 )
𝑃𝑘𝑖
𝑋𝑗𝑘𝑖 = 𝛽𝑗𝑘𝑃𝑘
𝑖(1 − 𝑃𝑘𝑖)
![Page 10: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/10.jpg)
10
© Arvind Rangaswamy (2017) All Rights Reserved
Another Implication: Cross Elasticity
The value on the right is the same for any alternative k.
Question: Is this a good property to have?
Cross elasticity is the percent change in the
probability of choice alternative k when an
observed variable j relating to another alternative
h changes:
𝑃𝑘𝑖
𝑋𝑗ℎ𝑖.𝑋𝑗ℎ
𝑖
𝑃𝑘𝑖= −𝛽𝑗ℎ𝑋𝑗ℎ
𝑖 𝑃ℎ𝑖
...(4)
![Page 11: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/11.jpg)
11
© Arvind Rangaswamy (2017) All Rights Reserved
Maximum Likelihood Estimation (MLE) of Logit Parameters
Then 𝑷(𝒀𝒌𝒊 = 𝟏) is the probability that 𝑼𝒌
𝒊 𝑼𝒎𝒊 for all k m in
the choice set. That is, 𝑷 𝑼𝒌𝒊 𝑼𝒎
𝒊 .
Now consider the likelihood of any random sample of N observations (individuals). This likelihood is the product of the likelihoods of the individual observations (Why?):
C is the choice set. For notational simplicity, I have dropped subscript k from .
Substitute for 𝑷 𝒀𝒌𝒊 = 𝟏 𝐟rom (4).
N
i Ck
YikJ
ikYPL
121 )1()...,,,(
𝒀𝒌𝒊 =
𝟏 𝐢𝐟 𝐜𝐨𝐧𝐬𝐮𝐦𝐞𝐫 𝒊 𝐜𝐡𝐨𝐨𝐬𝐞𝐬 𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 𝒌 𝟎 𝐢𝐟 𝐜𝐨𝐧𝐬𝐮𝐦𝐞𝐫 𝒊 𝐝𝐨𝐞𝐬 𝐧𝐨𝐭 𝐜𝐡𝐨𝐨𝐬𝐞 𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 𝒌
![Page 12: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/12.jpg)
12
© Arvind Rangaswamy (2017) All Rights Reserved
Estimating
N
i Ck
ijk
ik
ik
j
Ck
Xijk
jj
N
i Ck
ik
Y
N
i Ck
k
X
X
, ...J, for jXPyLLn
eLnXyLLn
e
eL
j
ijkj
ik
j
ijkj
j
ijkj
1
1
1
21 )()(
)()(
![Page 13: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/13.jpg)
13
© Arvind Rangaswamy (2017) All Rights Reserved
Estimating ’s
MLE maximizes the likelihood of obtaining the realized sample, as a function of model parameters.
To maximize Ln(L), set 𝑳𝒏(𝑳)
𝜷𝒋= 𝟎 for j = 1,2, …J. This
gives J equations in J unknowns. If a solution exists, it can shown to be unique under fairly general conditions.
The MLE Estimator of is consistent, asymptotically Normal, and asymptotically efficient.
’s can be interpreted akin to regression coefficients.
![Page 14: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/14.jpg)
14
© Arvind Rangaswamy (2017) All Rights Reserved
What’s the Big Deal? (McFadden won a Nobel prize for this line of work)
This is a theoretically defensible model to characterize discrete choices based on a random utility framework (people’s preferences are not directly observable – at least to the modeler – and can vary from situation to situation).
Developed MLE (Maximum Likelihood Estimation) for estimating the coefficients of the model. MLE maximizes the probabilities of the actual choices occurring given the estimated parameters.
Results in well-established statistical tests for determining the adequacy of the estimated model.
Has seen an exploding number of applications in different fields.
![Page 15: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/15.jpg)
Discussion of
Guadagni and Little (1983)
![Page 16: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/16.jpg)
16
© Arvind Rangaswamy (2017) All Rights Reserved
Overview of Data and Model
Ground coffee purchase and store contexts from store and panel data. Four Kansas City supermarkets for 78 weeks.
5 brands 2 sizes (small and large); eliminate 2 brand-sizes with very small market share (8 brands in total).
Variables – those with brand-specific effects (brand constants) and those with effects common to all alternatives.
Accounts for customer heterogeneity using loyalty variables.
![Page 17: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/17.jpg)
Models
Fit measure
𝑿𝟎𝒌𝒊
Effects
on utility
specific
to brands
𝑿𝟎𝟔𝒊
𝑿𝟎𝟕𝒊
𝑿𝟎𝟐𝒊
𝑿𝟎𝟑𝒊
𝑿𝟎𝟏𝒊
𝑿𝟎𝟒𝒊
𝑿𝟎𝟓𝒊
Effects
on utility
common
to all
brands
Null model
![Page 18: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/18.jpg)
18
© Arvind Rangaswamy (2017) All Rights Reserved
A Closer Look at the Loyalty Variable (Accounts for Heterogeneity)
b is a “carryover” constant, expected to be about 0.75.
is normalized to sum to 1 across brands.
A considerable amount of later research has scrutinized the loyalty variable,
and proposed alternatives. In an extensive simulation conducted by
Abramson et al. (JMR 2000), a model with the Guadagni and Little loyalty
specification, which allows for choice set effects (i.e., consideration sets),
performed the best. It is possible that the loyalty effects will be overstated
when consumer heterogeneity is ignored, although simulations conducted
by Abramson et al. show that underspecified (discrete) heterogeneity
induces bias primarily in preference coefficients (the alternative-specific
constants in the model).
Xi
k6
𝑿𝟔𝒌𝒊 = 𝜶𝒃𝑿𝟔𝒌
𝒊 (𝒏 − 𝟏) +
(𝟏 − 𝜶𝒃) 𝟏 𝐢𝐟 𝐜𝐨𝐧𝐬𝐮𝐦𝐞𝐫 𝒊 𝐛𝐨𝐮𝐠𝐡𝐭 𝐚𝐥𝐭𝐞𝐫𝐧𝐚𝐭𝐢𝐯𝐞 𝒌 𝐚𝐭 𝐩𝐮𝐫𝐜𝐡𝐚𝐬𝐞 𝐨𝐜𝐜𝐚𝐬𝐢𝐨𝐧 (𝒏 − 𝟏) 𝟎 𝐨𝐭𝐡𝐞𝐫𝐰𝐢𝐬𝐞
![Page 19: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/19.jpg)
19
© Arvind Rangaswamy (2017) All Rights Reserved
Predictive Validation of Models
![Page 20: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/20.jpg)
20
© Arvind Rangaswamy (2017) All Rights Reserved
Three Short-term Market Response Simulations
![Page 21: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/21.jpg)
21
© Arvind Rangaswamy (2017) All Rights Reserved
Segmentation in Choice Models Using Latent Class Analysis
Basic Idea: The population of customers consists of several segments, and the values of the variables of interest (e.g., Gender, Past purchases) are imperfect indicators of the segment to which a customer belongs.
Operationally, this means that the weights (j’s) in the Logit model differ across segments, but the segments are unknown (latent) and have to be inferred from the data.
(𝑷𝒌𝒊 |𝒊 𝒔𝒆𝒈𝒎𝒆𝒏𝒕 𝒔) =
𝒆 𝜷𝒋𝒔𝑿𝒊𝒋𝒌𝒋
𝒆 𝜷𝒋𝒔𝑿𝒊𝒋𝒌𝒋𝒌
![Page 22: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/22.jpg)
22
© Arvind Rangaswamy (2017) All Rights Reserved
Other Ways to Model Heterogeneity in Logit Models
Allow for the Logit parameters to be distributed over the population according to a known (i.e., specifiable, but with unknown parameters) distribution, and estimate the parameters of the distribution through MLE.
Incorporate observable characteristics of individuals, i.e., demographics, in the specification of the preference function ( ) -- does not work too well.
Hierarchical Bayes estimation (A topic that will require a separate class session).
ikV
![Page 23: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/23.jpg)
23
© Arvind Rangaswamy (2017) All Rights Reserved
Some Current Topics in This Area
Estimating differentiated product demand
systems with aggregate data (e.g., BLP model).
Dynamic choice models that account for time-
based preferences.
Simulated maximum likelihood and Bayesian
estimation.
![Page 24: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/24.jpg)
24
© Arvind Rangaswamy (2017) All Rights Reserved
Modeling Consideration
Roberts and Lattin model (1992) Two-Stage
model: (1) Consideration, and (2) Choice. Cost is
used as a threshold for being in or out of
consideration.
Wu and Rangaswamy (2003): Uses two-stage
model, but based on “fuzziness” in which all
alternatives are considered to some degree.
Incorporates decision process in understanding
consideration.
![Page 25: Choice Models - personal.psu.eduSegmentation in Choice Models Using Latent Class Analysis Basic Idea: The population of customers consists of several segments, and the values of the](https://reader034.vdocuments.mx/reader034/viewer/2022042913/5f49c3986016121b7e39debf/html5/thumbnails/25.jpg)
25
© Arvind Rangaswamy (2017) All Rights Reserved
Current Work on Consideration (with Daniel Ringel, Bernd Skiera, and Yifan Zhang)
Exploring factors that expand or maintain
consideration sets during decision process.
Detailed online data on consumer browsing
behaviors.