in-store experiments to determine the impact of price on...

In-Store Experiments to Determine

the Impact of Price on Sales∗

Vishal Gaur†, Marshall L. Fisher‡

December 2003

(Revised July 2004)

Abstract

This paper describes an experimentation methodology to measure how demand varies withprice and the results of its application at a toy retailer. The same product is assigned differentprice-points in different store panels and the resulting sales are used to estimate a demand curve.We use a variant of the k-median problem to form store panels that control for differencesbetween stores and produce results that are representative of the entire chain. We use theestimated demand curve to find a price that maximizes profit. Our experiment yielded theunexpected result that demand increases with price in some cases. We present likely reasons forthis finding from our discussions with retail managers. Our methodology can be used to analyzethe effect of several marketing and promotional levers employed in a retail store besides pricing.

∗This paper is based on a pricing experiment conducted at Zany Brainy, Inc., a toys retailer based in King of

Prussia, Pa., and owned by F.A.O. We would like to thank Gene Rosadino for facilitating the experiment, and

Young-Hoon Park (Cornell University) and Robert Shoemaker (New York University) for helpful comments.†Department of Information, Operations and Management Science, Leonard N. Stern School of Business, New

York University, 8-160, 44 West 4th St., New York, NY 10012. E-mail: [email protected].‡The Wharton School, University of Pennsylvania, Jon M. Huntsman Hall, 3730 Walnut St., Philadelphia, PA

19104-6366. E-mail: [email protected].

1 Introduction

An in-store experiment is a useful scientific tool for retailers to learn about consumer response to

the use of various marketing levers, such as pricing, promotions, store layout, presentation, etc.

However, successful design and execution of experiments involves many challenges. A retailer must

select such stores for the experiment that are representative of its entire chain, otherwise the results

of the experiment may be idiosyncratic and may not be generalizable to the entire chain. Further,

the retailer faces the choice whether to try all experimental treatments within a single store over

time, or to try different treatments across a cross-section of comparable stores. This choice depends

on the length of product lifecycle and the seasonality of demand. Finally, the retailer must execute

the experiment in a controlled environment, keeping the effect of non-experimental factors to a

minimum. These challenges can be daunting. For example, in a recent study of 32 large U.S.

retailers, we found that 90% of the retailers conduct price experiments. However, the retailers

themselves rated the effectiveness of their experiments as a median score of 6 out of 10.

This paper describes a controlled pricing experiment conducted at a toy retailer to measure

how its demand varies with price and to determine the price at which profit is maximized. The

products considered in the experiment are three different types of toys involving different buyer

behaviors (a branded toy, an unbranded but technologically complex toy, and an unbranded but

simple toy). None of the three products is a repeat purchase item. A set of stores is selected for the

experiment and the products are offered at different prices in different stores. The original prices

of the products are changed so that the customers see only the offered price and cannot tell if it

has been marked up or down. The experiment is carefully designed to ensure control and minimize

the chance that a customer may visit two stores and find the same product at different prices. We

present the experiment design methodology and then analyze the results from the experiment.

1

The key methodological questions addressed in the paper are: what stores to select for the

experiment, and how to map different price-points to each of these stores. Let n denote the number

of stores in the chain and suppose that each product in the experiment is to be tested at p price-

points in m stores (mp ≤ n). Thus, at each price-point, we require m stores such that they are

similar to the stores at the other price-points and, moreover, are representative of the entire chain.

We select such stores by using a clustering technique based on a measure of distance or degree of

dissimilarity between stores. Then we design a randomized block layout for the experiment wherein

p stores in each cluster are assigned one-to-one to different price-points.

Our paper contributes to the literature on pricing and field experiments in several ways. First,

while designing experiments has a long history in marketing and consumer research, in-store field

experiments are not that common. Instead, most existing research is based on interviews and

laboratory experiments involving the use of overt intervention to elicit consumer response; research

on pricing additionally uses historical time-series and panel data. An in-store experiment offers

some methodological advantages. It enables us to observe the relationship between price and unit

sales directly by changing prices systematically in a controlled environment. Thus, it allows us

to distinguish between consumer response to changes in regular price and consumer response to

markdowns. Furthermore, while prior research on pricing largely considers frequently purchased

items, we consider short lifecycle and one-time purchase items. The buyer behavior for such items

is more complex than that for frequently purchased items since consumers typically are less well

informed about such items and rely on price and other cues to make their purchase decisions (see

Roberts and Lilien (1993) for a survey of consumer behavior models). A field experiment enables

us to study buyer behavior for such items without having to assume a stylized pricing model.

Second, our experiment yields counterintuitive findings and shows that the relationship between

price and demand is not straightforward. While two of the three products used in the experiment

2

had downward-sloping demand curves, the third product had demand increasing significantly with

price in a part of the tested price range. Researchers in marketing have studied the role of price

in the consumer choice process as a budget constraint (allocative role) and as a signal of quality

(informational role). Our finding provides evidence to support existing theory and further shows

that the informational role of price can be so strong as to dominate the allocative role and result

in demand increasing with price. This finding is unique because we show an increase in demand

with price for the same product whereas existing empirical research typically compares price and

demand across competing products with different quality levels. We present possible explanations

for this finding from our discussions with managers at several retailers, and discuss managerial

implications on assortment selection and signals of quality other than price.

Third, we address the store selection problem in experiment design. This problem is particularly

relevant to seasonal or short-lifecycle products because there is a short time-span available for such

products, and it is not possible to change prices in the same subject store over time and observe the

corresponding change in sales.1 Instead, a matched subset of stores must be identified and different

prices must be used simultaneously in different stores. Several criteria must be met in selecting

the stores in order that the results of the experiment are accurately representative of the retailer’s

customer segment, and the cost and execution complexity of the experiment are minimized.

Our paper shows that in-store experimentation can be a powerful tool for retail managers to

learn about their customers. Once designed, the experiment can be conducted regularly with

different product groups. Further, it can be applied not only to price but also to other marketing

levers such as ‘item of the week’ promotion, shelf-space allocation, presentation, salesperson push,

etc.1It is also not prudent to change prices in the same subject store unless the goal is to study the effect of price

promotions. This is so because customers who revisit the store may discover the changes in price.

3

This paper is organized as follows. Section 2 describes the relevant literature; §3 presents our

store selection model; §4 discusses the application of the methodology to the retailer; §5 presents

the results of the experiment; §6 discusses the insights obtained for the pricing decision; and §7

concludes the paper with directions for future research.

2 Literature Review

There has been a vast amount of research on pricing in the marketing literature. We focus on the

main ideas related to our paper without providing a comprehensive survey.

Tellis (1988) conducts a meta-analysis study of models estimating the price elasticity of demand

for various products. He reports 367 estimates of elasticity for 220 brands/markets obtained from

42 studies published during the period 1960-85 and yielding 424 sales models. These studies use

time-series or panel sales data for frequently purchased products, and estimate various market

share models. They differ by product category, brand, lifecycle, estimation method, functional

form, region and demographic groups. Tellis finds that the mean price elasticity of demand is

significantly negative and that it differs significantly across product categories and even across

brands in the same product category. More significantly, he finds 50 items with estimated price

elasticities greater than zero and another 40 items with price elasticities between 0 and -1. See

also Brodie and Kluyver (1984), Ghosh, Neslin and Shoemaker (1984) and Naert and Weverbergh

(1981) for price elasticity studies using different datasets and pricing models.

It is recognized that price plays two distinct roles in the consumer choice process: an allocative

role (as a budget constraint) and an informational role (as a signal of quality); see Nagle (1984),

Rao (1984, 1993), Rao and Sattler (2000) for reviews. Empirical measurement of price elasticity of

demand is complicated by the fact that only the net effect of price can be observed in practice, and

4

the two roles cannot be distinguished. For example, in the aforementioned studies on price elasiticity

of demand, it is likely that the items with elasiticity greater than -1 had significant informational

role of price, and the ones with positive elasiticity had the informational role dominating the

allocative role. While there is a lack of research that integrates the two roles of price into a single

model, several papers use experiments to identify these roles.

For example, several researchers have examined the effect of price on consumers’ quality percep-

tions and on objective quality. Objective quality is defined as an unbiased measurement of quality

based on characteristics such as design, durability, performance and safety, and is often obtained

from independent consumer reports published by the Consumers Union. Research evidence suggests

that while price is used as an indicator of quality, there is a lack of uniformity across products as

well as across individuals in both the price-quality relationship and the perceived quality-objective

quality relationship (see Etgar and Malhotra 1981, Gerstner 1985, Zeithaml 1988). Several expla-

nations have been offered for this lack of uniformity. For example, Lichtenstein and Burton (1989)

find that consumers perceive a stronger association between price and quality for durable products

than for non-durable products, even though the objective quality of products may be unrelated to

or negatively correlated with price. They offer the following explanations for this finding: (i) less

knowledge about durable goods because the consumer makes fewer and more infrequent purchases

in a durable goods category, and (ii) greater difficulty in evaluating the quality of durable goods as

they are more complex products.

Dodd, et al. (1991) investigate how the price-perceived quality relationship influences buyers’

perception of value or their purchase intentions. They conduct an experiment with two short

lifecycle products (calculators and personal stereo) to identify the effects of price, brand and store

name on the consumers’ quality perceptions, value perceptions and willingness to buy. They find

that while price is positively correlated with quality perceptions, it is negatively correlated with

5

willingness to buy for both products. Their methodology differs from ours in that they use a

laboratory experiment rather than in-store experiment.

The lack of uniformity in the relationship between price and sales underscores the need for

further empirical research to validate existing theory. It also implies that retailers must use pro-

prietary data to learn about their own consumers. The field experiment presented in this paper

fits these purposes. It differs in focus from the existing research in three important ways. First,

we directly measure the effect of changes in regular price on sales rather than eliciting consumer

attitudes and preferences. Second, existing empirical research focuses on estimating market share

models for competing products or identifying the drivers of buyer behavior process, while we focus

on a retailer’s pricing decision. Finally, we consider short lifecycle one-time purchase items which

have not been commonly studied in prior research using actual sales data.

We note that other researchers have used controlled experiments at retail chains to estimate the

price elasticity of demand, see for example, Nevin (1974), Curhan (1974) and Neslin and Shoemaker

(1983). Experiments have also been used to study the impact of store environmental variables such

as music, lighting, behavior of store employees and store design on consumer behavior. See, for

example, Baker, Levy and Grewal (1992) and Gagnon and Osterhaus (1985). These studies differ

from this paper in that they use aggregated data or data from a pre-selected set of stores. Thus,

they do not address the problem of store selection for a retail chain selling short lifecycle products.

Our paper is related to Fisher and Rajaram (2000), who present an experimental methodology

for testing new merchandise at a subset of stores prior to launch and demonstrate results from

application to a women’s apparel retail chain. Their paper differs from ours since they do not

consider price as a decision variable. Further, their experiment design does not involve multiple

treatment effects, and therefore, does not require a matched subset of stores to be selected.

6

3 Experiment Design: Store Selection Model

Let n be the number of stores in the retail chain, p the number of price-points to be tested in the

experiment, and m the number of stores at which each price-point is to be repeated (mp ≤ n). In

the terminology of experiment design, each price-point is called a ‘treatment’, so that we have p

treatments and m repetitions of each treatment. We construct a randomized block design for the

experiment. We first partition all the stores into m disjoint blocks such that each block has at least

p stores and the stores in each block are as ‘alike’ as possible. We then select p stores in each block

that are geographically far from each other and randomly assign them to the p price-points. Thus,

each price-point is tested once in each of the m blocks.

The randomized block design is used because it satisfies three principles of experiment design:

replication, randomization, and local control. The replication of each treatment m times gives a

basis for the estimation of the experiment error. Randomization within each block is a necessary

condition for obtaining a valid estimate of the effects of the treatments on the experiment results

since it controls for unknown differences between stores that may be sources of error in the experi-

ment. Local control implies that the stores assigned to each block be chosen as alike as possible for

the comparison of treatment effects within each block. These three principles were formulated by R.

A. Fisher (1923, 1926). See Montgomery (1991), and Mason, Gunst, and Hess (1989) for an intro-

duction to experiment design, and Ghosh and Rao (1996) for survey articles on the mathematical

properties of experiment design methods.2

Our experiment has one other requirement in addition to these principles. We need to ensure

that the results of our experiment are representative, i.e., they provide an accurate forecast of the2When the number of experimental units per block is smaller than the number of treatments, then an incomplete

block design is used. While we use a complete block design, the method of store selection we present may be used in

conjunction with an incomplete block design as well.

7

sales in the entire chain at each treatment. Therefore, we require that the stores assigned to each

price-point should represent the sales characteristics of the entire chain.

To measure the degree of (dis)similarity between stores, we use a metric given by Fisher and

Rajaram (2000). They define the ‘distance’ or the degree of dissimilarity between two stores as

the difference between their sales distributions across product categories. Let l = 1, . . . , q index

the product categories sold by the retailer, and fsl be the fraction of sales of store s realized from

product category l. The distance between the sales distributions of stores s and t, denoted dst, is

computed using the Euclidean norm,

dst =√∑

l

(fsl − ftl)2. (1)

Fisher and Rajaram compare this distance metric with several alternatives. They find that the

partition of stores obtained using (1) provides a more accurate forecast of total chain sales than

distance measures defined solely on store size or geographical location. Our methodology remains

unchanged if an alternative measure of dissimilarity between stores is used.

We now assign stores to clusters (henceforth, ‘cluster’ is used synonymous with block). We use

a variant of the k-median problem (see Nemhauser and Wolsey 1988). Each cluster is represented

by its median store. The degree of dissimilarity within each cluster is defined as the sum of

the distances of all stores in that cluster from the median store. The objective of the problem

formulation is to minimize the total sum of dissimilarities within each cluster.

Let yk be 1 if store k is chosen as the median of a cluster and 0 otherwise. Also let xsk be

1 if store s is assigned to the cluster with store k as its median and 0 otherwise. The problem

formulation with decision variables yk and xsk is as follows:

minimize∑s,k

dskxsk (2)

8

subject to

∑k

xsk = 1 s = 1, . . . , n (3)

xsk ≤ yk k = 1, . . . , n, s = 1, . . . , n (4)∑k

yk = m (5)∑s

xsk ≥ pyk k = 1, . . . , n (6)

xsk, yk ∈ {0, 1} k = 1, . . . , n, s = 1, . . . , n. (7)

The objective function (2) is the sum of the distance of each store from the median of its cluster.

Constraint (3) ensures that each store is assigned to exactly one cluster; (4) ensures that stores are

assigned only to the median stores of their respective clusters; (5) restricts the number of clusters

to m; finally (6) defines the additional constraint that each cluster has at least p stores. We solve

this problem using the standard branch-and-bound algorithm for integer programming using the

CPLEX solver in GAMS.

We note that the k-median problem formulation is a non-hierarchical clustering technique.

It gives the optimal solution by solving an integer program for a fixed k, but is NP-Hard, thus,

computationally intensive and suited to smaller datasets. Additionally, it has the nice property that

the clusters formed are convex in the Euclidean space. In contrast, hierarchical cluster analysis

uses a greedy algorithm that iteratively combines together clusters with the minimum distance

between them in each step, but does not allow reallocation of stores that may have been poorly

classified in an early iteration. Thus, it need not give the optimal solution, but runs in polynomial

time and is particularly suited to large datasets. Further, it is useful when k is not known a priori.

Hierarchical cluster analysis requires a measure of distance not only between individual stores but

also between clusters of stores. Several distance metrics have been proposed, such as the distance

between the centroids of the clusters, or as the average distance between all pairs of stores across

9

the two clusters. For computing distance between singleton clusters, these methods are identical to

the Euclidean norm used by us. See Everitt, et al. (2001) for a comparison of alternative clustering

techniques.

4 Application

Zany Brainy, Inc.3 is a specialty retailer of high quality educational toys for children less than 12

years old. It has 53 retail stores in the U.S. with total sales of about $200 million. It faces a pricing

problem for its products every season when it launches new products and determines their prices.

The main characteristics of Zany Brainy’s business are as follows:

1. It sells products across eleven categories, such as games and puzzles, arts and crafts, sport-

theme toys, building toys and trains, infant development toys, electronic learning aids and

science-related toys, etc. The retailer has a unique image and does not sell toys that reinforce

gender stereotypes or encourage violence.

2. Most products sold at Zany Brainy are exclusive, with only 30% of the products being common

with discounters and mass-market retailers.

3. A typical store is 10,600 square feet and carries 15,000 skus. Stores are located in suburban

shopping centers, sharing retail space with destination and lifestyle-oriented retailers.

4. Zany Brainy seeks to set prices that give value to the customer, without trying to be the

discount leader in its market. Therefore, it does not use cost-plus pricing. Instead, merchan-

disers determine the markup for each item individually using their experience and judgement.

Products vary in price from less than $1 up to $200, with the average price for a single product

being less than $10. The average gross margin across all products is close to 50%.3All data for Zany Brainy, Inc. are based on the year 2000.

10

Zany Brainy was interested in adopting a more scientific approach to pricing. As a first step

towards this, it decided to conduct an in-store experiment with a subset of products in its chain

to observe how their demand varies with price. It was also interested in exploring the long-term

usefulness of in-store experimentation to learn about consumer response to various marketing levers

employed in its stores, such as “item of the week” promotion, salesperson push, advertisements,

store layout, assortment planning, etc.

Store Selection Model: We are given data for 53 stores, and each item is to be tested at three

price-points in six stores each. Thus, n = 53,m = 6 and p = 3. To measure the distances between

stores, we classify the annual dollar sales of each store into eleven product categories used by the

firm and compute the fraction of sales in each category. The average distance between stores thus

computed is 0.5347. After solving the k-median problem, the average distance of a store from the

median of its cluster is 0.2228, a 58% reduction. Since clustering per se does not provide tests of

statistical significance, we use an analysis of variance test to determine whether the clusters formed

are representative of the chain, i.e., whether they explain a statistically significant proportion of

the variation in sales distribution of all stores. The test is statistically significant at p< 0.001.

Now, from each cluster, three stores are selected for the experiment using additional criteria

to further control for dissimilarities within clusters. The selected stores are similar in age and size

(measured by total dollar sales), and their geographical locations are relatively isolated from other

stores belonging to the chain. Table 1 lists the stores selected for the experiment, their opening

years, their total year-to-date sales and the percentage of sales coming from the five largest product

categories.

Ideally, we would like to conduct the experiment in as many stores as possible to obtain a

large dataset for analysis. However, the choice of the number of stores was limited due to the

11

opportunity cost of lost sales in the stores under experimentation, and the complexity of managing

the controlled experiment such that there are no execution errors. The management of the firm was

also concerned about interference between stores located close to each other. If a large number of

stores were used in the experiment, then a customer visiting two neighboring stores would discover

the difference in their prices. This would not only introduce an error in the experiment but also

result in a loss of goodwill for the retailer.

Description of Items: The experiment was conducted for the following three products: a family

game center, ‘phonics traveler’, and a headset walkie-talkie. The family game center is an unbranded

board game. It is a simple toy that customers can try out in the store. The phonics traveler is a

branded toy, produced by Leapfrog Enterprises, Inc. It is a complex electronic product to teach

spelling and reading to children through interaction. The headset walkie-talkie is also a complex

electronic toy, but is unbranded. The family game center and the headset walkie-talkie are not

carried by the competition. Each item is unique to avoid comparison with other brands in the

same category.

Table 2 lists the items, their price-points and their purchase costs. The middle price-point in

each case is the existing list price and the high and low price-points are five dollars above and

below the list price. These price-points were chosen by the firm’s merchandising managers to be

sufficiently far from each other in order that they cause observable changes in demand.

Experiment Layout: Table 3 shows the layout of the experiment. In each cluster, one store is

designated the control store and assigned each item at the middle price-point. The remaining two

stores are randomly assigned the high and low price-points for each item. For example, store 103 is

the control store in cluster 1, store 102 has the Family Game Center at its highest price-point, and

the Phonics Traveler and the Headset Walkie-Talkie at their lowest price-points, and store 204 is

12

assigned the remaining price-points. We note that, as an alternative to this layout, all three prices

could be randomized across all 18 stores. Doing so would eliminate a potential confound if there is

interaction in demand across the products under experiment. We did not consider this possibility

since the products in our experiment belong to different categories, and are physically located in

different parts of the store.

Data Collection: The experiment was conducted for a period of six weeks. The length of the

time period was fixed as a compromise between our desire to have a long time period to collect

data, and the managers’ keenness to avoid the experiment encroaching on their peak selling season.

The data collected at each store for each week were the unit sales of the three items, the number

of returns, the beginning-of-week inventories of the three items, and the total number of sales

transactions in the store across all items. The total number of sales transactions in the store was

used to measure customer traffic in order to control for differences in store size.

Since store managers were kept unaware of the experiment, inventory control and data quality

were managed centrally. An inventory planner in the corporate office was responsible to monitor

beginning of week inventories to ensure that there was no stock-out. Various studies on retail

supply chains have demonstrated that data quality tends to be poor and that much work needs

to be done on upgrading data quality so that data could be useful in analysis (see, for example,

Raman, DeHoratius and Ton 2001). Thus, the inventory planner monitored sales data for the

experiment daily to safeguard against data quality problems.

Some precautions were observed during the experiment: (1) The price labels in each case were

changed to reflect the new prices. The labels did not show the original list price, so that customers

would not perceive that a product was marked up or marked down. (2) Sufficient inventory was kept

in the experimental stores to avoid stock-outs. (3) The store managers were not informed about

13

the experiment to avoid any execution differences that may arise because of managers treating the

experimental items differently, or trying to promote them based on their price-points.

5 Results

Table 4 summarizes the results of the experiment. For each item-store combination, it shows the list

price and the total number of units sold over six weeks. There were no stock-outs in the experiment

stores as sufficient inventory was provided. The last column in the table gives the average number

of sales transactions per week recorded in each store. Figure 1 shows a plot of the total sales of

each product at each price-point.

Note that the total sales of the family game center and the phonics traveler are downward sloping

in price. The family game center recorded total sales of 7 units, 5 units and 3 units at prices of

$19.99, $24.99 and $29.99, respectively. The phonics traveler recorded total sales of 33 units, 26

units and 15 units at prices of $24.99, $29.99 and $34.99, respectively. However, the headset walkie-

talkie shows a different pattern. Its sales of 74 units at the middle price-point are much higher than

the sales of 47 units at the lowest price-point and the sales of 36 units at the highest price-point.

This finding was unexpected to us as well as to the managers at the retail chain. To ascertain

whether this finding is statistically significant, we fit a demand model to the experimental data

expressing demand as a function of categorical variables for the three price-points.

Since all three products are slow-moving items, we represent weekly demand in each store with

a Poisson distribution, and thus, use a Poisson regression model for statistical analysis (See Greene

1997: Chapter 19 for details). We assume that mean weekly demand follows a multiplicative model

and is given by a product of cluster-specific, price-specific, and store size specific variables. Let

ykit denote random demand in the store in cluster k at price-point i in week t, and λkit denote the

14

mean of ykit. Here, k = 1, . . . ,m and i ∈ {L,M,H}, denoting low price, middle price and high

price, respectively. We write λkit as

λkit = akbi(xkit)c, (8)

where ak is a cluster-specific constant to control for differences between clusters, bi is a price-specific

constant, xkit is the number of transactions in the store in cluster k at price-point i in week t, and

c is a coefficient representing the increase in mean demand with store size.

The regression model is used to estimate ak, bi and c. We represent the indices k and i by

dummy variables, so that ak and bi are the coefficients of their respective dummy variables. Since

the coefficients’ matrix must be non-singular, it is not possible to estimate all coefficients separately.

Therefore, we set the value of bM to be 1 and use two dummy variables to estimate bL and bH

relative to bM . The model (8) is estimated by maximum likelihood estimation (MLE). The log

likelihood function for Poisson distribution is:

log L =∑k,i,t

[λkit + ykit (log ak + log bi + c log xkit)− log (ykit!)] . (9)

We test whether demand is downward sloping with price for the first two products using the

following hypothesis:

Hypothesis 1. For the phonics traveler and the family game center, bL ≥ bM ≥ bH .

For the headset walkie-talkie, we test whether demand at the middle price-point is higher than

that at lower and higher price-points using the following hypothesis:

Hypothesis 2. For the headset walkie-talkie, bL ≤ bM and bM ≥ bH .

Since bM is set to 1, Hypothesis 1 is equivalent to testing that log bL ≥ 0 and log bH ≤ 0.

Hypothesis 2 is equivalent to testing that log bL ≤ 0 and log bH ≤ 0. The Student’s t test is used

in each case because the estimates of log bi are asymptotically normally distributed.

15

Table 5 gives the parameter estimates obtained from MLE. For each item, the first two rows of

the table give the estimates of log bL and log bH setting a value of zero for the middle price-point.

The next six coefficient estimates give the values of log ak for the six clusters. The last estimate

gives the value of c, the exponent of xkit.

We observe that bi decreases with price for the family game center and the phonics traveler.

For example, for the family game center, the estimate of log bi is 0.3277 at the lower price, 0 at

the middle price, and -0.5249 at the higher price. Further, computing t-statistics and p-values, we

observe that the values of bi are statistically significant (p<0.05) at the higher price for the phonics

traveler and at both price levels for the headset walkie-talkie. Thus, Hypothesis 1 is supported

for the phonics traveler at 95% confidence level but not for the family game center. The lack of

significance for the family game center may be because the quantity of sales registered at each

price-point for this product is too small for statistical analysis.

For the headset walkie-talkie, Hypothesis 2 is supported with a 99% confidence level. Thus,

the finding that the sales at the middle price-point exceed those at the low and high price-points

is statistically validated.

6 Discussion

6.1 Reasons for upward sloping demand

To understand the reasons for the demand curve observed for the headset Walkie-Talkie, we dis-

cussed the results with the managers of the subject firm and several other retailers. The following

explanations emerged from our discussions:

1. Price as an indicator of quality: The headset walkie-talkie is a complex electronic item. The

consumers find it difficult to judge its quality, and therefore, use price as an indicator of quality.

16

The managers used wine as another example where consumers might be expected to use price as an

indicator of quality. This argument did not apply to the family game center because it is a board

game, and easily understood by the customer, so that price need not be used as an indicator of

quality. It also did not apply to the phonics traveler because it is a branded item.

This explanation is consistent with those given by Dodd, et al. (1991), Gerstner (1985), Licht-

enstein and Burton (1989), Tellis and Wernerfelt (1987) and Zeithaml (1988) for consumers using

price as an indicator of quality. Our finding extends the insights from these articles because it

is based on actual sales. Further, some of these papers are based on comparing price and quality

across products in a category, while we document the increase in sales with price for the same prod-

uct due to a likely increase in quality perception. For example, if a reputed and a not so reputed

brand sell the same toy for $30 and $25, respectively, then it is possible that the more reputed

brand has a higher quality and thus a higher demand in spite of the higher price. However, our

finding implies that the demand of the more reputed brand might increase if its price is increased

from $30 to $35.

2. Sweet spot of pricing: The price-point $19.99 is more popular for gift purchases than $14.99.

Consumers may like the headset walkie-talkie as a gift item, so that the unit sales at $19.99 exceed

those at $14.99.

6.2 Implications of Informational Role of Price

We first note that if price plays an informational role, then it leads to a reduction in the demand

and the expected profit for the subject item. To see this, consider the headset Walkie-Talkie. If

price did not play an informational role and consumers had full information about the quality of

this item from other sources, then the demand for this item would be downward sloping in price. A

conservative estimate of the price elasticity of demand is obtained from the two higher price points

17

of $19.99 and $24.99.4 We find that the price elasticity is -3.16 (see Appendix for the computations).

With this price elasticity, the price that maximizes the profit is $16.09 and the maximum profit

is $743.50. Compared to the profit of $665.26 realized at the middle price-point ($19.99), this

represents an increase of 12%. In such a case, it might be advantageous for the retailer to provide

additional signals of quality to increase demand and profits.

However, the informational role of price can be beneficial to the retailer due to product substi-

tution. Suppose that the retailer offers an assortment of items at different price-points and that the

consumers are willing to substitute between these items. Then using price as a signal for quality

could enable the retailer to capture a larger consumer surplus by promoting substitution.

7 Conclusions

We have presented a methodology for conducting experiments in a retail store that can help retail-

ing managers learn more about their consumers. This methodology is useful not just for finding

consumer reactions to different price-points, but also to test the effects of other marketing levers

on sales. The critical aspect of the methodology is the selection of stores for the experiment. We

have shown how the differences between stores may be defined using their sales characteristics and

used to partition stores into a randomized block design. This technique is advantageous because

an experiment with a small number of experimental stores can yield accurate results applicable to

the entire chain.

While our paper shows the usefulness of in-store experimentation, it has some limitations that

may be addressed in future research. First, our paper provides a single statistically significant

evidence that demand can be upward-sloping in price. There is need for replicating this finding

through further experimentation to demonstrate its validity. A wider range of products and price4This estimate is conservative since the observed sales are net of the informational and allocative roles of price.

18

points should be considered to systematically study how consumers react to prices and extend our

results. Second, the experiment may be conducted with fast-moving items or for a larger number

of weeks to collect more data for statistical analysis. Finally, the informational role of price may be

quantified in future research to investigate the conditions under which such a role is advantageous

or disadvantageous to the retailer. Such research would be useful to retailers that use different

pricing strategies and provide different levels of quality signals for their products.

References

Baker, J., M. Levy, D. Grewal. 1992. An experimental approach to making retail store environ-

mental decisions. Journal of Retailing 68 445-460.

Brodie, R., C. Kluyver. 1984. Attraction Versus Linear and Multiplicative Market Share Models:

An Empirical Evaluation. Journal of Marketing Research 21 194-201.

Curhan R. 1974. The Effects of Merchandising and Temporary Promotional Activities on the

Sales of Fresh Fruits and Vegetables in Supermarkets. Journal of Marketing Research 11

286-294.

Dodds, W. B., K. B. Monroe, D. Grewal. 1991. Effects of Price, Brand and Store Information on

Buyers’ Product Evaluations. Journal of Marketing Research 28 307-319.

Etgar, M., N. Malhotra. 1981. Determinants of Price Dependency: Personal and Perceptual

Factors. Journal of Consumer Research 8 217-222.

Everitt, B. S., S. Landau, M. Leese. 2001. Cluster Analysis. 4th ed. Edward Arnold.

Fisher, M. L., K. Rajaram. 2000. Accurate Retail Testing of Fashion Merchandise: Methodology

and Application. Marketing Science 19 266-278.

19

Fisher, R. A. 1923. Studies on crop variation II: The manorial response of different potato varieties.

Journal of Agricultural Science 13 311-320.

Fisher, R. A. 1926. The arrangement of field trials. Journal of Ministry of Agriculture 33 503-513.

Gagnon, J. P., J. T. Osterhaus. 1985. Effectiveness of Floor Displays on the Sales of Retail

Products. Journal of Retailing 61 104-116.

Gerstner, E. 1985. Do Higher Prices Signal Higher Quality? Journal of Marketing Research 22

209-15.

Ghosh, A., S. Neslin, R. Shoemaker. 1984. A Comparison of Market Share Models and Estimation

Procedures. Journal of Marketing Research 21 202-210.

Ghosh, S., C. R. Rao (eds.) 1996. Design and Analysis of Experiments. Handbook of Statistics,

Vol 13. Elsevier Science, Amsterdam.

Green, W. H. 1997. Econometric Analysis. 3rd ed., Prentice Hall, New Jersey.

Lichtenstein, D. R., S. Burton. 1989. The Relationship Between Perceived Quality and Objective

Price-Quality. Journal of Marketing Research 26 429-443.

Mason, R., R. Gunst, J. Hess. 1989. Statistical Design and Analysis of Experiments. John Wiley

& Sons, New York.

Montgomery, D. 1991. Design and Analysis of Experiments. 3rd ed., John Wiley & Sons, New

York.

Naert, Ph., M. Weverbergh. 1981. On the Prediction Power of Market Share Attraction Models.

Journal of Marketing Research 21 202-210.

Nagle, T. 1984. Economic Foundations for Pricing. Journal of Business 57 S3-S26.

20

Nemhauser, G., L. Wolsey. 1988. Integer and Combinatorial Optimization. John Wiley & Sons,

New York.

Neslin, S., R. Shoemaker. 1983. Using a Natural Experiment to Estimate Price Elasticity: The

1974 Sugar Shortage and the Ready-to-Eat Cereal Market. Journal of Marketing 47 44-57.

Nevin, J. 1974. Laboratory Experiments for Estimating Consumer Demand: A Validation Study.

Journal of Marketing Research 9 261-268.

Raman, A., N. DeHoratius and Z. Ton. 2001. Execution: The Missing Link in Retail Operations.

California Management Review 43 136 - 152.

Rao, V. R. 1984. Pricing Research in Marketing: The State of the Art. Journal of Business 57

S39-S60.

Rao, V. R. 1993. Pricing Models in Marketing. Chapter 11 in Handbooks in OR & MS. Vol. 5.

Marketing. J. Eliashberg and G. L. Lilien, eds., North-Holland, Amsterdam. 517-552.

Rao, V. R., H. Sattler. 2000. Measurement of Price Effects with Conjoint Analysis: Separating

Informational and Allocative Effects of Price. Chapter 2 in Conjoint Measurement: Methods

and Applications. Springer-Verlag, Berlin, Germany.

Roberts, J. H., G. L. Lilien. 1993. Explanatory and Predictive Models of Consumer Behavior.

Chapter 2 in Handbooks in OR & MS. Vol. 5. Marketing. J. Eliashberg and G. L. Lilien,

eds., North-Holland, Amsterdam. 27-82.

Tellis, G. J. 1988. Price Elasticity of Selective Demand: A Meta-Analysis of Econometric Models

of Sales. Journal of Marketing Research 25 331-341.

21

Tellis, G. J., B. Wernerfelt. 1987. Competitive Price and Quality under Asymmetric Information.

Marketing Science 6 240-253.

Zeithaml V. 1988. Consumer Perceptions of Price, Quality and Value: A Means-End Model and

Synthesis of Evidence. Journal of Marketing 52 2-22.

Appendix: Estimation of Price Elasticity of Demand

Assume that the mean demand depends on price according to a constant elasticity demand curve.

Then, (8) in the Poisson regression model of §5 is modified to

λkit = ak (Pki)η (xkit)

c ,

where Pki is the i-th price-point in cluster k, η is the price elasticity of demand, and the other

variables have the same definition as before. The log likelihood function is thus given by:

log L =∑kit

[λkit + ykit (log ak + η log Pki + c log xkit)− log (ykit!)] .

We compute the maximum likelihood estimates of ak, η and c. The price elasticity of demand for

the phonics traveler is found to be -2.02 (standard error = 0.90, p-value = 0.01). For the family

game center, it is found to be -1.95 (standard error = 1.64, p-value = 0.12). The lower significance

level for the family game center may be because it is a very slow-moving item with a few units sold

at each price. For the headset Walkie-Talkie, the price elasticity of demand estimated from the two

higher price points of $19.99 and $24.99 is -3.16 (standard error = 1.18, p-value < 0.01).

The optimal price is found by maximizing the standard expected profit function,

E[π] = Pkiλkit − Cλkit = constant ×(P 1+η

ki − CP ηki

),

where C denotes unit cost. This gives an optimal price of Pki = ηC/(1 + η). Its value is $35.65 for

the phonics traveler for a cost of $18 and a profit margin of $17.65. For the family game center, it

22

is equal to $22.58 for a cost of $11 and a profit margin of $11.58. Thus, the price for the phonics

traveler is found to be higher than the existing price of $29.99 and the price for the family game

center is found to be lower than the existing price of $24.99 at this retail chain. The increase in

expected gross profit from moving to the optimal price is 3.8% for the phonics traveler and 0.9%

for the family game center. The pricing problem for the headset Walkie-Talkie is discussed in §6.2.

23

24

Table 1: Summary of the stores used in the experiment classified into cluster

% Sales from each category Cluster Store Opening Year

Year To Date Sales ($ '000) 1 2 3 4 5

102 92 680.8 12.1 5.7 8.4 8.0 8.3 204 94 514.1 12.4 4.6 7.8 9.4 8.9

1

103 93 610.5 11.2 5.2 8.8 8.5 8.2 108 96 481.7 9.7 12.0 7.3 9.9 10.6 402 94 477.9 8.7 15.2 6.7 9.6 9.8

2

105 94 492.2 8.1 12.3 6.1 9.9 9.6 302 94 432.6 9.6 6.0 8.4 10.2 8.6 303 95 434.1 10.7 6.3 7.8 11.4 8.2

3

205 95 466.2 10.6 5.7 7.7 10.3 9.7 325 96 561.8 10.7 6.0 7.7 8.6 9.2 526 96 556.1 10.9 6.5 8.3 8.6 9.9

4

401 94 644.1 11.1 6.7 7.3 7.6 8.6 107 94 523.5 10.2 4.3 6.1 11.2 12.0 527 96 441.8 11.1 5.8 6.1 11.1 13.2

5

326 96 438.1 10.3 5.3 7.3 10.5 12.6 110 95 595.8 11.5 9.0 7.5 8.2 7.3 504 96 553.3 12.3 9.4 8.3 8.9 6.3

6

503 96 547 12.9 9.9 7.8 8.6 6.3

Table 2: Summary of products and price-points used in the experiment

Prices ($)

Low Medium (Existing list price)

High Purchase Cost

($)

A: Family Game Center 19.99 24.99 29.99 11 B: Phonics Traveler 24.99 29.99 34.99 18 C: Headset Walkie-Talkie 14.99 19.99 24.99 11

25

Table 3: Experiment layout showing the random assignment of stores in each cluster to price-points for each product

(a) Family Game Center

Prices ($) Clusters 19.99 24.99 29.99

1 204 103 1022 108 105 4023 302 205 3034 325 401 5265 107 326 5276 504 503 110

(b) Phonics Traveler

Prices ($) Clusters 24.99 29.99 34.99

1 102 103 2042 108 105 4023 303 205 3024 526 401 3255 107 326 5276 504 503 110

(c) Headset Walkie-Talkie

Prices ($) Clusters 14.99 19.99 24.99

1 102 103 2042 108 105 4023 303 205 3024 325 401 5265 527 326 1076 504 503 110

26

Table 4: Total sales recorded for each product in each store

Family Game Center Phonics Traveler Headset Walkie Talkie

Cluster Store Price Total Unit

Sales Price Total Unit

Sales Price Total Unit

Sales

Average number of transactions

per week 102 29.99 0 24.99 6 14.99 15 1954.83 204 19.99 2 34.99 0 24.99 5 1780.67

1

103 24.99 1 29.99 3 19.99 16 1741.50 108 19.99 2 24.99 3 14.99 8 1485.33 402 29.99 0 34.99 6 24.99 6 1355.50

2

105 24.99 1 29.99 2 19.99 18 1714.00 302 19.99 0 34.99 2 24.99 12 1294.17 303 29.99 0 24.99 6 14.99 4 1263.33

3

205 24.99 0 29.99 2 19.99 10 1343.00 325 19.99 2 34.99 0 14.99 9 1911.33 526 29.99 0 24.99 12 24.99 6 1872.17

4

401 24.99 3 29.99 10 19.99 9 1916.67 107 19.99 1 24.99 5 24.99 5 1795.00 527 29.99 1 34.99 1 14.99 6 1266.17

5

326 24.99 0 29.99 5 19.99 8 1328.33 110 29.99 2 34.99 6 24.99 2 1694.33 504 19.99 0 24.99 1 14.99 5 1424.67

6

503 24.99 0 29.99 4 19.99 13 2019.17 19.99 7 24.99 33 14.99 47 24.99 5 29.99 26 19.99 74

Total Sales

29.99 3 34.99 15 24.99 36

27

Table 5: Maximum likelihood estimates of the effects of price, cluster, and store size on unit sales

ML Estimates for Poisson Regression Model

Coefficients Estimate Standard

Error t-Statistic p-value Family Game Center Low price 0.3277 0.5858 0.5594 0.2880 High price -0.5249 0.7306 -0.7184 0.2363 Cluster 1 -0.2940 2.6467 -0.1111 0.4558 Cluster 2 -0.3317 2.5820 -0.1285 0.4489 Cluster 3 -14.1577 444.5660 -0.0318 0.4873 Cluster 4 0.2391 2.6643 0.0897 0.4643 Cluster 5 -0.7304 2.6288 -0.2778 0.3906 Cluster 6 -0.7086 2.6677 -0.2656 0.3953 Scale -0.2049 0.3525 -0.5812 0.2805 Phonics Traveler Low price 0.1858 0.2629 0.7065 0.2399 High price -0.6323 0.3248 -1.9465 0.0258 Cluster 1 5.4954 0.9818 5.5972 0.0000 Cluster 2 5.4885 0.9360 5.8635 0.0000 Cluster 3 5.2852 0.9293 5.6873 0.0000 Cluster 4 6.5071 0.9775 6.6569 0.0000 Cluster 5 5.5630 0.9570 5.8130 0.0000 Cluster 6 5.6776 0.9794 5.7969 0.0000 Scale -0.8609 0.1337 -6.4409 0.0000 Headset Walkie-Talkie Low price -0.5088 0.1868 -2.7238 0.0032 High price -0.7484 0.2017 -3.7098 0.0001 Cluster 1 6.9388 0.6362 10.9069 0.0000 Cluster 2 6.6475 0.6172 10.7704 0.0000 Cluster 3 6.3334 0.6152 10.2956 0.0000 Cluster 4 6.6708 0.6690 9.9713 0.0000 Cluster 5 6.1491 0.6454 9.5271 0.0000 Cluster 6 6.4219 0.6589 9.7464 0.0000 Scale -0.8306 0.0894 -9.2852 0.0000

28

Figure 1: Plot of total sales of each product at each price-point showing how demand varies with price

0

10

20

30

40

50

60

70

80

Low Medium High

Price

Unit Sales

Family Game CenterPhonics TravelerHeadset Walkie Talkie

in-store experiments to determine the impact of price on...

Documents