ESSAYS IN ADVERTISING MESSAGES, MASS MEDIA, AND
PRODUCT POSITIONING
by
Jun Bum Kwon
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
Graduate Department of Management University of Toronto
© Copyright by Jun Bum Kwon 2017
ii
Essays in Advertising Messages, Mass Media,
and Product Positioning
Jun Bum Kwon
Doctor of Philosophy
Graduate Department of Management
University of Toronto
2017
Abstract
In my dissertation, I apply emerging big data methodologies to measure the effects of
advertising content and detect potential product segments. First, I explore whether advertising
messages can change the topics reported in mass media. Specifically, I examine whether Dove’s
real beauty campaign increased the incidence of real-beauty-related topics in newspapers. Using
a topic model, I segment beauty-related topics and identify topics related to real beauty. The
number of sentences labeled as real beauty topics increases during the campaign. While the Dove
campaign’s significant impact on real beauty topics around the time of the campaign holds even
in newspapers without Unilever ads, the impact is larger in newspapers containing Unilever ads.
Overall, this evidence is consistent with both a mass media’s public service role and an
advertiser pressure influencing mass media content.
In my next study, joint work with Avi Goldfarb and Trevor Snider, I introduce a method
for identifying potentially related products using topological data analysis (TDA). From both
simulated and real consumer purchase data, I show that “loopy segments” in TDA can connect
regionally separated local products through national products, while standard clustering methods
such as hierarchical clustering cannot.
iii
Lastly, I test whether comparative advertising reposition rival brands closer together.
Using Google Trends’ aggregate consumer search data, I analyze Samsung’s U.S. television
comparative advertising campaign against Apple’s iPhone. I count co-occurrence of searches for
brand pairs (e.g. Samsung Apple) and their brand-product attributes (e.g. Samsung screen, Apple
Screen), to respectively map brand and product space. I find that advertised rival brands become
closer together in both brand and product spaces but not significant in product space. My results
suggest that a lower share brand may benefit from comparative advertising against a market
leader by forcing itself to be more considered alongside a market leader when consumers search
brands.
iv
Acknowledgments
“In their hearts humans plan their course, but the Lord establishes their steps (Proverbs 16:9)”
Thanks, God, for giving me peace during my unpredictable long journey. Each step was full of
opportunities and challenges. You have raised me up in the middle of my struggles. You have
taught me, even in very dark periods, how I can look on the bright side, say “thank you” to
everyone around me, and stand up to face the challenges with bravery.
I can’t say thank you enough to my parents. My father has the special talent of encouraging
people. He is my role model. He has always told me to share what I have learned and help others,
especially the poor. My mom has always worked more than I could imagine. She did almost
everything for my father, my sisters, me, and even her grandsons. I know well that she is always
praying for our whole family, including me. I have always felt her sincere support, even though
we have lived on different continents for the past 9 years. I would also like to express special
thanks to my two sisters, who have supported my parents while I have been studying.
My current PhD advisor, Avi, has been an excellent mentor in every step from the research idea,
empirical strategy, and data analysis to the written, visual, and verbal communications. He has
the talent of being able to look at both the big picture and the details. One of his unique teachings
is deciding when to stop. It is always painful to stop and start again from scratch. However,
saying “good-bye” has opened other opportunities, has broadened my horizons, and has made me
more objective in my own work. In the future, I will be conducting many different researches
and life projects. In any project, the lessons learned from Avi will apply.
My next special thanks go to my dissertation committee, Andrew and Ron. Their comments have
been useful in improving my dissertation, my job interviews, and my job talk. Sometimes, they
were more serious about my research than I was. I also learned a lot from my dissertation exam
committee. Scott, from Dartmouth College, gave me comprehensive comments for all 3 of my
essays. By addressing his questions, I had the opportunity to think more about the underlying
fundamental issues in my thesis. Nitin and Matt have also provided me with critical feedback for
my papers.
v
Several other professors have also highly impacted my overall research agenda and approach
through their classes and advice. My former advisor, Purush, in Milwaukee, showed me the
excitement of empirical work in the marketing field. Pradeep, in Chicago, taught overall
quantitative marketing modeling. Greg, in Ohio, taught several ground-breaking marketing
models with emphasis on Bayesian methods. Sridhar, in Toronto, has always explained complex
theoretical problems easily and intuitively. Victor, in Toronto, taught a variety of empirical
models with micro-economic foundations. Although I did not use their toolboxes much in my
dissertation, their perspectives and methods on marketing problems will impact my current and
future research.
My PhD life has not been determined solely by my own research. Community, indeed, matters.
We had deep, intellectual interactions among faculty and PhD students in research seminars.
PhD students in Toronto always helped each other by travelling to conference together and
sharing all kinds of information. In Milwaukee, my PhD colleagues and I participated in special
collaborations: commuting to Chicago to attend Pradeep’s class and taking Greg’s early morning
Skype class. Most of these things could not have been realized without my PhD colleagues.
Last, but not least, I am thankful to my own family. Since my departure from South Korea 9
years ago, my family has doubled. I strongly believe that my family is the best gift from God.
With my family, I have learned how to be content in whatever the circumstances (Philippians
4:11). Toronto, one of the most diverse and dynamic cities, has broadened my perspectives
knowledgably, socially, mentally, and spiritually. I interact with diverse people in my school, my
U of Toronto family apartment, and my church communities daily. Furthermore, Toronto has
provided my family with a variety of opportunities. My sons, Junyoung and Jaeyoung, have
many friends from many different nations. Motivated by current trends in big data and machine
learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her
expertise in computer science. I believe that God will keep helping and guiding me and my
family in our next city, Sydney, Australia.
vi
Table of Contents
Acknowledgments.......................................................................................................................... iv
Table of Contents ........................................................................................................................... vi
List of Tables ............................................................................................................................... viii
List of Figures ................................................................................................................................ xi
1.1 Introduction ..........................................................................................................................1
1.2 Literature ..............................................................................................................................8
1.2.1 Advertising effectiveness .........................................................................................8
1.2.2. Text Analysis .............................................................................................................9
1.2.3. Social issue marketing effectiveness .......................................................................10
1.2.4. Advertiser Pressure ..................................................................................................11
1.3. Data ....................................................................................................................................12
1.3.1. Dove Real Beauty campaigns ..................................................................................12
1.3.2. Newspapers ..............................................................................................................13
1.3.3. Text data pre-processing ..........................................................................................16
1.3.4. Data evidence from keyword-level analysis ............................................................17
1.4 Estimation and Result .........................................................................................................24
1.4.1 Empirical Strategy ....................................................................................................24
1.4.2 Topic extraction ........................................................................................................25
1.4.3 Testing.......................................................................................................................40
1.4.4 Mechanism ................................................................................................................46
1.4.4.1 How does the advertising message affect the content of a newspaper? ................46
1.4.4.2 Social issue advertising and the mass media’s public goal....................................48
1.4.4.3 Advertiser pressure ................................................................................................51
vii
1.5. Conclusion .........................................................................................................................53
1.3 Ch .......................................................................................................................................55
Chapter 2 ...................................................................................................................................55
Detecting potential product segments using topological data analysis .....................................55
2.1 Introduction .........................................................................................................................55
2.2. TDA methodology .............................................................................................................58
2.2.1. Vietoris-Rips Complex ............................................................................................59
2.2.2. Clustering distinctly grouped data (Cases 1 and 2) .................................................59
2.2.3. Homology groups, Betti numbers, and loopy segments ..........................................62
2.2.4. A loopy segment in a two dimensional plane (contrasting Cases 3 and 4) ..............63
2.2.5. Interval length of a loopy segment: Persistent homology (Cases 3 and 5) ..............66
2.2.6. Connecting loopy segments (Cases 6 and 7) ...........................................................68
2.2.7. Voids in three dimensional space (Cases 8 and 9) ...................................................71
2.3. Simulation study ................................................................................................................75
2.3.1. Simulation study procedure .....................................................................................75
2.3.2. Simulation study results ...........................................................................................79
2.4. Marketing application ........................................................................................................84
2.4.1. Data and computation time ......................................................................................84
2.4.2. Potential competitors within a category ...................................................................85
2.4.3. Potentially related products across categories .........................................................90
2.4.4. Relationship between a segment’s birth and its product diversity ...........................92
2.5. Conclusions ........................................................................................................................94
3.1 Introduction ..............................................................................................................................96
3.3.2. Main results using direct approach in brand space ................................................105
Appendices ...................................................................................................................................125
viii
List of Tables
Table 1.1 Dove’s Real Beauty campaign roll-out across countries ............................................ 12
Table 1.2 ProQuest queries used for extracting beauty articles by a country ........................... 14
Table 1.3 Summary Statistics ...................................................................................................... 15
Table 1.3.1 Sentences on social issue (i.e. real beauty) in newspapers ..................................... 15
Table 1.4 The number of newspaper sentences mentioning 'Real Beauty' increases
insignificantly relative to that without mentioning ‘Real Beauty’ in the treated countries
compared to control countries during the campaigns. ................................................................ 18
Table 1.5 Monthly top 50 words trend. ...................................................................................... 19
Table 1.5d Difference in mean of word frequency between during and non-during the
campaign ....................................................................................................................................... 22
Table 1.6 The optimum number of topics based on log Bayes factor over the null one-topic
model ............................................................................................................................................ 25
Table 1.7a Beauty topics ............................................................................................................. 27
Table 1.8a Content words in the Dove advertising campaign for Real Beauty .......................... 32
Table 1.8b ‘Social change’-related words ................................................................................... 33
Table 1.8c ‘Beauty service or product’-related words ................................................................ 33
Table 1.8d ‘Beauty contest’-related words ................................................................................. 33
Table 1.8e Movie related words ................................................................................................. 33
Table 1.8f There are 1 or 2 real-beauty-related topics in each country except New Zealand. .. 34
ix
Table 1.9a The number of sentences labeled as real beauty topics increases relative to that as
other beauty topics in the treated countries during the month(s) of the real beauty campaign.
....................................................................................................................................................... 42
Table 1.9b Falsification test: The number of sentences labeled as real beauty topics does not
increase relative to that as other beauty topics in the control countries during the month(s) of
the real beauty campaign. ............................................................................................................ 43
Table 1.9c The number of sentences labeled as real beauty topics increases relative to that as
other beauty topics in the treated countries relative to control countries during and one
month after the Real Beauty campaign. ....................................................................................... 45
Table 1.10 The significant impact of the campaign on real beauty-related topics are not driven
by reporting the Dove campaign. ................................................................................................. 47
Table 1.11a Rising social or cultural change words within real beauty topics during the
campaigns ..................................................................................................................................... 49
Table 1.11b Rising opposite words to physical beauty within real beauty topics during the
campaigns ..................................................................................................................................... 50
Table 1.12 The significant impact of the campaign on real beauty-related topics is even in U.S.
....................................................................................................................................................... 52
Table 2.1 TDA cases ..................................................................................................................... 74
Table 2.2: Community detection methods in Scenario 2 in the simulation study ....................... 83
Table 2.3 TDA results by the top N products in each market ....................................................... 85
Table 2.4a TDA for salty snacks..................................................................................................... 85
Table 2.5: Community detection methods for salty snacks using IRI data ................................... 87
Table 2.4b TDA for beers ............................................................................................................. 89
x
Table 2.4c TDA for the combined data ....................................................................................... 91
Table 2.6 The relationship between a segment’s “birth” filtration value and its product
diversity ......................................................................................................................................... 93
Table 3.1 Google trend queries used for extracting brand search trend ................................... 99
Table 3.2 Google trend queries used for extracting brands co-search trend with Apple ........ 101
Table 3.3 U.S. smartphone market share from March to May 2011 ........................................ 101
Table 3.4 Google trend queries used for extracting brand-product attribute trend for Apple 103
Table 3.5 Apple becomes closer to Samsung than the other brands during and after the
campaign in brand space. ........................................................................................................... 107
Table 3.6 Growth rate & change of co-search between each attribute and its brand ............. 110
Table 3.7 Difference in Difference for co-searches between brands and their attributes ...... 112
Table 3.8 Apple becomes closer to Samsung than the other brands but insignificantly during
and after the campaign in product space. .................................................................................. 114
Table A1 The number of sentences labeled as real beauty topics increases relative to that as
other beauty topics in the treated countries during the month(s) of the real beauty
campaign………………...................................................................................................................129
Table A2 The number of sentences labeled as real beauty topics increases relative to that as
other beauty topics in the treated countries relative to control countries during and one
month after the Real Beauty campaign.…………………………………………………………………………………130
xi
List of Figures
Figure 1.1 Billboard advertising on Dove campaign for Real Beauty............................................ 2
Figure 1.2 Newspapers more often use words related to social or cultural change in beauty
sentences during the Dove campaign for Real Beauty. .................................................................. 3
Figure 1.3 Trend of beauty topics ................................................................................................. 39
Figure 2.1a Distinctly grouped data Figure 2.1b A loopy segment ....................................... 55
Figure 2.2 TDA examples with two customers (Case 1-7) or three customers (Case 8 and 9) .. 60
Figure 2.2.1 Case 1 Two segments .............................................................................................. 60
Figure 2.2.2 Case 2 Tetragon ....................................................................................................... 61
Figure 2.2.3 Case 3 Square loopy segment ................................................................................. 64
Figure 2.2.4 Case 4 Center point within square .......................................................................... 65
Figure 2.2.5 Case 5 Rectangle loopy segment ............................................................................ 67
Figure 2.2.6 Case 6 Distant two loopy segments ........................................................................ 68
Figure 2.2.7 Case 7 Neighboring two loopy segments with one connection ............................. 70
Figure 2.2.8 Case 8 Tetrahedron ................................................................................................. 71
Figure 2.2.9 Case 9 Octahedron with void .................................................................................. 72
Figure 2.3a 5 steps for simulation study ..................................................................................... 76
Figure 2.3b True segments in simulation study .......................................................................... 76
Figure 2.4: TDA barcode chart for simulation study ..................................................................... 80
xii
Figure 2.5 Hierarchical clustering ................................................................................................ 81
Figure 2.6: Potentially competing products across segments using IRI data ............................... 86
Figure 2.7 Hierarchical clustering for salty snacks using IRI data ............................................... 88
Figure 2.8b: Potentially related products across segments using IRI data with order of
connection .................................................................................................................................... 90
Figure 3.1 Search volume trend by a brand ................................................................................ 97
Figure 3.2 Apple’s top rival brands trend ................................................................................. 100
Figure 3.3 Co-search trend by a brand pair with Apple ............................................................ 105
Figure 3.4 Market-structure map in brand space ..................................................................... 106
Figure 3.5 Samsung moves closer to Apple during campaign in brand space. ........................ 107
Figure 3.6(a) Co-searches between brands and advertised attributes increase in the first week
of the comparative advertising campaign. ................................................................................. 110
Figure 3.6(b) Co-searches between brands and unadvertised attributes do not change much in
the first week of the comparative advertising campaign. .......................................................... 112
Figure 3.7 Market-structure map in product space .................................................................. 113
Figure 3.8 Samsung moves closer to Apple during campaign in product space. ..................... 114
Chapter 1
Can an advertising message impact the content of mass media?
An examination of the Dove campaign for Real Beauty
1.1 Introduction
Advertising can have a society-wide impact (Kotler and Zaltman 1971; Andreasen 1995; Kotler,
Roberto, and Lee 2002; Kotler and Lee 2007). For example, DeBeers changed western marriage
culture with its “A diamond is forever” campaign in 1948; before the campaign, diamond rings
weren’t synonymous with marriage or engagement (Connolly 2011 in BBC News). While most
research focuses on how advertising affects brand recall (Jain and Hackleman 1978; Alba and
Amitava 1986), brand search (Joo, Wilbur, Cowgill and Zhu, 2015; Liaukonyte, Teixeira and
Wilbur, 2015; Hu, Du, and Damangir, 2014; Du, Hu, and Damangir, 2015; Srinivasan, Rutz,
Pauwels 2015), brand equity (Sriram, Balachander, and Kalwani 2007; Borkovsky, Goldfarb,
Haviv, and Moorthy forthcoming), and market outcomes (Leone 1995; Erickson and Jacobson
1992; Onishi and Manchanda 2012; Gopinath, Thomas, Krishnamurthi 2014; Pauwels, Stacey,
Lackman 2013; Dinner, Van Heerde, and Neslin 2014, Srinivasan, Rutz, Pauwels 2015), it has
been challenging to measure the impact of advertising on society in general or the media in
particular. In this study, we examine whether an advertising campaign can affect mass media
reporting on a social issue.
Social issue advertising informs the public about a social issue or influences their behavior
(Truss, French, Blair-Stevens 2010). It can be a powerful tool to impact the community by
triggering or accelerating social or cultural change (Kotler, Roberto, and Lee 2002; Kotler and
Lee 2007; Omid and Pete 2015). Governments often use public service announcements to
promote causes and activities that are generally considered socially desirable (Garbett 1981).
Public service announcements cover a variety of social problems, including racism, drug abuse,
drinking, driving, child abuse, and illiteracy (Murry, Stam, and Lastovicka 1996), often relying
on donated rather than paid media (Vingilis and Coultes 1990).
1
2
Figure 1.1 Billboard advertising on Dove campaign for Real Beauty
Figure 1.1a The first series of ad
Content words mentioned in the ads:
- oversized? outstanding? Does true beauty only squeeze into a size 6? Join the beauty debate.
- fat? fit? Does true beauty only squeeze into a size 6? Join the beauty debate.
- flat? flattering? Can you sexy without being busty? Join the beauty debate.
- flawed? flawless? Is beautiful skin only ever spotless? Join the beauty debate.
- grey? gorgeous? Why can’t more women feel glad to be grey? Join the beauty debate.
- ugly spots? beauty spots? Does skin really have to be flawless to be beautiful? Join the beauty
debate.
- wrinkled? wonderful? Will society ever accept ‘old’ can be beautiful? Join the beauty debate.
Figure 1.1b The second series of ad
Content words that describe the ads: “featuring six real women with real bodies and real curves” (Dove
website 2015)
The second picture is captured from CBS the early Show August 18, 2005, 9:59 AM
3
Private companies have also started to use social issue advertising, or cause marketing, as a form
of corporate social responsibility (CSR). For example, American Express’s “Small Business
Saturday” campaign encourages consumers to shop local to boost small businesses. Figure 1.1
shows images from the Dove campaign for real beauty, another example of social issue
advertising aimed at challenging beauty stereotypes, especially in media; to widen the definition
of beauty; and thus to make beauty a source of confidence, not anxiety, especially for girls
(Kolstad 2007; Dove website 2016). Currently, such empowering ads (e.g., Dove’s “Dove Real
Beauty Sketches” and P&G Always brand’s “Always #LikeAGirl”) are popular on YouTube:
The top 10 empowering ads were two-and-a-half times less likely to be skipped than other ads in
similar categories for the past 3 years from 2013 to 2015 (Wojcicki 2016).
Figure 1.2 Newspapers more often use words related to social or cultural change in beauty sentences
during the Dove campaign for Real Beauty.
US Canada UK
Y axis measures the ratio of the number of focal word(s) to the number of beauty sentences.
See Table 10a for words list on social or cultural change including opposite words to physical beauty.
Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), U.K. (2005 Jan.),
Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)
4
Given that the mass media affects and reflects modern culture and society (McQuail 2010), in
this study, we investigate whether an advertising message can change the content of mass media
and why. Specifically, we measure the change in topics related to beauty covered in newspapers
before, during, and after the Dove campaign for Real Beauty. Figure 1.2 provides some
motivating analysis. It shows that, during the campaigns, the frequency of the word “real beauty”
and the frequency of words related to real beauty or ‘social or cultural’ change (e.g. change,
question, traditional, culture, real, mind, brain, as identified by research assistants with graduate
training in sociology) rose substantially.
Such keyword-based analysis is suggestive of an impact of the Dove’s real beauty campaign on
media content. There are two potential issues with this analysis. First, the choice of real beauty
related words is not systematic. Second, there is no control group and so it is possible that real
beauty words would have risen during the campaign for reasons outside the campaign.
To address the first point, how can we identify topics related to advertising messages in
newspaper articles? There are challenges in collecting and analyzing newspaper content about
advertising message. First, advertising messages are not summarized with a few keywords,
unlike advertising titles. For example, articles on the campaign slogan (i.e., real beauty) may not
represent all relevant content because some articles may discuss real beauty without using the
phrase “real beauty”. In order not to lose relevant content, one needs to collect articles with
somewhat broad terms (e.g., beauty), and then extract topics related to “real beauty”.
Second, the commonly used aggregate-level keyword analysis may not discover the message of a
relatively small or emerging topic fully. If there are several themes (e.g., beauty services or
products, movies, real beauty) around the search query (e.g., beauty), it is hard to detect the
change in the focal topic (e.g., real beauty) at the aggregate level. At most, only the campaign
title words (e.g., Dove, ad, campaign, real, beauty), which are more frequently reported, can be
easily found. Instead, words related to less frequent advertising messages may not be captured
well.
To address the above challenges, we segmented beauty sentences into several groups based on
common topics. Topic models (Blei et al., 2003; Taddy 2012; Tirunillai and Tellis 2014;
Büschken and Allenby 2016) assume that the words in text are generated from a mixture of latent
topics. The extracted topics are defined by a collection of co-occurring words with a relatively
5
high probability of usage (Büschken and Allenby 2016). Such segmentation gave us the power to
detect relatively small topics. Examining the top words within a segment (i.e., topic) allowed us
to identify whether topics related to an advertising message exist in newspapers.
To address the second issue related to a control group, we exploited the rollout of the global
advertising campaign across several countries in order to test whether Dove’s real beauty
campaigns in the U.S., Canada, and the U.K. increased the incidence of real-beauty-related
topics in newspapers relative to other beauty topics across the three treated and two controlled
countries (i.e. Australia and New Zealand), where the campaign started later.
Utilizing the topic model proposed by Taddy (2012), we grouped all of the beauty sentences into
8 to 10 beauty topics in each analyzed country, including one or two topics related to real beauty.
The number of sentences labeled as real beauty topics increase relative to the number of
sentences labeled as other beauty topics in treated countries relative to control countries during
and one month after of the campaign. This is not driven only by reporting on the Dove campaign:
The significant impact on real beauty topics holds even after all the articles that mentioned Dove
in any sentence were excluded. Furthermore, many words related to social or cultural change in
real beauty topics were more often used during the campaigns. Overall, these results suggest that
advertising can affect the topics covered by mass media. While the Dove campaign’s significant
impact on real beauty topics around the time of the campaign holds even in newspapers without
Unilever ads, the impact is larger in newspapers containing Unilever ads.
Overall, this evidence is consistent with a mass media’s public service role as well as an
advertiser pressure role (Reuter and Zitzewitz 2006; Rinallo and Basuroy, 2009; Reuter, 2009; de
Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and Puglisi 2015; Focke,
Niessen-Ruenzi, and Ruenzi 2016) in media coverage of the Dove campaign. In other words,
media outlets with public as well as economic goals are willing to report and discuss the
messages of such social issue advertising.
Why can advertising messages on social issues affect mass media content? In addition to its goal
to profit, the mass media has a (non-economic) public goal to serve the public interest on desired
social or cultural change (McQuail 2010). Therefore, the mass media is likely to report the
message of social advertising actively. In fact, after interviewing 11 firms, Drumwright (1996)
finds that (1) social issue advertising tends to receive more media coverage than standard
6
campaigns, and (2) one social issue campaign earned media coverage valued at six times the
expenditure on paid media. Writers in the mass media industry seem to have a similar view. For
example, regarding the Dove campaign, Walker (2005) of the New York Times magazine wrote,
“the more intriguing fact is that it is a marketing campaign—not a political figure or a major
news organization or even a film—that ‘opened a dialogue’” in his essay “Social Lubricant–How
a marketing campaign became the catalyst for a societal debate.” Forbes contributor Dan also
said, “TV commercials are a culturally powerful force, shaping society and giving voice to those
outside the mainstream” (Omid and Pete 2015).
Why is this research question important? First, it is important to understand how marketing
affects society beyond firm performance. Given that mass media affects and reflects modern
culture and society (McQuail 2010), advertising messages can be another tool to change the way
that people think and talk by establishing a link between advertising message and the content of
mass media. Firms can then use advertising as a corporate social responsibility (CSR) activity, in
addition to products (e.g., innovative products, green products, recycling), employees (e.g.,
employing the disabled, providing retirement plans), transparent corporate governance (e.g.,
transparency), and charity to the community (Luo and Bhattacharya 2006; Luo and Bhattacharya
2009; Hull and Rothenberg 2008; Servaes and Tamayo 2013; Mishra and Modi 2016). In her
book Beauty Myth, Wolf (1991) argues that our culture’s images of beauty are shaped harmfully
by mass media (e.g., TV and women’s magazines) and advertisements. More than a decade later,
in a global study on women and beauty that was commissioned by Dove, more than two-thirds
(68%) of women also strongly agreed that the media and advertising set an unrealistic standard
of beauty that most women can’t ever achieve (Etcoff, Orbach, Scott, and D’Agostino 2004).
Therefore, a change in the way the media describes beauty was one of the Dove campaign’s
main goals (Kolstad 2007).
Second, given that publicity in mass media (Chintagunta, Jiang and Jin 2009; Kalra and Zhang
2011) and the interaction between a firm’s marketing action and publicity (Ching, Clark,
Horstmann, and Lim 2016) increase demand, it is important to understand what kinds of
marketing action can attract the attention of mass media. Recently, Harald, Gijsbrechts, and
Pauwels (2015) found that deep price reductions triggered newspaper coverage of a price war.
7
Third, given that publicity tends to have higher credibility than advertising (Cameron 1994; Lord
and Putrevu 1993) and more credible sources are viewed as more trustworthy and generate more
attitude change (Hovland and Weiss 1951), brand positioning is likely to be more effective when
a message is delivered through mass media rather than just commercial advertising. Once again,
it is important to understand what advertising message mass media are willing to report. While
several recent studies (Rinallo and Basuroy 2009, de Smet and Vanormelingen 2012; Gambaro
and Puglisi 2015, Reuter and Zitzewitz 2006; Gurun and Butler 2012; Focke, Niessen-Ruenzi,
and Ruenzi 2016) find the relationship between advertising revenue in focal mass media and
media bias, no advertising message has been studied as the driver to affect the content of mass
media, to our knowledge.
There are several contributions in our study. First, we contribute to the advertising effectiveness
literature by examining the effect of paid media (i.e., advertising) on earned media. While recent
studies measure the impact of advertising on consumer-generated media such as blogs, social
media, and online forums (Onishi and Manchanda 2012; Gopinath, Thomas, and Krishnamurthi
2014; Pauwels, Stacey, and Lackman 2013; Fossen and Schweidel 2016), we study the mass
media. Second, we explore both the mass media’s public service goal and advertiser pressure as
the drivers of the content of mass media, while most empirical studies have focused only on
advertiser pressure (Reuter and Zitzewitz 2006; Rinallo and Basuroy, 2009; Reuter, 2009; de
Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and Puglisi 2015; Focke,
Niessen-Ruenzi, and Ruenzi 2016). Lastly, we provide a novel marketing application of topic
modeling, which is an unsupervised text-mining method and was recently introduced in
marketing. In our paper, we show that while the commonly used aggregate-level keyword
analysis presents difficulties, the topic model is useful in extracting topics related to advertising
messages from newspapers.
We organize the rest of this chapter as follows. In §1.2, we summarize the related literature. In
§1.3.1, we introduce the Dove campaign for Real Beauty. In §1.3.2 and §1.3.3, we describe our
newspaper data and its pre-processing, then we show some evidence of the campaign’s impact
on newspaper content from the keyword level analysis in §1.3.4. Next, we propose our empirical
strategy in §1.4.1 and show the extracted topics by country in §1.4.2, the main testing results and
robustness check in §1.4.3, and the potential mechanisms in §1.4.4. Finally, we conclude this
study in §1.5.
8
1.2 Literature
1.2.1 Advertising effectiveness
This paper relates advertising to earned media. In this way, it examines a particular type of
advertising effectiveness. Traditionally, scholars have focused on direct impact of advertising on
market outcomes (Leone 1995; Erickson and Jacobson 1992) or marketing mix outcomes, such
as product differentiation (Kirmani and Zeithaml 1993) and price premiums (Ailawadi,
Lehmann, and Neslin 2003).
However, advertising may affect consumer purchase indirectly through pre-purchase consumer
behavior (Onishi and Manchanda 2012; Gopinath, Thomas, Krishnamurthi 2014; Pauwels,
Stacey, Lackman 2013; Dinner, Van Heerde, and Neslin 2014, Srinivasan, Rutz, Pauwels 2015).
As consumer activities began to be recorded in online websites (e.g., shopping and search), it
became possible to measure marketing effectiveness in even the pre-purchase stage. For
example, recent studies show that consumer searches for brands or products rises with their
television commercials (Joo, Wilbur, Cowgill and Zhu, 2015; Liaukonyte, Teixeira and Wilbur,
2015; Hu, Du, and Damangir, 2014; Du, Hu, and Damangir, 2015; Srinivasan, Rutz, Pauwels
2015) and with online display advertising (Lewis and Nguyen 2015).
While consumer search data help understand consumers’ consideration sets, the media plays a
role in increasing awareness of brands or products in the early stage of the consumer purchase
journey. Recent studies show that advertising can increase word-of-mouth in consumer-
generated media such as blogs, social media, and online forums (Onishi and Manchanda 2012;
Gopinath, Thomas, and Krishnamurthi 2014; Pauwels, Stacey, and Lackman 2013; Fossen and
Schweidel 2016).
Although mass media data (e.g., newspaper articles) were available publicly far before either
consumer search or consumer-generated media data, there is still limited literature measuring the
effect of advertising on mass media, perhaps due to the lack of systematic analysis of media
content. Only in the literature on media bias due to advertiser pressure, a few empirical studies
have linked advertising expenditure to the length of articles (Rinallo and Basuroy 2009),
frequency of news articles (de Smet and Vanormelingen 2012; Gambaro and Puglisi 2015), and
tone (i.e., sentiment) in articles (Reuter and Zitzewitz 2006; Gurun and Butler 2012; Focke,
9
Niessen-Ruenzi, and Ruenzi 2016) about advertising firms. Specifically, Rinallo and Basuroy
(2009) show that European and U.S. newspapers and magazines tend to write more about the
products of Italian fashion firms if they spend more for advertising in the focal media. Reuter
and Zitzewitz (2006) find that three U.S. personal finance media sources (i.e., Money Magazine,
Kiplinger’s Personal Finance, and Smart Money) are more likely to positively mention mutual
funds that are advertised with higher advertising expenditure. Gurun and Butler (2012) and
Focke, Niessen-Ruenzi, and Ruenzi (2016) show that U.S. newspapers write less critical articles
on heavier advertisers. They measure “news tone” based on the number of negative words, a
method developed in the financial context by Loughran and McDonald (2011).
However, none of the above papers studied the effect of advertising message rather than
expenditure on mass media content. In this paper, using a topic model, we propose a new
approach to measure whether mass media sources report or discuss more about the message or
theme that the advertising campaign intends to deliver. Our approach can be applied to
consumer-generated media as well.
1.2.2. Text Analysis
Most existing studies that use text-mining techniques are based on particular words or phrases
(Gentzkow and Shapiro 2010; Archak, Ghose, and Ipeirotis 2011, Ghose, Ipeirotis, and Li 2012;
Tang and Guo 2013; Gopinath, Thomas, and Krishnamurthi 2014; Pauwels, Stacey, and
Lackman 2013), co-occurrence of pairs of words (Netzer, Feldman, Goldenberg, and Fresko
2012), and sentiment (Sonnier et al. 2011; Tirunillai and Tellis 2012; Gurun and Butler 2012;
Ludwig et al. 2013; Focke, Niessen-Ruenzi, and Ruenzi 2016). Recently, marketing scholars
have started to use “unsupervised” text mining, which is a dimension-reduction technique in
which a large number of documents are summarized into a small number of product attribute
clusters (Lee and Bradlow 2011), principal components (Liu, Vir Singh, Srinivasan 2016), or
latent topics (Tirunillai and Tellis 2014; Büschken and Allenby 2016). This unsupervised text-
mining has several advantages. First, it exploits full information (i.e., all the words) within each
text when both reducing dimensions and interpreting clusters, components and topics, resulting
in rich context. This is different from the traditional text categorization approach to use only
some words (e.g., product attributes, brand pairs, adjectives). Second, it requires much less
10
human intervention than the traditional approach, which often requires a researcher to decide on
predefined keywords.
There are only a few applications using the unsupervised text-mining in marketing. For example,
Tirunillai and Tellis (2014) extracts quality dimensions (i.e., topics) from product reviews and
also show that dimensions’ importance varies over time. Liu, Vir Singh, and Srinivasan (2016)
decompose tweets into principal components and then use them to predict demand for TV
programs (i.e., shows and NFL games). Büschken and Allenby (2016) also use topics in hotel
reviews as the predictors of overall satisfaction (i.e., review rating). Our work is similar to that of
Tirunillai and Tellis (2014) in that both studies use time-varying topic trends. They illustrate that
a new product launch or news of bad product performance is likely to affect consumer
satisfaction on the “ease of use” quality dimension but do not test it formally. In our paper, we
test whether content marketing (i.e., advertising messages) can change topics of newspapers (i.e.,
have a qualitative impact on publicity). Each topic consists of a collection of words. This rich
information allows us to identify whether topics related to advertising messages exist in the
editorial content of newspapers.
1.2.3. Social issue marketing effectiveness
Our work also contributes to the literature on social issue marketing effectiveness. The existing
literature has focused on consumer attitude or purchase intention and market outcomes in
assessing the effectiveness of social issue marketing, such as cause marketing (CM) and other
corporate social responsibility (CSR) activities.
Using laboratory experiments, many studies show that respondents have a positive attitude
toward companies that implement CM campaigns and are more willing to buy their products
(Brown and Dacin 1997; Pracejus, Olsen, and Brown 2003; Strahilevitz and Meyers 1998, Chang
2008; Folse, Niedrich, and Grau 2010; Koschate-Fischer, Stefan, and Hoyer 2012; Robinson,
Irmak, and Jayachandran 2012). Recently, Andrews, Luo, Fang and Aspara (2014) found using a
large-scale field experiment that CM increases consumer purchase and thus sales revenue. The
effect is the strongest with moderate rather than deep or absent price discounts. Several studies
also link a firm’s CSR activities (e.g., charity, green products, transparent governance) to its
11
financial performance (Luo and Bhattacharya 2006; Luo and Bhattacharya 2009; Hull and
Rothenberg 2008; Servaes and Tamayo 2013; Mishra and Modi 2016).
By emphasizing the importance of expanding the metrics to measure CSR effectiveness beyond
firm performance (e.g., market share and financial return), Raghubir, Roberts, Lemon, and Winer
(2010) propose to add community metrics including measures related to societal issues (e.g.,
literacy rate, birth/death rate) and media coverage (e.g., quantity and quality of press impact).
However, media metrics are rarely used as the outcome of social issue marketing in academia,
although they are often used in industry (Drumwright 1996). Interviewing 11 firms about both
their standard and social issue campaigns, she finds that firms want to enhance their image, build
brand equity, and increase sales with their standard campaigns, but they also have public
relations (i.e., media exposure) and cause-related measures (e.g., the number of people actually
engaging in the social behavior) as goals of their social issue campaigns. Informants in her study
often observe more media coverage in social issue campaigns than standard campaigns.
Our work is among the first studies to measure the impact of a social issue campaign on topics
covered in the mass media, which are a qualitative measure of press impact.
1.2.4. Advertiser Pressure
This paper also provides evidence to support theoretical papers’ finding that editorial content can
be affected by advertiser pressure. Given that (1) advertising is the major revenue source of
many media outlets (Stromberg 2004, Mantrala, Naik, Sridhar, and Thorson 2007, Pew Research
Center 2014), and (2) media content affects demand (Chintagunta, Jiang and Jin 2009; Kalra and
Zhang 2011; Ching, Clark, Horstmann, and Lim 2016), advertisers have economic incentives to
influence editorial content (Kerkhof and Münster 2015). There is a growing body of such
theoretical literature to link the advertiser pressure and media bias across marketing and
economics (Ellman and Germano 2009; Gal-Or, Geylani, and Yildirim 2012; Zhu and Dukes
2015; Spiteri 2015; Blasco, Pin, and Sobbiro 2016).
By estimating both viewer and advertiser demand, Wilbur (2008) shows that advertiser
preferences influence network choices about program genre more strongly than viewer
12
preferences. Our work also adds evidence of such advertiser pressure to complement the other
empirical papers mentioned in section 2.1 (Reuter and Zitzewitz 2006; Rinallo and Basuroy,
2009; Reuter, 2009; de Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and
Puglisi 2015; Focke, Niessen-Ruenzi, and Ruenzi 2016). However, we also show that the
advertiser pressure is not the only reason.
1.3. Data
1.3.1. Dove Real Beauty campaigns
We chose the Dove campaign for this study for two reasons. First, social issue campaigns
tend to receive more media coverage than non-social ones (Drumwright 1996). The Dove
campaign is such social advertising. Before the campaign, Etcoff, Orbach, Scott, and D’Agostino
(2004) found that only 4% of women around the world would describe themselves as beautiful in
their global study, which was commissioned by Dove. After the study, Dove started its campaign
in order to challenge beauty stereotypes, especially in the media, and widen the definition of
beauty.
Table 1.1 Dove’s Real Beauty campaign roll-out across countries
Group Year Countries and the months of the campaign
Treatment
2004 Canada, U.S. (September, October)
2005 U.K. (January), U.S. (July, August), Canada (August, September)
Control 2006 Australia, New Zealand (April)
Analyzed periods are 2004-2005.
Second, global campaign rollouts allow a natural quasi-experimental setting across countries.
Table 1.1 shows the launching periods of the Dove campaign across five countries. The first
series of ads known as “Tick-Box campaign” (Figure 1.1a) was launched in September 2004 in
both Canada and the U.S., and then in January in the U.K. The ads asked viewers to judge
women’s looks (oversized or outstanding? and wrinkled or wonderful?), and invited them to cast
their votes at campaignforrealbeauty.com (Dove website 2015). The voting results were updated
in counter on billboard in real time. The second but more iconic campaign was introduced in
13
both Canada and the U.S. in 2005. As seen in Figure 1.1b, this ad is “featuring six real women
with real bodies and real curves” (Dove website 2015). In this study, we analyze those early
campaigns across the 3 countries. For control countries, we use New Zealand and Australia,
where the Dove campaigns were brought later in 2006. The key identifying assumption is that
month-to-month changes in newspaper topics related to beauty in Australia and New Zealand are
a good control for month-to-month changes in newspaper topics related to beauty in the countries
in which the campaign occurred.
1.3.2. Newspapers
As a dependent variable, we need to count the number of beauty sentences by a beauty topic. Our
first step was to collect newspaper articles on beauty from the ProQuest Newsstand database,
which collects many newspapers worldwide, through Libraries in the University of Toronto and
the Auckland University of Technology.
Because we analyze newspaper articles written in English during the analyzed periods of 2004 to
2005, the first set of query in the ProQuest Newsstand database we used was beauty AND
PD(2004-2005) AND LN(English) STYPE(Newspapers), where PD, LN, and STYPE mean
publication date, language, and source type, respectively. In order to analyze more relevant
articles, we added the following restrictions on the above query: (1) AND FTANY(yes), and (2)
AND (women OR woman OR females OR female OR girls OR girl) and (3) AB(beauty), where
FTANY, AB means full text and abstract, respectively. We downloaded articles in full text
because we needed to analyze the content. Such women-related context words helped filter
articles on women’s beauty. An article with beauty in its abstract is more likely to describe
beauty as a main topic.
Table 1.2 shows the queries we used. To collect articles by country, we used JSU (journal or
publication subject), which contains country information. For example, for the U.K., we added
AND JSU(“Great Britain”) in the above query. We also used the same query for Australia and
New Zealand. In the U.S. data, we found many articles published in Canada, perhaps due to their
regional closeness. Thus, for the U.S., we put additional restriction with AND CP(“United
States”), where CP means county of publication. For Canada, there is a newspaper database only
14
for Canada, ProQuest Canadian Newsstand Complete, and we used only JSU(“Canada”). Using
these queries, we downloaded each month’s beauty articles by country.
Table 1.2 ProQuest queries used for extracting beauty articles by a country
Country ProQuest database
Query
Country specific part Common part
Canada Canadian
Newsstand Complete
AND JSU(Canada) AB(beauty) AND PD(200512)
AND (woman OR women OR girl OR girls OR female OR females)
AND STYPE(Newspapers) AND LN(English) AND FTANY(yes)
U.S.
Newsstand
AND JSU("United States") AND CP("United States")
U.K. AND JSU("Great Britain")
Australia AND JSU(Australia)
New Zealand AND JSU("New Zealand")
JSU: journal or publication subject, CP: county of publication, AB: abstract, PD: publication date, STYPE:
source type, LN: language, FTANY: full text.
Next, to focus on directly beauty related contents, we choose all the sentences that mention
beauty. The detailed steps to process text data is described in the next section. In principle, one
could analyze articles that mention beauty in any sentence. However, given that newspaper
articles are quite long compared to tweets and product reviews, some articles that mention
“beauty” may talk about very different topics. As a result, there may be high noise in article level
analysis, making it hard to name topics. On the other hand, each sentence tends to have just one
topic (Büschken and Allenby 2016). Therefore, at the sentence level, micro-detection is possible.
Furthermore, by sampling sentences that mention “beauty”, there is also a gain in computing
time.
15
Table 1.3 Summary Statistics
U.S. Canada U.K. Australia
New Zealand
Monthly sentences that mention “beauty”
Min 259.0 92.0 226.0 59.0 4.0
Mean 320.9 158.0 310.6 100.3 12.3
Median 307.0 154.5 319.5 99.5 11.5
Standard Deviation 55.6 37.3 48.7 25.4 6.0
Max 506.0 242.0 400.0 146.0 26.0
Monthly sentences that mention “real beauty”
Min 0.0 0.0 0.0 0.0 0.0
Mean 2.5 2.0 1.7 1.8 2.0
Median 1.0 1.0 1.0 2.0 1.0
Standard Deviation 4.3 2.2 3.1 1.5 2.2
Max 19.0 7.0 15.0 7.0 7.0
Table 1.3 shows the number of sentences per month mentioning “beauty” (as defined in Table
1.2) and the number of sentences containing the two word expression “real beauty”, the
campaign slogan. There are many fewer “real beauty” than “beauty” sentences. “Real beauty”
sentences represent only less than 1% of “beauty” sentences in the U.S. During some months, no
“real beauty” article was published. This may suggest that the effect of the Dove campaign on
newspapers was trivial. However, there could have been sentences which do not include the “real
beauty” phrase. Table 1.3.1 show the example. Although the two sentences are about real beauty,
only the first sentence mention “real beauty” phrase. In this sense, just counting “real beauty”
sentences is a naïve approach, and thus it is not likely to represent all the relevant content.
Table 1.3.1 Sentences on social issue (i.e. real beauty) in newspapers
Type Sentence Source
With “Real Beauty” keyword
The "Campaign for Real Beauty" contest was part of a promotion intended to broaden the traditional definitions
of beauty.
The Press Democrat 19 Aug 2005
Santa Rosa, California
Without “Real Beauty” keyword
Too many women in America suffer from eating disorders brought on, in many cases, by cultural pressures to live up
to an unrealistic image of ideal beauty.
Deseret News 23 Mar 2005
Salt Lake City, Utah
16
1.3.3. Text data pre-processing
Most of the text pre-processing could be done automatically using standard text-mining software.
We pre-processed the newspaper articles using the following steps:
1. Removing URLs and “Full text:” which locates between the article title and body.
2. Splitting the text into articles using an article identifier (i.e., long line).
3. Identifying articles that mention Dove.
4. Detecting and keeping only beauty sentences that contain “beauty” after splitting articles
into sentences using R “openNLP” and “qdap” package.
5. Transforming capital letters into lowercase letters.
6. Collapsing compound words connected by hyphens into one word (i.e., make-
upmakeup, self-esteem, self esteemselfesteem).
7. Removing all punctuation.
8. Removing stop words using a vocabulary of stop words reserved in the R “tm” package.
9. Collapsing words into a common root using the R “SnowballC” package, which
implements Porter’s (1997) word stemming algorithm.
10. Replacing words with their similar meaning words (i.e., womanwomen, saidsay,
manmen, advertisead, therapisttherapi).
11. Removing all words that appear less than 0.2% of the time in all the beauty sentences in
each country.
12. Removing words with only one character or more than 20 characters.
13. Removing “beauti,” which is the stem of “beauty.”
In step 1, URLs and “Full text:” were added by ProQuest rather than the newspaper publishers.
Since they are not newspaper content, we deleted them.
17
Since we downloaded all of the monthly articles at once, we need to split the whole text into
separate articles in step 2. Given that our unit level of analysis is the sentence, we could split the
text into sentences directly in step 4. However, for the robustness check in the result section, we
needed to perform the article separation in step 3.
“Beautiful” and “beautifully” have the same stem as “beauty”. Before stemming in step 9, we
replaced the two words with “beautifull,” whose stem is “beautiful.” we also found that some
words with similar meanings have different stems. Thus, we matched those similar words in step
10 after the stemming step. Since “beauti” is in all of the sentences by construction, it does not
have any discriminatory power with respect to topics, like stop words (Büschken and Allenby
2016). Thus we deleted “beauti” in step 13.
1.3.4. Data evidence from keyword-level analysis
Before we turn to the topic model, we performed a commonly used keyword-level analysis in
order to provide suggestive evidence of the relationship between the campaign and newspaper
content. First of all, we tested whether the campaign slogan (i.e., “Real Beauty”) was more often
used during the campaigns in Columns (1) and (2) of Table 1.4. While Column (1) use only
sentences mentioning "real beauty" in the treated countries, Column (2) adds those in control
countries. In both columns, the key coefficient of interest, Real Beauty x During Campaign x
Treated Countries, is positive and significant, suggesting that newspapers seems to have talked
more about “real beauty” during the campaigns in treated relative to in control countries. The
effect for Real Beauty x During Campaign in Column (2) is positive but not significant, as
expected, suggesting that there were no significant increase in reporting “real beauty” in control
countries. Column (3) adds other beauty sentences that mention “beauty", but not “real beauty”.
Now, the key coefficient of interest is positive but not significant any more. This result may
suggest that the Dove campaign did not make enough impact so that newspapers started to talk
more about “real beauty” significantly compared to “other beauty” topics. As we discussed
before, however, this insignificant result may be driven by the keyword-level approach’s
inability to capture real-beauty-related sentences that do not mention “real beauty”. In other
words, one may not be able to explore “real beauty” topics comprehensively just by analyzing
only sentences that mention “real beauty”.
18
Table 1.4 The number of newspaper sentences mentioning 'Real Beauty' increases insignificantly
relative to that without mentioning ‘Real Beauty’ in the treated countries compared to control countries
during the campaigns.
Only beauty sentences with "real beauty"
Add beauty sentences without
"real beauty"
Only treated
countries
Add
control countries
(1) (2) (3)
Real Beauty x During Campaign x Treated Countries
6.47*** (1.41)
3.40*** (1.17)
13.9 (9.03)
Real Beauty x During Campaign 0.867 (1.32)
5.85 (10.2)
Country-specific sentence type dummies 2 4 9
Year-month dummies 23 23 23
R-sq 0.558 0.420 0.961
Observations 72 120 240
Treatment Sentences with "real beauty"
in treated countries
Control
Sentences with "real beauty" in control countries
Sentences with "beauty", but not
"real beauty"
Dependent variable is the number of sentences in each country-specific sentence type, which is the unit level of analysis. Each country has two types of beauty sentences (1) with "real beauty" and (2) without "real beauty" phrase. Treated countries: the U.S., Canada, and the U.K, Control countries: Australia, New Zealand All the results are estimated using OLS. ***p<0.01
19
Table 1.5 Monthly top 50 words trend.
Table 1.5a Monthly top 50 words trend in U.S.
Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti
2 say women women women say women say salon women women women say say say pageant women say say say women say say say women
3 women say say lauder women say salon say say say say one women women say say women women women say women women women one
4 men like one say pageant miss will women pageant salon one salon year pageant women look one pageant pageant real pageant one salon geisha
5 like girl year new year like women like contest product salon women natur year shop shop salon product product pageant show like year pageant
6 year can product product one pageant one new one one like year new vote salon pageant year store girl year one year life say
7 queen show salon year product one year pageant will new product pageant show like queen will product year year one will pageant new time
8 make one just one new hernando new one salon make pageant hair shop will miss salon new one like product contest salon like product
9 shop salon pageant cosmet look salon hair can love look work like see miss one like look brand one will salon product work new
10 photo pageant hair salon shop show like also shop love love natur product time first one love compani contest can like show pageant like
11 pageant contest natur busi will new look contest fashion natur year peopl life salon graci even fashion like new girl new american one contest
12 will day school contest just queen natur just year year fashion product like show bullock year time work will ad year shop show love
13 look year old use show will contest work new will new shop will product new can peopl hair look dove can look hair photo
14 can will star spend make hair get even peopl can queen show love natur hair time even limit school campaign shop hair way salon
15 salon shop make pageant school contest good year also like look contest day even contest miss get girl dove new hair world natur hair
16 work time also can live love work hair can girl magazin play young eye like make face men miss size natur come can film
17 also make play queen men peopl thing queen miss busi girl made one one will also life get natur like love new product look
18 mother peopl go cream care year age old music come school will salon love love hair girl new love look fashion girl even natur
19 show tale like este hair shop name offer show us see can school black work work school also time work just miss use way
20 contest new new look can work product us product just contest life star ad agent product two first industri peopl make fashion time work
21 first hair shop face love day can live natur hair skin men famili new life just thing old also find queen contest see just
22 one natur love show like want pageant natur celebr work young high desir hair get queen busi now find see get time now get
23 natur queen queen fashion two movi show love recent men best time work thing undercov girl will use real care thing queen real fashion
24 time life day like life can two show queen first shop girl busi busi fbi think first go design featur look cultur love movi
25 art see film natur peopl time cream miss life play hair first made american go contest book bodi market shop peopl will busi stori
26 former home role life day first shop see first see get see time fit show percent children will show just see life american men
27 girl well church industri skin now love mother meni thing way way men work can life pageant shop work old way first great face
28 fashion imag look just time art men dog whose use american us thing life natur men miss look come thing life play line beast
29 world becom can school first us peopl shop look includ famili great art also latifah school world natur care person also art celebr fresh
30 photograph product work compani world magazin old make time go show spa face world year take day love cosmet hair school natur base year
31 danc offer get becom salon part berri first just magazin can look can take look industri old miss garden natur play work look shop
32 subject help now will queen product time peopl make well time school also photo men get use see just day celebr make first life
33 joe open store hair old natur queen home get time men old two cultur art see store long make want friend men school school
34 survivor famili last also may life make film way also miss photo use imag photo fashion take around store bodi world way fashion two
35 peopl femal health world swan just world person old peopl world long us everi congeni busi good beach ad show day two old makeup
36 get societi life day contest live go walton play skin play host find queen time go colourtreatment think queen two high care danc
37 play fairi men home meni eye citi will home store store wonder imag girl world talk model show way way american name includ make
38 skin look first skin want cultur compani day offer open live new long film play import kind time world come call power film world
39 young love peopl help black style featur two includ pageant age love start citi now show percent make take editor bodi end citi thing
40 call school call great line artist design thing call world celebr work river can find first like school open contest girl cancer last come
41 made get famili model star moment friend eye photo day industri just first contest sandra way shop home can time men love bloomington art
42 famili now dee market natur look light citi age two model also even first peopl care hair includ life men eye get will includ
43 found use time york get also way someth featur old kind home skin school music black natur last meni two black even shop colour
44 back film girl youth now see play counti ideal good offic skin film care hart great just person want home place use contest becom
45 park go see swan american busi now time nation part will art great young girl grow includ lingeri line model creat eye peopl still
46 product photo home first store place us life crown name natur offer book magazin two ford eye can part sit paint help get will
47 new magazin place way much state young men panten mountain make store part state offer american well art featur love local line world young
48 hair place colour two seem garden feel get like school peopl paint pageant hand black age think offer know life vehicl surgeri home call
49 love cultur part thing inner makeov strength skin hair get two plastic hair shop stori help much meni tradit also miss studi skin want
50 life design person play wife way girl high work way high express just look start much becom want campaign first even school art eye
2004 2005
20
Table 1.5b Monthly top 50 words trend in Canada
Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti
2 say women women women women women pageant women women women women women women look shop play will pageant women women women say say women
3 tale come say pageant product queen china say say say product product star girl women contest say contest say say say colour women say
4 women say editor say year pageant look year like dove colour wonder home great pageant undercov year look pageant girl real imen like sing
5 product also execut one say say women contest appear love world whole like pageant say pageant see women imag size dove women pageant natur
6 one imag fashion per day year hair pageant can peopl say world film one like product last year beautiful pageant model world one take
7 fashion queen home cent celebr miss product new one colour turn divers look like queen women women will live year product new person can
8 fairi messag magazin contest look product men one peopl feel school embrac world editor brain shop look say look dove one can someth show
9 girl surround decor show show fashion say surgeri attribut interest like creation actress women salon new use make natur will ad use product salon
10 time will depart swan one look canada also cultur look us heal hope product just salon love world use age new get sens becom
11 question real one product hair canada like old physic like former year brought show cell time real even love stori like see comfort product
12 societi role chatelain make will repres year get per time life new co year stem film may learn see north photo eye clear one
13 new like can see fashion contest new whose cent make good time jame fashion one fbi contest like standard see look skin humour men
14 use power girl call life day male miss achiev find great first pageant love time congeni natur whose seem bodi former book dri life
15 studi peopl becom men film eye salon obsess natur meni made one stun africa product guy face around mind section health love contest well
16 theron featur truth surgeri face toronto love salon intern film industri call die say new gina pageant interest one product editor everi fit tri
17 baker photograph charg transform last new way find strong photograph year artist romenc can come miss show product can peopl featur like quit like
18 sperri photo tant look garden natur will good noth soap make say kay time miss way model one queen old fashion peopl enter just
19 whose media pageant will make show first may attitud want also can virginia also spot owner live natur year featur peopl also most never
20 mother stereotyp come meni get editor intern youth agre femal real cosmet bob day contest kidnap eye miss will differ use come exact hand
21 brother year men young busi among light month get canadian even servic danni just play like good love make campaign work find smoke song
22 year natur life canada societi come strength plastic imag year thing now gregori even cosmet old mother play also plus us want privat look
23 can show photo inspir classic old one queen studi queen book includ say eye much agent island age old skin imag everyon aniston make
24 peopl men leav win pageant home natur show near show teen book new goddess grow water child now us director photograph humen includ captur
25 old hair vacanc lauder can now also world spirit fashion red accept hair glorious soul will amber canada appear won director observ black bc
26 appear see will new time femal miss use contest live gift african blond use white origin one role someth sever idea one show anderson
27 famili base contest can use illustr get take meni whose sephora pageant thousand young claim singl also care toronto harper time time peopl reluct
28 import beyond make time us art offer young model face new day nurs three sensual mom day time product carolina natur fashion hair grew
29 lauder often hair model first countri base physic dove thing natur meni now charact look bacon physic queen meni model first men year adam
30 preoccup tip get first whose foot repres think survey ad peopl whose product cover face jorg thing come just teen now whose world anybodi
31 will theron young cosmet way chatelain accord news particip illustr men win one appreci well kevin ad old eye averag femal way age film
32 find world role physic cosmet jackson peopl post year studi day audienc busi wisdom movi go feel meni stereotyp current becom offer makeup even
33 work use societi help even one come berri help pageant play prize monday new russian hit book take packag eastern think around time two
34 femal life born garden canada can use natur stun product film earth time natur graci boss still work ban come brand mean men role
35 line age like american physic take day cosmet indonesia one shop peac also hair clinic sequel photo two get miss food continu colour big
36 head way year goal stori real colour turn will us cosmet award work shop diseas men point good design real pageant car go goe
37 anoth femal look soon market us age garden queen young includ centuri even real kidnap want give size back think age will can naomi
38 son skin peopl date style want two red hair think celebr urg canada home passion got although much pictur beautiful skin girl natur watt
39 chow physic find plain actress face good look find cultur high usual includ first latifah bullock music famili despit style cultur show day contest
40 like size work fox recent thing societi peopl young compani market assist last femal therapi sandra tip found without recent size old just use
41 meni book us year swan canadian market men ad market still beat place face heather return student often soul can line salon live get
42 market ideal live love helen stori ask hair feel ideal citi dancer store appear will make time increas ultim day care meni skin whose
43 design best want find like high sex day open campaign friend process angel includ men get fashion fun enjoy find fit just last strong
44 monster exhibit cosmet take natur classic across meni narrow ask aim broke los last use take make exot war just differ model health beast
45 charliz known skin star come blond west work busi provid ottawa ceremoni show feel meni cosmet work glebova convent imag point live differ giant
46 marri realiti star tale salon despit can us design countri road wound make magazin take big even hate marriag cosmet berg becom campaign eventu
47 grimm fame appear known live produc queen now mean present extrem drum get photograph star million featur can glebova ad gotten two follow fay
48 bam compar photograph compet femal five world face word fan citizen inde day call offer menag size show world illustr will thing walk wray
49 look oscar base fairi star respect play featur box orbach tower contest find much agent concept beautiful fashion work brittani colour canadian will appl
50 think confront go plastic becom blue work thing visual girl bell queen shop set bullock winner studi peopl physic love meni turn queen gorilla
2004 2005
21
Table 1.5c Monthly top 50 words trend in U.K.
Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti
2 year women women women women women salon say say women world women women product new year women say say say say product women women
3 say say year queen world say women women women say year year say look treatment one say women women look women year say new
4 treatment will love contest one year year one year one one new one say pound women year year year year product old year pound
5 women one say say product natur men salon one product women say age women one salon look look star like one look men say
6 will look treatment one salon contest will look look girl time product dove year first treatment one one look women new say look time
7 therapi like product year year look hair will time therapi say like new therapi women hair product product like one girl treatment like hair
8 salon hair model look say fashion say product men year can old real will year say salon men old will will men girl make
9 peopl year one first will top product year like salon design contest campaign first say look will will will product contest one world salon
10 new salon will also girl one therapi world queen new contest one salon can look product pound natur love men fashion new therapi like
11 hair world men miss work pound new hair product love miss world year treatment girl girl just make one time time love treatment can
12 product treatment can will queen salon treatment contest salon best look miss can one also men girl girl make just day also time look
13 old natur therapi product treatment get like star world will like can ad new product new time treatment use new year women one girl
14 work girl look new miss queen world men star time get salon product just like pound face first last girl see just new show
15 top therapi salon take shop product one now last old peopl treatment girl make hair like best two miss treatment like like old year
16 one old work play look will industri girl girl queen will just men avon work therapi like can polanski therapi miss girl love one
17 time just face girl therapi men girl first can get salon will use girl two now therapi thing time hair can shop contest day
18 make pound show treatment just time model hall show can make first way spot salon show star pound just old london rape star film
19 look take makeup men new hair young new new world girl star look men therapi industri use now wife life counter queen face product
20 just two new time like old just love miss first treatment open old world make time can queen swedish salon look natur will world
21 use best hair can spot star work miss peopl pound also look face pound day take contest peopl orlaith love make asian want therapi
22 makeup servic two thing can editor also model go life best pound will love come health last spot bodi work work will product work
23 girl star first parlour first now look hollywood also danniell thing queen like face play also young health big now last last salon includ
24 can skill make like day day time can eye look way get time get includ last first new london two good thing just beast
25 world can fashion world get use make thing pageant treatment car pageant hair natur think spot work old girl star clare week first will
26 last now black just great girl now want will like star plastic now great open can get see back face men high take old
27 age first star pound competit pageant want call now men two girl day skin will just play help night use world time life love
28 meni love blond make also beautiful real old take contest great hair life come queen first cream love anoth fashion hair can see get
29 like last now love take world old pound first face made thing first busi miss will men last second skin age therapi can natur
30 men industri world work model just even make day make new way make catherin model work old age new come want first hair even
31 now new come salon eye love can day face take hair offer skin time facial contest now skin contest industri sinc face work bodi
32 take work ft hair even also star natur even bodi pageant ladi us age point star spot live spot back just use last menag
33 offer face like show time last show peopl thing name product present young even agent part new eye model live now miss great treatment
34 place won time see pound model take way hair organ men sleep eye top love survey day top life everi face spot offer just
35 first sak get go love like busi play old work first men even home contest day peopl salon way also go skin good now
36 contest men last back show eye give long pound now queen day design enjoy star get come like well show live offer never queen
37 show still go former thing skin reveal welsh make day day use start salon show good way time men get muslim think blond last
38 best hand also realli meni put contest berri work fashion show fashion person work eye colleg health therapi can spot old much american place
39 film student use lauder busi hepburn two therapi use even well come peopl also see increas store just world great spot long show centuri
40 spot co life old tv can fashion work two health film former great peopl top model love use queen health model black get men
41 go product eye day lead best skin also life high good place industri fashion offer eye also take magazin go life salon fashion contest
42 industri day great film old come good queen see us art china femal film back still show come name play eye hair even star
43 busi show beautiful much now well home show set home love make treatment everi pageant ever see much citi busi includ make top also
44 compani peopl ad long health much get eye turn give age work world hollywood beautiful market skin around claim celebr still now go miss
45 train great spend win play live use come win just alway show therapi door feel number top week fair becom pageant get london two
46 pound come queen therapi includ organ miss well therapi star idea take just like photograph make think blond also hollywood shop fashion massag peopl
47 star call day star live treatment offer beautiful just show centuri two love old uk love back world day right turn home make age
48 day blond take two pageant two pound high love natur european life star day artist face meni work us appear friend back miss around
49 face perfect peopl life find peopl great becom get age therapi great show last massag natur femal former made win competit place model design
50 two competit age age end spot go win us meni old includ get two time fashion night littl month competit treatment feel health st
2004 2005
22
Table 1.5d Difference in mean of word frequency between during and non-during the campaign
U.S. Canada U.K.
Word
Difference in Mean
t-val P-val Word Difference in Mean
t-val P-val Word Difference
in Mean t-val P-val
Rising words during
the campaigns
Top 1 beauti 18.40 -0.56 0.581 beauti 43.10 -2.01 0.057 beauti 90.52 -1.78 0.089
Top 2 say 11.80 -1.99 0.059 women 27.40 -4.89 0.000 women 49.09 -5.49 0.000
Top 3 women 9.75 -1.26 0.221 dove 16.30 -18.38 0.000 age 21.65 -6.13 0.000
Top 4 real 8.80 -2.99 0.007 say 15.30 -3.56 0.002 real 20.00 -7.09 0.000
Top 5 product 7.70 -1.86 0.077 size 10.70 -3.03 0.006 way 12.22 -3.58 0.002
Top 6 girl 6.40 -2.53 0.019 peopl 10.00 -5.85 0.000 say 12.00 -1.22 0.237
Top 7 ad 5.80 -2.64 0.015 ad 9.05 -8.95 0.000 use 10.70 -3.28 0.003
Top 8 come 4.55 -2.41 0.025 model 8.55 -5.72 0.000 one 9.57 -1.74 0.096
Top 9 will 4.50 -1.13 0.270 girl 7.75 -2.01 0.057 new 9.35 -2.11 0.046
Top 10 find 4.40 -2.40 0.025 cultur 7.65 -5.14 0.000 can 8.87 -2.20 0.039
Falling words during
the campaigns
Bottom 10 photo -2.40 1.08 0.291 product -1.65 0.46 0.651 last -2.83 0.69 0.495
Bottom 9 queen -2.45 0.72 0.481 good -1.75 1.49 0.152 queen -3.43 0.69 0.497
Bottom 8 school -2.45 0.94 0.355 men -1.80 1.06 0.300 pound -4.04 0.79 0.437
Bottom 7 made -2.45 1.46 0.158 world -1.80 0.86 0.396 treatment -4.22 0.62 0.543
Bottom 6 star -2.45 1.20 0.242 use -2.25 1.11 0.278 contest -4.48 0.78 0.442
Bottom 5 shop -3.55 0.68 0.506 star -2.25 1.12 0.275 former -4.57 1.67 0.109
Bottom 4 like -3.65 1.19 0.245 last -2.55 2.06 0.051 makeup -4.78 1.16 0.258
Bottom 3 face -3.75 2.13 0.044 salon -2.90 0.96 0.348 top -4.87 1.08 0.292
Bottom 2 hair -4.10 1.77 0.090 contest -3.30 0.97 0.340 pageant -5.04 1.25 0.223
Bottom 1 salon -5.80 1.14 0.266 shop -4.20 0.84 0.410 miss -7.78 1.64 0.115
There are 24 months (observations), which is the unit level of analysis.
Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), U.K. (2005 Jan.),
Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)
Next, we explore whether there is a change in newspaper content using commonly used top word
trend analysis. Tables 5a to 5c show the monthly top 50 word trends for the U.S., Canada, and
the U.K. The campaign title words, “dove” “campaign” for “real” “beauty” rank high during the
months of the campaign, but are not in the top 50 during most non-campaign months. One
exception is the first campaign in the U.S. in September and October 2004, suggesting that the
first campaign was either weaker or less effective than the second one in the U.S.
Among the top 100 words across the two years by country, Table 1.5d shows the top 10 rising
and falling words during the campaigns. The words are ordered by the ‘difference in mean’ of
word frequency between during and non-during the campaign. Most of the campaign title words
are in the top 10 list across 3 countries, consistent with the finding in Tables 5a to 5c.
Across the three countries, there are similar types of falling words. First, there are several words
related to beauty service or product: “salon”, “shop”, “hair”, “face”, “makeup”, “treatment”,
“pound”, and “product”. Second, there are also beauty-contest-related words: “contest”,
“pageant”, “miss”, “world”, “queen”, and “photo”. These results suggest that traditionally
popular beauty words are used less frequently during the campaigns. There is also a country
difference in falling words. In the U.S. and Canada, there are more words related to beauty
service or product in the bottom 10 words than there are in the U.K.
Overall, the keyword trend analysis can suggest that traditionally popular beauty words on
beauty service or beauty contest are less frequently used during the campaigns. However, while
this analysis also captures the growth of the campaign title words (i.e., dove, campaign, real,
beauty), it does not reveal the advertising message that Dove emphasized.
Why does the keyword trend analysis reveal limited information? A similar problem occurs
when one attempts to analyze all consumers together without making consumer segments. Let’s
suppose that there is one big consumer segment and one small segment. While consumers in the
big segment are highly price sensitive, people in the other small one do not care about price. If
the marketing manager does only aggregate-level analysis, he might end up with the conclusion
that most consumers are very responsive to the change of price.
Essentially, the keyword trend analysis looks at the aggregate-level trend across all topics. Thus,
it can identify a change in big topics well, but not a change in small or emerging topics. That’s
24
why the decline of relatively big traditional beauty topics such as beauty services and beauty
contests is found easily, unlike the emerging topic of the Dove campaign. Similarly, the
difficulty in the U.S. and U.K. may be due to the amount of beauty news. As seen in Table 1.3,
the U.S. and U.K. had more than twice as many beauty articles as Canada. This is a limitation of
aggregate-level keyword analysis, which seems to fail to extract advertising-message-related
words from a big amount of text. In the next section, we will explore whether the topic model,
with which we segments beauty sentences into several topics, can offer additional insight beyond
the aggregate level keyword analysis.
1.4 Estimation and Result
1.4.1 Empirical Strategy
To see whether an advertising message affects newspaper content, we tested whether the
incidence of real-beauty-related topics increased with Dove’s real beauty campaign. For this
purpose, we used the following two stages.
First, we categorized beauty sentences in newspapers into topics. Topic extraction with an
optimum number of topics can be done automatically using recently developed topic models
(Blei, Ng and Jordan 2003; Taddy 2012). Then, to identify real-beauty-related topics, we used
(1) ‘Dove campaign for Real Beauty’-related words and (2) other relevant words chosen by
sociology experts among the top words in each topic. At this step, we could see whether topics
related to advertising messages (i.e., real-beauty-related topics) existed. As the last step of the
first stage, using the extent of association of a beauty sentence with beauty topics, we allocated
each beauty sentence to a particular topic with the highest association.
In the second stage, we exploited a global advertising campaign rollout across countries to test
whether the number of sentences labeled as real beauty topics increased compared to that as
other beauty topics in the treated countries relative to the other control countries, where the
campaign started later.
25
1.4.2 Topic extraction
For a topic model, we follow the formulation by Taddy (2012), in which multivariate count data
for terms (words in our paper) in a document (a sentence in our paper) is realized from
multinomial distribution parameterized by a weighted sum of latent topics. Given P unique terms
across all the observed N beauty sentences, each beauty sentence 𝑥𝑖 ∈ {𝑥1 … 𝑥𝑛} can be
expressed as a vector of counts for P words. The total number of words in a beauty sentence 𝑥𝑖 is
𝑚𝑖 = ∑ 𝑥𝑗𝑖𝑃𝑗=1 , where 𝑥𝑗𝑖 is the frequency of word j in a beauty sentence i. we assume that there
are K beauty topics a priori. Then, the K-topic model is
𝑥𝑖 ~ Multinomial(𝜔𝑖1𝜃1 + ⋯ + 𝜔𝑖𝐾𝜃𝐾 , 𝑚𝑖) (1)
where topics 𝜃𝑘 = [𝜃𝑘1, … , 𝜃𝑘𝑃]′ is a vector of probabilities over P words; weights 𝜔𝑖 =
[𝜔𝑖1 … 𝜔𝑖𝐾] are a vector of probabilities over K topics. One can label each topic based on its
own top words from 𝜃𝑘, and assign sentence 𝑥𝑖 to a particular topic based on weights 𝜔𝑖.
For implementation of the above topic model, we utilized the R “maptpx” package by Taddy
(2012). Table 1.6 shows log Bayes factors, which are the ratio of log marginal density of a K-
topic to a null one-topic model, where 𝐾 ∈ {2: (�̂� + 3)}, and �̂� is the optimum number of topics
with the highest value of the log Bayes factor. We obtained 10, 8, 10, 7, and 2 topics for the U.S.,
Canada, the U.K., Australia and New Zealand, respectively.
Table1.6 The optimum number of topics based on log Bayes factor over the null one-topic model
No of topics U.S. Canada U.K. Australia New Zealand
2 58700.0 30966.3 56684.8 20017.78 2330.9
3 77644.4 38431.6 74453.1 24327.04 1959.5
4 96745.0 45036.3 92989.4 26840.61 1104.1
5 115832.9 51475.9 111829.2 29287.3 414.4
6 135057.9 58542.4 132001.3 31359.66
7 154849.9 63146.9 151371.7 31898.02
8 170111.8 68042.0 163902.5 30098.53
9 177803.3 66222.8 172030.2 29749.55
10 178953.7 57928.5 172836.1 23964.92
11 172384.0 48512.1 166570.6
12 160635.0 151039.1
13 141955.7 131328.4
26
Alternatively, one could run a common topic model across all the analyzed countries. This
approach might help intercountry comparison. However, this approach will be challenging due to
the following reasons. First, particular words differ slightly across countries (e.g., behavior vs.
behaviour and color vs. colour). Second, people in different countries may use different words to
discuss the same topics. Third, the extent of topics people talk may vary across countries.
Therefore, we focus on the country-specific topic model.
Once topics are generated, researchers name topics based on the top words typically. In this
study, however, it is not a trivial task for researchers to identify real-beauty-related topics from
other beauty topics. Recall that we attempt to categorize sub-topics of beauty rather than easily
distinguished topics (e.g. politics, weather, sports) in newspapers. There is likely to be common
words between beauty topics. Advertised words in the Dove campaign such as “size”, “skin”,
and “spots” can be used to describe beauty products or services in the articles that do not relate
with the Dove campaign. “Female”, “model”, “body”, “hair”, or “makeup” are often used for
both beauty services and beauty contests. Or, when a writer criticizes the traditional view of
beauty and raises the need of a new definition of beauty, they may also mention the words such
as “look”, “face”, “makeover”, and “treatment” to describe beauty services. Furthermore,
researchers often do not have enough knowledge to identify words on social issue (e.g. real
beauty). In this case, it may be hard for researchers to distinguish real-beauty-related topics from
other beauty topics objectively. Therefore, we identify real-beauty-related topics based on not
only ad-related words, but also the opinions of experts who have relevant knowledge on social
issues that the Dove ads deal with.
27
Table 1.7a Beauty topics
Table 1.7a Beauty topics in U.S.
Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10
"Beauty
service 1"
"Real
Beauty 1"?
"Beauty
service 2"
"Real
Beauty 2"
"Beauty
contest 1"
"Beauty
service 3"
"Beauty
service 4""Movie" ?
1 new salon show like women pageant product say shop one
2 film work contest will store time peopl natur life queen
3 well year just look young first love men now fashion
4 cosmet girl make can good school day get play even
5 artist miss also call magazin world home care much go
6 york two way citi help thing use offer made colour
7 high old see think imag busi includ meni famili book
8 base live us long ad come eye great style stori
9 often find take want part skin black line found becom
10 role face creat makeup feel art celebr compani around whose
11 market open turn seem set year american hair may design
12 known cultur ful health former age star name friend model
13 surgeri real need alway center photo movi everi still music
14 lauder featur kind beach end place bodi never back paint
15 perfect person children inner femal last danc start state best
16 youth mother hope put countri industri white know standard daughter
17 plastic power mr local grace week sell nation park next
18 provid parlor bring sens treatment photograph america import owner move
19 recent three love garden free event big right charact second
20 believ month blond flower better wife full area today tip
21 becam ideal ms talk sinc illustr light chang small left
22 idea talent truth ugli differ night present voic someth point
23 inspir anoth run special secret town african editor enjoy earli
24 servic tradit physic fit mind student attract appreci perform fall
25 captur spa transform ladi public along hair enter moment combin
26 brand makeov wonder mean group univers focus cream among issu
27 accord competit might stage though high saturday develop visit studi
28 give organ room without cours class fill interest realli enough
29 rich grow keep strength appear colleg marri charm actress success
30 joy littl god beast cover money gift came heart told
31 job size brain less lot deep dark presid classic far
32 consid true tell church hand obsess lip form learn hit
33 die oper top origin societi almost custom done seen feminin
34 island wear yet communiti depart train head mysteri new simpli
35 site bath guy eleg male five tale six histori wrote
36 angel popular allow averag husband consult blue saw south remain
37 mountain th other instead suppli crown desir sing past fact
38 promot own experi smile self expert scene definit win surround
39 land hous parti final cloth host collect later death given
40 river compet express travel figur got view speak brought babi
28
Table 1.7b Beauty topics in Canada
Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8
"Real
Beauty 1"
"Beauty
service 1"
"Real
Beauty 2""Movie 1"
"Beauty
contest 1"
"Beauty
service 2"
"Beauty
service 3""Movie 2"
1 look say women play women one like product
2 can year girl make pageant show natur new
3 love world time whose contest just see men
4 day hair model first come even meni film
5 find get real now want shop live work
6 also will imag back miss good young becom
7 take colour age heart cosmet feel eye canada
8 life old dove much skin two person line
9 peopl includ featur mother care last someth role
10 thing salon think salon bodi place standard star
11 way well ad undercov canadian name sens base
12 physic great size music turn power home go
13 call magazin face movi artist seem learn design
14 ful makeup photograph agent queen citi queen toronto
15 per fashion femal stori believ point blond market
16 part editor art light inner use hope famili
17 cent busi societi dress ask univers often big
18 set servic still fashion attract moment among hand
19 everi health ideal time public made die best
20 may director us classic million charact comfort sell
21 littl book campaign style won ugli also surgeri
22 realli offer cultur soul origin anoth surround compani
23 interest china use run compet view far sing
24 us male illustr got night week earli york
25 give peopl differ food help along ever creat
26 know home brain three might street grace actress
27 garden school tale queen titl humen cours head
28 found open studi win averag spot fit self
29 strong provid celebr grow winner tree brought youth
30 noth month long special industri appear seen without
31 known local messag fbi appreci worth cultur former
32 question help american collect kidnap kind parlour appear
33 near daughter north ultim talent vancouv ladi plastic
34 histori word teen hit definit smile lip tri
35 never pictur media thought sever top clear murder
36 spirit treatment end second wear landscap friend reach
37 tradit cover exhibit truth increas despit gift theron
38 qualiti close sinc white former yet especi drama
39 achiev pretti will comedi true piec describ transform
40 organ around fairi gina figur novel owner launch
29
Table 1.7c Beauty topics in U.K.
Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10
"Real
Beauty 1"
"Beauty
service 1"?
"Beauty
service 2"?
"Beauty
contest 1"?
"Beauty
service 3"? ?
1 like salon one will say girl love women look year
2 thing treatment product fashion can world use just new old
3 real men first bodi make last natur also queen work
4 never hair pound young star top way see day show
5 meni therapi contest back now us home made face model
6 art take time shop get femal makeup think two age
7 centuri even miss celebr best anoth set much spot peopl
8 american skin eye week life month design still good health
9 role offer film blond come put ful long live want
10 surgeri includ well pictur call found book london turn industri
11 kind busi go hollywood help parti hous find got big
12 chang high pageant three friend name claim know idea part
13 mean open around seem play secret sex may magazin night
14 th littl ad appear person win counter tri whose becom
15 featur cosmet everi imag ask colleg mother reveal citi start
16 far perfect give tv although run without spend hand wife
17 modern alway feel ever base dress famili someth yesterday time
18 physic head realli brand heart tip creat yet final male
19 play cours today actress combin student follow colour next recent
20 attract believ compani sinc dark hotel artist four line full
21 ideal facial place black meet irish rather differ came hard
22 becam countri market power obsess pop brain though left studi
23 campaign centr music store right former west deep took increas
24 self need ladi youth true parlour talk mark classic school
25 figur street competit favourit known director husband marri stun mum
26 noth spa uk latest routin expert seen report walk job
27 almost room present british hit charm tradit bring point later
28 despit local launch experi discov number interest sure lot went
29 certain massag lead cream sun style light less co england
30 confid tan sell movi great must quit research children tell
31 grace million enter menag ex gorgeous food decid york won
32 moment rang st talent lip fan cultur welsh public stori
33 right cloth career train produc wear insid fit six near
34 mind care might five everyon beast sexual pretti north often
35 plastic area firm told describ award french success wale leav
36 univers hairdress nation regim intellig front cover great boy paint
37 charact better britain hall within travel class sarah happi brother
38 close nail organ daughter touch date lauder howev saw soon
39 wonder servic fact white red summer usual half kate south
40 death hour intern de english event search lost alreadi past
30
Table 1.7d Beauty topics in Australia
Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7
"Beauty
Contest 1"
"Real
Beauty"
"Beauty
Contest 2"
"Beauty
Service 1"?
"Beauty
Service 2"
"Beauty
Service 3"
1 miss look year one come product say
2 pageant women new salon work natur hair
3 contest treatment model busi us will therapi
4 australia well old eye thing women make
5 world peopl men star go day makeup
6 univers meni industri time therapi skin first
7 queen health just former need sydney can
8 student life world also art best get
9 hawkin age last write someth see back
10 love film face set anoth way spot
11 australian ad take tri good fashion give
12 girl play show away inner girl cosmet
13 jennif use two blond know even littl
14 gold great top magazin studi design bodi
15 coast young next queen train still colour
16 crown imag now find hand centr nation
17 mari call becom shop feel australian blind
18 win like fashion melbourn menag want rang
19 corbi long women home truth market seem
20 will ful part editor believ seen like
21 bali sinc visit grace won latest guarante
22 celebr hollywood appear big heart featur artist
23 newcastl wonder real month realli skincar complet
24 yesterday role past court may includ keep
25 titl high turn might room care nail
26 bag stori chang love within friend much
27 judg end almost help compani french provid
28 pictur tradit start recent women lot base
29 ms think local award book appreci rather
30 run avail launch screen movi bring around
31 schapell mind week nicol left exhibit sleep
32 competit person talent per ms lifestyl surgeri
33 davi power beach move watch help skin
34 secret live much cate case cream wax
35 found counter male open say buy travel
36 head youth door fashion interest focus sure
37 intern director black cent alway eleg robert
38 organis told side london demend better total
39 charm money profession south pop tell brisban
40 follow american white report import week citi
31
Table 1.7e Beauty topics in New Zealand
Rank Topic 1 Topic 2
"Beauty
service"?
1 therapi say
2 say new
3 salon miss
4 year will
5 contest makeup
6 one work
7 busi pageant
8 look zealand
9 treatment world
10 also natur
11 product women
12 women want
13 old photograph
14 clinic student
15 offer servic
16 school bring
17 massag waikato
18 can hill
19 new use
20 nail home
21 peopl gift
22 industri place
23 make health
24 much like
25 take can
26 client artist
27 open experi
28 hair imag
29 give book
30 day countri
31 technolog fashion
32 tan adam
33 spa ansel
34 style antarctica
35 good last
36 shop great
37 cours week
38 facial face
39 girl young
40 perfect person
32
These are the steps which we used for naming topics. First, by examining the top 40 words in
each topic across 5 countries in Table 1.7, it looks there are many words on (1) the ads
themselves (i.e. Dove campaign for Real Beauty), (2) beauty services or products, (3) beauty
contests, and (4) movies. Second, among the top 40 words, we asked sociology graduate students
to pick up words related to each of the 4 topic groups. Third, in each topic, we counted the
number of matched words with the chosen words by the evaluators for the above 4 groups, and
named topics as one of the 4 topic groups.
In the second step above, for the ad-related words, we collected both (1) content words that is
mentioned in the ads or that describe the ads and (2) words on advertising messages that are
implied in the social issue advertising. Table 1.8a shows the content words in the Dove
advertising (Figure 1.1a-b). Next, we collected two sets of words on the advertising message
from sociology experts. First, given that Dove aimed to challenge traditional views on beauty
(Dove website 2015), we asked sociology graduate students to choose “social or cultural
change”-related words, such as "change", “traditional”, "society", and "culture”. Second, if the
Real Beauty campaign was effective, people might have talked about the opposite side of
physical beauty. "Real", "true", "mind", "self", and "spirit" are the examples. Two evaluators
chose words independently. For words on which the evaluators disagreed, we followed the
opinion of a third sociology graduate student. Table 1.8b shows the chosen words from the three
evaluators. 35 social words and 15 opposite words to physical beauty were agreed on among the
independent evaluators. In order to choose words related to beauty services (or products), beauty
contests, and movies, we also relied on 3 evaluators. Table 1.8b to 1.8e shows the words agreed
by evaluators. As we discussed before, several words such as “size”, “skin”, “model”, “makeup”,
“women”, and “female” were indeed commonly included in multiple beauty topic groups by the
evaluators.
Table 1.8a Content words in the Dove advertising campaign for Real Beauty
Ad slogan Dove, ad, campaign, real, beauty
In the ads
oversized, outstanding, fat, fit, true, squeeze, size, beauty, debate flat, flattering, sexy, busty, beauty, debate flawed, flawless, beautiful, skin, only, ever, spotless, beauty, debate grey, gorgeous, more, women, feel, glad, beauty, debate ugly, beauty, spots, skin, really, have, flawless, beautiful, beauty, debate wrinkled, withered, wonderful, will, society, ever, accept, old, beautiful, debate feature, real, women, curve
33
Table 1.8b ‘Social change’-related words
"Social or cultural change"
-related words Opposite words to
physical beauty
Given as examples
change, traditional, culture, society inner, mind, real
Chosen by evaluators
achieve, better, campaign, celebrate, change, culture, depart, differ, exhibit, grow, history, ideal, message, modern, moment, organize, people, popular, power, public, question, right, since, society, strong, think, time, traditional, use, way, will, women, young
brain, character, differ, grow, ideal, inner, kind, mind, quality, real, self, spirit, strong, talent, true
Table 1.8c ‘Beauty service or product’-related words
Given as examples
product, service, shop, sell, makeup, cosmetics, face, body, line
Chosen by evaluators
achieve, age, appear, better, blonde, body, care, classic, colour, cosmetics, cream, dove, elegant, eye, face, facial, feature, feel, female, feminine, figure, fit, girl, gorgeous, great, hair, hairdress, head, health, ideal, image, industry, inner, light, line, lip, magazine, makeover, makeup, market, model, modern, nail, perfect, physics, plastic, popular, pretty, product, quality, real, reveal, salon, sell, service, shop, skin, skincare, smile, spa, special, strength, strong, style, success, tan, therapy, touch, train, travel, ugly, use, wax, wear, worth, young, youth, true
Table 1.8d ‘Beauty contest’-related words
Given as examples
face, body, line, contest, queen, pageant, crown, competition, miss, universe
Chosen by evaluators
achieve, actress, america, american, award, best, body, character, charm, compete,
competition, confidence, contest, crown, event, face, fashion, female, figure,
gorgeous, hair, ideal, judge, lady, line, makeover, makeup, miss, model, pageant,
pretty, queen, sing, smile, talent, title, universe, wear, win, winner, women, won,
world, young, youth
Table 1.8e Movie related words
Given as examples
film, play, role, actor, actress, star, beast
Chosen by evaluators
actor, actress, agent, america, american, appear, art, artist, award, beast, career,
character, classic, comedy, director, drama, editor, end, film, hollywood, image,
industry, media, movie, mystery, picture, play, role, scene, screen, stage, star,
talent, watch
34
Table 1.8f There are 1 or 2 real-beauty-related topics in each country except New Zealand.
the U.S.
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Real Beauty 0 13 4 5 12 2 2 2 4 0
Beauty Product or Service 6 10 2 10 8 4 8 5 4 4
Beauty Contest 1 8 1 3 4 5 4 4 3 4
Movie 4 1 0 2 3 2 5 3 5 1
Canada
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Real Beauty 11 3 19 2 6 7 4 1
Beauty Product or Service 4 9 11 5 8 8 5 7
Beauty Contest 1 5 6 3 13 3 3 5
Movie 0 3 5 6 3 2 0 9
the U.K.
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Real Beauty 13 2 5 7 2 1 4 4 2 3
Beauty Product or Service 7 15 4 9 4 6 4 7 4 4
Beauty Contest 6 1 5 6 1 8 1 2 3 2
Movie 5 0 5 7 2 3 1 0 1 1
Australia New Zealand
T1 T2 T3 T4 T5 T6 T7 T1 T2
Real Beauty 2 10 5 1 4 7 2 3 3
Beauty Product or Service 2 7 5 5 4 9 10 12 8
Beauty Contest 12 4 6 3 2 3 3 3 8
Movie 1 8 4 4 3 1 1 1 2
T represents a topic. We counted the number of matched words in each topic with those of the 4 topic
groups.
Table 1.8f shows the result of naming topics in the third step above. We find 1 or 2 real-beauty-
related topics in each country except New Zealand. The most popular beauty topics are beauty
products or services across the analyzed 5 countries. Beauty contest topics are also detected
except New Zealand. The U.S. and Canada also have movie topics. Note that we did not label
topics if there were not many matched words (i.e. at most 4 words, which is 10% x 40 words in
each topic) for the most related topic group or if multiple topics have the same biggest number of
matched words.
35
When we validated those selected real-beauty-related topics, we checked whether those topics
are driven by news contents reporting either the Dove campaign itself or underlying ad messages
of the campaign. Topics about the Dove campaign itself are likely to have many words on (1) the
campaign slogan and (2) ad content in Table 1.8a. On the other hand, if newspapers also
discussed real beauty beyond the campaign itself, some real-beauty-related topics may use
‘social change’-related words in Table 1.8b more often than other topics.
By revisiting Table 1.7a-c for the U.S, Canada, the U.K., we find that all the 5 topics identified
as real-beauty-related topics in the treated countries are validated by either directly ad-related or
implied ad-messages-related words. For the U.S. in Table 1.7a, in topic 2 labeled as real beauty,
the campaign slogan words rank high. “Real”, “dove” and “campaign” ranks 13th, 45th and 49th.
These are the highest ranks across the 10 topics in the U.S. "Old", “real”, “featur(e)”, "size", and
"true" are content words in the ads. “Cultur(e)”, "power", "ideal", "tradit(ional)", "organ(ize)",
"grow", "popular", are words on social or cultural change. There are also opposite words to
physical beauty: “real”, "ideal", "talent", "grow", and “true”.
Next, topic 5 seems to talk about helping young women. “Help” “young” “women”, “femal(es)”
“feel” “free” “better” “differ(ent)”, and “societ(y)” are such keywords. With additional search
about the Dove campaign, we realized that the campaign also aimed to help young girls have
confidence in their beauty (Dove website 2016). Note that we did not intend to include words
related to this goal in detecting real-beauty-related topics. However, we could identify this topic
because most of those words are related to social or cultural change in Table 1.8b. There are also
opposite words on ‘physical beauty’, which are “differ(ent), “mind”, and “self”. “Ad” ranks high
as 8th, suggesting that this topic 5 is also somewhat related to the ads.
However, note that “dove” and “campaign” are ranked much lower in topic 5 (571st and 527th)
than in topic 2 (45th and 49th), although “ad” ranks 8th in topic 2. This result suggests that
sentences related to real beauty in topic 5 are much less likely to come from the articles that
reported on the Dove campaign itself than those in topic 2. We will discuss this point below in
section 4.4 with stronger evidence.
In Canada in Table 1.7b, topics 1 and 3 are identified as real-beauty-related topics. In topic 3, the
campaign slogan words rank the highest among the 8 topics: “Real”, “dove”, “ad” and
“campaign” ranks 5th, 8th, 11th and 21th, suggesting that the topic 3 is the most related to the Dove
36
campaign in Canada. As a result, content words in the ads, which are “feature(e)”, “celebr(ate)”,
“real”, “women”, and “size” also rank relatively high. There are also many words related to
social and cultural change. “Women”, “time”, “think”, “societ(y)”, “ideal”, “campaign”,
“cultur(e)”, “differ(ent)” and “messag(e)” are the examples. “Real”, “ideal”, “differ(ent)”, and
“brain” are opposite words to physical beauty.
Similarly to the U.S., in Canada, “dove” and “campaign” are included in the top 40 words in
topic 3 but not in topic 1. No ad content word is included in the top 40 words in topic 1 except
“reall(y)”. Nevertheless, there are many words related to social change: “people”, “way”,
“strong”, “question”, “histor(y)”, “tradit(ional)”, “achiev(e)”, and “organ(ize)”. There are also
several opposite words to physical beauty: “Strong”, “spirit”, and “qualit(y)”. Given “look”
ranks the first, topic 1 seems to “question” “tradit(ional)” “way” or “physic(al)” “look”. As we
discussed before, topic 1 is not likely to talk about the Dove campaign itself.
In the U.K., topic 1 is only real-beauty-related topic. The campaign title words “real”,
“campaign”, “dove” rank 3rd, 23rd, and 43rd. “Featur(e)” and “wonder(ful)” are ad content words.
“chang(e)”, “modern”, “ideal”, “campaign”, “moment”, and “right” are words related to social or
cultural change. There are also several opposite words to physical beauty: “Real”, “kind”,
“ideal”, “self”, “mind”, and “charact(er)”. Considering “surger(y)”, “physic(al)”, and “plastic” in
the rank 10, 18, and 35, the topic 1 seems to “featur(e)” “real” beauty to “chang(e)” “modern”
“physic(al)” “thing(s)” “like” “plastic,” “surgery”.
Among other beauty topics, beauty services (or products) were the most popular during the
analyzed periods in all the 3 treated countries in terms of the number of topics. There are 4
beauty services (or products) in the U.S. Topic 1 talks about "cosmet(ics)", "market",
“surger(y)”, “lauder”, "perfect", “youth”, "plastic", "servic(e)", and “brand”. Topic 4 has “look”,
"makeup", "health", "inner", "ugl(y)", "special", "fit", "strength", "eleg(ant)", "smile", and
"travel". Topic 7 seems to talk about “sell” or “use” beauty “product” for “light” “eye(s)”,
“bod(y)”, “hair”, “lip” or “head”. Lastly, topic 8 also seems to deliver similar theme with words
such as “natur(al)” “great” “care” or “cream” for “hair” or “line”.
37
In Canada, 3 topics are full of words related to beauty services (or products) in beauty salons or
shops, including “hair”, “colour”, “salon”, “great”, “magazin(e)”, “makeup”, “fashion”,
“service”, “health”, “treatment”, and “prett(y)” in topic 2; "shop", "feel", "made", "ugl(y)",
“spot”, "appear", "worth", and "smile" in topic 6; and “natur”, "young", "eye", "blond(e)", "fit"
and "lip" in topic 7.
The U.K. also has a large collection of such words: “salon”, “treatment”, “hair”, “therap(y)”,
“skin”, “cosmet(ics)”, "perfect", "head", “facial”, “spa”, “massag(e)”, “tan”, “care”, “hairdress”,
“better”, “nail” and “service” in topic 2; “fashion”, “bod(y)”, "young", "shop", "blond",
"appear", "imag", "youth", "cream", and “brand” in topic 4; and "made", "reveal", "colour",
"fit", "prett(y)", "success", and "great" in topic 8.
Next, beauty-contest-related topics existed in all the treated countries: “pageant”, “world”,
“photo”, “photograph”, “event”, "univers(e)", “crown” in topic 6 in the U.S.; "women",
“pageant”, “contest”, "miss", "bod(y)", “queen”, "won", "compet(e)", "titl(e)", "winner", "talent",
"wear", and "figur(e)" in topic 5 in Canada; and "world", “top”, "femal(e)", "win", “dress”,
"charm", "gorgeous", "wear", "award", and "event" in topic 6 in the U.K.
Lastly, we find that movies were also a frequently reported beauty topic in newspapers, perhaps
due to the connection with the beauty of actresses or movies on beauty (e.g., Beauty Shop). In
particular, the movie Beauty Shop was released on March 24, 2005. Topic 9 in the U.S. has
“play”, “style”, “charact(ers)”, “perform”, “actress”, and “classic”. Popular movie-related words
in Canada were “play”, “music”, “movie”, “agent”, “stor(y)”, “classic” and “comed(y)” in topic
4 and “product”, “film”, “canada”, “line”, “role”, “star”, “toronto”, “market”, “sing”, “actress”,
“drama” and “launch” in topic 8.
Now, we turn to topics in the control countries in Table 1.7d and 1.7e for Australia and New
Zealand, respectively. Only topic 2 in Australia is selected as a real-beauty-related topic.
“Women”, “peopl(e)”, “use”, “young”, “sinc(e)”, “tradit(ional)”, “think”, and “power” are the
chosen words as social-change-related words. “Mind” is an opposite word to physical beauty.
Although “ad” ranks 11th, “dove” and “campaign” are not ranked in the top 40. Overall, this
result suggests that most sentences related to real beauty in topic 2 are not likely to come from
38
the articles that reported on the Dove campaign itself. It makes sense because the Dove campaign
was not launched in Australia during the analyzed periods.
Like in the treated countries, beauty services (or products) were the most popular topics in both
control countries. 3 topics in Australia are about beauty services (or products), as evidenced by
"salon", "eye", "blond(e)", "magazin(e)", "shop", and “fashion” in topic 4; “product”, “natur(al)”,
“fashion”, "skin", "girl", "market", "featur(e)", "skincar(e)", "care", "cream", "eleg(ant)", and
"better" in topic 6; and "hair", "therap(y)", "makeup", “spot”, "cosmet(ics)", "bod(y)", "colour",
"nail", “surger(y)”, "skin", "wax", and "travel" in topic 7. Similarly, topic 1 in New Zealand is
full of beauty service (or product) words such as "therapi", "salon", “treatment”, “product”,
“clinic”, “massag(e)”, "nail", "industry(y)", "hair", "tan", "spa", "style", "shop", "facial", "girl",
and "perfect".
Beauty-contest-related topics were also popular in control countries. In Australia, such words are
“miss”, “pageant”, “contest”, “Australia”, “world”, “univers(e)”, “queen”, “crown”, “win”,
"titl", “judg(e)”, and “competit(ion)”, and "charm" in topic 1; and “model”, “world”, “face”,
“show”, “fashion”, “women”, and “talent” in topic 3. Topic 2 in New Zealand has the same
number (8) of words about both beauty services and beauty contests. This makes sense since
both topic groups are highly associated each other. We did not label this topic 2.
In summary, we identified real-beauty-related topics during the analyzed periods. All the themes
discovered in the topics across the three countries are very similar to each other and to the
message of the Dove campaign.
Now, we are ready to categorize beauty sentences into the most relevant topics. We assigned
beauty sentence 𝑥𝑖 to a topic k if 𝜔𝑖𝑘 = max (𝜔𝑖1, … , 𝜔𝑖𝐾). Then, we counted the number of
sentences in each topic. Alternatively, one could use 𝜔𝑖𝑘 topic weight itself without categorizing
each sentence into a particular topic group. Let's suppose that there are two topics. One sentence
has 90% and 10% topic weights from topic 1 and 2, respectively. Seemingly, it looks natural to
use the weights (90% and 10%) as the contribution of topics in the sentence. This is likely to be
true when one uses newspaper article as unit level of analysis. However, one sentence is likely to
talk about one topic (Büschken & Allenby 2016). In fact, although the sentence almost talks
about topic 1, topic weights rarely have 100% due to some common words across topics. This is
39
why we categorized sentences into the topic group with the biggest topic weight. Then, we show
the result using topic weights as a robustness check.
Figure 1.3 Trend of beauty topics
Figure 1.3a The number of sentences labeled as real beauty topics increases relative to that as other
beauty topics with the Dove campaigns for Real Beauty in the U.S. and Canada, respectively.
U.S. Canada
Figure 1.3b There is no systematic pattern in both (1) the number of sentences labeled as real beauty
topics and (2) that as other beauty topics during the Dove campaigns in Australia.
Dependent variable is the number of sentences per a topic.
Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.),
Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)
Figure 1.3a compares monthly trend of real beauty vs. other beauty topics in the U.S and Canada,
respectively. There are two clear patterns. First, during the campaigns across the first and the
40
second campaigns in the U.S. and Canada, real-beauty-related sentences increase while other
beauty sentences show decreasing or relatively stable pattern.
Second, after the campaign, the number of real-beauty-related sentences quickly decreases. The
second pattern is expected since newspapers are always looking for “news” and thus they move
on other topics once the event finishes.
Third, there are also jumps in real-beauty-related topics before the second campaign. This is
likely to be due to the launch of new movie. Before the second campaign, the movie “beauty
shop” is released on 2005 April. Similarly, before the first campaign, several movies about girls
(“Mean girl” and “The girl next door”) are released on 2004 April. As the release of new movies
seems to increase several beauty topics including real-beauty-related topics, there might be other
events to increase both real beauty and other beauty topics. This pattern suggests that we need to
have other beauty topics as controls.
On the other hand, there is no systematic trend in real vs. other beauty topics in Australia, as seen
in Figure 1.3b.
1.4.3 Testing
Next, by exploiting variation in time of the campaigns across countries, we test whether
the number of sentences labeled as real beauty topics in the previous section increases during the
month(s) of the real beauty campaign compared to those categorized as non-”real beauty” topics
across countries. We start with a difference-in-difference using only data of treated countries
with the Dove campaigns during the analyzed periods. We will add the third difference between
the treated countries and the control countries. The two real beauty topics in each the U.S. and
Canada are aggregated into one topic because we measure aggregate impact of advertising on the
real-beauty-related topics. Because we extracted country-specific topics rather than common
topics across all the analyzed countries due to inter-country differences in the words used to
define each topic, we allow country-topic and time fixed effect.
41
The number of sentence labeled as a topic k in a country c in year-month t is
𝑁𝑜 𝑜𝑓 𝐵𝑒𝑎𝑢𝑡𝑦 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑘𝑐𝑡 = 𝛽𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 + 𝜇𝑘𝑐 + 𝜏𝑡 + 𝜀𝑘𝑐𝑡 (2)
where
- 𝛽 captures the core effect in this paper—the impact of advertising on real-beauty-related
topics compared to all the other topics;
- 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐 = 1 if k is a real-beauty-related topic in country c, 0 otherwise;
- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is a campaign period in country c, 0 otherwise;
- 𝜇𝑘𝑐 is a country-topic specific fixed effect that captures differences in the number of
sentences across countries and topics;
- 𝜏𝑡 is a year-month specific fixed effect;
- 𝜀𝑘𝑐𝑡 is the error term.
The above Equation (2) is estimated with the OLS. Our identification assumption is that there is
no systematic factor that drives the firm’s decision on campaign timing and location in order to
coincide with media coverage on real beauty nor is there an omitted variable that drives real
beauty reporting in the U.S., Canada, and the U.K, during the campaigns. Heteroskedasticity-
robust standard errors are clustered at the country-topic level to adjust for correlation within each
country’s topic across analyzed months.
Table 1.9a shows our main result for Equation (2). The coefficient for the interaction effect is
significantly positive, suggesting that there is higher coverage of real-beauty-related topics in
newspapers during the campaign than non-campaign periods compared to the other beauty topics
by 33 sentences. As we discussed before, we also counted the number of sentences using topic
weights without assigning each sentence into a particular topic group as robustness check. Table
1.A1 shows that main result still holds.
Table 1.9a shows our main result for Equation (2). Column (1) use only real-beauty-related
topics in treated countries. Note that not all the Dove campaigns were run at the same time across
the treated countries. Thus, real-beauty campaigns in other countries become control for those in
THE focal country. Column (2) through (5) add other-beauty-related topics as controls. Column
(2), (3), and (4) are for Canada, the U.S., and the U.K., respectively. As discussed above with
42
respect to Figure 1.3a, other events might increase both real beauty and other beauty topics.
Thus, we control for other beauty topics. Furthermore, given that we want to test whether the
Dove campaign increased discussion of real beauty compared to conventional media’s beauty
topics such as beauty services or beauty contests, by controlling for other beauty topics, the Diff-
in-Diff gives us relative change of real-beauty topics compared to other beauty topics. Column
(5) uses all the data across the 3 treated countries. Across all the 5 columns, the coefficients for
the interaction effect are significantly positive, suggesting that there is higher coverage of real-
beauty-related topics in newspapers during the campaign than non-campaign periods compared
to real-beauty-related topics in other countries in Column (1) or the other beauty topics within
country in Column (2) to (4). As we discussed before, we also counted the number of sentences
using topic weights without assigning each sentence into a particular topic group as robustness
check. Table 1.A1 shows that main result still holds.
Table 1.9a The number of sentences labeled as real beauty topics increases relative to that as other
beauty topics in the treated countries during the month(s) of the real beauty campaign.
Only
Real Beauty
Adding Other beauty topics
Canada US UK All
(1) (2) (3) (4) (5)
Real Beauty X During Campaign
29.7*** (3.35)
38.7*** (2.80)
23.9*** (1.53)
40.9*** (1.92)
33.2*** (5.20)
Country-Topic Dummies 2 7 9 10 25
Year-Month Dummies 23 23 23 23 23
R-sq 0.775 0.720 0.790 0.615 0.671
Observations 72 168 216 240 624
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
The U.S., Canada, and the U.K. have 10(2), 8(2), and 10(1) topics (real beauty ones), respectively.
Two real beauty topics in each country are aggregated into one topic.
Column (1) use only real-beauty-related topics in treated countries.
Column (2) through (5) add other-beauty-related topics.
Column (2), (3), and (4) use data from Canada, the U.S., and the U.K., respectively.
Column (5) use all the data across the 3 treated countries.
Robust standard errors are clustered at the country-topic level. ***p < 0.01
As a falsification test, Table 1.9b also use the same Equation (2) but only data from control
countries. The effect is insignificant, suggesting that there is no a comparable increase in
43
newspaper coverage of real-beauty-related topics in the control countries, as seen in Figure 1.3b.
This is expected because the campaign started later in those two countries.
Table 1.9b Falsification test: The number of sentences labeled as real beauty topics does not increase
relative to that as other beauty topics in the control countries during the month(s) of the real beauty
campaign.
Real Beauty X During Campaign
-0.60 (0.85)
Country-Topic Dummies 8
Year-Month Dummies 23
R-sq 0.502
Observations 216
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
New Zealand and Australia have 7(1) and 2(0) topics (real beauty ones), respectively.
Recall that the identifying assumption for Table 1.9a is that there was no other factor to affect
newspaper coverage on real beauty that are not related to the Dove campaigns. However, there
could be time-varying global interest on social issue of beauty. To rule out this potential
explanation, we compare the change in the treated to the control counties during and one month
after the campaigns as follows.
𝑁𝑜. 𝑜𝑓 𝐵𝑒𝑎𝑢𝑡𝑦 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑘𝑐𝑡 = 𝛽1𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐
+ 𝛽2𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐
+ 𝛽3𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡
+ 𝛽4𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 + 𝜇𝑘𝑐 + 𝜏𝑡 + 𝜀𝑘𝑐𝑡 (3)
where
- 𝛽1, 𝛽2 capture the core effects in this paper—the impact of advertising on real-beauty-related
topics compared to all the other topics in treated countries relative to control countries during
and one month after the campaigns, respectively.
- 𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 = 1 if the campaign was launched during the analyzed periods in a country c, 0
otherwise;
- 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐 = 1 if k is a real-beauty-related topic in country c, 0 otherwise;
44
- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is a campaign period in country c, 0 otherwise;
- 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is one month after the campaign in country c, 0
otherwise;
- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 = 1 if t is a campaign period in any treated country, 0 otherwise;
- 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 = 1 if t is one month after the campaign in any treated
country, 0 otherwise;
- 𝜇𝑘𝑐 is a country-topic specific fixed effect that captures differences in the number of
sentences across countries and topics;
- 𝜏𝑡 is a year − month specific fixed effect
- 𝜀𝑘𝑐𝑡 is the error term
This “differences-in-differences-in-differences” specification combines the insight of Table 1.9a
and 1.9b. We use the ad effect on real-beauty-related topics in the control countries to account
for changes over time (during and one month after the campaign) in the baseline of media
coverage on real-beauty-related topics with coefficients 𝛽3 and 𝛽4.
Table 1.9c shows our main results, building up to the full specification for Equation (3) in
column (2). Column (1) reflects the results of Table 1.9a and 1.9b by comparing the ad effects on
real-beauty-related topics between the treated and the control countries only during the
campaigns. The key coefficient of interest, 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 is
positive and significant. The effect size is similar with that in Table 1.9a. It suggests that
newspapers in the U.S., Canada, and the U.K report real-beauty-related topic more by about 70%
during the campaign compared to the other beauty topics relative to control countries. The
coefficient for 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 is insignificant, suggesting that there is no
systematic change during the campaign in media coverage on real-beauty-related topic in the
control countries.
Column (2) adds terms to capture the ad effect one month after the campaigns. The
coefficient for 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 is positive and significant although
the effect size is much smaller than that for “during the campaigns”. The corresponding baseline
coefficient for 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 is significantly negative. Both
45
coefficients suggest that newspapers showed much less interest on real-beauty-related topics one
month after the campaigns than during the campaigns across the treated and the control
countries. However, one month after the campaigns, the newspapers in the countries of the
campaigns still talk more about real beauty compared to the other beauty topics relative to those
in the control countries. We also find that the ad effect was almost gone from the second months
of the campaigns. See Table 1.A2 in the appendix for the result. In summary, newspapers in our
data wrote more about real beauty during and one month after the campaigns.
Table 1.9c The number of sentences labeled as real beauty topics increases relative to that as other
beauty topics in the treated countries relative to control countries during and one month after the Real
Beauty campaign.
(1) (2)
Only
During
During
& After
Real Beauty x Treated Countries
x During Campaign
32.90***
(5.106)
33.68***
(4.956)
x One Month After Campaign 5.157***
(1.965)
Real Beauty
x During Campaign
0.066
(0.647)
-0.606
(0.735)
x One Month After Campaign -4.047***
(0.901)
Country-specific topic dummies 34 34
Year-month dummies 23 23
R-sq 0.738 0.738
Observations 840 840
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),
respectively. Two real beauty topics in each country are aggregated into one topic.
Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty ones), respectively.
Robust standard errors are clustered at the country-topic level. ***p < 0.01
46
1.4.4 Mechanism
1.4.4.1 How does the advertising message affect the content of a newspaper?
In the previous section, we showed that real-beauty-related content increased with Dove’s real
beauty advertising campaign. Next, we explore how the advertising message affects newspaper
content. Potential scenarios are as follows. First, newspapers may have reported the Dove
advertising campaign itself and not altered content in any other meaningful way. Second,
newspapers may have discussed real beauty in articles that are not directly related to the Dove
campaign. Third, newspapers may have talked more about real beauty while referencing the
Dove campaign within the same article.
Access to all the newspaper text data in the title and body allowed us to test whether real beauty
related topics exist even in non-Dove articles. Among the beauty articles we downloaded, we
used the keyword “Dove” to filter the Dove campaign-related articles. After removing all the
sentences within the articles or the only sentences that mentioned “Dove,” we did the same test
with the previous section again in order to see whether beauty sentences labeled as real-beauty-
related topics increased with the Dove campaign.
Columns (1) to (5) in Table 1.10 show the results for Equation (2) using new data after deleting
Dove sentences. Column (1) is the base result from all of the original beauty sentences, as is
Column (2) in Table 1.9c. Columns (2), (3), and (4) delete all the beauty sentences within the
articles that include “Dove” in the title, abstract, and any place (title, abstract, or body),
respectively. While some sentences talk about the Dove campaign, others may discuss about real
beauty without mentioning “Dove”. Thus, Column (4) is very conservative test. We focus on the
interpretation of 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 in the first line because the
rest of the coefficients show similar patterns with those in Table 1.9c. The effect sizes from
Columns (2) and (3) to (4) decrease by 18%, 61%, and 74%, respectively, from the base one in
Column (1). This suggests two things. Firstly, many newspaper articles indeed reported on the
Dove campaign itself, although the extent varied: While some articles whose title mentioned
Dove mainly talked about the Dove campaign, others mentioned Dove just once in their body.
More importantly, the coefficient in Column (4) is still significant, suggesting that some
newspapers discuss about real beauty even in non-Dove articles as well. Overall, these results
47
support the second rather than the first scenario. In other words, this is the evidence that the
Dove advertising message affected how newspapers write about beauty.
On the other hand, Column (5) deletes the only “Dove” sentences. Its effect size decreases 33%
but is much higher than that of Column (4). This result supports the third scenario and suggests
the possibility that newspapers talked more about real beauty based on the Dove campaign
within the same article. This type of newspaper articles may have been the desired format for
marketing or PR managers. By reading such articles, consumers may have associated the brand
(e.g., Dove) with its desired image (e.g., real beauty, self-esteem).
Table 1.10 The significant impact of the campaign on real beauty-related topics are not driven by
reporting the Dove campaign.
All beauty sentences
Dropping articles with Dove in dropping only Dove sentences Title Abstract Anywhere
(1) (2) (3) (4) (5)
Real Beauty x Treated x During Campaign
33.68*** (4.956)
27.51*** (3.797)
13.25*** (1.744)
9.031*** (1.028)
23.56*** (2.864)
x One month after Campaign 5.157*** (1.965)
4.876*** (1.674)
5.289*** (1.744)
5.939*** (1.717)
5.217*** (1.892)
Real Beauty x During Campaign
-0.606 (0.735)
-0.385 (0.733)
-0.695 (0.723)
-0.811 (0.702)
-1.436* (0.733)
x One month after Campaign -4.047***
(0.901) -4.079***
(0.903) -4.182***
(0.910) -4.141***
(0.900) -4.099***
(0.906)
Country-topic dummies 34
Year-month dummies 23
Observations 840
R-sq 0.738 0.735 0.720 0.716 0.728
Controlling countries New Zealand & Australia
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),
respectively. Two real beauty topics in each country are aggregated into one topic.
Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty one), respectively.
Robust standard errors are clustered at the country-topic level. ***p < 0.01; *p < 0.10
48
1.4.4.2 Social issue advertising and the mass media’s public goal
Now, we explore why social issue advertising can affect the content of mass media. In addition
to a profit goal, the mass media has a (non-economic) public goal to serve the public interest on
desired social or cultural change (McQuail 2010). Typical societal issues in social issue
advertising (or public service announcements) are human rights, environmentalism, voting,
smoking, and donations. The Dove campaign targets boosting self-esteem. Therefore, the mass
media is likely to deliver the message of social issue advertising in order to make a positive
impact on the community. Considering the media’s interest in social issues, firms to implement
social issue advertising set up “mass media coverage” goals (Drumwrigh 1996). Interviewing 11
firms about both the standard and social issue advertising that each firm has made, Drumwright
(1996) finds that firms observed higher media coverage in social issue than standard campaigns.
One company reported that social issue advertising obtained earned media (i.e., free publicity)
valued at six times the expenditure on paid media.
If media indeed reported the Dove campaign actively in order to serve the public interest (i.e.,
desirable social and cultural change), one would observe new words about such social change
emerging during the campaigns. First of all, in the previous section, we showed that the number
of “real beauty”-related sentences increased significantly during the Dove campaigns even in the
newspaper articles that do not mention Dove in Table 1.10.
Next, as we discussed before with Figure 1.2, we also showed suggestive pattern that social-
change-related words (in Table 1.8b) chosen by the evaluators were used more frequently in the
beauty sentences during the campaigns across the U.S., Canada, and the U.K.
49
Table 1.11a Rising social or cultural change words within real beauty topics during the campaigns
Each word with 24 observations (months) is estimated separately within a country.
Dependent variable is monthly word frequency within real beauty topics.
Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July), U.K. (2005 Jan.)
***p < 0.01; ** p < 0.05; *p < 0.10
U.S. Canada U.K.
Word During Campaigns
R-sq
Word During Campaigns
R-sq
Word During
Campaign R-sq
First Second First Second
women -1.1
(3.36) 13.4*** (3.36)
0.44 women 18.55***
(4.66) 17.55***
(4.66) 0.57 way
3.57*** (1.1)
0.32
campaign 0.65
(0.93) 11.15***
(0.93) 0.87 peopl
4.6** (1.66)
8.6*** (1.66)
0.61 new 2.65* (1.53)
0.12
tradit 0.35
(1.13) 3.85*** (1.13)
0.35 achiev 8.55*** (1.88)
-0.45 (1.88)
0.50 achiev 2.48*** (0.81)
0.30
differ -0.15 (0.68)
3.85*** (0.68)
0.61 strong 7.15*** (1.54)
-0.35 (1.54)
0.51 young 2.39***
(0.6) 0.42
old 1.65
(1.84) 3.65* (1.84)
0.17 old 0.8
(1.61) 6.8*** (1.61)
0.46 think 2.3** (0.99)
0.20
sinc 2.45*** (0.69)
-0.55* (0.69)
0.40 campaign 4.75*** (1.34)
6.75*** (1.34)
0.62 cultur 2.26** (0.94)
0.21
peopl 1.35
(0.79) 2.35*** (0.79)
0.35 think 0.5
(1.61) 6.5*** (1.61)
0.44 old 2.09* (1.06)
0.15
young 0.25
(1.07) 2.25** (1.07)
0.17 differ 0.05 (0.5)
5.55*** (0.5)
0.86 use 2.09* (1.06)
0.15
messag 0.00
(0.72) 2**
(0.72) 0.27 chang
1.4* (0.7)
3.4*** (0.7)
0.55 power 2.04* (1.09)
0.14
way 0.25
(0.73) 1.75** (0.73)
0.22 ideal 3.4*** (0.71)
2.9*** (0.71)
0.63 grow 1.7** (0.65)
0.24
organ 0.2
(0.6) 1.7*** (0.6)
0.28 exhibit 0.2
(1.09) 2.7** (1.09)
0.22 question 1.61* (0.8)
0.16
chang 0.25
(0.64) 1.25* (0.64)
0.15 cultur 1.7
(1.19) 2.7** (1.19)
0.24 strong 0.83** (0.4)
0.17
modern 0.95* (0.51)
-0.05* (0.51)
0.14 young 2.7*** (0.89)
-0.3 (0.89)
0.31
popular 0.65
(0.53) 1.65*** (0.53)
0.34
better 0.5*** (0.11)
1*** (0.11)
0.81
grow -0.2
(0.43) 0.8*
(0.43) 0.15
50
Table 1.11b Rising opposite words to physical beauty within real beauty topics during the campaigns
U.S. Canada U.K.
Word During Campaigns
R-sq
Word During Campaigns
R-sq
Word During
Campaign R-sq
First Second First Second
real 0.55
(1.72) 15.05***
(1.72) 0.78 real
-0.4 (1.56)
10.6*** (1.56)
0.69 grow 1.7** (0.65)
0.24
talent 0.8
(0.96) 4.8*** (0.96)
0.54 strong 7.15*** (1.54)
-0.35* (1.54)
0.51 strong 0.83** (0.4)
0.17
differ -0.15 (0.68)
3.85*** (0.68)
0.61 spirit 6.8*** (1.63)
-0.2* (1.63)
0.46
spirit 1.35** (0.53)
0.35* (0.53)
0.24 differ 0.05 (0.5)
5.55*** (0.5)
0.86
charact 1.15** (0.49)
-0.35* (0.49)
0.24 ideal 3.4*** (0.71)
2.9*** (0.71)
0.63
brain 0.3
(0.39) 0.8*
(0.39) 0.18 brain
0.65 (0.83)
2.65*** (0.83)
0.33
TRUE 1.9*** (0.51)
-0.1* (0.51)
0.41
grow -0.2
(0.43) 0.8*
(0.43) 0.15
Each word with 24 observations (months) is estimated separately within a country.
Dependent variable is monthly word frequency within real beauty topics.
Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July), U.K. (2005 Jan.)
***p < 0.01; ** p < 0.05; *p < 0.10
Lastly, among social-change-related words, we check which words were used more frequently in
the real-beauty-related topics during the campaigns. The OLS was run for each word. The 13, 16,
and 12 words related to social or cultural change in Table 1.11a show positive and significant
coefficients in either the first or the second campaign in the U.S., Canada, and the U.K.,
respectively. The 6, 8, and 2 opposite words to physical beauty are also so in Table 1.11b. These
results suggest that many social-change-related words were used more frequently in real-beauty-
related sentences in newspapers during the campaigns. In other words, this is the evidence of the
media’s public goal mechanism about why social issue advertising affects mass media content.
Note that, in the U.S., most significant words occurred in the second campaign, as shown in
Table 1.11b. This result is consistent with the finding in Table 1.5a that campaign title words
rank high only in the second campaign.
51
1.4.4.3 Advertiser pressure
From the above keywords on social change found in newspapers in the previous section, the
media’s public goal seems to be a major force. However, it is still not clear whether the articles
on real beauty topics are published voluntarily or because of advertiser pressure. Given that
advertising is a major revenue source (Stromberg 2004, Mantrala, Naik, Sridhar, and Thorson
2007, Pew Research Center 2014), many theoretic papers predict that the mass media are
affected by advertiser pressure (Ellman and Germano 2009; Gal-Or, Geylani, and Yildirim 2012;
Zhu and Dukes 2015; Spiteri 2015; Blasco, Pin, and Sobbiro 2016). There are also empirical
evidences. Newspapers write longer articles (Rinallo and Basuroy 2009) more frequently (de
Smet and Vanormelingen 2012; Gambaro and Puglisi 2015) and more favorably (Reuter and
Zitzewitz 2006; Gurun and Butler 2012; Focke, Niessen-Ruenzi, and Ruenzi 2016) about the
advertising firms that spend more on newspaper advertisements. Therefore, around the months of
the Dove campaign, those newspapers that advertised Unilever, the company that owns the Dove
brand, might have reported on the Dove campaign more frequently or with more detailed
description.
To address this issue, we tested (1) whether the Dove campaign’s significant impact on real
beauty topics holds regardless of advertiser pressure, and (2) whether the effect is bigger in
newspapers that had any Unilever advertisement around the time of the Dove campaign.
First, we collected Unilever’s newspaper advertising data from a market research company that
tracks newspaper advertising. The market research company MarketTrack monitored 356 U.S.
newspapers in 2004 and 2005. Among the 113 U.S. newspapers that we used from the ProQuest
newsstand database, we identified that 28 newspapers did not publish any Unilever
advertisements and 17 newspapers did between 2004 January and 2005 December. Among them,
8 newspapers published such ads during the months of the Dove campaign.
52
Table 1.12 The significant impact of the campaign on real beauty-related topics is even in U.S.
Column (1) in Table 1.12 uses beauty sentences from newspapers without Unilever ads. In
contrast, Column (2) uses sentences from newspapers with Unilever ads during the campaign,
and Column (3) adds more newspapers with Unilever ads around the time of the campaign.
Across Columns (1) to (3), heteroskedasticity-robust standard errors are clustered at the country-
topic level to adjust for correlation within each country’s topic across analyzed months.
U.S. Newspapers
without Unilever ad
U.S. Newspapers with Unilever ad
All news- papers
During the campaign
Anytime 2004-2005
(1) (2) (3) (4)
Real Beauty X During campaign x Treated 5.017*** (0.275)
6.573*** (0.273)
10.85*** (0.295)
4.552*** (0.199)
x U.S. Newspapers with Unilever ad during campaign
1.95*** (0.139)
x U.S. Newspapers with Unilever ad anytime 2004-2005
6.25*** (0.136)
Real Beauty x One month after campaign x Treated 7.094*** (0.213)
3.667*** (0.212)
3.352*** (0.277)
4.733*** (1.064)
Real Beauty X During campaign 0.399
(0.469) 0.338
(0.478) 0.736
(0.593) 0.632**
(0.3)
Real Beauty x One month after campaign -5.186***
(1.172) -4.814***
(1.196) -4.763***
(1.205) -3.867***
(0.639)
Year-month dummies 23 23 23 23
Country-specific topic dummies 17 17 17 NA
Country (3 types in US)-specific topic dummies NA NA NA 35
Observations 432 432 432 864
R-sq 0.506 0.546 0.546 0.636
No. of newspapers 28 8 17 45
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
Column (4) use all observations for U.S. newspapers across Columns (1), (2), and (3) and those for New Zealand and Australia.
3 types in US mean Column (1), (2), and (3).
Treated country, the U.S. has 10 topics. Two of them are real-beauty-related topics, which are aggregated into one topic.
Australia and New Zealand have 7(1) and 2(0) topics (real beauty one), respectively. Robust standard errors are clustered at the country-topic level for Columns (1) to (3) and at the country-type-topic level for Column (4). ***p < 0.01; ** p < 0.05; *p < 0.10
53
Across Columns (1) to (3), all the coefficients for the interaction effects during the campaigns
are significantly positive, suggesting that Dove’s social campaign impacted coverage of real
beauty topics during the campaigns in the U.S. regardless of potential advertiser pressure.
Interestingly, the coefficient in Column (1) for the interaction effect with one month after the
campaigns is bigger than the others in Columns (2) and (3), suggesting that those newspapers
that did not receive economic incentives at least during the analyzed two-year periods show
bigger lasting interest in the social message. These results in Column (1) support the public role
of the mass media. The baseline effects in control countries show similar patterns with those in
Table 9c.
However, the effect sizes in Columns (2) and (3) for during the campaigns are bigger than that in
Column (1). Thus, we test whether this gap is significantly large in Column (4) using all the
newspapers used in Columns (1) to (3). The base interaction effect in the first row corresponds to
that in Column (1), and thus is positive significantly, as expected. The second and third rows
capture the gap between newspapers without and with Unilever ads. Both effects are positive and
significant. This result suggests that newspapers report topics about their advertisers more
actively, and that monetary incentive from advertiser to newspaper tends to come more before or
after a campaign than at the same time. In other words, this result is consistent with the
“advertiser pressure” mechanism.
1.5. Conclusion
In this paper, we have shown that an advertising message can change the content of a newspaper.
Specifically, we find that newspapers increased reporting of real beauty topics during and one
month after the Dove campaigns for real beauty. In segmenting topics of newspapers, we have
also shown that the topic model, which was recently introduced in marketing, is useful to
identify advertising-related messages in newspapers. As an underlying mechanism, we have
provided evidence to support the public role of the mass media: (1) newspapers deliver the
message on social or cultural change that social issue campaigns emphasize, and (2) newspapers
without any Unilever advertisements around the time of the campaign also wrote more about real
beauty topics during and one month after the Dove campaign. Furthermore, we also found
evidence on the advertiser pressure mechanism, in that the impact of the Dove campaign on real
54
beauty topics is bigger in newspapers with than without Unilever advertisement. Overall, the
results suggest that a marketing campaign can have an impact on mass media content, both in
terms of earned media mentions of the campaign and in terms of changing the focus of articles
about a related topic.
55
1.3 Ch
Chapter 2
Detecting potential product segments using topological data analysis
2.1 Introduction
Market structure analysis describes the relationships between brands and products in order to
define the market (Elrod et al. 2002). Analysis of market structure is a key step in the design and
development of new products, the repositioning of existing products, pricing, marketing
communications, and marketing strategy (Srivastava, Alpert, and Shocker 1984; Urban, Johnson,
and Hauser 1984; Kamakura and Rusell 1989; Urban and Hauser 1993; DeSarbo, Manrai, and
Manrai 1993; Erdem and Keane 1996; Bergen and Peteraf 2002; Lattin, Carrol, and Green 2003;
DeSarbo, Grewal, and Wind 2006). Until very recently, the bulk of published work focused on
competitive market structure with a limited number of products (Erdem 1996; Cooper and Inoue
1996; DeSarbo and Grewal 2007; Kim, Albuquerque, and Bronnenberg 2011; Lee and Bradlow
2011).
In the last few years, new methods have arisen to identify and visualize market structure with
many products. These new methods are a response to two developments. First, the variety of
products in the marketplace has increased (Ailawadi and Keller 2004), increasing demand for
such methods. Second, faster computers and increasing digital storage capacity have broadened
the set of potential tools to make sense of this variety, enabling the supply side. This has created
renewed interest among marketing scholars in market structure and segmentation (Netzer et al.
2012; France and Ghose 2016; Ringel and Skiera 2016).
Figure 2.1a Distinctly grouped data Figure 2.1b A loopy segment
Source: Lesnick (2013) Source: Lesnick (2013)
56
In this paper, we apply a new data analysis technique, Topological Data Analysis (TDA
hereafter, Carlsson 2009), to the problem of market structure segmentation with many products.
Standard clustering methods work well for distinctly grouped data (Figure 2.1a). However, as the
number of data points rises, the data set becomes more connected. One particular example of
such connected data is a loopy segment (Figure 2.1b), where products locate closely together
with their neighboring products but are indirectly connected to, and seemingly far apart from,
some other products. TDA is particularly well-suited to identifying such segments.
Loopy segments can occur in analyzing national level market structure with customer level data
on purchases. In many cases, not all products are available in each local market, and thus there
can be no common customers among some related products. For example, suppose that a
manufacturer launched products W and M in Wisconsin and Massachusetts, respectively.
Suppose that products W and M serve similar types of consumers in the different markets. No
consumer can purchase both products due to local availability. Instead, some consumers
purchase the local one. These consumers also purchase other products that are available in both
markets. As a result, products W and M can be in the same segment, connected through the
products that are available nationally, although no consumer purchases both W and M. This same
framework can also connect products sold at different stores. The indirect connection between W
and M will be identified in topological data analysis through a loopy segment.
Why is it useful to detect loopy segments? As described in the preceding paragraph, a loopy
segment can include products that occupy the same product space in different markets, but that
no consumer purchased together. If preferences are transitive, in the sense that if objects share a
relationship to a common object, then they would be related if they were in the same domain,
then loopy segments can help firms identify potentially competing or potentially co-purchased
products that are not currently offered in the same market. This can help manufacturers who
launch their products sequentially across regional markets. They can learn about (1) competitor
products in one market and (2) indirectly connected products in the other market, and they can
use that information to inform opportunities in both markets. This also helps retailers with
limited shelf space to detect related products. For example, Costco and Walmart strategically
keep a small number of products in each category. By identifying potentially related products,
they can make better product assortment decisions.
57
Standard clustering methods such as hierarchical clustering are not good at identifying indirect
connections such as those found in a loopy segment (Lesnick 2013). Recently developed
community detection methods are particularly useful for segmenting many observations
(Newman and Girvan 2004; Clauset, Newman, and Moore 2004; Pons and Latapy 2005;
Raghavan, Albert, and Kumara 2007; Blondel et al. 2008). However, our simulations suggest
that community detection methods are less effective than TDA at identifying product
connections across markets because no community detection method assigns a product into
multiple segments, unlike TDA. Thus, our results suggest that for the particular problem of
identifying connected products in unconnected markets, TDA is a useful new tool.
Topology is a mathematical discipline that studies shape. TDA, developed by computational
mathematician Gunnar Carlsson (2009), refers to the adaptation of this discipline to analyzing
highly complex data (Ayasdi 2015). TDA assumes that all data has shape and shape has
meaning, and thus tries to discover geometric relationships among data points. There are many
applications across oncology, astronomy, neuroscience, image processing, and biophysics.
Hoffman and Novak (2015) argue that TDA is useful in organizing potential applications of the
‘internet of things’. There has been some commercialization efforts by analytics company
Ayasdi (which counts Carlsson as one of its founders). For example, TDA analysis has helped
identify new patient groups in breast cancer treatment, distinct playing styles of National
Basketball Association players, and voting patterns of the members of the US House of
Representatives (Lum et al. 2013). Ayasdi’s website also discusses potential marketing
applications in customer segmentation, personalized marketing, churn analysis, and network
optimization (Ayasdi 2016).
We find that TDA is particularly well-suited to a specific marketing problem. We use simulated
data and the IRI marketing academic data set (Bronnenberg, Kruger, and Mela 2008) to
demonstrate that TDA can connect products in different markets through national products,
while standard hierarchical clustering methods and community detection methods have
difficulty. Our analysis of beer and salty snack buyers in Pittsfield Massachusetts and Eau Claire
Wisconsin shows that different locally popular brands appear to occupy similar product space in
the different markets. For example, in salty snacks, two national salty snacks (Rold Gold and
Tostitos) connect local segments, which include two products that sell well in Wisconsin (Barrel
O Fun and Jays) and a product that sells well in Massachusetts (UTZ). This suggests that the
58
positioning of UTZ in Massachusetts is similar to the positioning of Barrel O Fun and Jays in
Wisconsin. We also find potential co-purchase behavior between certain beer and salty snack
products. While the two product categories are quite separated when using hierarchical
clustering, many TDA segments include both beers and salty snacks.
We view the core contribution of this paper as introducing TDA methods to marketing by
providing a clear marketing application. This adds a new clustering tool to the rapidly growing
literature on market structure analysis using big data (France and Ghose 2016; Ringel and Skiera
2016). We view TDA as a useful exploratory new tool and we highlight a specific strength of
this tool. It should not be viewed as a replacement for other forms of product segmentation
because it is unlikely to outperform those methods for standard product segmentation purposes.
Because TDA is a new method to marketing, section 2.2 uses several simple examples to provide
an extensive discussion of the intuition behind TDA. Section 2.3 shows the usefulness of TDA
for connecting similar products in separated markets using a simulation study, comparing TDA
to other clustering tools. Section 2.4 applies the method to the IRI data to demonstrate its
practical application in marketing. Section 2.5 concludes with a discussion of opportunities and
limitations.
2.2. TDA methodology
Computing topology based on simplical complexes has been well understood decades (for more
details, see Armstrong 1983, Edelsbrunner, Letscher, and Zomorodian 2002, Hatcher 2002,
Zomorodian and Carlsson 2005, Edelsbrunner and Harer 2010, and especially Carlsson 2009).
However, computing simplical complexes is resource intensive and so TDA had limited
application until recently (Lum et al. 2013). Below, we provide a description of the TDA
methodology.
59
2.2.1. Vietoris-Rips Complex
We use the most common and easily implemented TDA method, the Vietoris-Rips complex. Let
𝑑(∙ ,∙) denotes the distance between two product points in customer purchase quantity space 𝑋.
The complex VR(𝑋, 𝑣) is defined as
• A set of vertices (data points or 0-simplices) is defined as 𝑋
• For vertices (data points) q and r, a line (edge or 1-simplex) [qr] is included in
VR(𝑋, 𝑣) if 𝑑(𝑞, 𝑟) ≤ 𝑣
• A higher dimensional (k>1) simplex such as a triangular face (2-simplex) or a
tetrahedron (3-simplex) is included in VR(𝑋, 𝑣) if all of the lines (1-simplices) that
make up the high dimensional simplex are in VR(𝑋, 𝑣).
All the points within a simplex are directly connected each other. Given that our goal in this
study is to find potentially related products, the simplex itself does not include indirectly
connected products through other products. Next, because VR(𝑋, 𝑣) includes a set of k-simplices
[𝑥0, 𝑥1, … 𝑥𝑘], where 𝑥𝑖 ∈ 𝑋, at filtration value 𝑣, it is also called a filtered simplicial complex.
Note that the complex VR(𝑋, 𝑣) grows in filtration value 𝑣. In other words, data points are
connected from their nearest neighbor to more distant ones. Lesnick (2013) label this the
“thickening” process.
To enhance the formal description and explain how it works with marketing data, we provide
several example cases on how TDA creates clusters of products based on purchases by two or
three sample customers. Figure 2.2 presents the 9 different cases and Table 2.1 summarizes the
key TDA output for each: filtration values, VR complexes, and Betti numbers which we define
below.
2.2.2. Clustering distinctly grouped data (Cases 1 and 2)
Case 1 illustrates how TDA segments distinctly grouped data. The data, or vertex set, 𝑋 consists
of four products. Customer 1 purchased 0, 1, 5, and 5 units of products a, b, c, and d
60
respectively. Customer 2 purchased 1, 1, 1, and 2 units. The points are plotted in the graph
labeled v=0 (Data). At filtration value v=0, no product pair is connected yet, and thus
VR(𝑋, 𝑣 = 0) includes only four data points 𝑥0 = {𝑎, 𝑏, 𝑐, 𝑑}. In the thickening process, we
gradually increase filtration value by 0.01. The graph labeled v=1 (Betti0=2) shows that at v=1,
we can now connect two groups of dots within the circles to generate two lines 𝑥1 = {𝑎𝑏, 𝑐𝑑}.
These two groups are maintained until v=4, when all four dots become connected.
Figure 2.2 TDA examples with two customers (Case 1-7) or three customers (Case 8 and 9)
Figure 2.2.1 Case 1 Two segments
61
Case 2 provides a similar example. Product d is purchased more by both customers and has a
distinct positioning. At v=1, products a, b, and c become one body, Then, at v=3.61, product d
joins the others. It suggests that there are two product segments in Case 2. In Cases 1 and 2, the
linking process of TDA is similar with that of standard hierarchical clustering.
Figure 2.2.2 Case 2 Tetragon
62
2.2.3. Homology groups, Betti numbers, and loopy segments
Now, we show how TDA summarizes the shape of data. As described above in Cases 1 and 2,
TDA generates a simplex (e.g. a line or a triangular face) by connecting data pairs which locate
within filtration value v. The filtered simplex complex VR(𝑋, 𝑣) can be summarized by what are
labeled homology groups. The value Bettih, where ℎ ∈ ℕ, counts the number of h-th homology
groups in the topological space, which is VR(𝑋, 𝑣) here. "Betti numbers" was coined
by Poincaré (1894) after Enrico Betti. The meaning of Bettih ,where ℎ ∈ {0,1,2}, is as follows.
• Betti0 : the number of connected components
• Betti1 : the number of holes or loops
• Betti2 : the number of voids or cavities
It is possible to define higher Bettih numbers,where ℎ > 2, but they are difficult to conceptualize
and do not appear to matter in our empirical context. Thus we use up to Betti2 in our study.
To provide examples of connected components and loops, the TDA literature often uses the
shape of upper case letters. The letters that are qualified as Betti1 with a single loop (and a single
hole) are {A, R, D, O, P, Q}. In contrast, {B} is Betti1 with two loops. All the other upper case
letters have no loops, and can be thought of as a point if compressed.
A torus (or empty donut shape) is an example of Betti2. A torus has a void inside of the donut as
well as two loops: one with a hole in the center and the other with a hole inside the donut.
For each case in Figure 2.2, the bar graph shows the number of segments by Betti type for
each filtration value v. For example, for Case 1, for Betti0 (Betti dimension 0), we have four
distinct groups until v=1. After v=1, there are two groups until v=4 when there is just one group
as all the dots are joined together. Similarly, for Case 2, for Betti0 we have four distinct groups
until v=1, then two groups until v=3.61 and one group for v≥3.61. In this way, Betti0 provides
similar results to a standard clustering algorithm.
In contrast to Betti0, both Betti1 and Betti2 count holes and voids, providing distinct insights into
the data structure from a standard clustering algorithm. Following the literature, we label a hole
63
or void as a loopy segment in our paper. Given that we aim to detect related products that are not
directly competing and that cannot be identified with standard clustering techniques, we focus on
loopy segments (Betti1 and Betti2), where each product is indirectly connected with some others.
2.2.4. A loopy segment in a two dimensional plane (contrasting Cases 3 and 4)
In Cases 1 and 2, no hole exists. Betti1 and Betti2 are zero throughout. In particular, there is no
empty space inside a simplex. Once all dots are connected in a triangle at a filtration value, the
space within the triangle is covered. For example, in Case 2, when products a and c are
connected at v=1.42, the inside of the triangle among products a, b, and c is shaded rather than
blank because distance 1.42 covers all space within the triangle. This means that at least four
products are necessary to form a hole in two-dimension space.
When can a loopy segment emerge in a two dimensional plane? The dots cannot be on a straight
line (as in Case 1) and the diagonals should be longer than any of the four boundary lines. Case 2
does not form a loopy segment because products a and c are linked to each other before they link
with product d.
Case 3, a square, does have a loopy segment. At v=0, there are four distinct dots and Betti0=4. At
v=2, lines can be drawn that connect the dots along the outside of the square and Betti0=1.
Importantly, the diagonals {ac,bd} are unconnected, meaning there is an unconnected simplex
and so Betti1=1. At v=2.83 the diagonals connect and there is no hole, and so for v≥2.83,
Betti1=0. The loopy segment suggests that the four products are indirectly related to their non-
neighboring products because a grouping of size less than 2.83 shows no direct link between b
and d or between a and c.
64
Figure 2.2.3 Case 3 Square loopy segment
65
In contrast, Case 4 is a square with a dot in the middle: Product e is in the center of the other four
products. Case 4 does not have a loopy segment. At v=1.42, four boundary products are
connected with product e in the center, and so Betti0=1 from this value. This linkage occurs
before the boundaries are linked with each other, and so no hole is formed because product e
made the diagonals shorter than the boundary lines. Case 4 shows that a hub structure, where one
popular product competes with other products, is not likely to have a loopy segment.
Figure 2.2.4 Case 4 Center point within square
66
2.2.5. Interval length of a loopy segment: Persistent homology (Cases 3 and 5)
When does the emerged hole disappear? Namely, how long does the hole persist? This is
important to understand because more persistent holes suggest more robust connections that are
distinct from standard clusters. Cases 3 and 5 provide a useful contrast for exploring persistent
holes.
We now introduce a new concept, the Betti interval. The Betti interval describes how the
homology of VR(𝑋, 𝑣) changes with filtration value v. Betti1 interval, with endpoints
[𝑣𝑠𝑡𝑎𝑟𝑡 , 𝑣𝑒𝑛𝑑), corresponds to a hole that appears at 𝑣𝑠𝑡𝑎𝑟𝑡, remains open for 𝑣𝑠𝑡𝑎𝑟𝑡 ≤ 𝑣 < 𝑣𝑒𝑛𝑑,
and closes at 𝑣𝑒𝑛𝑑. The filtration range or the interval length, 𝑣𝑒𝑛𝑑 − 𝑣𝑠𝑡𝑎𝑟𝑡, is the measure of
persistent homology. Longer persistence suggests more robust features. In Case 3, at filtration
value v=2.83, four simplex triangles {abc, bcd, cda, dab} arise when the additional two
diagonals ac and bd fill in the square, and thus the hole disappears. In summary, the loopy
segment is born at 𝑣𝑠𝑡𝑎𝑟𝑡 = 2 and dies at 𝑣𝑒𝑛𝑑 = 2.83, and thus its Betti interval has length (or
filtration range) 0.83. Note that 𝑣 keeps increasing until new segments do not appear any more.
In practice, the researcher sets a large enough value for 𝑣.
Case 5 presents a rectangle. At v=2, two product groups are formed and then they are maintained
until v=4, suggesting that there are two segments in this case. At v=4, a loopy segment consisting
of all four products emerges with Betti1 interval [4, 4.48). Compared to Case 3, this loopy
segment is born later (4 > 2) and is less persistent (0.48 < 0.83). The later birth suggests that,
relative to Case 3, in Case 5 the Betti0 segments are more distinct and that the indirect
connections might provide insights into potentially related products that standard cluster methods
might miss. In Case 5, standard cluster methods may conclude that there are only two separated
segments: {a, b}, {c, d}. The lower persistence suggests that the loopy (Betti1) segment in Case 5
is, however, a less robust feature of the data.
67
Figure 2.2.5 Case 5 Rectangle loopy segment
68
2.2.6. Connecting loopy segments (Cases 6 and 7)
Figure 2.2.6 Case 6 Distant two loopy segments
69
Using Cases 6 and 7, we explain how TDA connects segments and where it provides distinct
insights from standard hierarchical clustering. In Case 6, there are two clearly separated product
groups: one often purchased by two customers and the other not. At v=1, three segments are
formed and so 𝐵𝑒𝑡𝑡𝑖0 changes from 8 to 3. At v=2, the separate segments ab and dc are joined
and so 𝐵𝑒𝑡𝑡𝑖0 drops to 2. The rectangle and the square are separated until v=2.83 and 𝐵𝑒𝑡𝑡𝑖0
becomes 1. This process is similar to the way hierarchical clustering methods group items.
In addition to identifying distinct segments, and unlike hierarchical clustering, TDA informs us
whether each segment has a loopy structure or not. There are two loopy segments in Case 6. For
the square (efgh), the loopy segment has interval length 0.42, starting at 1 and ending at 1.42. For
the rectangle (abcd), the loopy segment has interval length 0.24, starting at 2 and ending at 2.24.
This suggests a different kind of connection between the points in the rectangle and the points in
the square, as in the above comparison between Cases 3 and 4. The loopy segment is more
meaningful in the square because it recognizes that the four dots are more equally connected.
Case 7 shows two loopy segments that are connected through a common product, d. At v=2,
TDA generates one whole segment (𝐵𝑒𝑡𝑡𝑖0 = 1) with two loopy segments (𝐵𝑒𝑡𝑡𝑖1 = 2). These
segments persist until v=2.83. Product d in Case 7 serves as a gate product. TDA connects
segments by assigning such gate products into multiple segments. This connection information
helps to detect potentially related products across neighboring segments.
Products a and e, which are indirectly connected through the gate product d, do not appear to be
direct competitors. Nevertheless, the common linkage with product d suggests that a and e are
related. As we describe below, if a and e are primarily sold in different markets, the common
gate product suggests that they may serve similar needs in the different markets.
This connecting ability enables TDA to yield distinct insights relative to other clustering
methods such as hierarchical clustering, which forces full separation. In Case 7, most
hierarchical clustering algorithms such as average and complete linkage or Ward’s method,
generate two segments: one with products a, b, c, and d, and another with products e, f, and g.
Moreover, single linkage algorithms, where, at each step, combining two clusters that contain the
closest pair of elements not yet belonging to the same cluster as each other, put all products into
just one segment because all the products has same distance with their neighboring product.
Thus, while single linkage algorithms closely resemble TDA in terms of 𝐵𝑒𝑡𝑡𝑖0 groupings, the
70
single linkage algorithm misses 𝐵𝑒𝑡𝑡𝑖𝑘, (𝑘 ≥ 1) groupings. In the simulation section below, we
conduct a more comprehensive comparison across several clustering methods.
Figure 2.2.7 Case 7 Neighboring two loopy segments with one connection
71
2.2.7. Voids in three dimensional space (Cases 8 and 9)
Figure 2.2.8 Case 8 Tetrahedron
Case 8 V=0 (Data, Betti0=4) V=5.66 (Betti0=1)
Next, we show when voids occur in three dimensional space using examples with three
customers. Case 8 shows an example with four products. The simplex in three dimensional space
is a tetrahedron, which also has four data points. Therefore, Case 8 cannot contain a void. At
v=5.66, all four products are connected each other, resulting in a tetrahedron as well as four
triangular faces. Because both a tetrahedron and a triangle are simplices, neither a void nor a hole
occurs.
72
Figure 2.2.9 Case 9 Octahedron with void
Case 9 V=0 (Data, Betti0=6) V=2.83 (Betti0=1, Betti2=1)
V=4 (Betti2=0)
In Case 9, there are six product points that if joined together would form an octahedron. At
v=2.83, each point is connected with four neighboring points, each of which is in the center of its
neighboring square side, thus 𝐵𝑒𝑡𝑡𝑖1 switches from 6 to 1. For example, product a is connected
with products b, c, e, and f, but not product d on the opposite side. As a result, there are four
triangles {abc, abf, ace, aef} that include product a. There are another four triangles that include
73
d but not a {dbc, dbf, dec, def}. Only these 8 triangles, and no tetragons, are in each plane. Since
a triangle is simplex, no hole is formed. As a result, 𝐵𝑒𝑡𝑡𝑖1 remains at 0.
However, there is one void (𝐵𝑒𝑡𝑡𝑖2 = 1) inside the 8 triangles starting at v=2.83. First,
intuitively, one can see that each point of the six points is connected with the other point in the
opposite side indirectly through their neighboring products. Three product pairs ad, be, and cf
have such an indirect connections. Second, to make sure that the inside is empty, we check
whether any tetrahedrons with four data points occur. For example, product a is connected with
products b, c, e, and f. However, product b is not connected e in its opposite side. There is also
no link between products c and f yet. Therefore, no tetrahedron occurs.
At v=4, the three product pairs {ad, be, cf} on opposite sides connect. Now, the inside is
occupied by twelve tetrahedrons {abcd, abce, abcf, abdf, abef, acde, acef, adef, bcde, bcdf, bedf,
cedf}, leading to 𝐵𝑒𝑡𝑡𝑖2 = 0. The length of the interval with this void is 1.17, and the interval is
[2.83, 4).
The above cases outline how TDA identifies indirect connections between products. Before we
analyze real world data, we provide simulation evidence that TDA generates a different type of
insight than other commonly used methods.
Table 2.1 TDA cases
Case Description Filtration Value (v)
*Filtered Simplicial Complex = VR(X, v) Betti Numbers
Points Lines Triangles Tetrahedron B0 B1 B2
1 Two segments
0 {a, b, c, d} 4 0 0
1 {ab, cd} 2 0 0
4 {bc} {bcd} 1 0 0
2 Tetragon
0 {a, b, c, d} 4 0 0
1 {ab, bc} 2 0 0
1.42 {ac} {abc} 2 0 0
3.61 {ad, cd} {acd} 1 0 0
3 Square loopy segment
0 {a, b, c, d} 4 0 0
2 {ab, bc, cd, ad} 1 1 0
2.83 {ac, bd} {abc, bcd, cda, dab} 1 0 0
4 Center point within
square
0 {a, b, c, d, e} 4 0 0
1.42 {ae, be, ce, de} 1 0 0
5 Rectangle loopy segment
0 {a, b, c, d} 4 0 0
2 {ab, cd} 2 0 0
4 {ad, ac} 1 1 0
4.48 {ac, bd} {abc, bcd, cda, dab} 1 0 0
6 Distant two loopy
segments
0 {a, b, c, d, e, f, g} 8 0 0
1 {ab, cd, ef, fg, gh, hd} 3 1 0
1.42 {efg, fgh, ghe, hef} 3 0 0
2 {bc, da} 2 1 0
2.24 {abc, bcd, cda, dab} 2 0 0
2.83 {de} 1 0 0
7 Neighboring two loopy
segments with one connection
0 {a, b, c, d, e, f, g} 8 0 0
2 {ab, bc, cd, ad, de, ef, fg, gd} 1 2 0
2.83 {ag, ce, ac, bd, df, eg} {abc, bcd, cda, dab, def, efg, fgd, gde, adg, cde}
1 0 0
8 Tetrahedron 0 {a, b, c, d} 4 0 0
5.66 {ab, ac, ad, bc, bd, cd} {abc, abd, acd, bdd} {abcd} 1 0 0
9 Octahedron with void
0 {a, b, c, d, e, f} 6 0 0
2.83 {ab, ac, ae, af, bc, bd, bf, cd, ce, de, df,ef}
{abc, abf, ace, aef, dbc, dbf, dec, def}
1 0 1
4 {ad, be, cf} {abe, acf, adb, adc, ade, adf, bcf, bec, bed, bef, cfd, cfe}
{abcd, abce, abcf, abdf, abef, acde, acef, adef, bcde, bcdf, bedf, cedf}
1 0 0
*Filtered simplicial complex, VR(X, v), is cumulative as v increases: The table shows the additional simplices for each filtration value v
2.3. Simulation study
Our goal is to demonstrate that TDA can identify potentially related products that have not been
sold together in the same market. Our target application is to cluster products in two local
markets which are regionally separated. In the IRI data analysis below, we examine sales across
two cities, Eau Claire Wisconsin and Pittsfield Massachusetts. We cluster salty snacks and beers
separately to see whether TDA can find products that occupy the same product space within a
category in the two local markets. Then, we combine both product categories in the same
analysis to see whether TDA can also connect products in different categories and different
markets that could potentially be purchased by the same customers. In other words, for this
simulation to be useful to marketers, we assume that preferences are transitive and examine
whether TDA can unpack the relationships in the data.
Before analyzing real consumer purchase data from the two local markets, we do a
simulation study to examine whether TDA can recover useful loopy segments in such a setting.
We compare results from TDA with those from hierarchical clustering methods and community
detection methods. The purpose of this section is not to demonstrate that TDA is always superior
to other methods. Instead, the purpose is to highlight a particular case in which TDA does detect
a pattern in the data when other methods do not.
2.3.1. Simulation study procedure
Our simulation study has the following 5 steps, as shown in Figure 2.3a. In the simulation, some
products appear only in Wisconsin (W), some products appear only in Massachusetts (M), and
some national products appear in both markets (N).
76
Figure 2.3a 5 steps for simulation study
Step 1: True segments
We simulate two scenarios, shown in Figure 2.3b. In Scenario 1, there are two loopy segments,
one in each local market. Each local segment includes one national product as well as its own
local product. This shape is called a “wedge sum” in topology. The two local segments are
connected through one national product. In other words, the national product is assigned into
both segments.
Next, one concern is that TDA might identify false gate products although there is no
connection. To address this issue, we do a falsification test. In Scenario 2, we add one local-only
segment into each local market. The added local-only segment should not be connected as the
result of TDA.
Figure 2.3b True segments in simulation study
77
Scenario 1 Scenario2
Product type W means Wisconsin product, M means Massachusetts product, and N means national product.
Step 2: Correlation matrix
We simulate a correlation structure among products only within the same local market because
we assume that the two local markets are geographically separated and so no consumer can
purchase both groups (M and W) of local products. To generate the loopy segment, we put
higher correlation between neighboring products. Higher correlation means shorter distance. For
example, we give correlation 0.5 and 0.6 between Wisconsin local product W1 and it
neighboring local and national products (W2 and N4), while we assign a correlation of 0.2
between W1 and its non-neighboring product W3.
Step 3: Simulating consumer purchases
Using the above correlation structure and assuming a marginal Poisson distribution, we simulate
10,000 consumers’ purchases in each market (20,000 consumers total). We chose the Poisson
distribution to reflect the discrete nature of purchase quantity. The quantity purchased by each
consumer of each product is therefore a draw based on correlated (across products) Poisson
random variables. To generate correlated Poisson random variables, we utilized an R
78
implementation by Barbiero and Ferrari (2014). We ensuare that no consumer can buy both M
and W products.1
Step 4: Distance or similarity matrix
With consumer purchases for each product, we calculate Euclidean distance among products
across the 20,000 consumers. This distance matrix becomes input data for TDA and hierarchical
clustering. We also construct similarity matrix for community detection methods that counts
each product pair’s joint purchase frequency as the number of consumers who purchase both
products among the 20,000 consumers.
Step 5: Product clustering
We create segments from this data using TDA, four different hierarchical cluster
algorithms, and five different community detection methods. Hierarchical clustering methods are
perhaps the most commonly used tool for segmenting and positioning products and brands
(Srivastava, Leone, and Shocker 1981; Punj and Stewart 1983; DeSarbo and DeSoete 1984;
Zhai, et al., 2011). The first three hierarchical clustering algorithms we use are single, average,
and complete linkage, which Johnson (1967) defines as the “standard” hierarchical clustering
algorithms. The fourth is Ward’s method (Ward 1963) which is known for working particularly
well with marketing data (Punj and Stewart 1983). Among them, the closest algorithm to TDA is
single linkage, where, at each step, combining two clusters that contain the closest pair of
elements not yet belonging to the same cluster.
1 To ensure the existence of a hole structure, we assign a lower mean value for the national product than for the local
products. Recall that a national product is sold in two markets, while local products are only sold in one market. A
higher mean value of the national product results by construction in a longer distance between the national and local
products. As a result, if a national product has too high of a mean value, it may not be part of a loopy segment.
79
Recently, community detection methods have been proposed as segmentation tools in network
analysis. The five community detection algorithms we use in this study are those developed by
Newman and Girvan (2004), Clauset, Newman and Moore (2004), Pons and Latapy (2005),
Raghavan, Albert and Kumara (2007), and Blondel et al. (2008). They differ in terms of
scalability and quality of detection.2 Netzer et al. (2012) introduced the community detection
method developed by Girvan and Newman (2002) for the first time in marketing, in order to
segment discussion of 169 car models in an online forum. Newman and Girvan (2004) extended
their previous paper by incorporating the weight of edge between vertices. Later, Clauset,
Newman, and Moore (2004), Pons and Latapy (2005), Raghavan, Albert, and Kumara (2007),
and Blondel et al. (2008) proposed new algorithms to process a large network quickly. Blondel et
al.’s (2008) the Louvain method is known to show better performance in terms of speed and
accuracy. Recently, Ringel and Skiera (2016) adapted the Louvain method as one component of
their market structure map of more than 1,000 products from an online price and product
comparison site.
To estimate TDA, we utilized a JavaPlex implementation by Adams, Tausz, and Vejdemo-
Johansson (2014) through the MATLAB interface developed by Adams and Tausz (2015). For
hierarchical clustering and community detection methods, we use the R “cluster” and “igraph”
packages, respectively.
2.3.2. Simulation study results
Figure 2.4 shows the result of topological data analysis on the simulated data. In Scenario 1,
there are two intervals under Betti1 in the barcode chart, implying that TDA detects two loopy
segments. TDA also generates segment members and their connection order. Each segment
includes the appropriate local products and the national product N4, implying that the two local
segments are connected through the national product. In Scenario 2, there are four intervals
under Betti1. As expected from the generated data, Segments 3 and 4 are connected through the
2 Related to these methods, Henderson, Iacobucci, and Calder (1998) and John et al. (2006) used survey-based
approaches to generate a brand-associative network.
80
national product N8, there is no overlapping product between Segments 1 and 2. In summary,
TDA recovers the true segments well.
Figure 2.4: TDA barcode chart for simulation study
Scenario 1 Scenario 2
Product type W means Wisconsin product, M means Massachusetts product, and N means national product.
Next, we turn to the results from hierarchical clustering methods in Figure 2.5. For both
Scenarios 1 and 2, the single linkage algorithm yields a different pattern than the others, putting
the national brand, N8, into its own segment. Generally, in Scenario 1, the single linkage
algorithm does successfully capture the different location groupings; however, in Scenario 2, the
single linkage algorithm groups products together that should be completely separate (segments
1 and 2). The national product also connects to segments 1 and 2 when it should be disconnected.
Although the single linkage algorithm is the most similar to TDA in terms of the intuition behind
the algorithm, it performs poorly because it does not allow for loopy segments.
Figure 2.5 Hierarchical clustering
Figure 2.5a Hierarchical clustering for Scenario 1 in simulation study Figure 2.5b Hierarchical clustering for Scenario 2 in simulation study
Product type W means Wisconsin product, M means Massachusetts product, and N means national product.
The other three hierarchical clustering methods perform better, in the sense that they do group the
products into the appropriate two or four segments. Still, they do not capture the useful information
that the national product connects the two groups of local products (segments 1 and 2 in Scenario 1
and segments 3 and 4 in Scenario 2) because the algorithms force each product into only one segment.
While this feature of hierarchical clustering methods is often useful in marketing research analysis,
it means that connections across products in different markets are better found using TDA.
We next examine a community detection method result from an R implementation of the
Louvain method (Blondel et al. 2008). Because the other four community detection methods
yielded the same results, the description that follows applies to all five methods. In Scenario 1,
the community detection methods generate two segments: {W1, W2, W3} and {N4, M5, M6,
M7}, failing to identify the gate product N4 because each product is assigned into only one
segment, like the above hierarchical clustering. However, there is a potential way to detect the
gate product using a node betweenness centrality measure (Freeman 1977) in network analysis
with the assumption that a product (i.e. node) with high betweenness will connect local
segments. We show this potential approach with the richer example in Scenario 2.
Scenario 2 results are shown in Table 2.2. Column 1 shows the “true” segments according to the
simulation. Column 2 shows the TDA segments, and Column 3 shows the community detection
method segments. Here the community detection methods split the sample into two groups,
failing to capture the four distinct segments. This suggests that the community detection methods
which we use segment products too broadly, perhaps because such network approaches use all
the given connection information when they generate clusters. This problem may be solved by
Ringel and Skiera (2016), who extend the Louvain method (Blondel et al. 2008) by (1) adding a
“resolution” parameter and (2) combining a multilevel coarsening and refinement procedure
(Rotta and Noack 2011). However, the new method by Ringel and Skiera (2016) also does not
identify indirect connection because it also does not allow for a product to be allocated into
multiple segments (i.e. submarkets). Thus, we do not implement the extension of the community
detection methods used by Ringel and Skiera (2016) here because detecting small segments is
not the key aspect we emphasize in this paper as the key strength of TDA. Rather, we explore
whether there is a potential way to identify indirect connections in the framework of network
analysis as a benchmark model.
83
Table 2.2: Community detection methods in Scenario 2 in the simulation study
(1) (2) (3) (4) (5)
Product
Type True TDA
Community
detection
Betweenness
centrality
Joint product purchase with national product N8 among 20,000 simulated customers
W1 1 1 1 0 1463
W2 1 1 1 0 1433
W3 1 1 1 0 1445
W4 1 1 1 0 727
W5 3 3 1 0 2399
W6 3 3 1 0 2123
W7 3 3 1 0 2413
N8 3,4 3,4 2 49 N/A
M9 4 4 2 0 2538
M10 4 4 2 0 2203
M11 4 4 2 0 2473
M12 2 2 2 0 1491
M13 2 2 2 0 1493
M14 2 2 2 0 1499
M15 2 2 2 0 675
Column 1 to 3 each show a different method. The numbers in the column represent the assigned segment according
to that method. Therefore the numbers are not related across columns. Only the national product has nonzero
betweenness centrality.
Next, while community detection methods do not directly identify gate products, it is possible to
take the constructed network and identify products with high node betweenness. Column 4
shows that the national product has high betweenness centrality, suggesting that, by adding this
step, the community detection methods can be used to help find the gate product. It is possible to
then look at co-purchasing patterns with this gate product in Column 5 and identify indirectly
connected local products identified through TDA. For example, W7 and M9 are especially likely
to be purchased with N8, correctly suggesting a linkage between them. However, as we
demonstrate in the empirical application below, this approach can be complicated if there are
multiple potential gate products. For example, if Scenario 2 is adapted so that there is another
national product N16, which is co-purchased often with W7 but rarely with M9 then it is not easy
to decide whether W7 and M9 are potentially competing. The difficulty will increase as the
number of national product increases.
In summary, TDA finds clear connections between the two local segments through the national
product. In this small product network, more familiar clustering methods can also show such a
84
link, but with additional effort required through manual checking of distances and values. As the
number of products grows, however, such effort becomes impractical. Thus, we interpret the
results of the simulation to suggest that TDA captures a potentially useful data pattern that is not
captured by hierarchical clustering or community detection methods.
2.4. Marketing application
2.4.1. Data and computation time
The IRI Marketing data set (Bronnenberg, Kruger & Mela 2008) has individual-level consumer
purchase data in two local cities: Eau Claire Wisconsin and Pittsfield Massachusetts. Consumers
in these cities have distinct tastes and there are some differences in product availability.
Therefore, this data set allows us to investigate whether TDA can detect potentially related
products across local markets. We also look for potentially related products by looking across
two categories, salty snacks and beers.
Like most other clustering methods, TDA use the distance matrix among products as
input data. We calculate Euclidean distance among products across all the consumers’ purchase
quantities during a particular year, 2003. There are 6,352 consumers who meet IRI’s reporting
criteria (Kruger and Pagni 2011 page 16) across the two cities in salty snacks and 3,101 who
meet the reporting criteria in beer.
As we discussed above, TDA is computationally intensive, increasing exponentially as the
number of products increases. To explore computational feasibility, we choose the top 10, 20,
30, 40, and 50 products in salty snacks and beer in each local market. Table 2.3 summarizes the
results. For the top 10 case, there are 15 salty snacks and 17 beers across two local markets
because national products are available in two regions. Some products which are sold in two
regions have very low sales quantities in one local market. In this case, we classify it as a local
product. We define a national product as a product that makes up more than 0.5% of category
sales in each market. With 32 products, TDA took just 0.5 seconds. With 119 products (top 40 in
each market, both categories), TDA took 30 minutes. Finally, with 146 products (top 50 in each
market), our computer kept running without generating a TDA result. This demonstrates the
85
computational limits of TDA without a high performance computer. The 119 products cover
94% salty snack sales and 89% of beer sales in these two markets. In most of what follows, we
show results on the 32 products (row 1 of Table 2.3) because the smaller number of products
allow for visual comparison of results with hierarchical clustering methods.
Table 2.3 TDA results by the top N products in each market
2.4.2. Potential competitors within a category
Table 2.4a TDA for salty snacks
Birth Death Interval Length
Salty Snacks Product Type
Betti 1 466.8 469.8 3 ROLD GOLD, BARREL O FUN, JAYS, TOSTITOS N, W, W, N
649.6 694.9 45.3 TOSTITOS, UTZ, ROLD GOLD, FRITOS, WAVY LAYS N, M, N, N, N
Product type W means Wisconsin product, M means Massachusetts product, and N means national product.
Table 2.4 reports the results. We focus on loopy segments (Betti1 and Betti2) in order to highlight
the distinct results given by TDA. Table 2.4a shows the loopy segments for salty snacks. There
are two loopy segments with hole (𝐵𝑒𝑡𝑡𝑖1) structures. Figure 2.6 visually summarizes the
members of each segment. Two national products ‘Rold Gold’ and ‘Tostitos’, connect two
neighboring segments. This connection information is useful in identifying products that serve
the same role in different markets. The two national products are competing against (1)
Wisconsin local products Barrel O Fun and Jays in segment 1 and (2) Massachusetts local
product UTZ in segment 2.
S B S+B S B S B S+B S B S+B
10 15 17 32 69 61 0.4 0.4 0.5 2 10 29
20 29 33 62 85 76 1 0.7 11 12 66 163
30 41 49 90 90 84 4.8 2.9 97.9 35 189 486
40 56 63 119 94 89 15.5 5.4 1857.2 106 289 1035
50 68 78 146 96 92Keep
running
*Market coverage is based on sales unit.
Top N
Products in
Each Market
No of Products
across Two MarketsNo of Segment
Elapased Time
(seconds)
Market
Coverage(%)*
86
Figure 2 .6: Potentially competing products across segments using IRI data
From this indirect relationship, a marketing manager learns that those local products have similar
positioning. In other words, if a marketing manager plans to launch the Midwestern (Wisconsin)
local product Barrel O Fun or Jays in Massachusetts, she can predict that it will be likely to
compete against East Coast (Massachusetts) local product UTZ, although those three local
products do not compete in the same market in our data.
We next examine whether these relationships appear using hierarchical clustering and
community detection methods. Figure 2.7 shows the results of hierarchical clustering the salty
snacks products. Massachusetts local product UTZ does not seem to be related to Wisconsin
local products Barrel O Fun and Jays. It is hard to see a connection between them in any of the
four hierarchical clustering methods. These results are driven by the fact that no consumer
purchases both Wisconsin and Massachusetts products. In summary, the standard hierarchical
clustering cannot capture the pattern of indirect connection, unlike TDA. Table 2.5 shows the
results of five different community detection methods. Again, no segment includes a mix of local
brands from the two regions. As in the simulation, it is possible to use betweenness measures to
try to identify connecting products. In this case, all the national products yield similar
betweenness measures, meaning that all nine national product connect all the local products in
the two regions. Then, to find potentially competing local products, one may need to check joint
product purchases with each of the nine national products, as in Column 5 in Table 2.2. As we
discussed in the simulation section, however, it is hard to see which local product in one region
is potentially competing against whom in the other region due to local product’s different co-
purchasing pattern with each of national products.
87
Table 2.5: Community detection methods for salty snacks using IRI data
Salty Snacks
Brand
Product
type
Blondel et
al. (2008)
Raghavan,
Albert, and
Kumara
(2007)
Pons and
Latapy
(2005)
Clauset,
Newman,
and Moore
(2004)
Newman
and
Girvan
(2004)
OLD DUTCH W 1 1 1 1 1
BARREL O FUN W 1 1 1 1 1
JAYS W 1 1 1 1 1
LAYS N 2 1 1 2 1
PRIVATE LABEL N 2 1 1 2 1
DORITOS N 2 1 1 2 1
WAVY LAYS N 2 1 1 2 1
CHEETOS N 2 1 1 2 1
PRINGLES N 2 1 1 1 1
ROLD GOLD N 2 1 1 2 1
TOSTITOS N 2 1 1 2 1
FRITOS N 2 1 1 2 1
SMART FOOD M 2 1 2 2 1
UTZ M 2 1 2 2 1
WISE M 2 1 2 2 1
Each column shows a different method. The numbers in the column represent the assigned segment according to
that method. Therefore the numbers are not related across columns. Product type W means Wisconsin product, M
means Massachusetts product, and N means national product.
88
Figure 2.7 Hierarchical clustering for salty snacks using IRI data
Next, we analyze a beer category. Table 2.4b shows that TDA generates 10 loopy segments with
8 holes (𝐵𝑒𝑡𝑡𝑖1)) and 2 voids (𝐵𝑒𝑡𝑡𝑖2)). Here national beer products connect products from the
different local markets, even within a segment. For example, row 8 contains two national brands,
a Massachusetts brand, and two Wisconsin brands (rows 1, 6, 9, and 10 have similar diversity).
Figure 2.8a visualizes the segment in row 8 of Table 2.4b. Two national products Bud Light and
89
Heineken connect the only Massachusetts brand in the 32 product data set (Michelob Light) with
two Wisconsin brands (Miller Genuine Draft and Miller Genuine Draft Light).
Table 2.4b TDA for beers
No Birth Death Interval Length
Beers Product Type
Both Locals
Betti 1
1 148.6 162.6 14 SMIRNOFF TWISTED V, HEINEKEN, LEINENKUGEL, MICHELOB GOLDEN DRAFT LIGHT, MICHELOB LIGHT
N, N, W, W, M
Y
2 129.2 175.5 46.3 SMIRNOFF TWISTED V, HEINEKEN, MILLER GENUINE DRAFT, CORONA EXTRA, LEINENKUGEL
N, N, W, N, W
N
3 169.5 175.5 6 SMIRNOFF TWISTED V, CORONA EXTRA, HEINEKEN, MILLER GENUINE DRAFT LIGHT
N, N, N, W
N
4 126.3 195.4 69.1 SMIRNOFF TWISTED V, HEINEKEN, LEINENKUGEL, COORS LIGHT, MILLER GENUINE DRAFT
N, N, W, N, W
N
5 191.5 201.4 9.9 MILLER GENUINE DRAFT, CORONA EXTRA, MICHELOB ULTRA, SMIRNOFF ICE
W, N, N, N
N
6 144.4 225.7 81.3 MILLER GENUINE DRAFT, HEINEKEN, MICHELOB LIGHT, MICHELOB GOLDEN DRAFT LIGHT
W, N, M, W
Y
7 213.6 246.6 33 SMIRNOFF TWISTED V, HEINEKEN, MILLER LITE, MICHELOB LIGHT
N, N, N, M
N
8 181.7 286.1 104.4 BUD LIGHT, MICHELOB LIGHT, HEINEKEN, MILLER GENUINE DRAFT LIGHT, MILLER GENUINE DRAFT
N, M, N, W, W
Y
Betti 2
9 309.8 373.6 63.8 MILLER GENUINE DRAFT, MICHELOB LIGHT, MILLER GENUINE DRAFT LIGHT, MICHELOB ULTRA, HEINEKEN, CORONA EXTRA, SMIRNOFF TWISTED V, BUD LIGHT
W, M, W, N, N, N, N, N
Y
10 375.5 380.4 4.9 BUDWEISER, MILLER GENUINE DRAFT, OLD MILWAUKEE, HEINEKEN, MICHELOB LIGHT, MICHELOB GOLDEN DRAFT LIGHT
N, W, W, N, M, W
Y
Product type W means Wisconsin product, M means Massachusetts product, and N means national product.
Furthermore, national products also connect local products across segments as described in the
salty snacks category and in the simulation: Heineken, Smirnoff Twisted V, Corona Extra and
other national brands appear in multiple segments. Table 2.4b also highlights a limitation of
looking for loopy segments using TDA: There is some repetition of products across segments.
This means that TDA is a useful starting point for identifying potentially interesting connections
between products, but further analysis is needed to assess the strength and validity of those
connections.
90
2.4.3. Potentially related products across categories
Next, we combine the salty snack and beer data together (32 total products) to see whether TDA
can find products that might be purchased together, if they were available in the same market.
The rightmost columns in Table 2.3 show that TDA generates many more segments from the
combined data (beer + salty snack) than separate product data. For example, in the top 10
product case, there are 2 salty snack segments, 10 beer segments, and 29 combined (salty snack
and beer) segments.
Table 2.4c shows the segment members from the combined data. Most segments (19 of 29) have
both salty snacks and beer products, providing insight into why the combined data have more
segments than the separate data. Given the underlying data, this makes sense: Even if a customer
always buys the same beer brand and the same salty snacks brand, these brands are connected in
the combined data and provides insight into which categories and products tend to be purchased
by the same customers.
Figure 2.8: Potentially related products across segments using IRI data with order of connection
91
Table 2.4c TDA for the combined data
Prefix b and s mean beer and salty snack, respectively. Product type W, M, and N means Wisconsin, Massachusetts, and national product, respectively.
No Birth DeathInterval
LengthSalty Snack & Beers Product Type
Both
Products
Potentially
Complem-
entary
Betti 1
1 237.4 259.8 22.4 bSMIRNOFF TWISTED V, bHEINEKEN, bLEINENKUGEL, bMICHELOB GOLDEN DRAFT LIGHT, bMICHELOB LIGHT bN, bN, bW, bW, bM N N
2 206.4 280.4 74 bSMIRNOFF TWISTED V, bHEINEKEN, bMILLER GENUINE DRAFT, bCORONA EXTRA, bLEINENKUGEL bN, bN, bW, bN, bW N N
3 270.7 280.4 9.7 bSMIRNOFF TWISTED V, bCORONA EXTRA, bHEINEKEN, bMILLER GENUINE DRAFT LIGHT bN, bN, bN, bW N N
4 201.7 312.2 110.5 bSMIRNOFF TWISTED V, bHEINEKEN, bLEINENKUGEL, bCOORS LIGHT, bMILLER GENUINE DRAFT bN, bN, bW, bN, bW N N
5 199.6 315.2 115.6 bMILLER HIGH LIFE, sUTZ, bMILLER GENUINE DRAFT, bHEINEKEN bN, sM, bW, bN Y Y
6 250.2 315.2 65 bMILLER GENUINE DRAFT, sSMART FOOD, bHEINEKEN, bMILLER HIGH LIFE bW, sM, bN, bN Y Y
7 305.8 321.6 15.8 bMILLER GENUINE DRAFT, bCORONA EXTRA, bMICHELOB ULTRA, bSMIRNOFF ICE bW, bN, bN, bN N N
8 230.7 360.5 129.8 bMILLER GENUINE DRAFT, bHEINEKEN, bMICHELOB LIGHT, bMICHELOB GOLDEN DRAFT LIGHT bW, bN, bM, bW N N
9 356.8 365.8 9 bMICHELOB LIGHT, sPRINGLES, bSMIRNOFF TWISTED V, bHEINEKEN bM, sN, bN, bN Y N
10 341.2 393.9 52.7 bSMIRNOFF TWISTED V, bHEINEKEN, bMILLER LITE, bMICHELOB LIGHT bN, bN, bN, bM N N
11 306.7 428.9 122.2 bSMIRNOFF TWISTED V, bCORONA EXTRA, bMICHELOB LIGHT, sBARREL O FUN, bSMIRNOFF ICE bN, bN, bM, sW, bN Y Y
12 425.1 434.9 9.8 bMICHELOB ULTRA, bMILLER GENUINE DRAFT, bSMIRNOFF ICE, sROLD GOLD bN, bW, bN, sN Y N
13 423.2 440.8 17.6 bSMIRNOFF ICE, sOLD DUTCH, bMILLER GENUINE DRAFT, sJAYS, sWAVY LAYS bN, sW, bW, sW, sN Y N
14 422.6 442.9 20.3 bSMIRNOFF TWISTED V, sJAYS, sWAVY LAYS, bSMIRNOFF ICE bN, sW, sN, bN Y N
15 290.2 456.9 166.7 bBUD LIGHT, bMICHELOB LIGHT, bHEINEKEN, bMILLER GENUINE DRAFT LIGHT, bMILLER GENUINE DRAFT bN, bM, bN, bW, bW N N
16 423.7 477.7 54 bHEINEKEN, sFRITOS, bMILLER GENUINE DRAFT, bCOORS LIGHT bN, sN, bW, bN Y N
17 430.9 511.9 81 bSMIRNOFF TWISTED V, bCORONA EXTRA, bHEINEKEN, sUTZ, bMILLER LITE bN, bN, bN, sM, bN Y N
18 407.6 544.4 136.8 bCORONA EXTRA, sCHEETOS, bMICHELOB LIGHT, bHEINEKEN bN, sN, bM, bN Y N
Betti 2
19 470.5 474.8 4.3bSMIRNOFF TWISTED V, bSMIRNOFF ICE, sTOSTITOS, sWAVY LAYS, sOLD DUTCH, sBARREL O FUN,
bMILLER GENUINE DRAFT, sJAYS
bN, bN, sN, sN, sW, sW,
bW, sWY N
20 440.4 477.2 36.8bSMIRNOFF TWISTED V, bCORONA EXTRA, sBARREL O FUN, bMILLER GENUINE DRAFT, sTOSTITOS,
bMICHELOB LIGHT, bHEINEKEN
bN, bN, sW, bW, sN,
bM, bNY Y
21 468.2 477.2 9bSMIRNOFF ICE, sROLD GOLD, sBARREL O FUN, bSMIRNOFF TWISTED V, sTOSTITOS, bHEINEKEN,
bCORONA EXTRA, bMILLER GENUINE DRAFT
bN, sN, sW, bN, sN, bN,
bN, bWY N
22 455.4 495.9 40.5bSMIRNOFF TWISTED V, bLEINENKUGEL, sTOSTITOS, bMICHELOB ULTRA, bSMIRNOFF ICE,
sROLD GOLD, bHEINEKEN, bMILLER GENUINE DRAFT, bCORONA EXTRA
bN, bW, sN, bN, bN,
sN, bN, bW, bNY N
23 466.8 511.9 45.1bMILLER GENUINE DRAFT, bHEINEKEN, bMILLER HIGH LIFE, bSMIRNOFF TWISTED V, sUTZ,
bCORONA EXTRA, bMICHELOB ULTRA
bW, bN, bN, bN, sM,
bN, bNY Y
24 413.1 534.1 121 bMICHELOB ULTRA, bSMIRNOFF ICE, sFRITOS, bSMIRNOFF TWISTED V, bHEINEKEN, bCORONA EXTRA bN, bN, sN, bN, bN, bN Y N
25 494.8 596.8 102bMILLER GENUINE DRAFT, bMICHELOB LIGHT, bMILLER GENUINE DRAFT LIGHT, bMICHELOB ULTRA,
bHEINEKEN, bCORONA EXTRA, bSMIRNOFF TWISTED V, bBUD LIGHT
bW, bM, bW, bN,
bN, bN, bN, bNN N
26 556.6 597.1 40.5bMILLER GENUINE DRAFT, bMICHELOB LIGHT, bMILLER HIGH LIFE, bMICHELOB ULTRA, bHEINEKEN,
bCORONA EXTRA, bSMIRNOFF TWISTED V, sSMART FOOD
bW, bM, bN, bN, bN,
bN, bN, sMY Y
27 599.7 607.7 8bBUDWEISER, bMILLER GENUINE DRAFT, bOLD MILWAUKEE, bHEINEKEN, bMICHELOB LIGHT,
bMICHELOB GOLDEN DRAFT LIGHT
bN, bW, bW, bN, bM,
bWN N
28 613 634.8 21.8bMILLER GENUINE DRAFT, sROLD GOLD, sOLD DUTCH, bSMIRNOFF TWISTED V, bOLD MILWAUKEE,
bMICHELOB ULTRA, bHEINEKEN
bW, sN, sW, bN, bW,
bN, bNY N
29 652 654.2 2.2 bHEINEKEN, bCORONA EXTRA, sPRINGLES, sWAVY LAYS, bMILLER GENUINE DRAFT, sSMART FOOD bN, bN, sN, sN, bW, sM Y Y
We also find potentially related products across categories in seven segments, as marked in the
rightmost column in Table 2.4c. Figure 2.8 visualizes the segment in row 5. Massachusetts salty
snack UTZ and Wisconsin beer Miller Genuine Draft are in the same segment. Once again, this
is mainly due to their connection with national products. TDA provides the order of connection:
(1) UTZ + Miller High Life, (2) Miller Genuine Draft + Heineken, (3) Miller High Life +
Heineken, and (4) UTZ + Miller Genuine Draft. Each local product connects with a national
product first and then the Massachusetts salty snack and Wisconsin beer get connected. This
suggests that the purchase behavior of people who buy UTZ in Massachusetts is similar to the
purchase behavior of people who buy Miller Genuine Draft in Wisconsin. This information could
be used to inform product launches across markets. Alternatively, it might help generate
advertising ideas, for example UTZ ads could borrow elements from a successful Miller Genuine
Draft campaign.
2.4.4. Relationship between a segment’s birth and its product diversity
In the above analysis, we focused on a relatively small number of products in order to facilitate
comparison with hierarchical clustering and to ease the communication of the content of the
various segments. When more products are included, TDA can generate more loopy segments. In
this section, we explore how TDA measures of birth filtration value help identify the interesting
segments. To do so, we now use the 119 total products (top 40 in each category in each market)
that make up 94% of salty snacks sales and 89% of beer sales.
Birth filtration value is a useful metric because it measures how unusual a particular grouping is
likely to be. TDA groups products that are close to each other first. Segments that emerge late
are more likely to leverage the distinct insights that the topological approach offers. In particular,
we are interested in detecting loopy segments that connect regionally distinct local products
through national products. Because the connections are indirect, those loopy segments tend to
form later.
We next correlate birth filtration value with the diversity of product members within a segment.
We focus on diversity because, as argued above, a key use of TDA is to identify connections that
93
other methods would not. In this paper we have emphasized separate local markets. We measure
diversity as follows. We first order all the products in the same local market by quantity. We
assign each a rank based on this ordering, and take the difference of the rank across the two local
markets. This difference is positive if the product is a Massachusetts product, negative if
Wisconsin, and close to zero if national. Finally, we calculate the standard deviation of the rank
gaps within a segment.
This gives a sense of the variation of the location of sales for the products in the segment: If the
segment has a mix of strongly Wisconsin and strongly Massachusetts products, this diversity
measure will be high. If the segment is mostly national products (or mostly from just one region),
the diversity measure will be low. If the segment contains both national products and products
from one region, the diversity measure will be in the middle.
Table 2.6 The relationship between a segment’s “birth” filtration value and its product diversity
Betti 1 Betti 2
Effect
No of Segment
Effect No of
Segment
Salty Snack
All 0.039
(0.027) 56 0.065**
(0.020) 50
Cut at 3
0.039 (0.027)
55 0.060*** (0.019)
41
Beer
All 0.12*** (0.043)
127 0.022** (0.01)
162
Cut at 3
0.12*** (0.042)
124 0.022** (0.01)
153
Combined
All 0.065*** (0.019)
370 0.068*** (0.009)
665
Cut at 3
0.061*** (0.018)
364 0.063*** (0.009)
631
Each number is the coefficient on product diversity from a regression of birth on product diversity. The
number of observations is the number of segments. ***p < 0.01; ** p< 0.05; *p<0.10
Table 2.6 shows the relationship between a segment’s birth filtration value and its product
diversity. We run separate regressions of filtration value on diversity for 𝐵𝑒𝑡𝑡𝑖1 and 𝐵𝑒𝑡𝑡𝑖2
groupings. We also show results that drop short-lived segments, which may occur due to noise in
94
data (Lesnick 2013). From visual inspection, we chose 3 as the cut-off value for eliminating
segments. Thus, we show twelve regressions: three product cases, two Betti groupings, and
with/without the short-lived segments. The coefficients are all positive and 10 of 12 are
significant, implying that the segments that are formed late are more likely to have mixed local
products across the two cities (i.e. product diversity), as expected. The two non-significant slopes
are for salty snacks 𝐵𝑒𝑡𝑡𝑖1, which has fewer segments than beer or the combined analysis,
suggesting that this might be an issue with statistical power.
In summary, we show that TDA can detect high diversity segments that include local products in
regionally distinct markets, particularly as the filtration value increases. If there are many high
birth filtration value segments, a final step in identifying the potentially most interesting
segments is to look for those with longer filtration range as longer intervals suggest more robust
segments.
2.5. Conclusions
In this paper, we have applied Topological Data Analysis to a particular marketing application.
We have shown that TDA is effective at identifying connections between products that are not
purchased together but hold similar positioning in geographically distinct markets.
A key open question is whether the assumption of transitive preferences holds across settings. In
particular, our framework assumes that two objects that have no direct relationship with each
other, but are bought with a third object, are indirectly related. We have not directly tested this
assumption because we do not have data on several cross-market product launches and data on
pre-launch sales across locations. Furthermore, it is worth exploring whether assumption holds
across market types. For example, it might hold in our setting for consumer non-durables, but it
might not hold for durables or in business-to-business markets. Relatedly, while the IRI data are
ideal in the sense that they have rich customer-level data in two distinct markets, the products do
not have sufficiently rich attribute information to check that they serve similar roles by clustering
products on attributes. In other words, we have shown the potential usefulness of TDA but leave
a field test for future work.
95
Generally, TDA is a new data mining tool and we anticipate other marketing applications. We
anticipate that those applications will be primarily identifying opportunities and a complement to
other types of analysis. In this way, TDA should not be seen as a final step for segment analysis,
but as a useful part of a more comprehensive analysis. It is exploratory and, as with all
segmentation methods, it does not yield a legitimate causal interpretation. Nevertheless, we
believe Topological Data Analysis should be seen as a useful tool in market structure and
segmentation analysis.
96
Chapter 3
Does comparative advertising reposition rival brands closer
together or further apart in market-structure maps in search
stage?
3.1 Introduction
Brands with a lower market share often run comparative advertising campaigns against market
leaders. One of the most famous examples is Apple’s “I’m a Mac/I’m a PC” campaign, targeting
Microsoft, which ran from 2006 to 2009. More recently, when Samsung was ranked fifth in the
smartphone market in 2011, it started a comparative advertising campaign targeting the market
leader, the Apple iPhone. There have been many more recent examples of comparative
advertising between rival brands: Domino’s Pizza vs. Subway; Dunkin’ Donuts and Caribou
Coffee vs. Starbucks; Oscar Mayer vs. Ball Park; Verizon vs. AT&T; and General Motors’
Chevy Silverado and Chrysler Ram vs. the Ford F-150.
How does comparative advertising affect the positions of rival brands in a market-structure map
in brand search stage? Given that consumers gain information about both brands in comparative
advertising, they may do less brand co-searching. However, research suggests that the general
effect of comparative advertising is associative rather than differentiating (Johnson and Horne,
1988; Miniard, Rose, Barone, and Manning, 1993). These previous studies were completed
mainly through laboratory experiments. Potential reasons for the findings are as follows.
Comparative advertising is effective as it directly contrasts product attributes, drawing attention
to and promoting recall of both the advertising and the target (rival) brands (Muehling, Stoltman,
and Grossbart, 1990). Moreover, due to their limited cognitive capacity, consumers consider only
a few brands when making purchases, implying that some consumers may substitute non-
advertised brands with advertised brands in their consideration set. As a result, comparative
advertising is likely to decrease the distance between the advertising brand and the target brand
in the consumer mind-set.
97
In this study, we test the effect of comparative advertising on brand position using consumer
behaviour in a marketplace, unlike previous studies done in a laboratory. Recent studies show
that television commercials result in an increase in consumer searches (Joo, Wilbur, Cowgill and
Zhu, 2015; Liaukonyte, Teixeira and Wilbur, 2015; Hu, Du, and Damangir, 2014; Du, Hu, and
Damangir, 2015). Thus, using publicly available aggregate consumer search data from Google
Trends, we measure how distances between brands change with comparative advertising.
We draw two types of market-structure maps to locate brands in (1) pure brand space and (2)
product space. In the first approach, we measure distances between brands directly. For the first
brand space map, we measure weekly co-occurrence of brand searches (e.g., Samsung and
Apple, HTC and Apple) for all the major brand pairs on the U.S. national level. Then, we draw a
map using multidimensional scaling. The more two brands are searched together, the closer they
are positioned in a map.
In the second approach, we measure brand distances indirectly using the relationship between
brand and attributes. For the second product space map, we count weekly brand–product
attribute co-occurrences (e.g., Samsung screen, Apple screen) for all the pairs between brands
and attributes at the U.S. national level. Then, we use correspondence analysis, where each brand
locates closest to the attribute most co-searched with it. If two brands share a most co-searched
attribute, they are likely to locate the most closely in the product space map. For example, if
Samsung and Apple are often searched with “screen” together, both brands are likely to locate
closely around “screen.” Therefore, this second approach using product space allows us to
investigate whether two brands reposition closely due to common product attributes mentioned
in comparative advertising.
Once we have a map, we can measure time-varying brand distance easily. Using the brand
distances derived from U.S. nationwide weekly co-search volume, we test whether a target brand
becomes closer to an advertising brand compared to non-advertised brands. We apply this
difference-in-difference (diff-in-diff) for both approaches at the U.S. national level.
By analyzing Samsung’s nationwide comparative TV advertising campaign against market
leader Apple iPhone in the U.S., we find empirical evidence that supports comparative
98
advertising’s associative effect on brand position. First, during the TV commercial, co-search
volume on Samsung and Apple rises sharply, while co-search volume on Apple and other non-
advertised brands (i.e., Blackberry and HTC) increases slightly. Second, reflecting the increase
in co-searches on the advertised rival brands, a brand space map approach shows that Apple
moves closer to Samsung than it does to the other brands during the campaign. Third, in the first
week of the ad campaign, co-search volume of each brand and the advertised product attributes
such as screen, videos, and movies increases, but not much during the rest of the campaign.
Fourth, as the result, relative distance between Apple and Samsung does not decrease
significantly in a product space map during the campaign. These results suggest that comparative
advertising repositions both brands closely due mainly to direct co-searches on both rival brands,
not much through advertised product attributes.
Our contributions are as follows. First, we test the effect of comparative advertising on brand
position in market-structure maps; our study is the first to use consumer search data, and we
show that the theory holds in a marketplace. Second, we add empirical evidence to the rapidly
growing literature on advertising content. Last but not least, we propose one practical way to
monitor real-time marketing effectiveness using publicly available search data from Google
Trends.
We’ve organized the rest of this chapter as follows. In §3.2.1, we discuss why mobile phones are
relevant in studying the effect of advertising on consumer searches. Then, in §3.2.2, we
introduce our focal comparative advertising campaign. In §3.3.1, we explain our empirical
strategy, and then show the results using the direct approach in brand space in §3.3.2. Then, in
§3.3.3, we investigate one potential mechanism that moves the rival brands closer together using
the indirect approach in product space. We conclude in §3.4.
3.2. Data
3.2.1. Why a mobile phone?
We aim to evaluate whether comparative advertising moves rival brands closer together
or further apart, based on consumer behavior in a marketplace. In the consideration stage of the
purchase funnel, consumers begin to search actively for product information. Thus, we use
99
consumer searches as the measure of consumer behavior in this study. Recent studies also show
that TV advertising increases consumer search volume (Joo, Wilbur, Cowgill, and Zhu 2015;
Liaukonyte, Teixeira, and Wilbur 2015; Hu, Du, and Damangir 2014; Du, Hu, and Damangir
2015). For this purpose, we seek for such an intensive search good.
While most durable goods could be used in this study, we decided to focus on mobile
phones to ensure adequate search volume. As mobile phones are a personal durable good, almost
everyone has one. Thus, many consumers search for a mobile phone. Furthermore, their life
cycle is short, which means that consumers are often in the market for a new phone and search
online for a new phoned.
Figure 3.1 Search volume trend by a brand
Notes: This picture is captured in Google trend.
Table 3.1 Google trend queries used for extracting brand search trend
Brand Query
Apple Apple + iPhone
Blackberry Blackberry
HTC HTC + Evo
Samsung Samsung + Galaxy
Notes: A single-term query will match all the searches containing that term. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.
Appl
e
iPhone Launch
100
Before we conduct our main analysis, we check whether search volumes from Google Trends
well explain the mobile phone market. Figure 3.1 shows search volume trends by brand. As
shown in Table 3.1, we count the smartphone brand as well as its family brand (e.g., iPhone +
Apple). One can see that search volumes for Apple or iPhone have stayed high since the iPhone
launched in 2007. There are several spikes, which seem to correspond with new product
launches. Next, we do competition analysis using brand co-search volumes for all the brand
pairs. If two brands have both a family and a smartphone brand, there are 4 brand pairs (e.g.,
Apple Samsung + Apple Galaxy + iPhone Samsung + iPhone Galaxy; see Table 3.2 for others).
Figure 3.2 shows that Apple’s top rival brand has changed from Blackberry to HTC and
Samsung based on brand co-searches. Interestingly, major spikes occurred in the co-searches
between a market leader and competing brands.
Figure 3.2 Apple’s top rival brands trend
Notes: This picture is captured in Google trend.
Apple vs
Blackberry
Apple vs HTC Apple vs
Samsung
101
Table 3.2 Google trend queries used for extracting brands co-search trend with Apple
Brand Pairs Query
Blackberry and Apple Blackberry Apple + Blackberry iPhone
HTC and Apple HTC Apple + HTC iPhone + Evo Apple + Evo iPhone
Samsung and Apple Samsung Apple + Samsung iPhone + Galaxy Apple + Galaxy iPhone
Notes: terms separated by a space will match searches with all the terms. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.
3.2.2. Comparative advertising campaign
We analyze Samsung’s comparative advertising campaign against the market leader,
Apple iPhone. Based on quantity of smartphone sales from March to May in 2011, Apple and
Samsung are ranked first and fifth, respectively (see Table 3.3). To catch up with Apple,
Samsung invested heavily in developing new products and advertising. As one of such efforts,
Samsung aired its first “The Next Big Thing is already here” TV commercial on November 24,
2011—Thanksgiving Day. They emphasized the superiority of their Galaxy S2, recently
launched in September 2011, touting better product attributes (e.g., bigger screen, faster speed,
and longer battery life) than the iPhone 4S unveiled in October 2011. By mocking the Apple
fanboys and fangirls, who are always waiting for a long time in a long line, the Samsung
campaign became one of the most viral TV commercials.
Table 3.3 U.S. smartphone market share from March to May 2011
Brand Market Share
Apple 26.6
Blackberry 24.7
HTC 11.8
Motorola 11.4
Samsung 8.9
Others 16.6
source: comScore MobiLens
After receiving a strong response from the campaign, Samsung ran “The Next Big Thing”
commercial series for several years, featuring their new smartphones or tablets. In this study, we
focus on the first campaign, which was run for about 5 weeks from November 24 to December
26, 2011. We also analyze the post-campaign period to see whether the effect lasted after the TV
commercial was no longer airing. As Samsung ran another comparative advertising campaign
102
against Apple four weeks after the focal campaign, the three-week span between the two
campaigns is our post-campaign period.
Samsung made two types of content in the first campaign, which we analyze in this study. The
content of the full 60-second version of the first commercial includes the Galaxy S2’s big screen
and fast 4G network, and iPhone’s short battery life. Samsung also aired five 30-second versions:
one for both screen and battery, three for only screen, and one for only 4G. To emphasize its big
screen, in the ad a woman watches video on her phone. The second ad campaign features cloud
service for storing music and movies; however, it had only one 30-second version and ran for
only three weeks. Overall, during the first “Next Big Thing” campaign, Samsung emphasized its
big screen for watching movies or videos the most in terms of the amount of time aired.
3.3. Empirical strategy and results
3.3.1. Empirical strategy
In this study, we test whether comparative advertising repositions rival brands closer together
based on changes in consumer search volumes. In order to measure brand distances using
aggregate search volume, we use two types of market-structure maps based on consumers’
search strategies. Then, using the brand distances obtained from each map, we do diff-in-diff
analysis to test whether a market leader moves closer to an advertising brand compared to other
non-advertised brands during and after the campaign.
Consumers may have different search strategies. To compare alternative brands, consumers may
search two brands simultaneously or one brand each time, sequentially. For simultaneous brand
searches, the researcher can easily identify which brands are considered together. However, that
is not the case with the sequential brand search. If researchers can’t access a consumer’s search
history, a single brand search provides very limited information in identifying relationships
between brands. Unfortunately, Google Trends provides only aggregate search volume trends
rather than consumer-level search data.
For the sequential brand search, we exploit mixed searches between brands and their own
product attributes. As an example, let’s suppose that some consumers want to know which screen
size on a smartphone is optimal for both portability and watching movies or videos. These
103
consumers may search “Apple screen” and “Samsung screen” sequentially. Although these
consumers did not search Apple and Samsung together, they are likely to consider both brands.
Through the common attribute “screen,” researchers can get some information about the
relationship between the two brands.
Considering the above search strategies, we draw two types of market-structure maps to locate
brands (1) in pure brand space and (2) in product space. In the first approach, we measure
distances between brands directly from simultaneous brand searches. For the first brand space
map, we count the co-occurrence of searches for brands (e.g., Samsung Apple; see Table 3.2 for
queries used) weekly for all the brand pairs at the U.S. national level. This becomes a similarity
matrix, where each cell includes the number of co-searches between two brands. A high value
means that two brands are often searched together. To draw a map using multidimensional
scaling, one needs a distance matrix. Thus, we inverse the number in each cell of the similarity
matrix. As more consumers consider two brands together for their purchase, the number of
simultaneous searches on the two brands increases, and thus the two brands locate more closely
in the consumers’ brand consideration space.
Table 3.4 Google trend queries used for extracting brand-product attribute trend for Apple
Product Attribute Query
App App Apple + Apple Apps + iPhone App + iPhone Apps
Screen Screen Apple + iPhone Screen
4G 4G Apple + iPhone 4G
Videos Videos Apple + Apple Video + iPhone Videos + iPhone Video
Voice Voice Apple + Apple Siri + iPhone Voice + iPhone Siri
Text Text Apple + iPhone Text
Pictures Pictures Apple + Apple Picture + iPhone Pictures + iPhone Picture
Data Data Apple + iPhone Data
Music Music Apple + iPhone Music
Internet Internet Apple + iPhone Internet
Battery Battery Apple + iPhone Battery
Camera Camera Apple + iPhone Camera
Map Map Apple + Apple Maps + iPhone Map + iPhone Maps
3D 3D Apple + iPhone 3D
Movies Movies Apple + Apple Movie + iPhone Movies + iPhone Movie
Cloud Cloud Apple + Apple iCloud + iPhone Cloud + iPhone iCloud
Notes: terms separated by a space will match searches with all the terms. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.
104
In the second approach, we measure brand distances indirectly by exploiting mixed searches
between brands and product attributes. For the second product space map, we count searches
including both a brand and a product attribute (e.g., Samsung screen; see Table 3.4 for queries
used for Apple) weekly for all the pairs between brands and attributes at the U.S. national level.
It becomes a contingency table with a brand column and a product attribute row. Then, we use
correspondence analysis proposed by Hirschfeld (1935) and later developed by Benzécri (1973).
In the resulting map, each brand locates closest to the attribute most co-searched with it. If two
brands have a common attribute that is the most co-searched attribute, those two brands are
likely to locate more closely than any other brands in a product space map.
Once a map is drawn, one can easily measure time-varying brand distances. Using the brand
distances derived from the U.S. nationwide weekly co-search volumes, we test whether
comparative advertising repositions both rival brands closer together. However, there is one
challenge, in that comparative advertising may reposition not only its focal brands, but also other
non-advertised brands in a map. Given that display advertising increases searches for even
competing brands that are not mentioned in the advertising (Leweis and Nguyen 2015), TV
advertising also may increase co-searches between advertised and non-advertised brands. If this
is the case, checking a distance change only between an advertising and a target (rival) brand
would not be enough to test our research question. To address such co-movement issues, we do
diff-in-diff analysis in our study. Diff-in-diff produces a natural measurement of the relative
distance change between the advertised rival brands and other non-advertised brands.
Another issue is to decide on a proper counterfactual distance as a control group. There are two
candidates: (1) distances between an advertising brand and non-advertised ones and (2) distances
between a target brand (a market leader in this study) and non-advertised ones. While either way
would be acceptable in other studies, we think that the second candidate is more appropriate in
our study due to the position of a target brand. As the market leader tends to be searched together
with other brands or product attributes more than any other brands, it is likely to locate in the
center of a market-structure map, which means the position of a market leader can serve as a
base point in measuring distance with other brands. Here is an example. Let’s suppose that a
market leader locates between an advertising and a non-advertised brand. Further, let’s assume
that comparative advertising affects both an advertising and a market leader except non-
advertised brands. During a campaign, when an advertising brand becomes a market leader, the
105
distance between an advertising and a non-advertised brand also decreases while the distance
between a market leader and a non-advertised brand does not change. This example favors the
second candidate as a proper control group.
Considering the above argument, we test whether a target brand locates closer to an advertising
brand compared to non-advertised brands during and after the campaign. Our identification
assumption is that no other factors affect the change in relative distance between a market leader
and an advertising brand compared to other brands except the comparative advertising campaign.
In our focal TV comparative advertising campaign, while Apple and Samsung are the target and
the advertising brands, respectively, Blackberry and HTC are non-advertised brands. Our
treatment is the distance between Apple and Samsung. Our control groups are the distances
between (1) Apple and Blackberry and (2) Apple and HTC. A brand pair’s distance at week is
𝐷𝑖𝑡 = 𝛽1𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 + 𝛽2𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔×𝐴𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 +
𝜇𝑖 + 𝜏𝑡 + 𝜀𝑖𝑡 (1)
Where 𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔 = 1 for an Apple and Samsung brand pair.
3.3.2. Main results using direct approach in brand space
Figure 3.3 shows the co-search trend for brands. Co-searches between the market leader Apple
and the advertiser Samsung jump in the first week of the campaign, and then decrease. As
Christmas season approaches, co-searches increase again. Co-searches between Apple and the
other non-advertised brands (i.e., Blackberry and HTC) also show a similar trend, except that the
increase of co-searches in the first week of a TV commercial is much smaller than that of the
advertised rival brands.
Figure 3.3 Co-search trend by a brand pair with Apple
106
Now, we do diff-in-diff analysis using Equation (1) with co-search volumes as outcomes.
Column (1) in Table 3.5 shows that the coefficient for effectiveness during the campaign is
positive and significant, suggesting that the increase in co-searching between Apple and
Samsung is significantly higher than that between Apple and the other non-advertised brands.
The coefficient for effectiveness after the campaign is also positive although it is not significant,
suggesting that the increased co-search gap lasts somewhat even after the TV commercial is no
longer airing.
The above pattern in the co-searching trend is reflected in a market-structure map in Figure 3.4.
We use classical metric multidimensional scaling with the cmdscale function in R’s stats library,
also known as principal coordinates analysis (Gower, 1966). With comparative advertising, the
target brand Apple seems to be closer to the advertising brand Samsung than to the non-
advertised brands (i.e., Blackberry and HTC).
Comparative Ad.
Campaign Samsung
Galaxy S2
In U.S.
Apple
iPhone 4S
107
Figure 3.4 Market-structure map in brand space
Week -2 Week +2
Figure 3.5 Samsung moves closer to Apple during campaign in brand space.
Figure 3.5 shows the trend of brand distances measured from the brand space maps in Figure 3.4.
Overall, Samsung is closer to Apple than the other two brands are. Around Apple’s iPhone 4S
launch, the distances between Apple and all the other three brands decrease and then increase.
108
During Samsung’s comparative advertising, Samsung moves closer to Apple and then the
distance increases gradually, which is consistent with the decreasing marginal effect of
advertising exposure. On the contrary, Blackberry and HTC move away from Apple from the
second week of the campaign.
Table 3.5 Apple becomes closer to Samsung than the other brands during and after the campaign in
brand space.
Co-search between Brands
Direct Approach in Brand Space
Base Before &
After Nexus - Nexus
(1) (2) (3) (4)
Samsung_Apple x During (5 weeks) 8.83** (3.77)
-0.0250*** (0.0082)
-0.0170* (0.00905)
x Week 1~3 -0.0242** (0.0100)
x Week 4~5 -0.0262** (0.0118)
Samsung_Apple x After 5.27
(4.05) -0.0266* (0.0141)
-0.0266* (0.0143)
-0.0118 (0.0074)
R-sq 0.827 0.866 0.866 0.873
No. of brand pair dummies 2
No. of week dummies 22
No. of observations 69
Note. A dependent variable is distance between two brands measured in brand space except the first
column, whose dependent variable is co-search between two brands. Robust standard errors are
clustered at week level (23 weeks). ***p < 0.01; ** p< 0.05; *p<0.10.
Now, we test formally using Equation (1). Column (2) in Table 3.5 shows that the two
coefficients for the interaction effects are negative and significant. These results show that
market leader Apple becomes closer to an advertiser Samsung than to the other brands during
and even after the campaign. By comparing the result in Column (1), we find that the coefficient
for “after campaign” is significant in the brand space map, but not in co-search volume. This is
because brand positions in a map are also affected by co-search volumes of the other brand pairs,
which are not used in Column (1).
Around the fourth week of the campaign, on December 15, 2011, Samsung launched Galaxy
Nexus in the U.S., which is the third smartphone in the Google Nexus series. Because its launch
109
timing does not overlap with the start of its comparative advertising, we are not concerned much
with this issue. However, it could affect the effect size. To address this issue, we measure the
effects for both before and after the Nexus phone launch. Column (3) in Table 3.5 shows that
both coefficients for “during campaign” are negative and significant. In this exercise, the first
coefficient is our main interest because it suggests that the comparative campaign rather than the
Nexus phone launch indeed repositions both rival brands.
One might still have concerns about the new product effect as consumers tend to wait and search
for a new product even before its launch. To reduce this concern, we extract search volumes
without Nexus as a keyword: for example, “Apple Samsung – Nexus” as the query in Google
Trends. However, this is a conservative approach. If the comparative advertising indeed affects
brand co-searches, it could boost “iPhone Galaxy Nexus” as well as “iPhone Galaxy.” Therefore,
by eliminating the queries with Nexus, the effect is likely to be underestimated. Column (4)
shows that the coefficient for “during” is negative and significant, although its effect size is
smaller than that in Column (2), which does not exclude brand co-searches with Nexus.
However, another coefficient for “after” is not any more significant. Its effect size decreases
much more than that for “during” compared to that in Column (2). This pattern seems to be
driven by the Nexus phone’s success. In other words, as Nexus was getting popular, consumers
might have searched more using an “iPhone Galaxy Nexus” query. In spite of such a strong test,
these results suggest that comparative advertising moves market leader Apple closer to advertiser
Samsung than to the other brands, at least during the campaign in brand space.
3.3.3. Mechanism check using indirect approach in product space
One can measure distances between brands and their own product attributes in addition to
distances among brands in a product space map drawn using the correspondence analysis. This
allows us to investigate why comparative advertising repositions rival brands closer. First, during
the comparative advertising, we check whether and how long co-searches between advertised
brands and product attributes increase by the type of attributes. Second, we test whether an
advertising brand (i.e., Samsung) and a target brand (i.e., Apple) move closer to each other in
product space.
110
We classify product attributes in Table 3.6 into three types: (1) advertised, (2) ad-related, and (3)
unadvertised ones. First, advertised attributes are explicitly mentioned or seen in a TV
commercial. Recall that this campaign focuses the most on Samsung Galaxy S2’s huge screen,
which is good for watching videos or movies. The second most emphasized attribute is a fast 4G
network. The last one is Apple iPhone’s short-lived battery.
Table 3.6 Growth rate & change of co-search between each attribute and its brand
Attribute Type
Attribute
Samsung Galaxy Apple iPhone
Growth (%)
Change Growth
(%) Change
Advertised
Movies 100.0 2 10.5 2
Battery 60.0 3 0.0 0
Screen 38.5 5 9.3 4
Videos 33.3 2 18.3 8
4G 20.0 5 14.7 5
Ad-related
Pictures 100.0 4 29.6 8
Internet 100.0 2 0.0 0
Camera 66.7 2 29.4 5
Data 66.7 2 0.0 0
3G 60.0 3 3.8 4
3D 40.0 2 -20.0 -1
Wifi 33.3 1 9.8 4
Charger 0.0 0 0.0 0
Unadvertised
Text 33.3 1 2.9 1
App 11.8 2 3.4 1
Map 0.0 0 66.7 4
Voice 0.0 0 12.2 7
Music -42.9 -3 0.0 0
Cloud NA NA -12.5 -3
Note. There is not enough co-search between cloud and Samsung.
Second, ad-related attributes have similar benefits to the advertised attributes. For example, the
advertised big screen for watching videos or movies offers entertainment benefits. The ability to
watch “3D” movies, a “camera” for taking better “pictures,” and even listening to “music”
provide such entertainment value. Similarly, the advertised 4G improves “internet” speed to
download “data” quickly.
111
Lastly, there are several unadvertised attributes: app, voice, text, map, and cloud. Among them,
in fact, cloud was advertised in the second campaign. Because we do not include the second
campaign in this product attribute analysis, the cloud is grouped into the unadvertised one.
Figure 3.6(a) shows co-searches with advertised product attributes for both the advertiser
Samsung and the target Apple, respectively. Across most advertised attributes, co-search volume
increases in the first week of the campaign, as shown in Table 3.6, and then decreases. On the
other hand, co-searches for unadvertised attributes in Figure 3.6(b) show a much more stable
trend compared to those for advertised or related ones, except Apple’s voice. Especially,
Samsung’s unadvertised attributes, except its App, do not change in their co-searches in the first
week of the campaign. Co-searches for Samsung App increase only slightly.
Figure 3.6(a) Co-searches between brands and advertised attributes increase in the first week of the
comparative advertising campaign.
Note: Campaign periods: Week 16 to 20, the launch of Apple 4S: Week 8
112
Figure 3.6(b) Co-searches between brands and unadvertised attributes do not change much in the first
week of the comparative advertising campaign.
Note: Campaign periods: Week 16 to 20, the launch of Apple 4S: Week 8
Table 3.7 Difference in Difference for co-searches between brands and their attributes
Brand Attributes
Type
No. of
Attributes
Before
Ad*
After
Ad**
Mean of the
Differences
Paired
t-test P-value
Mean of the
Growth Rates (%)
Advertiser:
Samsung Galaxy
Advertised 5 10.20 13.60 3.00 3.40 0.007 50.36
Ad-related 8 3.50 5.50 2.00 4.73 0.002 58.33
Unadvertised 5 6.2 6.2 0 0 1.000 0.45
Target Brand:
Apple iPhone
(Market Leader)
Advertised 5 30.67 34.53 3.86 2.75 0.051 10.56
Ad-related 8 32.00 34.50 3.50 2.50 0.063 6.57
Unadvertised 6 29.67 31.29 1.62 1.18 0.291 12.11
Note. *Before Ad: one week before the campaign, **After Ad: the first week of the campaign
From the above observations, we test whether co-searches increase in the first week of the
campaign compared to one week before the campaign by each brand’s attribute type. The results
of a paired t-test are shown in Table 3.7. Reflecting the pattern in Figure 3.6, co-searches
between both brands and both advertised and ad-related attributes increase significantly, while
those for unadvertised attributes do not change significantly. These results suggest three things.
113
First, advertising increases searches for an advertised attribute as well as its brand. Second, there
is spillover into advertising-related attributes. Third, not only the attacking, but also the attacked
brand gains in co-searches with its attributes. However, an attacking brand gains more. While
both brands gain a similar amount of co-searches (see Column “Mean of the Difference”), the
growth rate (see Column “Mean of the Growth Rates”) is much bigger for advertiser Samsung
because its lower market share has a much smaller search volume than market leader Apple.
Now, we turn to product space. To draw a market-structure map in product space, we do
correspondence analysis with the ca function in R’s ca package. In Figure 3.7, we plot brands
and attributes using the first two dimensions, which explain 94.8% variance. While Apple locates
closely to movies and videos, Blackberry is near App. Samsung and HTC position closely
around 4G.
Figure 3.7 Market-structure map in product space
Week -1 Week +1
By looking into each attribute’s growth rate in co-searches in Table 3.7 and Figure 3.7, in the
first week of the campaign, we find that all the advertised or ad-related attributes that locate
between advertiser Samsung and market leader Apple increase in co-searches with either
Samsung or Apple. Moreover, the growth rate tends to be bigger for attributes that are between
both brands than those that are not. Co-searches between Samsung and movies, which is between
114
Samsung and Apple, have the highest growth rate. Lastly, none of the unadvertised attributes are
between the two brands in a map.
Figure 3.8 Samsung moves closer to Apple during campaign in product space.
Table 3.8 Apple becomes closer to Samsung than the other brands but insignificantly during and after
the campaign in product space.
Brand-attribute pair
Indirect approach
in product space
Samsung_Apple x During -0.0117 (0.0277)
Samsung_Apple x After -0.0309 (0.0335)
R-sq
No. of brand pair dummies 2
No. of week dummies 22
No. of observations 69
Note. A dependent variable is distance between two brands measured in product space. Robust
standard errors are clustered at week level (23 weeks). ***p < 0.01; ** p< 0.05; *p<0.10.
115
Figure 3.8 shows the trend of brand distances measured from the product space maps in Figure
3.7. In the first week of the campaign, Samsung moves closer to Apple, while HTC and
Blackberry move more distant from Apple. During the 5-week campaign period, Apple looks
closer to Samsung than to the other two brands. Table 3.8 shows the results of diff-in-diff
analysis using Equation (1) for brand distances in product space; both coefficients for
effectiveness during and after the campaign are negative but insignificant. It suggests that Apple
becomes closer to Samsung than to the other brands, but not too much.
In summary, one possible reason that comparative advertising repositions both rival brands
Samsung and Apple closer together is that consumers do co-searches more between advertised
brands and their advertised attributes (i.e., videos, movies, and screen). However, this does not
seem to be a major force. Instead, consumers directly search more for the rival brands targeted in
comparative advertising.
3.4. Conclusion
In this study, we show that comparative advertising repositions rival brands closer together using
weekly aggregate search volume from Google Trends. As a mechanism, we find that direct brand
comparison (e.g., Apple vs. Samsung) is a major force. While consumers do indirect brand
comparison through advertised attributes (e.g., Apple screen vs. Samsung screen), such indirect
co-searches between brands and their attributes only rise in the beginning of the campaign. Our
results suggest that a brand with a lower market share may benefit from comparative advertising
against a market leader by forcing itself to be considered alongside a market leader when
consumers do brand searches.
An important limitation in this research is due to the nature of aggregate data. Google Trends
provides only aggregate search volume rather than individual search history. As a result,
researchers do not observe exactly which brands each consumer searched for in Google. To
overcome this problem, we exploited co-searching (1) between brands and (2) between brands
and their own attributes. Given that we test how brands reposition before and after a campaign
rather than focus on generating an exact market-structure map, our results would be reliable
unless many consumers change their search strategy around marketing activity. Instead, if the
research goal is to visualize the exact relationship among brands, consumer-level search history
116
would generate a more accurate map than aggregate search volume. We leave this topic as one
for future research.
117
References
• Adams H, Tausz A (2015) JavaPlex tutorial.
http://www.math.colostate.edu/~adams/research /javaplex_tutorial.pdf
• Adams H, Tausz A, Vejdemo-Johansson M (2014) JavaPlex: A research software package for
persistent (Co) homology. Proceedings of ICMS 2014, H. Hong and C. Yap (Eds.), Springer-
Verlag Berlin Heidelberg 129–136.
• Ailawadi, Kusum, Donald R. Lehmann, and Scott A. Neslin (2003), “Revenue Premium as an
Outcome Measure of Brand Equity,” Journal of Marketing, 67 (October), 1–17.
• Ailawadi KL, Keller KL (2004) Understanding Retail Branding: Conceptual Insights and
Research Priorities. Journal of Retailing 80(4):331-342.
• Alba, Joseph W., and Chattopadhyay Amitava (1986) Salience Effects in Brand Recall. Journal
of Marketing Research 23(4): 363-69
• Andreasen, Alan R. (1995) Marketing Social Change: Changing Behavior to Promote Health,
Social Development, and the Environment, Jossey-Bass 1st ed.
• Archak N, Ghose A, Ipeirotis PG (2011) Deriving the pricing power of product features by
mining consumer reviews. Management Sci. 57(8):1485–1509.
• Armstrong MA (1983) Basic topology. Springer, New York, Berlin.
• Ayasdi (2015) TDA and machine learning: Better together.
http://www.ayasdi.com/resources/tda-and-machine-learning-better-together-via-intro-tda/.
• Ayasdi (2016) website, http://www.ayasdi.com/industries/communications/personalized-
marketing/, accessed on February 29, 2016.
• Barbiero A and Ferrari PA (2014) Simulation of correlated Poisson variables, Applied
Stochastic Models in Business and Industry 31(5):669–680
• Benzécri, J.-P. (1973). L'Analyse des Données. Volume II. L'Analyse des Correspondances.
Paris, France: Dunod.
• Bergen M, Peteraf MA (2002) Competitor identification and competitor analysis: A broad-
based managerial approach. Managerial and Decision Economics 23(4-5):157-169.
• Blasco, Andrea., Pin, Paolo., Sobbrio, Francesco., 2016. Paying Positive to Go Negative:
Advertisers' Competition and Media Reports. European Economic Review 83, 243–261
• Blei, David M, Andrew Y Ng, and Michael I Jordan (2003) Latent dirichlet allocation. JMLR,
3:993-1022
• Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in
large networks. Journal of Statistical Mechanics: Theory and Experiment (10):P10008.
• Borkovsky, Ron N., Avi Goldfarb, Avery Haviv, and Sridhar Moorthy (2016) Measuring and
Understanding Brand Value in a Dynamic Model of Brand Management. Marketing Science
Forthcoming
• Büschken, Joachim, and Greg M. Allenby (2016) Sentence-Based Text Analysis for Customer
Reviews. Marketing Science Forthcoming
• Bronnenberg BJ, Kruger MW, Mela CF (2008) Database paper: The IRI marketing data
Set. Marketing Science 27(4):745-748.
• Brown, Tom J. and Peter A. Dacin (1997), “The Company and the Product: Corporate
Associations and Consumer Product Responses,” Journal of Marketing, 61 (January), 68-84.
118
• Cameron, G. T. (1994). Does publicity outperform advertising? An experimental test of the
third-party endorsement. Journal of Public Relations Research, 6, 185–207.
• Carlsson G (2009) Topology and data. Bulletin of the American Mathematical
Society 46(2):255–308.
• Chang, Chun-Tuan (2008), “To Donate or Not to Donate? Product Characteristics and Framing
Effects of Cause-Related Marketing on Consumer Purchase Behavior,” Psychology and
Marketing (12), 1089-1110.
• Ching, Andrew T, Robert Clark, Ignatius Horstmann, Hyunwoo Lim (2016) The Effects of
Publicity on Demand: The Case of Anti-Cholesterol Drugs. Marketing Science 35(1):158-181
• Chintagunta PK, Jiang R, Jin GZ (2009) Information, learning, and drug diffusion: The case
of Cox-2 inhibitors. Quant. Marketing Econom. 7(4):399–443.
• Ciambriello, Roo (2014) How Ads That Empower Women Are Boosting Sales and Bettering
the Industry: Advertising Week panel spotlights 'fem-vertising'. Advertising Week October 3,
2014 http://www.adweek.com/news/advertising-branding/how-ads-empower-women-are-
boosting-sales-and-bettering-industry-160539
• Clauset A, Newman MEJ, Moore C (2004). Finding community structure in very large
networks. http://www.arxiv.org/abs/cond-mat/0408187.
• Conley, Timothy G. and Christopher R. Taber (2011) Inference with "Difference in
Differences" with a Small Number of Policy Changes. The Review of Economics and Statistics,
February 2011, 93(1): 113–125
• Connolly, Katie 2011 Six ads that changed the way you think. BBC News, Washington
http://www.bbc.com/news/world-us-canada-11963364
• Cooper LG, Inoue A (1996) Building market structures from consumer preferences. Journal
of Marketing Research 33(3):293–306.
• Datamonitor (2005) Dove Campaign for Real Beauty case study: Innovative marketing
strategies in the beauty industy, 2005 June
• De Smet, D., Vanormelingen, S., (2012) The Advertiser is Mentioned Twice. Media Bias in
Belgian Newspapers. HUB Research Papers 2012/05.
• DeSarbo WS, Grewal R (2007) An alternative efficient representation of demand‐ based
competitive asymmetry. Strategic Management Journal 28(7):755-766.
• DeSarbo WS, Grewal R, Wind J (2006) Who competes with whom? A demand-based
perspective for identifying and representing asymmetric competition. Strategic Management
Journal 27(2):101-129.
• DeSarbo WS, Manrai AK, Manrai LA (1993) Non-spatial tree models for the assessment of
comparative maket structure: An integrated review of the marketing and psychometric
literature. Eliashberg J, Lilien G, eds. Handbook in operations research and marketing science,
North Holland, Amsterdam, 193-257.
• DeSarbo WS, Soete GD. 1984. On the Use of Hierarchical Clustering for the Analysis of
Nonsymmetric Proximities. Journal of Consumer Research 11(1) 601-610.
• Dove website (2015) The Dove Campaign for Real Beauty. Http://www.dove.us/Social-
Mission/campaign-for-real-beauty.aspx (Last visited on Dec. 5 2015).
• Dove website (2015) The Dove Campaign for Real Beauty. Http://www.dove.us/Social-
Mission/campaign-for-real-beauty.aspx (Last visited on Dec. 5 2015).
119
• Dove website (2016) Dove Vision. http://www.dove.com/us/en/stories/about-dove/our-
vision.html (Last visited on Dec. 27 2016).
• Drumwright, Minette E. (1996) Company Advertising with a Social Dimension: The Role of
Noneconomic Criteria. Journal of Marketing 60(4):71-87
• Du, Hu & Damangir (2015) “Leveraging trends in online searches for product features in
market response modeling”, Journal of Marketing
• Edelsbrunner H, Harer J (2010) Computational topology: An introduction. American
Mathematical Society, Providence RI.
• Edelsbrunner H, Letscher D, Zomorodian A. (2002) Topological persistence and simplication.
Discrete and Computational Geometry 28:511-533.
• Ellman, M., Germano, F., 2009. What do the papers sell? A model of Advertising and Media
Bias. Econ. J. 119 (537), 680–704.
• Elrod T, Russell GJ, Shocker AD, Andrews RL, Bacon L, Bayus, Carroll JD, Johnson RM,
Kamakura WRA, Lenk P, Mazanec JA, Rao VR, Shankar V. (2002) Inferring market structure
from customer response to competing and complementary products. Marketing Letters 13(3):
221–32.
• Erdem T (1996) A dynamic analysis of market structure based on panel data. Marketing Science
15(4):359-378.
• Erdem T, Keane MP (1996) Decision-Making Under Uncertainty: Capturing Dynamic Choice
Processes in Turbulent Consumer Good Markets Marketing Science 15(1): 1–20.
• Etcoff, Nancy, Susie Orbach, Jennifer Scott, Heidi D’Agostino (2004) The real truth about
beauty: a global report, findings of the global study on women, beauty and well-being,
http://www.dove.us/docs/pdf/19_08_10_The_Truth_About_Beauty-White_Paper_2.pdf
• Focke, Florens, Alexandra Niessen-Ruenzi, and Stefan Ruenzi, 2016 A Friendly Tur
• n: Advertising Bias in the News Media. Tech Report, Universität Mannheim.
• Folse, Judith A.G., Ronald W. Niedrich, and Stacy L. Grau (2010), “Cause-Related Marketing:
The Effect of Purchase Quantity and Firm Donation Amount on Consumer Inferences and
Participation Intentions,” Journal of Retailing, 86 (4), 295-309.
• Fossen, Beth L. and David A. Schweidel (2016) Television Advertising and Online Word-of-
mouth: An Empirical Investigation of Social TV Activity. Conditionally accepted at Marketing
Science
• Freeman L (1977) A set of measures of centrality based on betweenness. Sociometry 40: 35–
41.
• France S, Ghose S (2016) An analysis and visualization methodology for identifying and
testing market structure. Marketing Science 35(1): 182 – 197.
• Gabszewicz, Jean J., Didier Laussel, and Nathalie Sonnac (2002), “Press Advertising and the
Political Differentiation of Newspapers,” Journal of Public Economic Theory, 4 (July), 317–
34.
• Gal-Or, Esther, Tansev Geylani, Tuba Pinar Yildirim (2012) The Impact of Advertising on
Media Bias. Journal of Marketing Research: February 2012, Vol. 49, No. 1, pp. 92-99.
• Gambaro, M., Puglisi, R., 2015. What do ads buy? Daily coverage of listed companies on the
Italian press, European Journal of Political Economy, 39, 41-57
• Garbett, Thomas F. 1981. Corporate Advertising. New York: McGraw-Hill.
120
• Gary, Erickson and Robert Jacobson (1992), “Gaining Comparative Advantage Through
Discretionary Expenditures: The Returns to R&D and Advertising,” Management Science, 38
(9), 1264–79.
• Gentzkow, Matthew and Jesse M. Shapiro (2010) What Drives Media Slant? Evidence from
U.S. Daily Newspapers. Econometrica 78 (1) 35-71
• Ghose A, Ipeirotis PG, Li B (2012) Designing ranking systems for hotels on travel search
engines by mining user-generated and crowdsourced content. Marketing Sci. 31(3):493–520.
• Girvan M, Newman ME (2002) Community structure in social and biological networks.
Proceedings of the National Academy of Sciences 99(12):7821-7826.
• Gopinath S, Thomas JS, Krishnamurthi L (2014) Investigating the relationship between the
content of online word of mouth,advertising, and brand performance. Marketing Sci.
33(2):241–258.
• Gower, J. C. (1966) Some distance properties of latent root and vector methods used in
multivariate analysis. Biometrika 53, 325–328.
• Gromov M (1987) Hyperbolic groups. Essays in group theory, Mathematical Sciences
Research Institute Publications 8, Springer-Verlag, 75–263.
• Gurun, Umit G. and Alexander W. Butler. 2012. Don't believe the Hype: Local Media Slant,
Local Advertising, and Firm Value." Journal of Finance 67 (2):561-597.
• Harald J. Van Heerde, Els Gijsbrechts, and Koen Pauwels (2015) Fanning the Flames? How
Media Coverage of a Price War Affects Retailers, Consumers, and Investors. Journal of
Marketing Research: October 2015, Vol. 52, No. 5, pp. 674-693.
• Hatcher A (2002) Algebraic topology. Cambridge University Press, Cambridge
• Hausmann JC (1995) On the Vietoris–Rips complexes and a cohomology theory for metric
spaces. Prospects in Topology: Proceedings of a conference in honour of William Browder,
Annals of Mathematics Studies 138, Princeton Univ. Press, 175–188.
• Henderson GR, Iacobucci D, Calder BJ (1998), Brand Diagnostics: Mapping Branding Effect
Using Consumer Associative Networks. European Journal of Operational Research, 111
(December), 306–327.
• Hirschfeld, H.O. (1935) "A connection between correlation and contingency", Proc.
Cambridge Philosophical Society, 31, 520–524
• Hoffman, D. Novak, T (2015) Emergent Experience and the Connected Consumer in the Smart
Home Assemblage and the Internet of Things. Working paper, George Washington University.
• Honiq, Zach (2012) “Apple files German lawsuit against Samsung, targets Galaxy S II, nine
other smartphones”, Engadget, January 17th 2012,
http://www.engadget.com/2012/01/17/apple-files-another-german-lawsuit-against-samsung-
targets-gala/
• Hovland, C. I., & Weiss, W. (1951). The influence of source credibility on communication
effectiveness. Public Opinion Quarterly, 15, 635–650.
• Hu, Du & Damangir (2014) “Decomposing the Impact of Advertising: Augmenting Sales with
Online Search Data”, Journal of marketing research
• Hull, Clyde E. and Sandra Rothenberg (2008), “Firm Performance: The Interactions of
Corporate Social Performance with Innovation and Industry Differentiation,” Strategic
Management Journal, 29 (7), 781-89.
121
• Jain, Subhash C. and Edwin C. Hackleman (1978) How Effective is Comparison Advertising
for Stimulating Brand Recall? Journal of Advertising 7(3): 20-25
• John DR, Loken B, Kim K, Monga AB (2006) Brand concept maps: A methodology for
identifying brand association networks. Journal of Marketing Research 43(4):549–563.
• Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241-254.
• Joo, Mingyu, Kenneth C. Wilbur, Bo Cowgill, Yi Zhu, (2015) Television Advertising and
Online Search, Management Science,
• Kalra A, Li S, Zhang W (2011) Understanding responses to contradictory information about
products. Marketing Sci. 30(6): 1098–1114
• Kamakura WA, Russell GJ (1989) A Probabilistic Choice Model for Market Segmentation and
Elasticity Structure Journal of Marketing Research, 26 (November), 87–96.
• Kerkhof, Anna and Johannes Münster, 2015 restrictions on advertising, commercial media bias,
and welfare, Journal of Public Economics, 131, 124-141
• Kim JB, Albuquerque P, Bronnenberg BJ (2011) Mapping online consumer search. Journal of
Marketing Research 48(1):13-27.
• Kolstad, Jonathan (2007) Unilever PLC: Campaign for Real Beauty campaign. Encyclopedia
of Major Marketing Campaigns, volume 2. Thomson Gale 1679-1683
• Koschate-Fischer, Nicole, Isabel V. Stefan, and Wayne D. Hoyer (2012), “Willingness to Pay
for Cause-Related Marketing: The Impact of Donation Amount and Moderating Effects,”
Journal of Marketing Research, 49 (December), 910-27.
• Kotler, Philip and Gerald Zaltman 1971 Social Marketing: An Approach to Planned Social
Change. Journal of Marketing, 35 (3)
• Kotler, Philip A., Ned Roberto, and Nancy R. Lee (2002) Social Marketing: Improving the
Quality of Life, Sage Publications 3rd Ed.
• Kotler, Philip and Nancy R. Lee (2007) Social Marketing: Influencing Behaviors for Good,
Sage Publications 3rd Ed.
• Kruger MW, Pagni D (2011) IRI academic data set description. Information Resources, Inc.
page 16
• Lattin JM, Carrol DJ, Green PE (2003) Analyzing multivariate data. Duxbury Resource Center,
Pacific Grove.
• Lee TY, Bradlow ET (2011) Automated marketing research using online customer reviews. J.
Marketing Res. 48(5):881–894.
• Lesnick M (2013) Studying the shape of data using topology. The Institute Letter. Institute for
Advanced Study, Summer Issue, page 10-11.
• Levine, Dan (2011). "U.S. judge says Samsung tablets infringe Apple patents". Reuters.com,
October 13, 2011, http://www.reuters.com/article/2011/10/13/us-apple-samsung-lawsuit-
idUSTRE79C79C20111013?feedType=RSS&feedName=businessNews&utm_source=dlvr.it
&utm_medium=twitter&dlvrit=56943
• Leone, Robert P. (1995), “Generalizing What Is Known About Temporal Aggregation and
Advertising Carryover,” Marketing Science, 14 (3), 141–50.
• Lewis, Randall, Dan Nguyen 2015 Display advertising’s competitive spillovers to consumer
search, Quantitative Marketing and Economics 13 (2), 93-115
122
• Li, H., & Kannan, P. K. (2014). Attributing conversions in a multichannel online marketing
environment: an empirical model and a field experiment. Journal of Marketing Research, 51(1),
40–56.
• Liaukonyte, Teixeira & Wilbur (2015) Television Advertising and Online Shopping,
Marketing Science,
• Liu Y (2006) Word-of-mouth for movies: Its dynamics and impact on box office revenue. J.
Marketing 70(3):74–89.
• Lord, K. R., & Putrevu, S. (1993). Advertising and publicity: an information processing
perspective. Journal of Economic Psychology, 14, 57–84.
• Ludwig S, de Ruyter K, Friedman M, Brüggen EC, Wetzels M, Pfann G (2013) More than
words: The influence of affective content and linguistic style matches in online reviews on
conversion rates. J. Marketing 77(1):87–103.
• Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, Alagappan M, Carlsson J,
Carlsson G (2013) Extracting insights from the shape of complex data using topology.
Scientific Reports 3, 1236.
• Luo, Xueming and Bhattacharya. C.B. (2006) Corporate Social Responsibility, Customer
Satisfaction, and Market Value. Journal of Marketing 70(4):1-18
• Luo, Xueming and Bhattacharya. C.B. (2009), “The Debate over Doing Good: Corporate
Social Performance, Strategic Marketing Levers, and Firm-Idiosyncratic Risk," Journal of
Marketing, 73 (November), 198-213.
• Mantrala, Murali K., Prasad A. Naik, Shrihari Sridhar, and Esther Thorson (2007), “Uphill or
Downhill? Locating the Firm on a Profit Function,” Journal of Marketing, 71 (April), 26–44.
• McQuail (2010) McQuail’s Mass Communication Theory
• Meredith, Macleod (2005) Advertisers bank on 'real women' to sell. The Spectator 24 Aug
2005 http://search.proquest.com/docview/270232751?accountid=14771
• Michelle Andrews, Xueming Luo, Zheng Fang and Jaakko Aspara. (2014) Cause Marketing
Effectiveness and the Moderating Role of Price Discounts. Journal of Marketing 78:6, 120-
142.
• Murry JR., John P., Antonie Stam and John L. Lastovicka (1996) Paid- versus Donated-Media
Strategies For Public Service Announcement Campaigns. Public Opinion Quarterly 60 (1): 1-
29.
• Netzer, Oded, Ronen Feldman, Jacob Goldenberg, Moshe Fresko, (2012) Mine Your Own
Business: Market-Structure Surveillance through Text Mining. Marketing Science 31(3):521-
543.
• Newman ME, Girvan M (2004) Finding and evaluating community structure in networks.
Physical Review E 69(2):026113.
• Omid and Pete 2015 How advertising has become an agent of social change Feb 10, 2015
https://medium.com/@moonstorming/how-advertising-has-become-an-agent-of-social-
change-148aa0ef303a#.xz6y429mm
• Onishi H, Manchanda P (2012) Marketing activity, blogging and sales. Internat. J. Res.
Marketing 29(3):221–234.
• Pauwels H, Stacey E, Lackman A (2013) Beyond likes and tweets: Marketing, online platforms
content, and store performance. MSI Report.
• Pew Research Center, 2014. The State of the News Media 2014. An Annual Report on
American Journalism (Washington DC).
123
• Pons P, Latapy M (2005) Computing communities in large networks using random walks.
http://arxiv.org/abs/physics/0512106
• Porter, M.F. (1997) An Algorithm for Suffix Stripping. Readings in Information Retrieval,
Karen Sparck Jones and Peter Willett, eds. San Francisco: Morgan Kaufmann Publishers, 313–
16.
• Pracejus, John W. and Norman R. Brown (2003), “On the Prevalence and Impact of Vague
Quantifiers in the Advertising of Cause-Related Marketing (CRM),” Journal of Advertising,
32 (4), 19-28.
• Punj G, Stewart DW (1983) Cluster analysis in marketing research: Review and suggestions
for application. Journal of Marketing Research 20(2):134-148.
• Raghubir, Priya, John Roberts, Katherine N Lemon and Russell S Winer. (2010) Why, When,
and How Should the Effect of Marketing Be Measured? A Stakeholder Perspective for
Corporate Social Responsibility Metrics. Journal of Public Policy & Marketing 29:1, 66-77.
• Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community
structures in large-scale networks. Physical Review E 76, 036106.
• Reuter,J.,Zitzewitz,E.,2006. Do ads influence editors? Advertising and bias in the financial
media. The Quarterly Journal of Economics (2006) 121 (1): 197-227.
• Rinallo, D.,Basuroy,S.,2009 Does advertising spending influence Media coverage of the
advertiser? Journal of Marketing 73, 33–46.
• Ringel DM, Skiera B (2016) Visualizing asymmetric competition among more than 1,000
products using big search data. Forthcoming at Marketing Science.
• Rips E (1982) Subgroups of small cancellation groups. Bulletin of the London Mathematical
Society 14 (1):45–47.
• Robinson, Stefanie R., Caglar Irmak, and Satish Jayachandran (2012), “Choice of Cause in
Cause-Related Marketing,” Journal of Marketing, 76 (July), 126-39
• Rotta R, Noack A (2011) Multilevel local search algorithms for modularity clustering. Journal
of Experimental Algorithmics 16:2-3.
• Saurabh Mishra and Sachin B. Modi. (2016) Corporate Social Responsibility and Shareholder
Wealth: The Role of Marketing Capability. Journal of Marketing 80:1, 26-46.
• Servaes, Henri and Ane Tamayo (2013), “The Impact of Corporate Social Responsibility on
Firm Value: The Role of Customer Awareness,” Management Science, 59, 1045-61.
• Sonnier GP, McAlister L, Rutz OJ (2011) A dynamic model of the effect of online
communications on firm sales. Marketing Sci. 30(4):702–716
• Spiteri, J., 2015. When Is No News Good News? A Model of Information Disclosure and
Commercial Media Bias. Working paper.
• Srinivasan S, Rutz OJ, Pauwels K (2015) Paths to and of purchase: quantifying the impact of
traditional marketing and online consumer activity. Journal of Academic Marketing Science
Forthcoming.
• Srivastava RK, Leone RP, Shocker AD (1981) Market Structure Analysis: Hierarchical
Clustering of Products Based on Substitution-in-use. Journal of Marketing 45(3):38-48.
• Srivastava RK, Alpert MI, Shocker AD (1984) A Customer-Oriented Approach for
Determining Market Structures. Journal of Marketing 48 (1):32–45.
• Sriram, S., Subramanian Balachander, and Manohar U. Kalwani (2007) Monitoring the
Dynamics of Brand Equity Using Store-Level Data. Journal of Marketing 71(2), 61–78
124
• Strahilevitz, Michal and John G. Myers (1998), “Donations to Charity as Purchase Incentives:
How Well They Work May Depend on What You Are Trying to Sell,” Journal of Consumer
Research, 24 (4), 434-46.
• Strömberg, David (2004), “Mass Media Competition, Political Competition, and Public Policy,”
Review of Economic Studies, 71 (January), 265–84.
• Taddy, Matt (2012) On estimation and selection for topic models. In Proceedings of the
Fifteenth international Conference on Artificial Intelligence and Statistics (AISTATS-12),
1184-119
• Tang C, Guo L (2013) Digging for gold with a simple tool: Validating text mining in studying
electronic word-of-mouth (eWOM) communication. Marketing Lett. 26(1):67–80.
• Tirunillai S, Tellis GJ (2012) Does chatter really matter? Dynamics of user-generated content
and stock performance. Marketing Sci. 31(2):198–215.
• Tirunillai, Seshadri and Gerard J. Tellis (2014) Mining Marketing Meaning from Online
Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation. Journal of
Marketing Research 51 (4) 463-479.
• Urban GL, Johnson PL, Hauser JR. (1984) Testing competitive market structures. Marketing
Science 3(2):83-112.
• Vietoris L (1927) Über den höheren Zusammenhang kompakter Räume und eine Klasse von
zusammenhangstreuen Abbildungen. Mathematische Annalen 97(1):454–472.
• Vingilis, Evelyn, and Barbara Coultes. 1990. "Mass Communications and Drinking-Driving:
Theories, Practice and Results." Alcohol, Drugs and Driving 6(2):61-81.
• Walker, Rob (2005) Social Lubricant–How a marketing campaign became the catalyst for a
societal debate. New York Times Magazine, September 4 2005
• http://www.nytimes.com/2005/09/04/magazine/social-lubricant.html?_r=0
• Ward JH (1963) Hierarchical grouping to optimize an objective function. Journal of the
American Statistical Association 58:236–244.
• Wilbur, K.C., 2008. A two-sided, empirical model of television advertising and viewing
markets. Mark. Sci. 27 (3), 356–378.
• Wojcicki, Susan (2016) Susan Wojcicki on the Effectiveness of Empowering Ads on YouTube.
Think with Google 2016 April https://think.storage.googleapis.com/docs/youtube-
empowering-ads-engage-a.pdf
• Wolf, Naomi (2002) The Beauty Myth: How Images of Beauty Are Used against Women. New
York: William Morrow, (originally published in 1991)
• Xiao, Liu, Singh Param Vir, Srinivasan Kannan (2016) A Structured Analysis of Unstructured
Big Data by Leveraging Cloud Computing, marketing science forthcoming
• Zhai Z, Liu B, Xu H, Jia P (2011) Clustering Product Features for Opinion Mining. In
Proceedings of the Fourth ACM International Conference on Web Search and Data Mining.
New York, NY. ACM, 347-354.
• Zhu, Yi and Anthony Dukes (2015) Selective Reporting of Factual Content by Commercial
Media. Journal of Marketing Research: February 2015, Vol. 52, No. 1, pp. 56-76.
• Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete and
Computational Geometry 33:249-274.
Appendices
Table A1 The number of sentences labeled as real beauty topics increases relative to that as other
beauty topics in the treated countries during the month(s) of the real beauty campaign.
Real Beauty X During Campaign
18.93*** (4.41)
Country-Topic Dummies 25
Year-Month Dummies 23
R-sq 0.830
Observations 624
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
The number of sentences is calculated by su
The U.S., Canada, and the U.K. have 10(2), 8(2), and 10(1) topics (real beauty ones), respectively.
Two real beauty topics in each country are aggregated into one topic.
Robust standard errors are clustered at the country-topic level. ***p < 0.01
126
Table A2 The number of sentences labeled as real beauty topics increases relative to that as other
beauty topics in the treated countries relative to control countries during and one month after the Real
Beauty campaign.
(1) (2)
Only
During
During &
After
Real Beauty x Treated Countries
x During Campaign
32.90***
(5.106)
32.38***
(5.047)
x One Month After Campaign 3.852*
(2.262)
x Two Months After Campaign -6.146
(6.177)
Real Beauty
x During Campaign
0.066
(0.647)
0.765
(0.778)
x One Month After Campaign -2.670***
(0.957)
x Two Months After Campaign 6.864***
(1.078)
Country-specific topic dummies 34 34
Year-month dummies 23 23
R-sq 0.738 0.739
Observations 840 840
Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.
Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),
respectively. Two real beauty topics in each country are aggregated into one topic.
Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty ones), respectively.
Robust standard errors are clustered at the country-topic level. ***p < 0.01, *p < 0.10