essays in advertising messages mass media ......learning in toronto, my wife, eunkyung, started...

138
ESSAYS IN ADVERTISING MESSAGES, MASS MEDIA, AND PRODUCT POSITIONING by Jun Bum Kwon A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Management University of Toronto © Copyright by Jun Bum Kwon 2017

Upload: others

Post on 22-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

ESSAYS IN ADVERTISING MESSAGES, MASS MEDIA, AND

PRODUCT POSITIONING

by

Jun Bum Kwon

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Graduate Department of Management University of Toronto

© Copyright by Jun Bum Kwon 2017

Page 2: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

ii

Essays in Advertising Messages, Mass Media,

and Product Positioning

Jun Bum Kwon

Doctor of Philosophy

Graduate Department of Management

University of Toronto

2017

Abstract

In my dissertation, I apply emerging big data methodologies to measure the effects of

advertising content and detect potential product segments. First, I explore whether advertising

messages can change the topics reported in mass media. Specifically, I examine whether Dove’s

real beauty campaign increased the incidence of real-beauty-related topics in newspapers. Using

a topic model, I segment beauty-related topics and identify topics related to real beauty. The

number of sentences labeled as real beauty topics increases during the campaign. While the Dove

campaign’s significant impact on real beauty topics around the time of the campaign holds even

in newspapers without Unilever ads, the impact is larger in newspapers containing Unilever ads.

Overall, this evidence is consistent with both a mass media’s public service role and an

advertiser pressure influencing mass media content.

In my next study, joint work with Avi Goldfarb and Trevor Snider, I introduce a method

for identifying potentially related products using topological data analysis (TDA). From both

simulated and real consumer purchase data, I show that “loopy segments” in TDA can connect

regionally separated local products through national products, while standard clustering methods

such as hierarchical clustering cannot.

Page 3: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

iii

Lastly, I test whether comparative advertising reposition rival brands closer together.

Using Google Trends’ aggregate consumer search data, I analyze Samsung’s U.S. television

comparative advertising campaign against Apple’s iPhone. I count co-occurrence of searches for

brand pairs (e.g. Samsung Apple) and their brand-product attributes (e.g. Samsung screen, Apple

Screen), to respectively map brand and product space. I find that advertised rival brands become

closer together in both brand and product spaces but not significant in product space. My results

suggest that a lower share brand may benefit from comparative advertising against a market

leader by forcing itself to be more considered alongside a market leader when consumers search

brands.

Page 4: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

iv

Acknowledgments

“In their hearts humans plan their course, but the Lord establishes their steps (Proverbs 16:9)”

Thanks, God, for giving me peace during my unpredictable long journey. Each step was full of

opportunities and challenges. You have raised me up in the middle of my struggles. You have

taught me, even in very dark periods, how I can look on the bright side, say “thank you” to

everyone around me, and stand up to face the challenges with bravery.

I can’t say thank you enough to my parents. My father has the special talent of encouraging

people. He is my role model. He has always told me to share what I have learned and help others,

especially the poor. My mom has always worked more than I could imagine. She did almost

everything for my father, my sisters, me, and even her grandsons. I know well that she is always

praying for our whole family, including me. I have always felt her sincere support, even though

we have lived on different continents for the past 9 years. I would also like to express special

thanks to my two sisters, who have supported my parents while I have been studying.

My current PhD advisor, Avi, has been an excellent mentor in every step from the research idea,

empirical strategy, and data analysis to the written, visual, and verbal communications. He has

the talent of being able to look at both the big picture and the details. One of his unique teachings

is deciding when to stop. It is always painful to stop and start again from scratch. However,

saying “good-bye” has opened other opportunities, has broadened my horizons, and has made me

more objective in my own work. In the future, I will be conducting many different researches

and life projects. In any project, the lessons learned from Avi will apply.

My next special thanks go to my dissertation committee, Andrew and Ron. Their comments have

been useful in improving my dissertation, my job interviews, and my job talk. Sometimes, they

were more serious about my research than I was. I also learned a lot from my dissertation exam

committee. Scott, from Dartmouth College, gave me comprehensive comments for all 3 of my

essays. By addressing his questions, I had the opportunity to think more about the underlying

fundamental issues in my thesis. Nitin and Matt have also provided me with critical feedback for

my papers.

Page 5: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

v

Several other professors have also highly impacted my overall research agenda and approach

through their classes and advice. My former advisor, Purush, in Milwaukee, showed me the

excitement of empirical work in the marketing field. Pradeep, in Chicago, taught overall

quantitative marketing modeling. Greg, in Ohio, taught several ground-breaking marketing

models with emphasis on Bayesian methods. Sridhar, in Toronto, has always explained complex

theoretical problems easily and intuitively. Victor, in Toronto, taught a variety of empirical

models with micro-economic foundations. Although I did not use their toolboxes much in my

dissertation, their perspectives and methods on marketing problems will impact my current and

future research.

My PhD life has not been determined solely by my own research. Community, indeed, matters.

We had deep, intellectual interactions among faculty and PhD students in research seminars.

PhD students in Toronto always helped each other by travelling to conference together and

sharing all kinds of information. In Milwaukee, my PhD colleagues and I participated in special

collaborations: commuting to Chicago to attend Pradeep’s class and taking Greg’s early morning

Skype class. Most of these things could not have been realized without my PhD colleagues.

Last, but not least, I am thankful to my own family. Since my departure from South Korea 9

years ago, my family has doubled. I strongly believe that my family is the best gift from God.

With my family, I have learned how to be content in whatever the circumstances (Philippians

4:11). Toronto, one of the most diverse and dynamic cities, has broadened my perspectives

knowledgably, socially, mentally, and spiritually. I interact with diverse people in my school, my

U of Toronto family apartment, and my church communities daily. Furthermore, Toronto has

provided my family with a variety of opportunities. My sons, Junyoung and Jaeyoung, have

many friends from many different nations. Motivated by current trends in big data and machine

learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her

expertise in computer science. I believe that God will keep helping and guiding me and my

family in our next city, Sydney, Australia.

Page 6: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

vi

Table of Contents

Acknowledgments.......................................................................................................................... iv

Table of Contents ........................................................................................................................... vi

List of Tables ............................................................................................................................... viii

List of Figures ................................................................................................................................ xi

1.1 Introduction ..........................................................................................................................1

1.2 Literature ..............................................................................................................................8

1.2.1 Advertising effectiveness .........................................................................................8

1.2.2. Text Analysis .............................................................................................................9

1.2.3. Social issue marketing effectiveness .......................................................................10

1.2.4. Advertiser Pressure ..................................................................................................11

1.3. Data ....................................................................................................................................12

1.3.1. Dove Real Beauty campaigns ..................................................................................12

1.3.2. Newspapers ..............................................................................................................13

1.3.3. Text data pre-processing ..........................................................................................16

1.3.4. Data evidence from keyword-level analysis ............................................................17

1.4 Estimation and Result .........................................................................................................24

1.4.1 Empirical Strategy ....................................................................................................24

1.4.2 Topic extraction ........................................................................................................25

1.4.3 Testing.......................................................................................................................40

1.4.4 Mechanism ................................................................................................................46

1.4.4.1 How does the advertising message affect the content of a newspaper? ................46

1.4.4.2 Social issue advertising and the mass media’s public goal....................................48

1.4.4.3 Advertiser pressure ................................................................................................51

Page 7: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

vii

1.5. Conclusion .........................................................................................................................53

1.3 Ch .......................................................................................................................................55

Chapter 2 ...................................................................................................................................55

Detecting potential product segments using topological data analysis .....................................55

2.1 Introduction .........................................................................................................................55

2.2. TDA methodology .............................................................................................................58

2.2.1. Vietoris-Rips Complex ............................................................................................59

2.2.2. Clustering distinctly grouped data (Cases 1 and 2) .................................................59

2.2.3. Homology groups, Betti numbers, and loopy segments ..........................................62

2.2.4. A loopy segment in a two dimensional plane (contrasting Cases 3 and 4) ..............63

2.2.5. Interval length of a loopy segment: Persistent homology (Cases 3 and 5) ..............66

2.2.6. Connecting loopy segments (Cases 6 and 7) ...........................................................68

2.2.7. Voids in three dimensional space (Cases 8 and 9) ...................................................71

2.3. Simulation study ................................................................................................................75

2.3.1. Simulation study procedure .....................................................................................75

2.3.2. Simulation study results ...........................................................................................79

2.4. Marketing application ........................................................................................................84

2.4.1. Data and computation time ......................................................................................84

2.4.2. Potential competitors within a category ...................................................................85

2.4.3. Potentially related products across categories .........................................................90

2.4.4. Relationship between a segment’s birth and its product diversity ...........................92

2.5. Conclusions ........................................................................................................................94

3.1 Introduction ..............................................................................................................................96

3.3.2. Main results using direct approach in brand space ................................................105

Appendices ...................................................................................................................................125

Page 8: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

viii

List of Tables

Table 1.1 Dove’s Real Beauty campaign roll-out across countries ............................................ 12

Table 1.2 ProQuest queries used for extracting beauty articles by a country ........................... 14

Table 1.3 Summary Statistics ...................................................................................................... 15

Table 1.3.1 Sentences on social issue (i.e. real beauty) in newspapers ..................................... 15

Table 1.4 The number of newspaper sentences mentioning 'Real Beauty' increases

insignificantly relative to that without mentioning ‘Real Beauty’ in the treated countries

compared to control countries during the campaigns. ................................................................ 18

Table 1.5 Monthly top 50 words trend. ...................................................................................... 19

Table 1.5d Difference in mean of word frequency between during and non-during the

campaign ....................................................................................................................................... 22

Table 1.6 The optimum number of topics based on log Bayes factor over the null one-topic

model ............................................................................................................................................ 25

Table 1.7a Beauty topics ............................................................................................................. 27

Table 1.8a Content words in the Dove advertising campaign for Real Beauty .......................... 32

Table 1.8b ‘Social change’-related words ................................................................................... 33

Table 1.8c ‘Beauty service or product’-related words ................................................................ 33

Table 1.8d ‘Beauty contest’-related words ................................................................................. 33

Table 1.8e Movie related words ................................................................................................. 33

Table 1.8f There are 1 or 2 real-beauty-related topics in each country except New Zealand. .. 34

Page 9: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

ix

Table 1.9a The number of sentences labeled as real beauty topics increases relative to that as

other beauty topics in the treated countries during the month(s) of the real beauty campaign.

....................................................................................................................................................... 42

Table 1.9b Falsification test: The number of sentences labeled as real beauty topics does not

increase relative to that as other beauty topics in the control countries during the month(s) of

the real beauty campaign. ............................................................................................................ 43

Table 1.9c The number of sentences labeled as real beauty topics increases relative to that as

other beauty topics in the treated countries relative to control countries during and one

month after the Real Beauty campaign. ....................................................................................... 45

Table 1.10 The significant impact of the campaign on real beauty-related topics are not driven

by reporting the Dove campaign. ................................................................................................. 47

Table 1.11a Rising social or cultural change words within real beauty topics during the

campaigns ..................................................................................................................................... 49

Table 1.11b Rising opposite words to physical beauty within real beauty topics during the

campaigns ..................................................................................................................................... 50

Table 1.12 The significant impact of the campaign on real beauty-related topics is even in U.S.

....................................................................................................................................................... 52

Table 2.1 TDA cases ..................................................................................................................... 74

Table 2.2: Community detection methods in Scenario 2 in the simulation study ....................... 83

Table 2.3 TDA results by the top N products in each market ....................................................... 85

Table 2.4a TDA for salty snacks..................................................................................................... 85

Table 2.5: Community detection methods for salty snacks using IRI data ................................... 87

Table 2.4b TDA for beers ............................................................................................................. 89

Page 10: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

x

Table 2.4c TDA for the combined data ....................................................................................... 91

Table 2.6 The relationship between a segment’s “birth” filtration value and its product

diversity ......................................................................................................................................... 93

Table 3.1 Google trend queries used for extracting brand search trend ................................... 99

Table 3.2 Google trend queries used for extracting brands co-search trend with Apple ........ 101

Table 3.3 U.S. smartphone market share from March to May 2011 ........................................ 101

Table 3.4 Google trend queries used for extracting brand-product attribute trend for Apple 103

Table 3.5 Apple becomes closer to Samsung than the other brands during and after the

campaign in brand space. ........................................................................................................... 107

Table 3.6 Growth rate & change of co-search between each attribute and its brand ............. 110

Table 3.7 Difference in Difference for co-searches between brands and their attributes ...... 112

Table 3.8 Apple becomes closer to Samsung than the other brands but insignificantly during

and after the campaign in product space. .................................................................................. 114

Table A1 The number of sentences labeled as real beauty topics increases relative to that as

other beauty topics in the treated countries during the month(s) of the real beauty

campaign………………...................................................................................................................129

Table A2 The number of sentences labeled as real beauty topics increases relative to that as

other beauty topics in the treated countries relative to control countries during and one

month after the Real Beauty campaign.…………………………………………………………………………………130

Page 11: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

xi

List of Figures

Figure 1.1 Billboard advertising on Dove campaign for Real Beauty............................................ 2

Figure 1.2 Newspapers more often use words related to social or cultural change in beauty

sentences during the Dove campaign for Real Beauty. .................................................................. 3

Figure 1.3 Trend of beauty topics ................................................................................................. 39

Figure 2.1a Distinctly grouped data Figure 2.1b A loopy segment ....................................... 55

Figure 2.2 TDA examples with two customers (Case 1-7) or three customers (Case 8 and 9) .. 60

Figure 2.2.1 Case 1 Two segments .............................................................................................. 60

Figure 2.2.2 Case 2 Tetragon ....................................................................................................... 61

Figure 2.2.3 Case 3 Square loopy segment ................................................................................. 64

Figure 2.2.4 Case 4 Center point within square .......................................................................... 65

Figure 2.2.5 Case 5 Rectangle loopy segment ............................................................................ 67

Figure 2.2.6 Case 6 Distant two loopy segments ........................................................................ 68

Figure 2.2.7 Case 7 Neighboring two loopy segments with one connection ............................. 70

Figure 2.2.8 Case 8 Tetrahedron ................................................................................................. 71

Figure 2.2.9 Case 9 Octahedron with void .................................................................................. 72

Figure 2.3a 5 steps for simulation study ..................................................................................... 76

Figure 2.3b True segments in simulation study .......................................................................... 76

Figure 2.4: TDA barcode chart for simulation study ..................................................................... 80

Page 12: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

xii

Figure 2.5 Hierarchical clustering ................................................................................................ 81

Figure 2.6: Potentially competing products across segments using IRI data ............................... 86

Figure 2.7 Hierarchical clustering for salty snacks using IRI data ............................................... 88

Figure 2.8b: Potentially related products across segments using IRI data with order of

connection .................................................................................................................................... 90

Figure 3.1 Search volume trend by a brand ................................................................................ 97

Figure 3.2 Apple’s top rival brands trend ................................................................................. 100

Figure 3.3 Co-search trend by a brand pair with Apple ............................................................ 105

Figure 3.4 Market-structure map in brand space ..................................................................... 106

Figure 3.5 Samsung moves closer to Apple during campaign in brand space. ........................ 107

Figure 3.6(a) Co-searches between brands and advertised attributes increase in the first week

of the comparative advertising campaign. ................................................................................. 110

Figure 3.6(b) Co-searches between brands and unadvertised attributes do not change much in

the first week of the comparative advertising campaign. .......................................................... 112

Figure 3.7 Market-structure map in product space .................................................................. 113

Figure 3.8 Samsung moves closer to Apple during campaign in product space. ..................... 114

Page 13: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

Chapter 1

Can an advertising message impact the content of mass media?

An examination of the Dove campaign for Real Beauty

1.1 Introduction

Advertising can have a society-wide impact (Kotler and Zaltman 1971; Andreasen 1995; Kotler,

Roberto, and Lee 2002; Kotler and Lee 2007). For example, DeBeers changed western marriage

culture with its “A diamond is forever” campaign in 1948; before the campaign, diamond rings

weren’t synonymous with marriage or engagement (Connolly 2011 in BBC News). While most

research focuses on how advertising affects brand recall (Jain and Hackleman 1978; Alba and

Amitava 1986), brand search (Joo, Wilbur, Cowgill and Zhu, 2015; Liaukonyte, Teixeira and

Wilbur, 2015; Hu, Du, and Damangir, 2014; Du, Hu, and Damangir, 2015; Srinivasan, Rutz,

Pauwels 2015), brand equity (Sriram, Balachander, and Kalwani 2007; Borkovsky, Goldfarb,

Haviv, and Moorthy forthcoming), and market outcomes (Leone 1995; Erickson and Jacobson

1992; Onishi and Manchanda 2012; Gopinath, Thomas, Krishnamurthi 2014; Pauwels, Stacey,

Lackman 2013; Dinner, Van Heerde, and Neslin 2014, Srinivasan, Rutz, Pauwels 2015), it has

been challenging to measure the impact of advertising on society in general or the media in

particular. In this study, we examine whether an advertising campaign can affect mass media

reporting on a social issue.

Social issue advertising informs the public about a social issue or influences their behavior

(Truss, French, Blair-Stevens 2010). It can be a powerful tool to impact the community by

triggering or accelerating social or cultural change (Kotler, Roberto, and Lee 2002; Kotler and

Lee 2007; Omid and Pete 2015). Governments often use public service announcements to

promote causes and activities that are generally considered socially desirable (Garbett 1981).

Public service announcements cover a variety of social problems, including racism, drug abuse,

drinking, driving, child abuse, and illiteracy (Murry, Stam, and Lastovicka 1996), often relying

on donated rather than paid media (Vingilis and Coultes 1990).

1

Page 14: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

2

Figure 1.1 Billboard advertising on Dove campaign for Real Beauty

Figure 1.1a The first series of ad

Content words mentioned in the ads:

- oversized? outstanding? Does true beauty only squeeze into a size 6? Join the beauty debate.

- fat? fit? Does true beauty only squeeze into a size 6? Join the beauty debate.

- flat? flattering? Can you sexy without being busty? Join the beauty debate.

- flawed? flawless? Is beautiful skin only ever spotless? Join the beauty debate.

- grey? gorgeous? Why can’t more women feel glad to be grey? Join the beauty debate.

- ugly spots? beauty spots? Does skin really have to be flawless to be beautiful? Join the beauty

debate.

- wrinkled? wonderful? Will society ever accept ‘old’ can be beautiful? Join the beauty debate.

Figure 1.1b The second series of ad

Content words that describe the ads: “featuring six real women with real bodies and real curves” (Dove

website 2015)

The second picture is captured from CBS the early Show August 18, 2005, 9:59 AM

Page 15: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

3

Private companies have also started to use social issue advertising, or cause marketing, as a form

of corporate social responsibility (CSR). For example, American Express’s “Small Business

Saturday” campaign encourages consumers to shop local to boost small businesses. Figure 1.1

shows images from the Dove campaign for real beauty, another example of social issue

advertising aimed at challenging beauty stereotypes, especially in media; to widen the definition

of beauty; and thus to make beauty a source of confidence, not anxiety, especially for girls

(Kolstad 2007; Dove website 2016). Currently, such empowering ads (e.g., Dove’s “Dove Real

Beauty Sketches” and P&G Always brand’s “Always #LikeAGirl”) are popular on YouTube:

The top 10 empowering ads were two-and-a-half times less likely to be skipped than other ads in

similar categories for the past 3 years from 2013 to 2015 (Wojcicki 2016).

Figure 1.2 Newspapers more often use words related to social or cultural change in beauty sentences

during the Dove campaign for Real Beauty.

US Canada UK

Y axis measures the ratio of the number of focal word(s) to the number of beauty sentences.

See Table 10a for words list on social or cultural change including opposite words to physical beauty.

Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), U.K. (2005 Jan.),

Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)

Page 16: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

4

Given that the mass media affects and reflects modern culture and society (McQuail 2010), in

this study, we investigate whether an advertising message can change the content of mass media

and why. Specifically, we measure the change in topics related to beauty covered in newspapers

before, during, and after the Dove campaign for Real Beauty. Figure 1.2 provides some

motivating analysis. It shows that, during the campaigns, the frequency of the word “real beauty”

and the frequency of words related to real beauty or ‘social or cultural’ change (e.g. change,

question, traditional, culture, real, mind, brain, as identified by research assistants with graduate

training in sociology) rose substantially.

Such keyword-based analysis is suggestive of an impact of the Dove’s real beauty campaign on

media content. There are two potential issues with this analysis. First, the choice of real beauty

related words is not systematic. Second, there is no control group and so it is possible that real

beauty words would have risen during the campaign for reasons outside the campaign.

To address the first point, how can we identify topics related to advertising messages in

newspaper articles? There are challenges in collecting and analyzing newspaper content about

advertising message. First, advertising messages are not summarized with a few keywords,

unlike advertising titles. For example, articles on the campaign slogan (i.e., real beauty) may not

represent all relevant content because some articles may discuss real beauty without using the

phrase “real beauty”. In order not to lose relevant content, one needs to collect articles with

somewhat broad terms (e.g., beauty), and then extract topics related to “real beauty”.

Second, the commonly used aggregate-level keyword analysis may not discover the message of a

relatively small or emerging topic fully. If there are several themes (e.g., beauty services or

products, movies, real beauty) around the search query (e.g., beauty), it is hard to detect the

change in the focal topic (e.g., real beauty) at the aggregate level. At most, only the campaign

title words (e.g., Dove, ad, campaign, real, beauty), which are more frequently reported, can be

easily found. Instead, words related to less frequent advertising messages may not be captured

well.

To address the above challenges, we segmented beauty sentences into several groups based on

common topics. Topic models (Blei et al., 2003; Taddy 2012; Tirunillai and Tellis 2014;

Büschken and Allenby 2016) assume that the words in text are generated from a mixture of latent

topics. The extracted topics are defined by a collection of co-occurring words with a relatively

Page 17: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

5

high probability of usage (Büschken and Allenby 2016). Such segmentation gave us the power to

detect relatively small topics. Examining the top words within a segment (i.e., topic) allowed us

to identify whether topics related to an advertising message exist in newspapers.

To address the second issue related to a control group, we exploited the rollout of the global

advertising campaign across several countries in order to test whether Dove’s real beauty

campaigns in the U.S., Canada, and the U.K. increased the incidence of real-beauty-related

topics in newspapers relative to other beauty topics across the three treated and two controlled

countries (i.e. Australia and New Zealand), where the campaign started later.

Utilizing the topic model proposed by Taddy (2012), we grouped all of the beauty sentences into

8 to 10 beauty topics in each analyzed country, including one or two topics related to real beauty.

The number of sentences labeled as real beauty topics increase relative to the number of

sentences labeled as other beauty topics in treated countries relative to control countries during

and one month after of the campaign. This is not driven only by reporting on the Dove campaign:

The significant impact on real beauty topics holds even after all the articles that mentioned Dove

in any sentence were excluded. Furthermore, many words related to social or cultural change in

real beauty topics were more often used during the campaigns. Overall, these results suggest that

advertising can affect the topics covered by mass media. While the Dove campaign’s significant

impact on real beauty topics around the time of the campaign holds even in newspapers without

Unilever ads, the impact is larger in newspapers containing Unilever ads.

Overall, this evidence is consistent with a mass media’s public service role as well as an

advertiser pressure role (Reuter and Zitzewitz 2006; Rinallo and Basuroy, 2009; Reuter, 2009; de

Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and Puglisi 2015; Focke,

Niessen-Ruenzi, and Ruenzi 2016) in media coverage of the Dove campaign. In other words,

media outlets with public as well as economic goals are willing to report and discuss the

messages of such social issue advertising.

Why can advertising messages on social issues affect mass media content? In addition to its goal

to profit, the mass media has a (non-economic) public goal to serve the public interest on desired

social or cultural change (McQuail 2010). Therefore, the mass media is likely to report the

message of social advertising actively. In fact, after interviewing 11 firms, Drumwright (1996)

finds that (1) social issue advertising tends to receive more media coverage than standard

Page 18: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

6

campaigns, and (2) one social issue campaign earned media coverage valued at six times the

expenditure on paid media. Writers in the mass media industry seem to have a similar view. For

example, regarding the Dove campaign, Walker (2005) of the New York Times magazine wrote,

“the more intriguing fact is that it is a marketing campaign—not a political figure or a major

news organization or even a film—that ‘opened a dialogue’” in his essay “Social Lubricant–How

a marketing campaign became the catalyst for a societal debate.” Forbes contributor Dan also

said, “TV commercials are a culturally powerful force, shaping society and giving voice to those

outside the mainstream” (Omid and Pete 2015).

Why is this research question important? First, it is important to understand how marketing

affects society beyond firm performance. Given that mass media affects and reflects modern

culture and society (McQuail 2010), advertising messages can be another tool to change the way

that people think and talk by establishing a link between advertising message and the content of

mass media. Firms can then use advertising as a corporate social responsibility (CSR) activity, in

addition to products (e.g., innovative products, green products, recycling), employees (e.g.,

employing the disabled, providing retirement plans), transparent corporate governance (e.g.,

transparency), and charity to the community (Luo and Bhattacharya 2006; Luo and Bhattacharya

2009; Hull and Rothenberg 2008; Servaes and Tamayo 2013; Mishra and Modi 2016). In her

book Beauty Myth, Wolf (1991) argues that our culture’s images of beauty are shaped harmfully

by mass media (e.g., TV and women’s magazines) and advertisements. More than a decade later,

in a global study on women and beauty that was commissioned by Dove, more than two-thirds

(68%) of women also strongly agreed that the media and advertising set an unrealistic standard

of beauty that most women can’t ever achieve (Etcoff, Orbach, Scott, and D’Agostino 2004).

Therefore, a change in the way the media describes beauty was one of the Dove campaign’s

main goals (Kolstad 2007).

Second, given that publicity in mass media (Chintagunta, Jiang and Jin 2009; Kalra and Zhang

2011) and the interaction between a firm’s marketing action and publicity (Ching, Clark,

Horstmann, and Lim 2016) increase demand, it is important to understand what kinds of

marketing action can attract the attention of mass media. Recently, Harald, Gijsbrechts, and

Pauwels (2015) found that deep price reductions triggered newspaper coverage of a price war.

Page 19: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

7

Third, given that publicity tends to have higher credibility than advertising (Cameron 1994; Lord

and Putrevu 1993) and more credible sources are viewed as more trustworthy and generate more

attitude change (Hovland and Weiss 1951), brand positioning is likely to be more effective when

a message is delivered through mass media rather than just commercial advertising. Once again,

it is important to understand what advertising message mass media are willing to report. While

several recent studies (Rinallo and Basuroy 2009, de Smet and Vanormelingen 2012; Gambaro

and Puglisi 2015, Reuter and Zitzewitz 2006; Gurun and Butler 2012; Focke, Niessen-Ruenzi,

and Ruenzi 2016) find the relationship between advertising revenue in focal mass media and

media bias, no advertising message has been studied as the driver to affect the content of mass

media, to our knowledge.

There are several contributions in our study. First, we contribute to the advertising effectiveness

literature by examining the effect of paid media (i.e., advertising) on earned media. While recent

studies measure the impact of advertising on consumer-generated media such as blogs, social

media, and online forums (Onishi and Manchanda 2012; Gopinath, Thomas, and Krishnamurthi

2014; Pauwels, Stacey, and Lackman 2013; Fossen and Schweidel 2016), we study the mass

media. Second, we explore both the mass media’s public service goal and advertiser pressure as

the drivers of the content of mass media, while most empirical studies have focused only on

advertiser pressure (Reuter and Zitzewitz 2006; Rinallo and Basuroy, 2009; Reuter, 2009; de

Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and Puglisi 2015; Focke,

Niessen-Ruenzi, and Ruenzi 2016). Lastly, we provide a novel marketing application of topic

modeling, which is an unsupervised text-mining method and was recently introduced in

marketing. In our paper, we show that while the commonly used aggregate-level keyword

analysis presents difficulties, the topic model is useful in extracting topics related to advertising

messages from newspapers.

We organize the rest of this chapter as follows. In §1.2, we summarize the related literature. In

§1.3.1, we introduce the Dove campaign for Real Beauty. In §1.3.2 and §1.3.3, we describe our

newspaper data and its pre-processing, then we show some evidence of the campaign’s impact

on newspaper content from the keyword level analysis in §1.3.4. Next, we propose our empirical

strategy in §1.4.1 and show the extracted topics by country in §1.4.2, the main testing results and

robustness check in §1.4.3, and the potential mechanisms in §1.4.4. Finally, we conclude this

study in §1.5.

Page 20: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

8

1.2 Literature

1.2.1 Advertising effectiveness

This paper relates advertising to earned media. In this way, it examines a particular type of

advertising effectiveness. Traditionally, scholars have focused on direct impact of advertising on

market outcomes (Leone 1995; Erickson and Jacobson 1992) or marketing mix outcomes, such

as product differentiation (Kirmani and Zeithaml 1993) and price premiums (Ailawadi,

Lehmann, and Neslin 2003).

However, advertising may affect consumer purchase indirectly through pre-purchase consumer

behavior (Onishi and Manchanda 2012; Gopinath, Thomas, Krishnamurthi 2014; Pauwels,

Stacey, Lackman 2013; Dinner, Van Heerde, and Neslin 2014, Srinivasan, Rutz, Pauwels 2015).

As consumer activities began to be recorded in online websites (e.g., shopping and search), it

became possible to measure marketing effectiveness in even the pre-purchase stage. For

example, recent studies show that consumer searches for brands or products rises with their

television commercials (Joo, Wilbur, Cowgill and Zhu, 2015; Liaukonyte, Teixeira and Wilbur,

2015; Hu, Du, and Damangir, 2014; Du, Hu, and Damangir, 2015; Srinivasan, Rutz, Pauwels

2015) and with online display advertising (Lewis and Nguyen 2015).

While consumer search data help understand consumers’ consideration sets, the media plays a

role in increasing awareness of brands or products in the early stage of the consumer purchase

journey. Recent studies show that advertising can increase word-of-mouth in consumer-

generated media such as blogs, social media, and online forums (Onishi and Manchanda 2012;

Gopinath, Thomas, and Krishnamurthi 2014; Pauwels, Stacey, and Lackman 2013; Fossen and

Schweidel 2016).

Although mass media data (e.g., newspaper articles) were available publicly far before either

consumer search or consumer-generated media data, there is still limited literature measuring the

effect of advertising on mass media, perhaps due to the lack of systematic analysis of media

content. Only in the literature on media bias due to advertiser pressure, a few empirical studies

have linked advertising expenditure to the length of articles (Rinallo and Basuroy 2009),

frequency of news articles (de Smet and Vanormelingen 2012; Gambaro and Puglisi 2015), and

tone (i.e., sentiment) in articles (Reuter and Zitzewitz 2006; Gurun and Butler 2012; Focke,

Page 21: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

9

Niessen-Ruenzi, and Ruenzi 2016) about advertising firms. Specifically, Rinallo and Basuroy

(2009) show that European and U.S. newspapers and magazines tend to write more about the

products of Italian fashion firms if they spend more for advertising in the focal media. Reuter

and Zitzewitz (2006) find that three U.S. personal finance media sources (i.e., Money Magazine,

Kiplinger’s Personal Finance, and Smart Money) are more likely to positively mention mutual

funds that are advertised with higher advertising expenditure. Gurun and Butler (2012) and

Focke, Niessen-Ruenzi, and Ruenzi (2016) show that U.S. newspapers write less critical articles

on heavier advertisers. They measure “news tone” based on the number of negative words, a

method developed in the financial context by Loughran and McDonald (2011).

However, none of the above papers studied the effect of advertising message rather than

expenditure on mass media content. In this paper, using a topic model, we propose a new

approach to measure whether mass media sources report or discuss more about the message or

theme that the advertising campaign intends to deliver. Our approach can be applied to

consumer-generated media as well.

1.2.2. Text Analysis

Most existing studies that use text-mining techniques are based on particular words or phrases

(Gentzkow and Shapiro 2010; Archak, Ghose, and Ipeirotis 2011, Ghose, Ipeirotis, and Li 2012;

Tang and Guo 2013; Gopinath, Thomas, and Krishnamurthi 2014; Pauwels, Stacey, and

Lackman 2013), co-occurrence of pairs of words (Netzer, Feldman, Goldenberg, and Fresko

2012), and sentiment (Sonnier et al. 2011; Tirunillai and Tellis 2012; Gurun and Butler 2012;

Ludwig et al. 2013; Focke, Niessen-Ruenzi, and Ruenzi 2016). Recently, marketing scholars

have started to use “unsupervised” text mining, which is a dimension-reduction technique in

which a large number of documents are summarized into a small number of product attribute

clusters (Lee and Bradlow 2011), principal components (Liu, Vir Singh, Srinivasan 2016), or

latent topics (Tirunillai and Tellis 2014; Büschken and Allenby 2016). This unsupervised text-

mining has several advantages. First, it exploits full information (i.e., all the words) within each

text when both reducing dimensions and interpreting clusters, components and topics, resulting

in rich context. This is different from the traditional text categorization approach to use only

some words (e.g., product attributes, brand pairs, adjectives). Second, it requires much less

Page 22: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

10

human intervention than the traditional approach, which often requires a researcher to decide on

predefined keywords.

There are only a few applications using the unsupervised text-mining in marketing. For example,

Tirunillai and Tellis (2014) extracts quality dimensions (i.e., topics) from product reviews and

also show that dimensions’ importance varies over time. Liu, Vir Singh, and Srinivasan (2016)

decompose tweets into principal components and then use them to predict demand for TV

programs (i.e., shows and NFL games). Büschken and Allenby (2016) also use topics in hotel

reviews as the predictors of overall satisfaction (i.e., review rating). Our work is similar to that of

Tirunillai and Tellis (2014) in that both studies use time-varying topic trends. They illustrate that

a new product launch or news of bad product performance is likely to affect consumer

satisfaction on the “ease of use” quality dimension but do not test it formally. In our paper, we

test whether content marketing (i.e., advertising messages) can change topics of newspapers (i.e.,

have a qualitative impact on publicity). Each topic consists of a collection of words. This rich

information allows us to identify whether topics related to advertising messages exist in the

editorial content of newspapers.

1.2.3. Social issue marketing effectiveness

Our work also contributes to the literature on social issue marketing effectiveness. The existing

literature has focused on consumer attitude or purchase intention and market outcomes in

assessing the effectiveness of social issue marketing, such as cause marketing (CM) and other

corporate social responsibility (CSR) activities.

Using laboratory experiments, many studies show that respondents have a positive attitude

toward companies that implement CM campaigns and are more willing to buy their products

(Brown and Dacin 1997; Pracejus, Olsen, and Brown 2003; Strahilevitz and Meyers 1998, Chang

2008; Folse, Niedrich, and Grau 2010; Koschate-Fischer, Stefan, and Hoyer 2012; Robinson,

Irmak, and Jayachandran 2012). Recently, Andrews, Luo, Fang and Aspara (2014) found using a

large-scale field experiment that CM increases consumer purchase and thus sales revenue. The

effect is the strongest with moderate rather than deep or absent price discounts. Several studies

also link a firm’s CSR activities (e.g., charity, green products, transparent governance) to its

Page 23: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

11

financial performance (Luo and Bhattacharya 2006; Luo and Bhattacharya 2009; Hull and

Rothenberg 2008; Servaes and Tamayo 2013; Mishra and Modi 2016).

By emphasizing the importance of expanding the metrics to measure CSR effectiveness beyond

firm performance (e.g., market share and financial return), Raghubir, Roberts, Lemon, and Winer

(2010) propose to add community metrics including measures related to societal issues (e.g.,

literacy rate, birth/death rate) and media coverage (e.g., quantity and quality of press impact).

However, media metrics are rarely used as the outcome of social issue marketing in academia,

although they are often used in industry (Drumwright 1996). Interviewing 11 firms about both

their standard and social issue campaigns, she finds that firms want to enhance their image, build

brand equity, and increase sales with their standard campaigns, but they also have public

relations (i.e., media exposure) and cause-related measures (e.g., the number of people actually

engaging in the social behavior) as goals of their social issue campaigns. Informants in her study

often observe more media coverage in social issue campaigns than standard campaigns.

Our work is among the first studies to measure the impact of a social issue campaign on topics

covered in the mass media, which are a qualitative measure of press impact.

1.2.4. Advertiser Pressure

This paper also provides evidence to support theoretical papers’ finding that editorial content can

be affected by advertiser pressure. Given that (1) advertising is the major revenue source of

many media outlets (Stromberg 2004, Mantrala, Naik, Sridhar, and Thorson 2007, Pew Research

Center 2014), and (2) media content affects demand (Chintagunta, Jiang and Jin 2009; Kalra and

Zhang 2011; Ching, Clark, Horstmann, and Lim 2016), advertisers have economic incentives to

influence editorial content (Kerkhof and Münster 2015). There is a growing body of such

theoretical literature to link the advertiser pressure and media bias across marketing and

economics (Ellman and Germano 2009; Gal-Or, Geylani, and Yildirim 2012; Zhu and Dukes

2015; Spiteri 2015; Blasco, Pin, and Sobbiro 2016).

By estimating both viewer and advertiser demand, Wilbur (2008) shows that advertiser

preferences influence network choices about program genre more strongly than viewer

Page 24: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

12

preferences. Our work also adds evidence of such advertiser pressure to complement the other

empirical papers mentioned in section 2.1 (Reuter and Zitzewitz 2006; Rinallo and Basuroy,

2009; Reuter, 2009; de Smet and Vanormelingen 2012; Gurun and Butler 2012; Gambaro and

Puglisi 2015; Focke, Niessen-Ruenzi, and Ruenzi 2016). However, we also show that the

advertiser pressure is not the only reason.

1.3. Data

1.3.1. Dove Real Beauty campaigns

We chose the Dove campaign for this study for two reasons. First, social issue campaigns

tend to receive more media coverage than non-social ones (Drumwright 1996). The Dove

campaign is such social advertising. Before the campaign, Etcoff, Orbach, Scott, and D’Agostino

(2004) found that only 4% of women around the world would describe themselves as beautiful in

their global study, which was commissioned by Dove. After the study, Dove started its campaign

in order to challenge beauty stereotypes, especially in the media, and widen the definition of

beauty.

Table 1.1 Dove’s Real Beauty campaign roll-out across countries

Group Year Countries and the months of the campaign

Treatment

2004 Canada, U.S. (September, October)

2005 U.K. (January), U.S. (July, August), Canada (August, September)

Control 2006 Australia, New Zealand (April)

Analyzed periods are 2004-2005.

Second, global campaign rollouts allow a natural quasi-experimental setting across countries.

Table 1.1 shows the launching periods of the Dove campaign across five countries. The first

series of ads known as “Tick-Box campaign” (Figure 1.1a) was launched in September 2004 in

both Canada and the U.S., and then in January in the U.K. The ads asked viewers to judge

women’s looks (oversized or outstanding? and wrinkled or wonderful?), and invited them to cast

their votes at campaignforrealbeauty.com (Dove website 2015). The voting results were updated

in counter on billboard in real time. The second but more iconic campaign was introduced in

Page 25: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

13

both Canada and the U.S. in 2005. As seen in Figure 1.1b, this ad is “featuring six real women

with real bodies and real curves” (Dove website 2015). In this study, we analyze those early

campaigns across the 3 countries. For control countries, we use New Zealand and Australia,

where the Dove campaigns were brought later in 2006. The key identifying assumption is that

month-to-month changes in newspaper topics related to beauty in Australia and New Zealand are

a good control for month-to-month changes in newspaper topics related to beauty in the countries

in which the campaign occurred.

1.3.2. Newspapers

As a dependent variable, we need to count the number of beauty sentences by a beauty topic. Our

first step was to collect newspaper articles on beauty from the ProQuest Newsstand database,

which collects many newspapers worldwide, through Libraries in the University of Toronto and

the Auckland University of Technology.

Because we analyze newspaper articles written in English during the analyzed periods of 2004 to

2005, the first set of query in the ProQuest Newsstand database we used was beauty AND

PD(2004-2005) AND LN(English) STYPE(Newspapers), where PD, LN, and STYPE mean

publication date, language, and source type, respectively. In order to analyze more relevant

articles, we added the following restrictions on the above query: (1) AND FTANY(yes), and (2)

AND (women OR woman OR females OR female OR girls OR girl) and (3) AB(beauty), where

FTANY, AB means full text and abstract, respectively. We downloaded articles in full text

because we needed to analyze the content. Such women-related context words helped filter

articles on women’s beauty. An article with beauty in its abstract is more likely to describe

beauty as a main topic.

Table 1.2 shows the queries we used. To collect articles by country, we used JSU (journal or

publication subject), which contains country information. For example, for the U.K., we added

AND JSU(“Great Britain”) in the above query. We also used the same query for Australia and

New Zealand. In the U.S. data, we found many articles published in Canada, perhaps due to their

regional closeness. Thus, for the U.S., we put additional restriction with AND CP(“United

States”), where CP means county of publication. For Canada, there is a newspaper database only

Page 26: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

14

for Canada, ProQuest Canadian Newsstand Complete, and we used only JSU(“Canada”). Using

these queries, we downloaded each month’s beauty articles by country.

Table 1.2 ProQuest queries used for extracting beauty articles by a country

Country ProQuest database

Query

Country specific part Common part

Canada Canadian

Newsstand Complete

AND JSU(Canada) AB(beauty) AND PD(200512)

AND (woman OR women OR girl OR girls OR female OR females)

AND STYPE(Newspapers) AND LN(English) AND FTANY(yes)

U.S.

Newsstand

AND JSU("United States") AND CP("United States")

U.K. AND JSU("Great Britain")

Australia AND JSU(Australia)

New Zealand AND JSU("New Zealand")

JSU: journal or publication subject, CP: county of publication, AB: abstract, PD: publication date, STYPE:

source type, LN: language, FTANY: full text.

Next, to focus on directly beauty related contents, we choose all the sentences that mention

beauty. The detailed steps to process text data is described in the next section. In principle, one

could analyze articles that mention beauty in any sentence. However, given that newspaper

articles are quite long compared to tweets and product reviews, some articles that mention

“beauty” may talk about very different topics. As a result, there may be high noise in article level

analysis, making it hard to name topics. On the other hand, each sentence tends to have just one

topic (Büschken and Allenby 2016). Therefore, at the sentence level, micro-detection is possible.

Furthermore, by sampling sentences that mention “beauty”, there is also a gain in computing

time.

Page 27: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

15

Table 1.3 Summary Statistics

U.S. Canada U.K. Australia

New Zealand

Monthly sentences that mention “beauty”

Min 259.0 92.0 226.0 59.0 4.0

Mean 320.9 158.0 310.6 100.3 12.3

Median 307.0 154.5 319.5 99.5 11.5

Standard Deviation 55.6 37.3 48.7 25.4 6.0

Max 506.0 242.0 400.0 146.0 26.0

Monthly sentences that mention “real beauty”

Min 0.0 0.0 0.0 0.0 0.0

Mean 2.5 2.0 1.7 1.8 2.0

Median 1.0 1.0 1.0 2.0 1.0

Standard Deviation 4.3 2.2 3.1 1.5 2.2

Max 19.0 7.0 15.0 7.0 7.0

Table 1.3 shows the number of sentences per month mentioning “beauty” (as defined in Table

1.2) and the number of sentences containing the two word expression “real beauty”, the

campaign slogan. There are many fewer “real beauty” than “beauty” sentences. “Real beauty”

sentences represent only less than 1% of “beauty” sentences in the U.S. During some months, no

“real beauty” article was published. This may suggest that the effect of the Dove campaign on

newspapers was trivial. However, there could have been sentences which do not include the “real

beauty” phrase. Table 1.3.1 show the example. Although the two sentences are about real beauty,

only the first sentence mention “real beauty” phrase. In this sense, just counting “real beauty”

sentences is a naïve approach, and thus it is not likely to represent all the relevant content.

Table 1.3.1 Sentences on social issue (i.e. real beauty) in newspapers

Type Sentence Source

With “Real Beauty” keyword

The "Campaign for Real Beauty" contest was part of a promotion intended to broaden the traditional definitions

of beauty.

The Press Democrat 19 Aug 2005

Santa Rosa, California

Without “Real Beauty” keyword

Too many women in America suffer from eating disorders brought on, in many cases, by cultural pressures to live up

to an unrealistic image of ideal beauty.

Deseret News 23 Mar 2005

Salt Lake City, Utah

Page 28: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

16

1.3.3. Text data pre-processing

Most of the text pre-processing could be done automatically using standard text-mining software.

We pre-processed the newspaper articles using the following steps:

1. Removing URLs and “Full text:” which locates between the article title and body.

2. Splitting the text into articles using an article identifier (i.e., long line).

3. Identifying articles that mention Dove.

4. Detecting and keeping only beauty sentences that contain “beauty” after splitting articles

into sentences using R “openNLP” and “qdap” package.

5. Transforming capital letters into lowercase letters.

6. Collapsing compound words connected by hyphens into one word (i.e., make-

upmakeup, self-esteem, self esteemselfesteem).

7. Removing all punctuation.

8. Removing stop words using a vocabulary of stop words reserved in the R “tm” package.

9. Collapsing words into a common root using the R “SnowballC” package, which

implements Porter’s (1997) word stemming algorithm.

10. Replacing words with their similar meaning words (i.e., womanwomen, saidsay,

manmen, advertisead, therapisttherapi).

11. Removing all words that appear less than 0.2% of the time in all the beauty sentences in

each country.

12. Removing words with only one character or more than 20 characters.

13. Removing “beauti,” which is the stem of “beauty.”

In step 1, URLs and “Full text:” were added by ProQuest rather than the newspaper publishers.

Since they are not newspaper content, we deleted them.

Page 29: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

17

Since we downloaded all of the monthly articles at once, we need to split the whole text into

separate articles in step 2. Given that our unit level of analysis is the sentence, we could split the

text into sentences directly in step 4. However, for the robustness check in the result section, we

needed to perform the article separation in step 3.

“Beautiful” and “beautifully” have the same stem as “beauty”. Before stemming in step 9, we

replaced the two words with “beautifull,” whose stem is “beautiful.” we also found that some

words with similar meanings have different stems. Thus, we matched those similar words in step

10 after the stemming step. Since “beauti” is in all of the sentences by construction, it does not

have any discriminatory power with respect to topics, like stop words (Büschken and Allenby

2016). Thus we deleted “beauti” in step 13.

1.3.4. Data evidence from keyword-level analysis

Before we turn to the topic model, we performed a commonly used keyword-level analysis in

order to provide suggestive evidence of the relationship between the campaign and newspaper

content. First of all, we tested whether the campaign slogan (i.e., “Real Beauty”) was more often

used during the campaigns in Columns (1) and (2) of Table 1.4. While Column (1) use only

sentences mentioning "real beauty" in the treated countries, Column (2) adds those in control

countries. In both columns, the key coefficient of interest, Real Beauty x During Campaign x

Treated Countries, is positive and significant, suggesting that newspapers seems to have talked

more about “real beauty” during the campaigns in treated relative to in control countries. The

effect for Real Beauty x During Campaign in Column (2) is positive but not significant, as

expected, suggesting that there were no significant increase in reporting “real beauty” in control

countries. Column (3) adds other beauty sentences that mention “beauty", but not “real beauty”.

Now, the key coefficient of interest is positive but not significant any more. This result may

suggest that the Dove campaign did not make enough impact so that newspapers started to talk

more about “real beauty” significantly compared to “other beauty” topics. As we discussed

before, however, this insignificant result may be driven by the keyword-level approach’s

inability to capture real-beauty-related sentences that do not mention “real beauty”. In other

words, one may not be able to explore “real beauty” topics comprehensively just by analyzing

only sentences that mention “real beauty”.

Page 30: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

18

Table 1.4 The number of newspaper sentences mentioning 'Real Beauty' increases insignificantly

relative to that without mentioning ‘Real Beauty’ in the treated countries compared to control countries

during the campaigns.

Only beauty sentences with "real beauty"

Add beauty sentences without

"real beauty"

Only treated

countries

Add

control countries

(1) (2) (3)

Real Beauty x During Campaign x Treated Countries

6.47*** (1.41)

3.40*** (1.17)

13.9 (9.03)

Real Beauty x During Campaign 0.867 (1.32)

5.85 (10.2)

Country-specific sentence type dummies 2 4 9

Year-month dummies 23 23 23

R-sq 0.558 0.420 0.961

Observations 72 120 240

Treatment Sentences with "real beauty"

in treated countries

Control

Sentences with "real beauty" in control countries

Sentences with "beauty", but not

"real beauty"

Dependent variable is the number of sentences in each country-specific sentence type, which is the unit level of analysis. Each country has two types of beauty sentences (1) with "real beauty" and (2) without "real beauty" phrase. Treated countries: the U.S., Canada, and the U.K, Control countries: Australia, New Zealand All the results are estimated using OLS. ***p<0.01

Page 31: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

19

Table 1.5 Monthly top 50 words trend.

Table 1.5a Monthly top 50 words trend in U.S.

Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti

2 say women women women say women say salon women women women say say say pageant women say say say women say say say women

3 women say say lauder women say salon say say say say one women women say say women women women say women women women one

4 men like one say pageant miss will women pageant salon one salon year pageant women look one pageant pageant real pageant one salon geisha

5 like girl year new year like women like contest product salon women natur year shop shop salon product product pageant show like year pageant

6 year can product product one pageant one new one one like year new vote salon pageant year store girl year one year life say

7 queen show salon year product one year pageant will new product pageant show like queen will product year year one will pageant new time

8 make one just one new hernando new one salon make pageant hair shop will miss salon new one like product contest salon like product

9 shop salon pageant cosmet look salon hair can love look work like see miss one like look brand one will salon product work new

10 photo pageant hair salon shop show like also shop love love natur product time first one love compani contest can like show pageant like

11 pageant contest natur busi will new look contest fashion natur year peopl life salon graci even fashion like new girl new american one contest

12 will day school contest just queen natur just year year fashion product like show bullock year time work will ad year shop show love

13 look year old use show will contest work new will new shop will product new can peopl hair look dove can look hair photo

14 can will star spend make hair get even peopl can queen show love natur hair time even limit school campaign shop hair way salon

15 salon shop make pageant school contest good year also like look contest day even contest miss get girl dove new hair world natur hair

16 work time also can live love work hair can girl magazin play young eye like make face men miss size natur come can film

17 also make play queen men peopl thing queen miss busi girl made one one will also life get natur like love new product look

18 mother peopl go cream care year age old music come school will salon love love hair girl new love look fashion girl even natur

19 show tale like este hair shop name offer show us see can school black work work school also time work just miss use way

20 contest new new look can work product us product just contest life star ad agent product two first industri peopl make fashion time work

21 first hair shop face love day can live natur hair skin men famili new life just thing old also find queen contest see just

22 one natur love show like want pageant natur celebr work young high desir hair get queen busi now find see get time now get

23 natur queen queen fashion two movi show love recent men best time work thing undercov girl will use real care thing queen real fashion

24 time life day like life can two show queen first shop girl busi busi fbi think first go design featur look cultur love movi

25 art see film natur peopl time cream miss life play hair first made american go contest book bodi market shop peopl will busi stori

26 former home role life day first shop see first see get see time fit show percent children will show just see life american men

27 girl well church industri skin now love mother meni thing way way men work can life pageant shop work old way first great face

28 fashion imag look just time art men dog whose use american us thing life natur men miss look come thing life play line beast

29 world becom can school first us peopl shop look includ famili great art also latifah school world natur care person also art celebr fresh

30 photograph product work compani world magazin old make time go show spa face world year take day love cosmet hair school natur base year

31 danc offer get becom salon part berri first just magazin can look can take look industri old miss garden natur play work look shop

32 subject help now will queen product time peopl make well time school also photo men get use see just day celebr make first life

33 joe open store hair old natur queen home get time men old two cultur art see store long make want friend men school school

34 survivor famili last also may life make film way also miss photo use imag photo fashion take around store bodi world way fashion two

35 peopl femal health world swan just world person old peopl world long us everi congeni busi good beach ad show day two old makeup

36 get societi life day contest live go walton play skin play host find queen time go colourtreatment think queen two high care danc

37 play fairi men home meni eye citi will home store store wonder imag girl world talk model show way way american name includ make

38 skin look first skin want cultur compani day offer open live new long film play import kind time world come call power film world

39 young love peopl help black style featur two includ pageant age love start citi now show percent make take editor bodi end citi thing

40 call school call great line artist design thing call world celebr work river can find first like school open contest girl cancer last come

41 made get famili model star moment friend eye photo day industri just first contest sandra way shop home can time men love bloomington art

42 famili now dee market natur look light citi age two model also even first peopl care hair includ life men eye get will includ

43 found use time york get also way someth featur old kind home skin school music black natur last meni two black even shop colour

44 back film girl youth now see play counti ideal good offic skin film care hart great just person want home place use contest becom

45 park go see swan american busi now time nation part will art great young girl grow includ lingeri line model creat eye peopl still

46 product photo home first store place us life crown name natur offer book magazin two ford eye can part sit paint help get will

47 new magazin place way much state young men panten mountain make store part state offer american well art featur love local line world young

48 hair place colour two seem garden feel get like school peopl paint pageant hand black age think offer know life vehicl surgeri home call

49 love cultur part thing inner makeov strength skin hair get two plastic hair shop stori help much meni tradit also miss studi skin want

50 life design person play wife way girl high work way high express just look start much becom want campaign first even school art eye

2004 2005

Page 32: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

20

Table 1.5b Monthly top 50 words trend in Canada

Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti

2 say women women women women women pageant women women women women women women look shop play will pageant women women women say say women

3 tale come say pageant product queen china say say say product product star girl women contest say contest say say say colour women say

4 women say editor say year pageant look year like dove colour wonder home great pageant undercov year look pageant girl real imen like sing

5 product also execut one say say women contest appear love world whole like pageant say pageant see women imag size dove women pageant natur

6 one imag fashion per day year hair pageant can peopl say world film one like product last year beautiful pageant model world one take

7 fashion queen home cent celebr miss product new one colour turn divers look like queen women women will live year product new person can

8 fairi messag magazin contest look product men one peopl feel school embrac world editor brain shop look say look dove one can someth show

9 girl surround decor show show fashion say surgeri attribut interest like creation actress women salon new use make natur will ad use product salon

10 time will depart swan one look canada also cultur look us heal hope product just salon love world use age new get sens becom

11 question real one product hair canada like old physic like former year brought show cell time real even love stori like see comfort product

12 societi role chatelain make will repres year get per time life new co year stem film may learn see north photo eye clear one

13 new like can see fashion contest new whose cent make good time jame fashion one fbi contest like standard see look skin humour men

14 use power girl call life day male miss achiev find great first pageant love time congeni natur whose seem bodi former book dri life

15 studi peopl becom men film eye salon obsess natur meni made one stun africa product guy face around mind section health love contest well

16 theron featur truth surgeri face toronto love salon intern film industri call die say new gina pageant interest one product editor everi fit tri

17 baker photograph charg transform last new way find strong photograph year artist romenc can come miss show product can peopl featur like quit like

18 sperri photo tant look garden natur will good noth soap make say kay time miss way model one queen old fashion peopl enter just

19 whose media pageant will make show first may attitud want also can virginia also spot owner live natur year featur peopl also most never

20 mother stereotyp come meni get editor intern youth agre femal real cosmet bob day contest kidnap eye miss will differ use come exact hand

21 brother year men young busi among light month get canadian even servic danni just play like good love make campaign work find smoke song

22 year natur life canada societi come strength plastic imag year thing now gregori even cosmet old mother play also plus us want privat look

23 can show photo inspir classic old one queen studi queen book includ say eye much agent island age old skin imag everyon aniston make

24 peopl men leav win pageant home natur show near show teen book new goddess grow water child now us director photograph humen includ captur

25 old hair vacanc lauder can now also world spirit fashion red accept hair glorious soul will amber canada appear won director observ black bc

26 appear see will new time femal miss use contest live gift african blond use white origin one role someth sever idea one show anderson

27 famili base contest can use illustr get take meni whose sephora pageant thousand young claim singl also care toronto harper time time peopl reluct

28 import beyond make time us art offer young model face new day nurs three sensual mom day time product carolina natur fashion hair grew

29 lauder often hair model first countri base physic dove thing natur meni now charact look bacon physic queen meni model first men year adam

30 preoccup tip get first whose foot repres think survey ad peopl whose product cover face jorg thing come just teen now whose world anybodi

31 will theron young cosmet way chatelain accord news particip illustr men win one appreci well kevin ad old eye averag femal way age film

32 find world role physic cosmet jackson peopl post year studi day audienc busi wisdom movi go feel meni stereotyp current becom offer makeup even

33 work use societi help even one come berri help pageant play prize monday new russian hit book take packag eastern think around time two

34 femal life born garden canada can use natur stun product film earth time natur graci boss still work ban come brand mean men role

35 line age like american physic take day cosmet indonesia one shop peac also hair clinic sequel photo two get miss food continu colour big

36 head way year goal stori real colour turn will us cosmet award work shop diseas men point good design real pageant car go goe

37 anoth femal look soon market us age garden queen young includ centuri even real kidnap want give size back think age will can naomi

38 son skin peopl date style want two red hair think celebr urg canada home passion got although much pictur beautiful skin girl natur watt

39 chow physic find plain actress face good look find cultur high usual includ first latifah bullock music famili despit style cultur show day contest

40 like size work fox recent thing societi peopl young compani market assist last femal therapi sandra tip found without recent size old just use

41 meni book us year swan canadian market men ad market still beat place face heather return student often soul can line salon live get

42 market ideal live love helen stori ask hair feel ideal citi dancer store appear will make time increas ultim day care meni skin whose

43 design best want find like high sex day open campaign friend process angel includ men get fashion fun enjoy find fit just last strong

44 monster exhibit cosmet take natur classic across meni narrow ask aim broke los last use take make exot war just differ model health beast

45 charliz known skin star come blond west work busi provid ottawa ceremoni show feel meni cosmet work glebova convent imag point live differ giant

46 marri realiti star tale salon despit can us design countri road wound make magazin take big even hate marriag cosmet berg becom campaign eventu

47 grimm fame appear known live produc queen now mean present extrem drum get photograph star million featur can glebova ad gotten two follow fay

48 bam compar photograph compet femal five world face word fan citizen inde day call offer menag size show world illustr will thing walk wray

49 look oscar base fairi star respect play featur box orbach tower contest find much agent concept beautiful fashion work brittani colour canadian will appl

50 think confront go plastic becom blue work thing visual girl bell queen shop set bullock winner studi peopl physic love meni turn queen gorilla

2004 2005

Page 33: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

21

Table 1.5c Monthly top 50 words trend in U.K.

Rank 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

1 beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti beauti

2 year women women women women women salon say say women world women women product new year women say say say say product women women

3 say say year queen world say women women women say year year say look treatment one say women women look women year say new

4 treatment will love contest one year year one year one one new one say pound women year year year year product old year pound

5 women one say say product natur men salon one product women say age women one salon look look star like one look men say

6 will look treatment one salon contest will look look girl time product dove year first treatment one one look women new say look time

7 therapi like product year year look hair will time therapi say like new therapi women hair product product like one girl treatment like hair

8 salon hair model look say fashion say product men year can old real will year say salon men old will will men girl make

9 peopl year one first will top product year like salon design contest campaign first say look will will will product contest one world salon

10 new salon will also girl one therapi world queen new contest one salon can look product pound natur love men fashion new therapi like

11 hair world men miss work pound new hair product love miss world year treatment girl girl just make one time time love treatment can

12 product treatment can will queen salon treatment contest salon best look miss can one also men girl girl make just day also time look

13 old natur therapi product treatment get like star world will like can ad new product new time treatment use new year women one girl

14 work girl look new miss queen world men star time get salon product just like pound face first last girl see just new show

15 top therapi salon take shop product one now last old peopl treatment girl make hair like best two miss treatment like like old year

16 one old work play look will industri girl girl queen will just men avon work therapi like can polanski therapi miss girl love one

17 time just face girl therapi men girl first can get salon will use girl two now therapi thing time hair can shop contest day

18 make pound show treatment just time model hall show can make first way spot salon show star pound just old london rape star film

19 look take makeup men new hair young new new world girl star look men therapi industri use now wife life counter queen face product

20 just two new time like old just love miss first treatment open old world make time can queen swedish salon look natur will world

21 use best hair can spot star work miss peopl pound also look face pound day take contest peopl orlaith love make asian want therapi

22 makeup servic two thing can editor also model go life best pound will love come health last spot bodi work work will product work

23 girl star first parlour first now look hollywood also danniell thing queen like face play also young health big now last last salon includ

24 can skill make like day day time can eye look way get time get includ last first new london two good thing just beast

25 world can fashion world get use make thing pageant treatment car pageant hair natur think spot work old girl star clare week first will

26 last now black just great girl now want will like star plastic now great open can get see back face men high take old

27 age first star pound competit pageant want call now men two girl day skin will just play help night use world time life love

28 meni love blond make also beautiful real old take contest great hair life come queen first cream love anoth fashion hair can see get

29 like last now love take world old pound first face made thing first busi miss will men last second skin age therapi can natur

30 men industri world work model just even make day make new way make catherin model work old age new come want first hair even

31 now new come salon eye love can day face take hair offer skin time facial contest now skin contest industri sinc face work bodi

32 take work ft hair even also star natur even bodi pageant ladi us age point star spot live spot back just use last menag

33 offer face like show time last show peopl thing name product present young even agent part new eye model live now miss great treatment

34 place won time see pound model take way hair organ men sleep eye top love survey day top life everi face spot offer just

35 first sak get go love like busi play old work first men even home contest day peopl salon way also go skin good now

36 contest men last back show eye give long pound now queen day design enjoy star get come like well show live offer never queen

37 show still go former thing skin reveal welsh make day day use start salon show good way time men get muslim think blond last

38 best hand also realli meni put contest berri work fashion show fashion person work eye colleg health therapi can spot old much american place

39 film student use lauder busi hepburn two therapi use even well come peopl also see increas store just world great spot long show centuri

40 spot co life old tv can fashion work two health film former great peopl top model love use queen health model black get men

41 go product eye day lead best skin also life high good place industri fashion offer eye also take magazin go life salon fashion contest

42 industri day great film old come good queen see us art china femal film back still show come name play eye hair even star

43 busi show beautiful much now well home show set home love make treatment everi pageant ever see much citi busi includ make top also

44 compani peopl ad long health much get eye turn give age work world hollywood beautiful market skin around claim celebr still now go miss

45 train great spend win play live use come win just alway show therapi door feel number top week fair becom pageant get london two

46 pound come queen therapi includ organ miss well therapi star idea take just like photograph make think blond also hollywood shop fashion massag peopl

47 star call day star live treatment offer beautiful just show centuri two love old uk love back world day right turn home make age

48 day blond take two pageant two pound high love natur european life star day artist face meni work us appear friend back miss around

49 face perfect peopl life find peopl great becom get age therapi great show last massag natur femal former made win competit place model design

50 two competit age age end spot go win us meni old includ get two time fashion night littl month competit treatment feel health st

2004 2005

Page 34: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

22

Table 1.5d Difference in mean of word frequency between during and non-during the campaign

U.S. Canada U.K.

Word

Difference in Mean

t-val P-val Word Difference in Mean

t-val P-val Word Difference

in Mean t-val P-val

Rising words during

the campaigns

Top 1 beauti 18.40 -0.56 0.581 beauti 43.10 -2.01 0.057 beauti 90.52 -1.78 0.089

Top 2 say 11.80 -1.99 0.059 women 27.40 -4.89 0.000 women 49.09 -5.49 0.000

Top 3 women 9.75 -1.26 0.221 dove 16.30 -18.38 0.000 age 21.65 -6.13 0.000

Top 4 real 8.80 -2.99 0.007 say 15.30 -3.56 0.002 real 20.00 -7.09 0.000

Top 5 product 7.70 -1.86 0.077 size 10.70 -3.03 0.006 way 12.22 -3.58 0.002

Top 6 girl 6.40 -2.53 0.019 peopl 10.00 -5.85 0.000 say 12.00 -1.22 0.237

Top 7 ad 5.80 -2.64 0.015 ad 9.05 -8.95 0.000 use 10.70 -3.28 0.003

Top 8 come 4.55 -2.41 0.025 model 8.55 -5.72 0.000 one 9.57 -1.74 0.096

Top 9 will 4.50 -1.13 0.270 girl 7.75 -2.01 0.057 new 9.35 -2.11 0.046

Top 10 find 4.40 -2.40 0.025 cultur 7.65 -5.14 0.000 can 8.87 -2.20 0.039

Falling words during

the campaigns

Bottom 10 photo -2.40 1.08 0.291 product -1.65 0.46 0.651 last -2.83 0.69 0.495

Bottom 9 queen -2.45 0.72 0.481 good -1.75 1.49 0.152 queen -3.43 0.69 0.497

Bottom 8 school -2.45 0.94 0.355 men -1.80 1.06 0.300 pound -4.04 0.79 0.437

Bottom 7 made -2.45 1.46 0.158 world -1.80 0.86 0.396 treatment -4.22 0.62 0.543

Bottom 6 star -2.45 1.20 0.242 use -2.25 1.11 0.278 contest -4.48 0.78 0.442

Bottom 5 shop -3.55 0.68 0.506 star -2.25 1.12 0.275 former -4.57 1.67 0.109

Bottom 4 like -3.65 1.19 0.245 last -2.55 2.06 0.051 makeup -4.78 1.16 0.258

Bottom 3 face -3.75 2.13 0.044 salon -2.90 0.96 0.348 top -4.87 1.08 0.292

Bottom 2 hair -4.10 1.77 0.090 contest -3.30 0.97 0.340 pageant -5.04 1.25 0.223

Bottom 1 salon -5.80 1.14 0.266 shop -4.20 0.84 0.410 miss -7.78 1.64 0.115

There are 24 months (observations), which is the unit level of analysis.

Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), U.K. (2005 Jan.),

Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)

Page 35: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

Next, we explore whether there is a change in newspaper content using commonly used top word

trend analysis. Tables 5a to 5c show the monthly top 50 word trends for the U.S., Canada, and

the U.K. The campaign title words, “dove” “campaign” for “real” “beauty” rank high during the

months of the campaign, but are not in the top 50 during most non-campaign months. One

exception is the first campaign in the U.S. in September and October 2004, suggesting that the

first campaign was either weaker or less effective than the second one in the U.S.

Among the top 100 words across the two years by country, Table 1.5d shows the top 10 rising

and falling words during the campaigns. The words are ordered by the ‘difference in mean’ of

word frequency between during and non-during the campaign. Most of the campaign title words

are in the top 10 list across 3 countries, consistent with the finding in Tables 5a to 5c.

Across the three countries, there are similar types of falling words. First, there are several words

related to beauty service or product: “salon”, “shop”, “hair”, “face”, “makeup”, “treatment”,

“pound”, and “product”. Second, there are also beauty-contest-related words: “contest”,

“pageant”, “miss”, “world”, “queen”, and “photo”. These results suggest that traditionally

popular beauty words are used less frequently during the campaigns. There is also a country

difference in falling words. In the U.S. and Canada, there are more words related to beauty

service or product in the bottom 10 words than there are in the U.K.

Overall, the keyword trend analysis can suggest that traditionally popular beauty words on

beauty service or beauty contest are less frequently used during the campaigns. However, while

this analysis also captures the growth of the campaign title words (i.e., dove, campaign, real,

beauty), it does not reveal the advertising message that Dove emphasized.

Why does the keyword trend analysis reveal limited information? A similar problem occurs

when one attempts to analyze all consumers together without making consumer segments. Let’s

suppose that there is one big consumer segment and one small segment. While consumers in the

big segment are highly price sensitive, people in the other small one do not care about price. If

the marketing manager does only aggregate-level analysis, he might end up with the conclusion

that most consumers are very responsive to the change of price.

Essentially, the keyword trend analysis looks at the aggregate-level trend across all topics. Thus,

it can identify a change in big topics well, but not a change in small or emerging topics. That’s

Page 36: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

24

why the decline of relatively big traditional beauty topics such as beauty services and beauty

contests is found easily, unlike the emerging topic of the Dove campaign. Similarly, the

difficulty in the U.S. and U.K. may be due to the amount of beauty news. As seen in Table 1.3,

the U.S. and U.K. had more than twice as many beauty articles as Canada. This is a limitation of

aggregate-level keyword analysis, which seems to fail to extract advertising-message-related

words from a big amount of text. In the next section, we will explore whether the topic model,

with which we segments beauty sentences into several topics, can offer additional insight beyond

the aggregate level keyword analysis.

1.4 Estimation and Result

1.4.1 Empirical Strategy

To see whether an advertising message affects newspaper content, we tested whether the

incidence of real-beauty-related topics increased with Dove’s real beauty campaign. For this

purpose, we used the following two stages.

First, we categorized beauty sentences in newspapers into topics. Topic extraction with an

optimum number of topics can be done automatically using recently developed topic models

(Blei, Ng and Jordan 2003; Taddy 2012). Then, to identify real-beauty-related topics, we used

(1) ‘Dove campaign for Real Beauty’-related words and (2) other relevant words chosen by

sociology experts among the top words in each topic. At this step, we could see whether topics

related to advertising messages (i.e., real-beauty-related topics) existed. As the last step of the

first stage, using the extent of association of a beauty sentence with beauty topics, we allocated

each beauty sentence to a particular topic with the highest association.

In the second stage, we exploited a global advertising campaign rollout across countries to test

whether the number of sentences labeled as real beauty topics increased compared to that as

other beauty topics in the treated countries relative to the other control countries, where the

campaign started later.

Page 37: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

25

1.4.2 Topic extraction

For a topic model, we follow the formulation by Taddy (2012), in which multivariate count data

for terms (words in our paper) in a document (a sentence in our paper) is realized from

multinomial distribution parameterized by a weighted sum of latent topics. Given P unique terms

across all the observed N beauty sentences, each beauty sentence 𝑥𝑖 ∈ {𝑥1 … 𝑥𝑛} can be

expressed as a vector of counts for P words. The total number of words in a beauty sentence 𝑥𝑖 is

𝑚𝑖 = ∑ 𝑥𝑗𝑖𝑃𝑗=1 , where 𝑥𝑗𝑖 is the frequency of word j in a beauty sentence i. we assume that there

are K beauty topics a priori. Then, the K-topic model is

𝑥𝑖 ~ Multinomial(𝜔𝑖1𝜃1 + ⋯ + 𝜔𝑖𝐾𝜃𝐾 , 𝑚𝑖) (1)

where topics 𝜃𝑘 = [𝜃𝑘1, … , 𝜃𝑘𝑃]′ is a vector of probabilities over P words; weights 𝜔𝑖 =

[𝜔𝑖1 … 𝜔𝑖𝐾] are a vector of probabilities over K topics. One can label each topic based on its

own top words from 𝜃𝑘, and assign sentence 𝑥𝑖 to a particular topic based on weights 𝜔𝑖.

For implementation of the above topic model, we utilized the R “maptpx” package by Taddy

(2012). Table 1.6 shows log Bayes factors, which are the ratio of log marginal density of a K-

topic to a null one-topic model, where 𝐾 ∈ {2: (�̂� + 3)}, and �̂� is the optimum number of topics

with the highest value of the log Bayes factor. We obtained 10, 8, 10, 7, and 2 topics for the U.S.,

Canada, the U.K., Australia and New Zealand, respectively.

Table1.6 The optimum number of topics based on log Bayes factor over the null one-topic model

No of topics U.S. Canada U.K. Australia New Zealand

2 58700.0 30966.3 56684.8 20017.78 2330.9

3 77644.4 38431.6 74453.1 24327.04 1959.5

4 96745.0 45036.3 92989.4 26840.61 1104.1

5 115832.9 51475.9 111829.2 29287.3 414.4

6 135057.9 58542.4 132001.3 31359.66

7 154849.9 63146.9 151371.7 31898.02

8 170111.8 68042.0 163902.5 30098.53

9 177803.3 66222.8 172030.2 29749.55

10 178953.7 57928.5 172836.1 23964.92

11 172384.0 48512.1 166570.6

12 160635.0 151039.1

13 141955.7 131328.4

Page 38: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

26

Alternatively, one could run a common topic model across all the analyzed countries. This

approach might help intercountry comparison. However, this approach will be challenging due to

the following reasons. First, particular words differ slightly across countries (e.g., behavior vs.

behaviour and color vs. colour). Second, people in different countries may use different words to

discuss the same topics. Third, the extent of topics people talk may vary across countries.

Therefore, we focus on the country-specific topic model.

Once topics are generated, researchers name topics based on the top words typically. In this

study, however, it is not a trivial task for researchers to identify real-beauty-related topics from

other beauty topics. Recall that we attempt to categorize sub-topics of beauty rather than easily

distinguished topics (e.g. politics, weather, sports) in newspapers. There is likely to be common

words between beauty topics. Advertised words in the Dove campaign such as “size”, “skin”,

and “spots” can be used to describe beauty products or services in the articles that do not relate

with the Dove campaign. “Female”, “model”, “body”, “hair”, or “makeup” are often used for

both beauty services and beauty contests. Or, when a writer criticizes the traditional view of

beauty and raises the need of a new definition of beauty, they may also mention the words such

as “look”, “face”, “makeover”, and “treatment” to describe beauty services. Furthermore,

researchers often do not have enough knowledge to identify words on social issue (e.g. real

beauty). In this case, it may be hard for researchers to distinguish real-beauty-related topics from

other beauty topics objectively. Therefore, we identify real-beauty-related topics based on not

only ad-related words, but also the opinions of experts who have relevant knowledge on social

issues that the Dove ads deal with.

Page 39: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

27

Table 1.7a Beauty topics

Table 1.7a Beauty topics in U.S.

Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10

"Beauty

service 1"

"Real

Beauty 1"?

"Beauty

service 2"

"Real

Beauty 2"

"Beauty

contest 1"

"Beauty

service 3"

"Beauty

service 4""Movie" ?

1 new salon show like women pageant product say shop one

2 film work contest will store time peopl natur life queen

3 well year just look young first love men now fashion

4 cosmet girl make can good school day get play even

5 artist miss also call magazin world home care much go

6 york two way citi help thing use offer made colour

7 high old see think imag busi includ meni famili book

8 base live us long ad come eye great style stori

9 often find take want part skin black line found becom

10 role face creat makeup feel art celebr compani around whose

11 market open turn seem set year american hair may design

12 known cultur ful health former age star name friend model

13 surgeri real need alway center photo movi everi still music

14 lauder featur kind beach end place bodi never back paint

15 perfect person children inner femal last danc start state best

16 youth mother hope put countri industri white know standard daughter

17 plastic power mr local grace week sell nation park next

18 provid parlor bring sens treatment photograph america import owner move

19 recent three love garden free event big right charact second

20 believ month blond flower better wife full area today tip

21 becam ideal ms talk sinc illustr light chang small left

22 idea talent truth ugli differ night present voic someth point

23 inspir anoth run special secret town african editor enjoy earli

24 servic tradit physic fit mind student attract appreci perform fall

25 captur spa transform ladi public along hair enter moment combin

26 brand makeov wonder mean group univers focus cream among issu

27 accord competit might stage though high saturday develop visit studi

28 give organ room without cours class fill interest realli enough

29 rich grow keep strength appear colleg marri charm actress success

30 joy littl god beast cover money gift came heart told

31 job size brain less lot deep dark presid classic far

32 consid true tell church hand obsess lip form learn hit

33 die oper top origin societi almost custom done seen feminin

34 island wear yet communiti depart train head mysteri new simpli

35 site bath guy eleg male five tale six histori wrote

36 angel popular allow averag husband consult blue saw south remain

37 mountain th other instead suppli crown desir sing past fact

38 promot own experi smile self expert scene definit win surround

39 land hous parti final cloth host collect later death given

40 river compet express travel figur got view speak brought babi

Page 40: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

28

Table 1.7b Beauty topics in Canada

Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8

"Real

Beauty 1"

"Beauty

service 1"

"Real

Beauty 2""Movie 1"

"Beauty

contest 1"

"Beauty

service 2"

"Beauty

service 3""Movie 2"

1 look say women play women one like product

2 can year girl make pageant show natur new

3 love world time whose contest just see men

4 day hair model first come even meni film

5 find get real now want shop live work

6 also will imag back miss good young becom

7 take colour age heart cosmet feel eye canada

8 life old dove much skin two person line

9 peopl includ featur mother care last someth role

10 thing salon think salon bodi place standard star

11 way well ad undercov canadian name sens base

12 physic great size music turn power home go

13 call magazin face movi artist seem learn design

14 ful makeup photograph agent queen citi queen toronto

15 per fashion femal stori believ point blond market

16 part editor art light inner use hope famili

17 cent busi societi dress ask univers often big

18 set servic still fashion attract moment among hand

19 everi health ideal time public made die best

20 may director us classic million charact comfort sell

21 littl book campaign style won ugli also surgeri

22 realli offer cultur soul origin anoth surround compani

23 interest china use run compet view far sing

24 us male illustr got night week earli york

25 give peopl differ food help along ever creat

26 know home brain three might street grace actress

27 garden school tale queen titl humen cours head

28 found open studi win averag spot fit self

29 strong provid celebr grow winner tree brought youth

30 noth month long special industri appear seen without

31 known local messag fbi appreci worth cultur former

32 question help american collect kidnap kind parlour appear

33 near daughter north ultim talent vancouv ladi plastic

34 histori word teen hit definit smile lip tri

35 never pictur media thought sever top clear murder

36 spirit treatment end second wear landscap friend reach

37 tradit cover exhibit truth increas despit gift theron

38 qualiti close sinc white former yet especi drama

39 achiev pretti will comedi true piec describ transform

40 organ around fairi gina figur novel owner launch

Page 41: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

29

Table 1.7c Beauty topics in U.K.

Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7 Topic 8 Topic 9 Topic 10

"Real

Beauty 1"

"Beauty

service 1"?

"Beauty

service 2"?

"Beauty

contest 1"?

"Beauty

service 3"? ?

1 like salon one will say girl love women look year

2 thing treatment product fashion can world use just new old

3 real men first bodi make last natur also queen work

4 never hair pound young star top way see day show

5 meni therapi contest back now us home made face model

6 art take time shop get femal makeup think two age

7 centuri even miss celebr best anoth set much spot peopl

8 american skin eye week life month design still good health

9 role offer film blond come put ful long live want

10 surgeri includ well pictur call found book london turn industri

11 kind busi go hollywood help parti hous find got big

12 chang high pageant three friend name claim know idea part

13 mean open around seem play secret sex may magazin night

14 th littl ad appear person win counter tri whose becom

15 featur cosmet everi imag ask colleg mother reveal citi start

16 far perfect give tv although run without spend hand wife

17 modern alway feel ever base dress famili someth yesterday time

18 physic head realli brand heart tip creat yet final male

19 play cours today actress combin student follow colour next recent

20 attract believ compani sinc dark hotel artist four line full

21 ideal facial place black meet irish rather differ came hard

22 becam countri market power obsess pop brain though left studi

23 campaign centr music store right former west deep took increas

24 self need ladi youth true parlour talk mark classic school

25 figur street competit favourit known director husband marri stun mum

26 noth spa uk latest routin expert seen report walk job

27 almost room present british hit charm tradit bring point later

28 despit local launch experi discov number interest sure lot went

29 certain massag lead cream sun style light less co england

30 confid tan sell movi great must quit research children tell

31 grace million enter menag ex gorgeous food decid york won

32 moment rang st talent lip fan cultur welsh public stori

33 right cloth career train produc wear insid fit six near

34 mind care might five everyon beast sexual pretti north often

35 plastic area firm told describ award french success wale leav

36 univers hairdress nation regim intellig front cover great boy paint

37 charact better britain hall within travel class sarah happi brother

38 close nail organ daughter touch date lauder howev saw soon

39 wonder servic fact white red summer usual half kate south

40 death hour intern de english event search lost alreadi past

Page 42: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

30

Table 1.7d Beauty topics in Australia

Rank Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 Topic 6 Topic 7

"Beauty

Contest 1"

"Real

Beauty"

"Beauty

Contest 2"

"Beauty

Service 1"?

"Beauty

Service 2"

"Beauty

Service 3"

1 miss look year one come product say

2 pageant women new salon work natur hair

3 contest treatment model busi us will therapi

4 australia well old eye thing women make

5 world peopl men star go day makeup

6 univers meni industri time therapi skin first

7 queen health just former need sydney can

8 student life world also art best get

9 hawkin age last write someth see back

10 love film face set anoth way spot

11 australian ad take tri good fashion give

12 girl play show away inner girl cosmet

13 jennif use two blond know even littl

14 gold great top magazin studi design bodi

15 coast young next queen train still colour

16 crown imag now find hand centr nation

17 mari call becom shop feel australian blind

18 win like fashion melbourn menag want rang

19 corbi long women home truth market seem

20 will ful part editor believ seen like

21 bali sinc visit grace won latest guarante

22 celebr hollywood appear big heart featur artist

23 newcastl wonder real month realli skincar complet

24 yesterday role past court may includ keep

25 titl high turn might room care nail

26 bag stori chang love within friend much

27 judg end almost help compani french provid

28 pictur tradit start recent women lot base

29 ms think local award book appreci rather

30 run avail launch screen movi bring around

31 schapell mind week nicol left exhibit sleep

32 competit person talent per ms lifestyl surgeri

33 davi power beach move watch help skin

34 secret live much cate case cream wax

35 found counter male open say buy travel

36 head youth door fashion interest focus sure

37 intern director black cent alway eleg robert

38 organis told side london demend better total

39 charm money profession south pop tell brisban

40 follow american white report import week citi

Page 43: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

31

Table 1.7e Beauty topics in New Zealand

Rank Topic 1 Topic 2

"Beauty

service"?

1 therapi say

2 say new

3 salon miss

4 year will

5 contest makeup

6 one work

7 busi pageant

8 look zealand

9 treatment world

10 also natur

11 product women

12 women want

13 old photograph

14 clinic student

15 offer servic

16 school bring

17 massag waikato

18 can hill

19 new use

20 nail home

21 peopl gift

22 industri place

23 make health

24 much like

25 take can

26 client artist

27 open experi

28 hair imag

29 give book

30 day countri

31 technolog fashion

32 tan adam

33 spa ansel

34 style antarctica

35 good last

36 shop great

37 cours week

38 facial face

39 girl young

40 perfect person

Page 44: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

32

These are the steps which we used for naming topics. First, by examining the top 40 words in

each topic across 5 countries in Table 1.7, it looks there are many words on (1) the ads

themselves (i.e. Dove campaign for Real Beauty), (2) beauty services or products, (3) beauty

contests, and (4) movies. Second, among the top 40 words, we asked sociology graduate students

to pick up words related to each of the 4 topic groups. Third, in each topic, we counted the

number of matched words with the chosen words by the evaluators for the above 4 groups, and

named topics as one of the 4 topic groups.

In the second step above, for the ad-related words, we collected both (1) content words that is

mentioned in the ads or that describe the ads and (2) words on advertising messages that are

implied in the social issue advertising. Table 1.8a shows the content words in the Dove

advertising (Figure 1.1a-b). Next, we collected two sets of words on the advertising message

from sociology experts. First, given that Dove aimed to challenge traditional views on beauty

(Dove website 2015), we asked sociology graduate students to choose “social or cultural

change”-related words, such as "change", “traditional”, "society", and "culture”. Second, if the

Real Beauty campaign was effective, people might have talked about the opposite side of

physical beauty. "Real", "true", "mind", "self", and "spirit" are the examples. Two evaluators

chose words independently. For words on which the evaluators disagreed, we followed the

opinion of a third sociology graduate student. Table 1.8b shows the chosen words from the three

evaluators. 35 social words and 15 opposite words to physical beauty were agreed on among the

independent evaluators. In order to choose words related to beauty services (or products), beauty

contests, and movies, we also relied on 3 evaluators. Table 1.8b to 1.8e shows the words agreed

by evaluators. As we discussed before, several words such as “size”, “skin”, “model”, “makeup”,

“women”, and “female” were indeed commonly included in multiple beauty topic groups by the

evaluators.

Table 1.8a Content words in the Dove advertising campaign for Real Beauty

Ad slogan Dove, ad, campaign, real, beauty

In the ads

oversized, outstanding, fat, fit, true, squeeze, size, beauty, debate flat, flattering, sexy, busty, beauty, debate flawed, flawless, beautiful, skin, only, ever, spotless, beauty, debate grey, gorgeous, more, women, feel, glad, beauty, debate ugly, beauty, spots, skin, really, have, flawless, beautiful, beauty, debate wrinkled, withered, wonderful, will, society, ever, accept, old, beautiful, debate feature, real, women, curve

Page 45: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

33

Table 1.8b ‘Social change’-related words

"Social or cultural change"

-related words Opposite words to

physical beauty

Given as examples

change, traditional, culture, society inner, mind, real

Chosen by evaluators

achieve, better, campaign, celebrate, change, culture, depart, differ, exhibit, grow, history, ideal, message, modern, moment, organize, people, popular, power, public, question, right, since, society, strong, think, time, traditional, use, way, will, women, young

brain, character, differ, grow, ideal, inner, kind, mind, quality, real, self, spirit, strong, talent, true

Table 1.8c ‘Beauty service or product’-related words

Given as examples

product, service, shop, sell, makeup, cosmetics, face, body, line

Chosen by evaluators

achieve, age, appear, better, blonde, body, care, classic, colour, cosmetics, cream, dove, elegant, eye, face, facial, feature, feel, female, feminine, figure, fit, girl, gorgeous, great, hair, hairdress, head, health, ideal, image, industry, inner, light, line, lip, magazine, makeover, makeup, market, model, modern, nail, perfect, physics, plastic, popular, pretty, product, quality, real, reveal, salon, sell, service, shop, skin, skincare, smile, spa, special, strength, strong, style, success, tan, therapy, touch, train, travel, ugly, use, wax, wear, worth, young, youth, true

Table 1.8d ‘Beauty contest’-related words

Given as examples

face, body, line, contest, queen, pageant, crown, competition, miss, universe

Chosen by evaluators

achieve, actress, america, american, award, best, body, character, charm, compete,

competition, confidence, contest, crown, event, face, fashion, female, figure,

gorgeous, hair, ideal, judge, lady, line, makeover, makeup, miss, model, pageant,

pretty, queen, sing, smile, talent, title, universe, wear, win, winner, women, won,

world, young, youth

Table 1.8e Movie related words

Given as examples

film, play, role, actor, actress, star, beast

Chosen by evaluators

actor, actress, agent, america, american, appear, art, artist, award, beast, career,

character, classic, comedy, director, drama, editor, end, film, hollywood, image,

industry, media, movie, mystery, picture, play, role, scene, screen, stage, star,

talent, watch

Page 46: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

34

Table 1.8f There are 1 or 2 real-beauty-related topics in each country except New Zealand.

the U.S.

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Real Beauty 0 13 4 5 12 2 2 2 4 0

Beauty Product or Service 6 10 2 10 8 4 8 5 4 4

Beauty Contest 1 8 1 3 4 5 4 4 3 4

Movie 4 1 0 2 3 2 5 3 5 1

Canada

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Real Beauty 11 3 19 2 6 7 4 1

Beauty Product or Service 4 9 11 5 8 8 5 7

Beauty Contest 1 5 6 3 13 3 3 5

Movie 0 3 5 6 3 2 0 9

the U.K.

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Real Beauty 13 2 5 7 2 1 4 4 2 3

Beauty Product or Service 7 15 4 9 4 6 4 7 4 4

Beauty Contest 6 1 5 6 1 8 1 2 3 2

Movie 5 0 5 7 2 3 1 0 1 1

Australia New Zealand

T1 T2 T3 T4 T5 T6 T7 T1 T2

Real Beauty 2 10 5 1 4 7 2 3 3

Beauty Product or Service 2 7 5 5 4 9 10 12 8

Beauty Contest 12 4 6 3 2 3 3 3 8

Movie 1 8 4 4 3 1 1 1 2

T represents a topic. We counted the number of matched words in each topic with those of the 4 topic

groups.

Table 1.8f shows the result of naming topics in the third step above. We find 1 or 2 real-beauty-

related topics in each country except New Zealand. The most popular beauty topics are beauty

products or services across the analyzed 5 countries. Beauty contest topics are also detected

except New Zealand. The U.S. and Canada also have movie topics. Note that we did not label

topics if there were not many matched words (i.e. at most 4 words, which is 10% x 40 words in

each topic) for the most related topic group or if multiple topics have the same biggest number of

matched words.

Page 47: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

35

When we validated those selected real-beauty-related topics, we checked whether those topics

are driven by news contents reporting either the Dove campaign itself or underlying ad messages

of the campaign. Topics about the Dove campaign itself are likely to have many words on (1) the

campaign slogan and (2) ad content in Table 1.8a. On the other hand, if newspapers also

discussed real beauty beyond the campaign itself, some real-beauty-related topics may use

‘social change’-related words in Table 1.8b more often than other topics.

By revisiting Table 1.7a-c for the U.S, Canada, the U.K., we find that all the 5 topics identified

as real-beauty-related topics in the treated countries are validated by either directly ad-related or

implied ad-messages-related words. For the U.S. in Table 1.7a, in topic 2 labeled as real beauty,

the campaign slogan words rank high. “Real”, “dove” and “campaign” ranks 13th, 45th and 49th.

These are the highest ranks across the 10 topics in the U.S. "Old", “real”, “featur(e)”, "size", and

"true" are content words in the ads. “Cultur(e)”, "power", "ideal", "tradit(ional)", "organ(ize)",

"grow", "popular", are words on social or cultural change. There are also opposite words to

physical beauty: “real”, "ideal", "talent", "grow", and “true”.

Next, topic 5 seems to talk about helping young women. “Help” “young” “women”, “femal(es)”

“feel” “free” “better” “differ(ent)”, and “societ(y)” are such keywords. With additional search

about the Dove campaign, we realized that the campaign also aimed to help young girls have

confidence in their beauty (Dove website 2016). Note that we did not intend to include words

related to this goal in detecting real-beauty-related topics. However, we could identify this topic

because most of those words are related to social or cultural change in Table 1.8b. There are also

opposite words on ‘physical beauty’, which are “differ(ent), “mind”, and “self”. “Ad” ranks high

as 8th, suggesting that this topic 5 is also somewhat related to the ads.

However, note that “dove” and “campaign” are ranked much lower in topic 5 (571st and 527th)

than in topic 2 (45th and 49th), although “ad” ranks 8th in topic 2. This result suggests that

sentences related to real beauty in topic 5 are much less likely to come from the articles that

reported on the Dove campaign itself than those in topic 2. We will discuss this point below in

section 4.4 with stronger evidence.

In Canada in Table 1.7b, topics 1 and 3 are identified as real-beauty-related topics. In topic 3, the

campaign slogan words rank the highest among the 8 topics: “Real”, “dove”, “ad” and

“campaign” ranks 5th, 8th, 11th and 21th, suggesting that the topic 3 is the most related to the Dove

Page 48: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

36

campaign in Canada. As a result, content words in the ads, which are “feature(e)”, “celebr(ate)”,

“real”, “women”, and “size” also rank relatively high. There are also many words related to

social and cultural change. “Women”, “time”, “think”, “societ(y)”, “ideal”, “campaign”,

“cultur(e)”, “differ(ent)” and “messag(e)” are the examples. “Real”, “ideal”, “differ(ent)”, and

“brain” are opposite words to physical beauty.

Similarly to the U.S., in Canada, “dove” and “campaign” are included in the top 40 words in

topic 3 but not in topic 1. No ad content word is included in the top 40 words in topic 1 except

“reall(y)”. Nevertheless, there are many words related to social change: “people”, “way”,

“strong”, “question”, “histor(y)”, “tradit(ional)”, “achiev(e)”, and “organ(ize)”. There are also

several opposite words to physical beauty: “Strong”, “spirit”, and “qualit(y)”. Given “look”

ranks the first, topic 1 seems to “question” “tradit(ional)” “way” or “physic(al)” “look”. As we

discussed before, topic 1 is not likely to talk about the Dove campaign itself.

In the U.K., topic 1 is only real-beauty-related topic. The campaign title words “real”,

“campaign”, “dove” rank 3rd, 23rd, and 43rd. “Featur(e)” and “wonder(ful)” are ad content words.

“chang(e)”, “modern”, “ideal”, “campaign”, “moment”, and “right” are words related to social or

cultural change. There are also several opposite words to physical beauty: “Real”, “kind”,

“ideal”, “self”, “mind”, and “charact(er)”. Considering “surger(y)”, “physic(al)”, and “plastic” in

the rank 10, 18, and 35, the topic 1 seems to “featur(e)” “real” beauty to “chang(e)” “modern”

“physic(al)” “thing(s)” “like” “plastic,” “surgery”.

Among other beauty topics, beauty services (or products) were the most popular during the

analyzed periods in all the 3 treated countries in terms of the number of topics. There are 4

beauty services (or products) in the U.S. Topic 1 talks about "cosmet(ics)", "market",

“surger(y)”, “lauder”, "perfect", “youth”, "plastic", "servic(e)", and “brand”. Topic 4 has “look”,

"makeup", "health", "inner", "ugl(y)", "special", "fit", "strength", "eleg(ant)", "smile", and

"travel". Topic 7 seems to talk about “sell” or “use” beauty “product” for “light” “eye(s)”,

“bod(y)”, “hair”, “lip” or “head”. Lastly, topic 8 also seems to deliver similar theme with words

such as “natur(al)” “great” “care” or “cream” for “hair” or “line”.

Page 49: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

37

In Canada, 3 topics are full of words related to beauty services (or products) in beauty salons or

shops, including “hair”, “colour”, “salon”, “great”, “magazin(e)”, “makeup”, “fashion”,

“service”, “health”, “treatment”, and “prett(y)” in topic 2; "shop", "feel", "made", "ugl(y)",

“spot”, "appear", "worth", and "smile" in topic 6; and “natur”, "young", "eye", "blond(e)", "fit"

and "lip" in topic 7.

The U.K. also has a large collection of such words: “salon”, “treatment”, “hair”, “therap(y)”,

“skin”, “cosmet(ics)”, "perfect", "head", “facial”, “spa”, “massag(e)”, “tan”, “care”, “hairdress”,

“better”, “nail” and “service” in topic 2; “fashion”, “bod(y)”, "young", "shop", "blond",

"appear", "imag", "youth", "cream", and “brand” in topic 4; and "made", "reveal", "colour",

"fit", "prett(y)", "success", and "great" in topic 8.

Next, beauty-contest-related topics existed in all the treated countries: “pageant”, “world”,

“photo”, “photograph”, “event”, "univers(e)", “crown” in topic 6 in the U.S.; "women",

“pageant”, “contest”, "miss", "bod(y)", “queen”, "won", "compet(e)", "titl(e)", "winner", "talent",

"wear", and "figur(e)" in topic 5 in Canada; and "world", “top”, "femal(e)", "win", “dress”,

"charm", "gorgeous", "wear", "award", and "event" in topic 6 in the U.K.

Lastly, we find that movies were also a frequently reported beauty topic in newspapers, perhaps

due to the connection with the beauty of actresses or movies on beauty (e.g., Beauty Shop). In

particular, the movie Beauty Shop was released on March 24, 2005. Topic 9 in the U.S. has

“play”, “style”, “charact(ers)”, “perform”, “actress”, and “classic”. Popular movie-related words

in Canada were “play”, “music”, “movie”, “agent”, “stor(y)”, “classic” and “comed(y)” in topic

4 and “product”, “film”, “canada”, “line”, “role”, “star”, “toronto”, “market”, “sing”, “actress”,

“drama” and “launch” in topic 8.

Now, we turn to topics in the control countries in Table 1.7d and 1.7e for Australia and New

Zealand, respectively. Only topic 2 in Australia is selected as a real-beauty-related topic.

“Women”, “peopl(e)”, “use”, “young”, “sinc(e)”, “tradit(ional)”, “think”, and “power” are the

chosen words as social-change-related words. “Mind” is an opposite word to physical beauty.

Although “ad” ranks 11th, “dove” and “campaign” are not ranked in the top 40. Overall, this

result suggests that most sentences related to real beauty in topic 2 are not likely to come from

Page 50: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

38

the articles that reported on the Dove campaign itself. It makes sense because the Dove campaign

was not launched in Australia during the analyzed periods.

Like in the treated countries, beauty services (or products) were the most popular topics in both

control countries. 3 topics in Australia are about beauty services (or products), as evidenced by

"salon", "eye", "blond(e)", "magazin(e)", "shop", and “fashion” in topic 4; “product”, “natur(al)”,

“fashion”, "skin", "girl", "market", "featur(e)", "skincar(e)", "care", "cream", "eleg(ant)", and

"better" in topic 6; and "hair", "therap(y)", "makeup", “spot”, "cosmet(ics)", "bod(y)", "colour",

"nail", “surger(y)”, "skin", "wax", and "travel" in topic 7. Similarly, topic 1 in New Zealand is

full of beauty service (or product) words such as "therapi", "salon", “treatment”, “product”,

“clinic”, “massag(e)”, "nail", "industry(y)", "hair", "tan", "spa", "style", "shop", "facial", "girl",

and "perfect".

Beauty-contest-related topics were also popular in control countries. In Australia, such words are

“miss”, “pageant”, “contest”, “Australia”, “world”, “univers(e)”, “queen”, “crown”, “win”,

"titl", “judg(e)”, and “competit(ion)”, and "charm" in topic 1; and “model”, “world”, “face”,

“show”, “fashion”, “women”, and “talent” in topic 3. Topic 2 in New Zealand has the same

number (8) of words about both beauty services and beauty contests. This makes sense since

both topic groups are highly associated each other. We did not label this topic 2.

In summary, we identified real-beauty-related topics during the analyzed periods. All the themes

discovered in the topics across the three countries are very similar to each other and to the

message of the Dove campaign.

Now, we are ready to categorize beauty sentences into the most relevant topics. We assigned

beauty sentence 𝑥𝑖 to a topic k if 𝜔𝑖𝑘 = max (𝜔𝑖1, … , 𝜔𝑖𝐾). Then, we counted the number of

sentences in each topic. Alternatively, one could use 𝜔𝑖𝑘 topic weight itself without categorizing

each sentence into a particular topic group. Let's suppose that there are two topics. One sentence

has 90% and 10% topic weights from topic 1 and 2, respectively. Seemingly, it looks natural to

use the weights (90% and 10%) as the contribution of topics in the sentence. This is likely to be

true when one uses newspaper article as unit level of analysis. However, one sentence is likely to

talk about one topic (Büschken & Allenby 2016). In fact, although the sentence almost talks

about topic 1, topic weights rarely have 100% due to some common words across topics. This is

Page 51: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

39

why we categorized sentences into the topic group with the biggest topic weight. Then, we show

the result using topic weights as a robustness check.

Figure 1.3 Trend of beauty topics

Figure 1.3a The number of sentences labeled as real beauty topics increases relative to that as other

beauty topics with the Dove campaigns for Real Beauty in the U.S. and Canada, respectively.

U.S. Canada

Figure 1.3b There is no systematic pattern in both (1) the number of sentences labeled as real beauty

topics and (2) that as other beauty topics during the Dove campaigns in Australia.

Dependent variable is the number of sentences per a topic.

Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.),

Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July)

Figure 1.3a compares monthly trend of real beauty vs. other beauty topics in the U.S and Canada,

respectively. There are two clear patterns. First, during the campaigns across the first and the

Page 52: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

40

second campaigns in the U.S. and Canada, real-beauty-related sentences increase while other

beauty sentences show decreasing or relatively stable pattern.

Second, after the campaign, the number of real-beauty-related sentences quickly decreases. The

second pattern is expected since newspapers are always looking for “news” and thus they move

on other topics once the event finishes.

Third, there are also jumps in real-beauty-related topics before the second campaign. This is

likely to be due to the launch of new movie. Before the second campaign, the movie “beauty

shop” is released on 2005 April. Similarly, before the first campaign, several movies about girls

(“Mean girl” and “The girl next door”) are released on 2004 April. As the release of new movies

seems to increase several beauty topics including real-beauty-related topics, there might be other

events to increase both real beauty and other beauty topics. This pattern suggests that we need to

have other beauty topics as controls.

On the other hand, there is no systematic trend in real vs. other beauty topics in Australia, as seen

in Figure 1.3b.

1.4.3 Testing

Next, by exploiting variation in time of the campaigns across countries, we test whether

the number of sentences labeled as real beauty topics in the previous section increases during the

month(s) of the real beauty campaign compared to those categorized as non-”real beauty” topics

across countries. We start with a difference-in-difference using only data of treated countries

with the Dove campaigns during the analyzed periods. We will add the third difference between

the treated countries and the control countries. The two real beauty topics in each the U.S. and

Canada are aggregated into one topic because we measure aggregate impact of advertising on the

real-beauty-related topics. Because we extracted country-specific topics rather than common

topics across all the analyzed countries due to inter-country differences in the words used to

define each topic, we allow country-topic and time fixed effect.

Page 53: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

41

The number of sentence labeled as a topic k in a country c in year-month t is

𝑁𝑜 𝑜𝑓 𝐵𝑒𝑎𝑢𝑡𝑦 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑘𝑐𝑡 = 𝛽𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 + 𝜇𝑘𝑐 + 𝜏𝑡 + 𝜀𝑘𝑐𝑡 (2)

where

- 𝛽 captures the core effect in this paper—the impact of advertising on real-beauty-related

topics compared to all the other topics;

- 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐 = 1 if k is a real-beauty-related topic in country c, 0 otherwise;

- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is a campaign period in country c, 0 otherwise;

- 𝜇𝑘𝑐 is a country-topic specific fixed effect that captures differences in the number of

sentences across countries and topics;

- 𝜏𝑡 is a year-month specific fixed effect;

- 𝜀𝑘𝑐𝑡 is the error term.

The above Equation (2) is estimated with the OLS. Our identification assumption is that there is

no systematic factor that drives the firm’s decision on campaign timing and location in order to

coincide with media coverage on real beauty nor is there an omitted variable that drives real

beauty reporting in the U.S., Canada, and the U.K, during the campaigns. Heteroskedasticity-

robust standard errors are clustered at the country-topic level to adjust for correlation within each

country’s topic across analyzed months.

Table 1.9a shows our main result for Equation (2). The coefficient for the interaction effect is

significantly positive, suggesting that there is higher coverage of real-beauty-related topics in

newspapers during the campaign than non-campaign periods compared to the other beauty topics

by 33 sentences. As we discussed before, we also counted the number of sentences using topic

weights without assigning each sentence into a particular topic group as robustness check. Table

1.A1 shows that main result still holds.

Table 1.9a shows our main result for Equation (2). Column (1) use only real-beauty-related

topics in treated countries. Note that not all the Dove campaigns were run at the same time across

the treated countries. Thus, real-beauty campaigns in other countries become control for those in

THE focal country. Column (2) through (5) add other-beauty-related topics as controls. Column

(2), (3), and (4) are for Canada, the U.S., and the U.K., respectively. As discussed above with

Page 54: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

42

respect to Figure 1.3a, other events might increase both real beauty and other beauty topics.

Thus, we control for other beauty topics. Furthermore, given that we want to test whether the

Dove campaign increased discussion of real beauty compared to conventional media’s beauty

topics such as beauty services or beauty contests, by controlling for other beauty topics, the Diff-

in-Diff gives us relative change of real-beauty topics compared to other beauty topics. Column

(5) uses all the data across the 3 treated countries. Across all the 5 columns, the coefficients for

the interaction effect are significantly positive, suggesting that there is higher coverage of real-

beauty-related topics in newspapers during the campaign than non-campaign periods compared

to real-beauty-related topics in other countries in Column (1) or the other beauty topics within

country in Column (2) to (4). As we discussed before, we also counted the number of sentences

using topic weights without assigning each sentence into a particular topic group as robustness

check. Table 1.A1 shows that main result still holds.

Table 1.9a The number of sentences labeled as real beauty topics increases relative to that as other

beauty topics in the treated countries during the month(s) of the real beauty campaign.

Only

Real Beauty

Adding Other beauty topics

Canada US UK All

(1) (2) (3) (4) (5)

Real Beauty X During Campaign

29.7*** (3.35)

38.7*** (2.80)

23.9*** (1.53)

40.9*** (1.92)

33.2*** (5.20)

Country-Topic Dummies 2 7 9 10 25

Year-Month Dummies 23 23 23 23 23

R-sq 0.775 0.720 0.790 0.615 0.671

Observations 72 168 216 240 624

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

The U.S., Canada, and the U.K. have 10(2), 8(2), and 10(1) topics (real beauty ones), respectively.

Two real beauty topics in each country are aggregated into one topic.

Column (1) use only real-beauty-related topics in treated countries.

Column (2) through (5) add other-beauty-related topics.

Column (2), (3), and (4) use data from Canada, the U.S., and the U.K., respectively.

Column (5) use all the data across the 3 treated countries.

Robust standard errors are clustered at the country-topic level. ***p < 0.01

As a falsification test, Table 1.9b also use the same Equation (2) but only data from control

countries. The effect is insignificant, suggesting that there is no a comparable increase in

Page 55: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

43

newspaper coverage of real-beauty-related topics in the control countries, as seen in Figure 1.3b.

This is expected because the campaign started later in those two countries.

Table 1.9b Falsification test: The number of sentences labeled as real beauty topics does not increase

relative to that as other beauty topics in the control countries during the month(s) of the real beauty

campaign.

Real Beauty X During Campaign

-0.60 (0.85)

Country-Topic Dummies 8

Year-Month Dummies 23

R-sq 0.502

Observations 216

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

New Zealand and Australia have 7(1) and 2(0) topics (real beauty ones), respectively.

Recall that the identifying assumption for Table 1.9a is that there was no other factor to affect

newspaper coverage on real beauty that are not related to the Dove campaigns. However, there

could be time-varying global interest on social issue of beauty. To rule out this potential

explanation, we compare the change in the treated to the control counties during and one month

after the campaigns as follows.

𝑁𝑜. 𝑜𝑓 𝐵𝑒𝑎𝑢𝑡𝑦 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑘𝑐𝑡 = 𝛽1𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐

+ 𝛽2𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐

+ 𝛽3𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡

+ 𝛽4𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 + 𝜇𝑘𝑐 + 𝜏𝑡 + 𝜀𝑘𝑐𝑡 (3)

where

- 𝛽1, 𝛽2 capture the core effects in this paper—the impact of advertising on real-beauty-related

topics compared to all the other topics in treated countries relative to control countries during

and one month after the campaigns, respectively.

- 𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 = 1 if the campaign was launched during the analyzed periods in a country c, 0

otherwise;

- 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐 = 1 if k is a real-beauty-related topic in country c, 0 otherwise;

Page 56: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

44

- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is a campaign period in country c, 0 otherwise;

- 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡 = 1 if t is one month after the campaign in country c, 0

otherwise;

- 𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 = 1 if t is a campaign period in any treated country, 0 otherwise;

- 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 = 1 if t is one month after the campaign in any treated

country, 0 otherwise;

- 𝜇𝑘𝑐 is a country-topic specific fixed effect that captures differences in the number of

sentences across countries and topics;

- 𝜏𝑡 is a year − month specific fixed effect

- 𝜀𝑘𝑐𝑡 is the error term

This “differences-in-differences-in-differences” specification combines the insight of Table 1.9a

and 1.9b. We use the ad effect on real-beauty-related topics in the control countries to account

for changes over time (during and one month after the campaign) in the baseline of media

coverage on real-beauty-related topics with coefficients 𝛽3 and 𝛽4.

Table 1.9c shows our main results, building up to the full specification for Equation (3) in

column (2). Column (1) reflects the results of Table 1.9a and 1.9b by comparing the ad effects on

real-beauty-related topics between the treated and the control countries only during the

campaigns. The key coefficient of interest, 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 is

positive and significant. The effect size is similar with that in Table 1.9a. It suggests that

newspapers in the U.S., Canada, and the U.K report real-beauty-related topic more by about 70%

during the campaign compared to the other beauty topics relative to control countries. The

coefficient for 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 is insignificant, suggesting that there is no

systematic change during the campaign in media coverage on real-beauty-related topic in the

control countries.

Column (2) adds terms to capture the ad effect one month after the campaigns. The

coefficient for 𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 is positive and significant although

the effect size is much smaller than that for “during the campaigns”. The corresponding baseline

coefficient for 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝑂𝑛𝑒 𝑚𝑜𝑛𝑡ℎ 𝑎𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 is significantly negative. Both

Page 57: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

45

coefficients suggest that newspapers showed much less interest on real-beauty-related topics one

month after the campaigns than during the campaigns across the treated and the control

countries. However, one month after the campaigns, the newspapers in the countries of the

campaigns still talk more about real beauty compared to the other beauty topics relative to those

in the control countries. We also find that the ad effect was almost gone from the second months

of the campaigns. See Table 1.A2 in the appendix for the result. In summary, newspapers in our

data wrote more about real beauty during and one month after the campaigns.

Table 1.9c The number of sentences labeled as real beauty topics increases relative to that as other

beauty topics in the treated countries relative to control countries during and one month after the Real

Beauty campaign.

(1) (2)

Only

During

During

& After

Real Beauty x Treated Countries

x During Campaign

32.90***

(5.106)

33.68***

(4.956)

x One Month After Campaign 5.157***

(1.965)

Real Beauty

x During Campaign

0.066

(0.647)

-0.606

(0.735)

x One Month After Campaign -4.047***

(0.901)

Country-specific topic dummies 34 34

Year-month dummies 23 23

R-sq 0.738 0.738

Observations 840 840

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),

respectively. Two real beauty topics in each country are aggregated into one topic.

Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty ones), respectively.

Robust standard errors are clustered at the country-topic level. ***p < 0.01

Page 58: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

46

1.4.4 Mechanism

1.4.4.1 How does the advertising message affect the content of a newspaper?

In the previous section, we showed that real-beauty-related content increased with Dove’s real

beauty advertising campaign. Next, we explore how the advertising message affects newspaper

content. Potential scenarios are as follows. First, newspapers may have reported the Dove

advertising campaign itself and not altered content in any other meaningful way. Second,

newspapers may have discussed real beauty in articles that are not directly related to the Dove

campaign. Third, newspapers may have talked more about real beauty while referencing the

Dove campaign within the same article.

Access to all the newspaper text data in the title and body allowed us to test whether real beauty

related topics exist even in non-Dove articles. Among the beauty articles we downloaded, we

used the keyword “Dove” to filter the Dove campaign-related articles. After removing all the

sentences within the articles or the only sentences that mentioned “Dove,” we did the same test

with the previous section again in order to see whether beauty sentences labeled as real-beauty-

related topics increased with the Dove campaign.

Columns (1) to (5) in Table 1.10 show the results for Equation (2) using new data after deleting

Dove sentences. Column (1) is the base result from all of the original beauty sentences, as is

Column (2) in Table 1.9c. Columns (2), (3), and (4) delete all the beauty sentences within the

articles that include “Dove” in the title, abstract, and any place (title, abstract, or body),

respectively. While some sentences talk about the Dove campaign, others may discuss about real

beauty without mentioning “Dove”. Thus, Column (4) is very conservative test. We focus on the

interpretation of 𝑅𝑒𝑎𝑙 𝐵𝑒𝑎𝑢𝑡𝑦𝑘𝑐×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑐𝑡×𝑇𝑟𝑒𝑎𝑡𝑒𝑑𝑐 in the first line because the

rest of the coefficients show similar patterns with those in Table 1.9c. The effect sizes from

Columns (2) and (3) to (4) decrease by 18%, 61%, and 74%, respectively, from the base one in

Column (1). This suggests two things. Firstly, many newspaper articles indeed reported on the

Dove campaign itself, although the extent varied: While some articles whose title mentioned

Dove mainly talked about the Dove campaign, others mentioned Dove just once in their body.

More importantly, the coefficient in Column (4) is still significant, suggesting that some

newspapers discuss about real beauty even in non-Dove articles as well. Overall, these results

Page 59: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

47

support the second rather than the first scenario. In other words, this is the evidence that the

Dove advertising message affected how newspapers write about beauty.

On the other hand, Column (5) deletes the only “Dove” sentences. Its effect size decreases 33%

but is much higher than that of Column (4). This result supports the third scenario and suggests

the possibility that newspapers talked more about real beauty based on the Dove campaign

within the same article. This type of newspaper articles may have been the desired format for

marketing or PR managers. By reading such articles, consumers may have associated the brand

(e.g., Dove) with its desired image (e.g., real beauty, self-esteem).

Table 1.10 The significant impact of the campaign on real beauty-related topics are not driven by

reporting the Dove campaign.

All beauty sentences

Dropping articles with Dove in dropping only Dove sentences Title Abstract Anywhere

(1) (2) (3) (4) (5)

Real Beauty x Treated x During Campaign

33.68*** (4.956)

27.51*** (3.797)

13.25*** (1.744)

9.031*** (1.028)

23.56*** (2.864)

x One month after Campaign 5.157*** (1.965)

4.876*** (1.674)

5.289*** (1.744)

5.939*** (1.717)

5.217*** (1.892)

Real Beauty x During Campaign

-0.606 (0.735)

-0.385 (0.733)

-0.695 (0.723)

-0.811 (0.702)

-1.436* (0.733)

x One month after Campaign -4.047***

(0.901) -4.079***

(0.903) -4.182***

(0.910) -4.141***

(0.900) -4.099***

(0.906)

Country-topic dummies 34

Year-month dummies 23

Observations 840

R-sq 0.738 0.735 0.720 0.716 0.728

Controlling countries New Zealand & Australia

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),

respectively. Two real beauty topics in each country are aggregated into one topic.

Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty one), respectively.

Robust standard errors are clustered at the country-topic level. ***p < 0.01; *p < 0.10

Page 60: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

48

1.4.4.2 Social issue advertising and the mass media’s public goal

Now, we explore why social issue advertising can affect the content of mass media. In addition

to a profit goal, the mass media has a (non-economic) public goal to serve the public interest on

desired social or cultural change (McQuail 2010). Typical societal issues in social issue

advertising (or public service announcements) are human rights, environmentalism, voting,

smoking, and donations. The Dove campaign targets boosting self-esteem. Therefore, the mass

media is likely to deliver the message of social issue advertising in order to make a positive

impact on the community. Considering the media’s interest in social issues, firms to implement

social issue advertising set up “mass media coverage” goals (Drumwrigh 1996). Interviewing 11

firms about both the standard and social issue advertising that each firm has made, Drumwright

(1996) finds that firms observed higher media coverage in social issue than standard campaigns.

One company reported that social issue advertising obtained earned media (i.e., free publicity)

valued at six times the expenditure on paid media.

If media indeed reported the Dove campaign actively in order to serve the public interest (i.e.,

desirable social and cultural change), one would observe new words about such social change

emerging during the campaigns. First of all, in the previous section, we showed that the number

of “real beauty”-related sentences increased significantly during the Dove campaigns even in the

newspaper articles that do not mention Dove in Table 1.10.

Next, as we discussed before with Figure 1.2, we also showed suggestive pattern that social-

change-related words (in Table 1.8b) chosen by the evaluators were used more frequently in the

beauty sentences during the campaigns across the U.S., Canada, and the U.K.

Page 61: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

49

Table 1.11a Rising social or cultural change words within real beauty topics during the campaigns

Each word with 24 observations (months) is estimated separately within a country.

Dependent variable is monthly word frequency within real beauty topics.

Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July), U.K. (2005 Jan.)

***p < 0.01; ** p < 0.05; *p < 0.10

U.S. Canada U.K.

Word During Campaigns

R-sq

Word During Campaigns

R-sq

Word During

Campaign R-sq

First Second First Second

women -1.1

(3.36) 13.4*** (3.36)

0.44 women 18.55***

(4.66) 17.55***

(4.66) 0.57 way

3.57*** (1.1)

0.32

campaign 0.65

(0.93) 11.15***

(0.93) 0.87 peopl

4.6** (1.66)

8.6*** (1.66)

0.61 new 2.65* (1.53)

0.12

tradit 0.35

(1.13) 3.85*** (1.13)

0.35 achiev 8.55*** (1.88)

-0.45 (1.88)

0.50 achiev 2.48*** (0.81)

0.30

differ -0.15 (0.68)

3.85*** (0.68)

0.61 strong 7.15*** (1.54)

-0.35 (1.54)

0.51 young 2.39***

(0.6) 0.42

old 1.65

(1.84) 3.65* (1.84)

0.17 old 0.8

(1.61) 6.8*** (1.61)

0.46 think 2.3** (0.99)

0.20

sinc 2.45*** (0.69)

-0.55* (0.69)

0.40 campaign 4.75*** (1.34)

6.75*** (1.34)

0.62 cultur 2.26** (0.94)

0.21

peopl 1.35

(0.79) 2.35*** (0.79)

0.35 think 0.5

(1.61) 6.5*** (1.61)

0.44 old 2.09* (1.06)

0.15

young 0.25

(1.07) 2.25** (1.07)

0.17 differ 0.05 (0.5)

5.55*** (0.5)

0.86 use 2.09* (1.06)

0.15

messag 0.00

(0.72) 2**

(0.72) 0.27 chang

1.4* (0.7)

3.4*** (0.7)

0.55 power 2.04* (1.09)

0.14

way 0.25

(0.73) 1.75** (0.73)

0.22 ideal 3.4*** (0.71)

2.9*** (0.71)

0.63 grow 1.7** (0.65)

0.24

organ 0.2

(0.6) 1.7*** (0.6)

0.28 exhibit 0.2

(1.09) 2.7** (1.09)

0.22 question 1.61* (0.8)

0.16

chang 0.25

(0.64) 1.25* (0.64)

0.15 cultur 1.7

(1.19) 2.7** (1.19)

0.24 strong 0.83** (0.4)

0.17

modern 0.95* (0.51)

-0.05* (0.51)

0.14 young 2.7*** (0.89)

-0.3 (0.89)

0.31

popular 0.65

(0.53) 1.65*** (0.53)

0.34

better 0.5*** (0.11)

1*** (0.11)

0.81

grow -0.2

(0.43) 0.8*

(0.43) 0.15

Page 62: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

50

Table 1.11b Rising opposite words to physical beauty within real beauty topics during the campaigns

U.S. Canada U.K.

Word During Campaigns

R-sq

Word During Campaigns

R-sq

Word During

Campaign R-sq

First Second First Second

real 0.55

(1.72) 15.05***

(1.72) 0.78 real

-0.4 (1.56)

10.6*** (1.56)

0.69 grow 1.7** (0.65)

0.24

talent 0.8

(0.96) 4.8*** (0.96)

0.54 strong 7.15*** (1.54)

-0.35* (1.54)

0.51 strong 0.83** (0.4)

0.17

differ -0.15 (0.68)

3.85*** (0.68)

0.61 spirit 6.8*** (1.63)

-0.2* (1.63)

0.46

spirit 1.35** (0.53)

0.35* (0.53)

0.24 differ 0.05 (0.5)

5.55*** (0.5)

0.86

charact 1.15** (0.49)

-0.35* (0.49)

0.24 ideal 3.4*** (0.71)

2.9*** (0.71)

0.63

brain 0.3

(0.39) 0.8*

(0.39) 0.18 brain

0.65 (0.83)

2.65*** (0.83)

0.33

TRUE 1.9*** (0.51)

-0.1* (0.51)

0.41

grow -0.2

(0.43) 0.8*

(0.43) 0.15

Each word with 24 observations (months) is estimated separately within a country.

Dependent variable is monthly word frequency within real beauty topics.

Campaign periods: US (the 1st in 2004 Sep, Oct; the 2nd in 2005 July and Aug.), Canada (the first one: 2004 Sep. and Oct., the second one: 2005 June and July), U.K. (2005 Jan.)

***p < 0.01; ** p < 0.05; *p < 0.10

Lastly, among social-change-related words, we check which words were used more frequently in

the real-beauty-related topics during the campaigns. The OLS was run for each word. The 13, 16,

and 12 words related to social or cultural change in Table 1.11a show positive and significant

coefficients in either the first or the second campaign in the U.S., Canada, and the U.K.,

respectively. The 6, 8, and 2 opposite words to physical beauty are also so in Table 1.11b. These

results suggest that many social-change-related words were used more frequently in real-beauty-

related sentences in newspapers during the campaigns. In other words, this is the evidence of the

media’s public goal mechanism about why social issue advertising affects mass media content.

Note that, in the U.S., most significant words occurred in the second campaign, as shown in

Table 1.11b. This result is consistent with the finding in Table 1.5a that campaign title words

rank high only in the second campaign.

Page 63: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

51

1.4.4.3 Advertiser pressure

From the above keywords on social change found in newspapers in the previous section, the

media’s public goal seems to be a major force. However, it is still not clear whether the articles

on real beauty topics are published voluntarily or because of advertiser pressure. Given that

advertising is a major revenue source (Stromberg 2004, Mantrala, Naik, Sridhar, and Thorson

2007, Pew Research Center 2014), many theoretic papers predict that the mass media are

affected by advertiser pressure (Ellman and Germano 2009; Gal-Or, Geylani, and Yildirim 2012;

Zhu and Dukes 2015; Spiteri 2015; Blasco, Pin, and Sobbiro 2016). There are also empirical

evidences. Newspapers write longer articles (Rinallo and Basuroy 2009) more frequently (de

Smet and Vanormelingen 2012; Gambaro and Puglisi 2015) and more favorably (Reuter and

Zitzewitz 2006; Gurun and Butler 2012; Focke, Niessen-Ruenzi, and Ruenzi 2016) about the

advertising firms that spend more on newspaper advertisements. Therefore, around the months of

the Dove campaign, those newspapers that advertised Unilever, the company that owns the Dove

brand, might have reported on the Dove campaign more frequently or with more detailed

description.

To address this issue, we tested (1) whether the Dove campaign’s significant impact on real

beauty topics holds regardless of advertiser pressure, and (2) whether the effect is bigger in

newspapers that had any Unilever advertisement around the time of the Dove campaign.

First, we collected Unilever’s newspaper advertising data from a market research company that

tracks newspaper advertising. The market research company MarketTrack monitored 356 U.S.

newspapers in 2004 and 2005. Among the 113 U.S. newspapers that we used from the ProQuest

newsstand database, we identified that 28 newspapers did not publish any Unilever

advertisements and 17 newspapers did between 2004 January and 2005 December. Among them,

8 newspapers published such ads during the months of the Dove campaign.

Page 64: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

52

Table 1.12 The significant impact of the campaign on real beauty-related topics is even in U.S.

Column (1) in Table 1.12 uses beauty sentences from newspapers without Unilever ads. In

contrast, Column (2) uses sentences from newspapers with Unilever ads during the campaign,

and Column (3) adds more newspapers with Unilever ads around the time of the campaign.

Across Columns (1) to (3), heteroskedasticity-robust standard errors are clustered at the country-

topic level to adjust for correlation within each country’s topic across analyzed months.

U.S. Newspapers

without Unilever ad

U.S. Newspapers with Unilever ad

All news- papers

During the campaign

Anytime 2004-2005

(1) (2) (3) (4)

Real Beauty X During campaign x Treated 5.017*** (0.275)

6.573*** (0.273)

10.85*** (0.295)

4.552*** (0.199)

x U.S. Newspapers with Unilever ad during campaign

1.95*** (0.139)

x U.S. Newspapers with Unilever ad anytime 2004-2005

6.25*** (0.136)

Real Beauty x One month after campaign x Treated 7.094*** (0.213)

3.667*** (0.212)

3.352*** (0.277)

4.733*** (1.064)

Real Beauty X During campaign 0.399

(0.469) 0.338

(0.478) 0.736

(0.593) 0.632**

(0.3)

Real Beauty x One month after campaign -5.186***

(1.172) -4.814***

(1.196) -4.763***

(1.205) -3.867***

(0.639)

Year-month dummies 23 23 23 23

Country-specific topic dummies 17 17 17 NA

Country (3 types in US)-specific topic dummies NA NA NA 35

Observations 432 432 432 864

R-sq 0.506 0.546 0.546 0.636

No. of newspapers 28 8 17 45

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

Column (4) use all observations for U.S. newspapers across Columns (1), (2), and (3) and those for New Zealand and Australia.

3 types in US mean Column (1), (2), and (3).

Treated country, the U.S. has 10 topics. Two of them are real-beauty-related topics, which are aggregated into one topic.

Australia and New Zealand have 7(1) and 2(0) topics (real beauty one), respectively. Robust standard errors are clustered at the country-topic level for Columns (1) to (3) and at the country-type-topic level for Column (4). ***p < 0.01; ** p < 0.05; *p < 0.10

Page 65: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

53

Across Columns (1) to (3), all the coefficients for the interaction effects during the campaigns

are significantly positive, suggesting that Dove’s social campaign impacted coverage of real

beauty topics during the campaigns in the U.S. regardless of potential advertiser pressure.

Interestingly, the coefficient in Column (1) for the interaction effect with one month after the

campaigns is bigger than the others in Columns (2) and (3), suggesting that those newspapers

that did not receive economic incentives at least during the analyzed two-year periods show

bigger lasting interest in the social message. These results in Column (1) support the public role

of the mass media. The baseline effects in control countries show similar patterns with those in

Table 9c.

However, the effect sizes in Columns (2) and (3) for during the campaigns are bigger than that in

Column (1). Thus, we test whether this gap is significantly large in Column (4) using all the

newspapers used in Columns (1) to (3). The base interaction effect in the first row corresponds to

that in Column (1), and thus is positive significantly, as expected. The second and third rows

capture the gap between newspapers without and with Unilever ads. Both effects are positive and

significant. This result suggests that newspapers report topics about their advertisers more

actively, and that monetary incentive from advertiser to newspaper tends to come more before or

after a campaign than at the same time. In other words, this result is consistent with the

“advertiser pressure” mechanism.

1.5. Conclusion

In this paper, we have shown that an advertising message can change the content of a newspaper.

Specifically, we find that newspapers increased reporting of real beauty topics during and one

month after the Dove campaigns for real beauty. In segmenting topics of newspapers, we have

also shown that the topic model, which was recently introduced in marketing, is useful to

identify advertising-related messages in newspapers. As an underlying mechanism, we have

provided evidence to support the public role of the mass media: (1) newspapers deliver the

message on social or cultural change that social issue campaigns emphasize, and (2) newspapers

without any Unilever advertisements around the time of the campaign also wrote more about real

beauty topics during and one month after the Dove campaign. Furthermore, we also found

evidence on the advertiser pressure mechanism, in that the impact of the Dove campaign on real

Page 66: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

54

beauty topics is bigger in newspapers with than without Unilever advertisement. Overall, the

results suggest that a marketing campaign can have an impact on mass media content, both in

terms of earned media mentions of the campaign and in terms of changing the focus of articles

about a related topic.

Page 67: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

55

1.3 Ch

Chapter 2

Detecting potential product segments using topological data analysis

2.1 Introduction

Market structure analysis describes the relationships between brands and products in order to

define the market (Elrod et al. 2002). Analysis of market structure is a key step in the design and

development of new products, the repositioning of existing products, pricing, marketing

communications, and marketing strategy (Srivastava, Alpert, and Shocker 1984; Urban, Johnson,

and Hauser 1984; Kamakura and Rusell 1989; Urban and Hauser 1993; DeSarbo, Manrai, and

Manrai 1993; Erdem and Keane 1996; Bergen and Peteraf 2002; Lattin, Carrol, and Green 2003;

DeSarbo, Grewal, and Wind 2006). Until very recently, the bulk of published work focused on

competitive market structure with a limited number of products (Erdem 1996; Cooper and Inoue

1996; DeSarbo and Grewal 2007; Kim, Albuquerque, and Bronnenberg 2011; Lee and Bradlow

2011).

In the last few years, new methods have arisen to identify and visualize market structure with

many products. These new methods are a response to two developments. First, the variety of

products in the marketplace has increased (Ailawadi and Keller 2004), increasing demand for

such methods. Second, faster computers and increasing digital storage capacity have broadened

the set of potential tools to make sense of this variety, enabling the supply side. This has created

renewed interest among marketing scholars in market structure and segmentation (Netzer et al.

2012; France and Ghose 2016; Ringel and Skiera 2016).

Figure 2.1a Distinctly grouped data Figure 2.1b A loopy segment

Source: Lesnick (2013) Source: Lesnick (2013)

Page 68: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

56

In this paper, we apply a new data analysis technique, Topological Data Analysis (TDA

hereafter, Carlsson 2009), to the problem of market structure segmentation with many products.

Standard clustering methods work well for distinctly grouped data (Figure 2.1a). However, as the

number of data points rises, the data set becomes more connected. One particular example of

such connected data is a loopy segment (Figure 2.1b), where products locate closely together

with their neighboring products but are indirectly connected to, and seemingly far apart from,

some other products. TDA is particularly well-suited to identifying such segments.

Loopy segments can occur in analyzing national level market structure with customer level data

on purchases. In many cases, not all products are available in each local market, and thus there

can be no common customers among some related products. For example, suppose that a

manufacturer launched products W and M in Wisconsin and Massachusetts, respectively.

Suppose that products W and M serve similar types of consumers in the different markets. No

consumer can purchase both products due to local availability. Instead, some consumers

purchase the local one. These consumers also purchase other products that are available in both

markets. As a result, products W and M can be in the same segment, connected through the

products that are available nationally, although no consumer purchases both W and M. This same

framework can also connect products sold at different stores. The indirect connection between W

and M will be identified in topological data analysis through a loopy segment.

Why is it useful to detect loopy segments? As described in the preceding paragraph, a loopy

segment can include products that occupy the same product space in different markets, but that

no consumer purchased together. If preferences are transitive, in the sense that if objects share a

relationship to a common object, then they would be related if they were in the same domain,

then loopy segments can help firms identify potentially competing or potentially co-purchased

products that are not currently offered in the same market. This can help manufacturers who

launch their products sequentially across regional markets. They can learn about (1) competitor

products in one market and (2) indirectly connected products in the other market, and they can

use that information to inform opportunities in both markets. This also helps retailers with

limited shelf space to detect related products. For example, Costco and Walmart strategically

keep a small number of products in each category. By identifying potentially related products,

they can make better product assortment decisions.

Page 69: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

57

Standard clustering methods such as hierarchical clustering are not good at identifying indirect

connections such as those found in a loopy segment (Lesnick 2013). Recently developed

community detection methods are particularly useful for segmenting many observations

(Newman and Girvan 2004; Clauset, Newman, and Moore 2004; Pons and Latapy 2005;

Raghavan, Albert, and Kumara 2007; Blondel et al. 2008). However, our simulations suggest

that community detection methods are less effective than TDA at identifying product

connections across markets because no community detection method assigns a product into

multiple segments, unlike TDA. Thus, our results suggest that for the particular problem of

identifying connected products in unconnected markets, TDA is a useful new tool.

Topology is a mathematical discipline that studies shape. TDA, developed by computational

mathematician Gunnar Carlsson (2009), refers to the adaptation of this discipline to analyzing

highly complex data (Ayasdi 2015). TDA assumes that all data has shape and shape has

meaning, and thus tries to discover geometric relationships among data points. There are many

applications across oncology, astronomy, neuroscience, image processing, and biophysics.

Hoffman and Novak (2015) argue that TDA is useful in organizing potential applications of the

‘internet of things’. There has been some commercialization efforts by analytics company

Ayasdi (which counts Carlsson as one of its founders). For example, TDA analysis has helped

identify new patient groups in breast cancer treatment, distinct playing styles of National

Basketball Association players, and voting patterns of the members of the US House of

Representatives (Lum et al. 2013). Ayasdi’s website also discusses potential marketing

applications in customer segmentation, personalized marketing, churn analysis, and network

optimization (Ayasdi 2016).

We find that TDA is particularly well-suited to a specific marketing problem. We use simulated

data and the IRI marketing academic data set (Bronnenberg, Kruger, and Mela 2008) to

demonstrate that TDA can connect products in different markets through national products,

while standard hierarchical clustering methods and community detection methods have

difficulty. Our analysis of beer and salty snack buyers in Pittsfield Massachusetts and Eau Claire

Wisconsin shows that different locally popular brands appear to occupy similar product space in

the different markets. For example, in salty snacks, two national salty snacks (Rold Gold and

Tostitos) connect local segments, which include two products that sell well in Wisconsin (Barrel

O Fun and Jays) and a product that sells well in Massachusetts (UTZ). This suggests that the

Page 70: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

58

positioning of UTZ in Massachusetts is similar to the positioning of Barrel O Fun and Jays in

Wisconsin. We also find potential co-purchase behavior between certain beer and salty snack

products. While the two product categories are quite separated when using hierarchical

clustering, many TDA segments include both beers and salty snacks.

We view the core contribution of this paper as introducing TDA methods to marketing by

providing a clear marketing application. This adds a new clustering tool to the rapidly growing

literature on market structure analysis using big data (France and Ghose 2016; Ringel and Skiera

2016). We view TDA as a useful exploratory new tool and we highlight a specific strength of

this tool. It should not be viewed as a replacement for other forms of product segmentation

because it is unlikely to outperform those methods for standard product segmentation purposes.

Because TDA is a new method to marketing, section 2.2 uses several simple examples to provide

an extensive discussion of the intuition behind TDA. Section 2.3 shows the usefulness of TDA

for connecting similar products in separated markets using a simulation study, comparing TDA

to other clustering tools. Section 2.4 applies the method to the IRI data to demonstrate its

practical application in marketing. Section 2.5 concludes with a discussion of opportunities and

limitations.

2.2. TDA methodology

Computing topology based on simplical complexes has been well understood decades (for more

details, see Armstrong 1983, Edelsbrunner, Letscher, and Zomorodian 2002, Hatcher 2002,

Zomorodian and Carlsson 2005, Edelsbrunner and Harer 2010, and especially Carlsson 2009).

However, computing simplical complexes is resource intensive and so TDA had limited

application until recently (Lum et al. 2013). Below, we provide a description of the TDA

methodology.

Page 71: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

59

2.2.1. Vietoris-Rips Complex

We use the most common and easily implemented TDA method, the Vietoris-Rips complex. Let

𝑑(∙ ,∙) denotes the distance between two product points in customer purchase quantity space 𝑋.

The complex VR(𝑋, 𝑣) is defined as

• A set of vertices (data points or 0-simplices) is defined as 𝑋

• For vertices (data points) q and r, a line (edge or 1-simplex) [qr] is included in

VR(𝑋, 𝑣) if 𝑑(𝑞, 𝑟) ≤ 𝑣

• A higher dimensional (k>1) simplex such as a triangular face (2-simplex) or a

tetrahedron (3-simplex) is included in VR(𝑋, 𝑣) if all of the lines (1-simplices) that

make up the high dimensional simplex are in VR(𝑋, 𝑣).

All the points within a simplex are directly connected each other. Given that our goal in this

study is to find potentially related products, the simplex itself does not include indirectly

connected products through other products. Next, because VR(𝑋, 𝑣) includes a set of k-simplices

[𝑥0, 𝑥1, … 𝑥𝑘], where 𝑥𝑖 ∈ 𝑋, at filtration value 𝑣, it is also called a filtered simplicial complex.

Note that the complex VR(𝑋, 𝑣) grows in filtration value 𝑣. In other words, data points are

connected from their nearest neighbor to more distant ones. Lesnick (2013) label this the

“thickening” process.

To enhance the formal description and explain how it works with marketing data, we provide

several example cases on how TDA creates clusters of products based on purchases by two or

three sample customers. Figure 2.2 presents the 9 different cases and Table 2.1 summarizes the

key TDA output for each: filtration values, VR complexes, and Betti numbers which we define

below.

2.2.2. Clustering distinctly grouped data (Cases 1 and 2)

Case 1 illustrates how TDA segments distinctly grouped data. The data, or vertex set, 𝑋 consists

of four products. Customer 1 purchased 0, 1, 5, and 5 units of products a, b, c, and d

Page 72: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

60

respectively. Customer 2 purchased 1, 1, 1, and 2 units. The points are plotted in the graph

labeled v=0 (Data). At filtration value v=0, no product pair is connected yet, and thus

VR(𝑋, 𝑣 = 0) includes only four data points 𝑥0 = {𝑎, 𝑏, 𝑐, 𝑑}. In the thickening process, we

gradually increase filtration value by 0.01. The graph labeled v=1 (Betti0=2) shows that at v=1,

we can now connect two groups of dots within the circles to generate two lines 𝑥1 = {𝑎𝑏, 𝑐𝑑}.

These two groups are maintained until v=4, when all four dots become connected.

Figure 2.2 TDA examples with two customers (Case 1-7) or three customers (Case 8 and 9)

Figure 2.2.1 Case 1 Two segments

Page 73: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

61

Case 2 provides a similar example. Product d is purchased more by both customers and has a

distinct positioning. At v=1, products a, b, and c become one body, Then, at v=3.61, product d

joins the others. It suggests that there are two product segments in Case 2. In Cases 1 and 2, the

linking process of TDA is similar with that of standard hierarchical clustering.

Figure 2.2.2 Case 2 Tetragon

Page 74: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

62

2.2.3. Homology groups, Betti numbers, and loopy segments

Now, we show how TDA summarizes the shape of data. As described above in Cases 1 and 2,

TDA generates a simplex (e.g. a line or a triangular face) by connecting data pairs which locate

within filtration value v. The filtered simplex complex VR(𝑋, 𝑣) can be summarized by what are

labeled homology groups. The value Bettih, where ℎ ∈ ℕ, counts the number of h-th homology

groups in the topological space, which is VR(𝑋, 𝑣) here. "Betti numbers" was coined

by Poincaré (1894) after Enrico Betti. The meaning of Bettih ,where ℎ ∈ {0,1,2}, is as follows.

• Betti0 : the number of connected components

• Betti1 : the number of holes or loops

• Betti2 : the number of voids or cavities

It is possible to define higher Bettih numbers,where ℎ > 2, but they are difficult to conceptualize

and do not appear to matter in our empirical context. Thus we use up to Betti2 in our study.

To provide examples of connected components and loops, the TDA literature often uses the

shape of upper case letters. The letters that are qualified as Betti1 with a single loop (and a single

hole) are {A, R, D, O, P, Q}. In contrast, {B} is Betti1 with two loops. All the other upper case

letters have no loops, and can be thought of as a point if compressed.

A torus (or empty donut shape) is an example of Betti2. A torus has a void inside of the donut as

well as two loops: one with a hole in the center and the other with a hole inside the donut.

For each case in Figure 2.2, the bar graph shows the number of segments by Betti type for

each filtration value v. For example, for Case 1, for Betti0 (Betti dimension 0), we have four

distinct groups until v=1. After v=1, there are two groups until v=4 when there is just one group

as all the dots are joined together. Similarly, for Case 2, for Betti0 we have four distinct groups

until v=1, then two groups until v=3.61 and one group for v≥3.61. In this way, Betti0 provides

similar results to a standard clustering algorithm.

In contrast to Betti0, both Betti1 and Betti2 count holes and voids, providing distinct insights into

the data structure from a standard clustering algorithm. Following the literature, we label a hole

Page 75: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

63

or void as a loopy segment in our paper. Given that we aim to detect related products that are not

directly competing and that cannot be identified with standard clustering techniques, we focus on

loopy segments (Betti1 and Betti2), where each product is indirectly connected with some others.

2.2.4. A loopy segment in a two dimensional plane (contrasting Cases 3 and 4)

In Cases 1 and 2, no hole exists. Betti1 and Betti2 are zero throughout. In particular, there is no

empty space inside a simplex. Once all dots are connected in a triangle at a filtration value, the

space within the triangle is covered. For example, in Case 2, when products a and c are

connected at v=1.42, the inside of the triangle among products a, b, and c is shaded rather than

blank because distance 1.42 covers all space within the triangle. This means that at least four

products are necessary to form a hole in two-dimension space.

When can a loopy segment emerge in a two dimensional plane? The dots cannot be on a straight

line (as in Case 1) and the diagonals should be longer than any of the four boundary lines. Case 2

does not form a loopy segment because products a and c are linked to each other before they link

with product d.

Case 3, a square, does have a loopy segment. At v=0, there are four distinct dots and Betti0=4. At

v=2, lines can be drawn that connect the dots along the outside of the square and Betti0=1.

Importantly, the diagonals {ac,bd} are unconnected, meaning there is an unconnected simplex

and so Betti1=1. At v=2.83 the diagonals connect and there is no hole, and so for v≥2.83,

Betti1=0. The loopy segment suggests that the four products are indirectly related to their non-

neighboring products because a grouping of size less than 2.83 shows no direct link between b

and d or between a and c.

Page 76: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

64

Figure 2.2.3 Case 3 Square loopy segment

Page 77: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

65

In contrast, Case 4 is a square with a dot in the middle: Product e is in the center of the other four

products. Case 4 does not have a loopy segment. At v=1.42, four boundary products are

connected with product e in the center, and so Betti0=1 from this value. This linkage occurs

before the boundaries are linked with each other, and so no hole is formed because product e

made the diagonals shorter than the boundary lines. Case 4 shows that a hub structure, where one

popular product competes with other products, is not likely to have a loopy segment.

Figure 2.2.4 Case 4 Center point within square

Page 78: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

66

2.2.5. Interval length of a loopy segment: Persistent homology (Cases 3 and 5)

When does the emerged hole disappear? Namely, how long does the hole persist? This is

important to understand because more persistent holes suggest more robust connections that are

distinct from standard clusters. Cases 3 and 5 provide a useful contrast for exploring persistent

holes.

We now introduce a new concept, the Betti interval. The Betti interval describes how the

homology of VR(𝑋, 𝑣) changes with filtration value v. Betti1 interval, with endpoints

[𝑣𝑠𝑡𝑎𝑟𝑡 , 𝑣𝑒𝑛𝑑), corresponds to a hole that appears at 𝑣𝑠𝑡𝑎𝑟𝑡, remains open for 𝑣𝑠𝑡𝑎𝑟𝑡 ≤ 𝑣 < 𝑣𝑒𝑛𝑑,

and closes at 𝑣𝑒𝑛𝑑. The filtration range or the interval length, 𝑣𝑒𝑛𝑑 − 𝑣𝑠𝑡𝑎𝑟𝑡, is the measure of

persistent homology. Longer persistence suggests more robust features. In Case 3, at filtration

value v=2.83, four simplex triangles {abc, bcd, cda, dab} arise when the additional two

diagonals ac and bd fill in the square, and thus the hole disappears. In summary, the loopy

segment is born at 𝑣𝑠𝑡𝑎𝑟𝑡 = 2 and dies at 𝑣𝑒𝑛𝑑 = 2.83, and thus its Betti interval has length (or

filtration range) 0.83. Note that 𝑣 keeps increasing until new segments do not appear any more.

In practice, the researcher sets a large enough value for 𝑣.

Case 5 presents a rectangle. At v=2, two product groups are formed and then they are maintained

until v=4, suggesting that there are two segments in this case. At v=4, a loopy segment consisting

of all four products emerges with Betti1 interval [4, 4.48). Compared to Case 3, this loopy

segment is born later (4 > 2) and is less persistent (0.48 < 0.83). The later birth suggests that,

relative to Case 3, in Case 5 the Betti0 segments are more distinct and that the indirect

connections might provide insights into potentially related products that standard cluster methods

might miss. In Case 5, standard cluster methods may conclude that there are only two separated

segments: {a, b}, {c, d}. The lower persistence suggests that the loopy (Betti1) segment in Case 5

is, however, a less robust feature of the data.

Page 79: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

67

Figure 2.2.5 Case 5 Rectangle loopy segment

Page 80: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

68

2.2.6. Connecting loopy segments (Cases 6 and 7)

Figure 2.2.6 Case 6 Distant two loopy segments

Page 81: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

69

Using Cases 6 and 7, we explain how TDA connects segments and where it provides distinct

insights from standard hierarchical clustering. In Case 6, there are two clearly separated product

groups: one often purchased by two customers and the other not. At v=1, three segments are

formed and so 𝐵𝑒𝑡𝑡𝑖0 changes from 8 to 3. At v=2, the separate segments ab and dc are joined

and so 𝐵𝑒𝑡𝑡𝑖0 drops to 2. The rectangle and the square are separated until v=2.83 and 𝐵𝑒𝑡𝑡𝑖0

becomes 1. This process is similar to the way hierarchical clustering methods group items.

In addition to identifying distinct segments, and unlike hierarchical clustering, TDA informs us

whether each segment has a loopy structure or not. There are two loopy segments in Case 6. For

the square (efgh), the loopy segment has interval length 0.42, starting at 1 and ending at 1.42. For

the rectangle (abcd), the loopy segment has interval length 0.24, starting at 2 and ending at 2.24.

This suggests a different kind of connection between the points in the rectangle and the points in

the square, as in the above comparison between Cases 3 and 4. The loopy segment is more

meaningful in the square because it recognizes that the four dots are more equally connected.

Case 7 shows two loopy segments that are connected through a common product, d. At v=2,

TDA generates one whole segment (𝐵𝑒𝑡𝑡𝑖0 = 1) with two loopy segments (𝐵𝑒𝑡𝑡𝑖1 = 2). These

segments persist until v=2.83. Product d in Case 7 serves as a gate product. TDA connects

segments by assigning such gate products into multiple segments. This connection information

helps to detect potentially related products across neighboring segments.

Products a and e, which are indirectly connected through the gate product d, do not appear to be

direct competitors. Nevertheless, the common linkage with product d suggests that a and e are

related. As we describe below, if a and e are primarily sold in different markets, the common

gate product suggests that they may serve similar needs in the different markets.

This connecting ability enables TDA to yield distinct insights relative to other clustering

methods such as hierarchical clustering, which forces full separation. In Case 7, most

hierarchical clustering algorithms such as average and complete linkage or Ward’s method,

generate two segments: one with products a, b, c, and d, and another with products e, f, and g.

Moreover, single linkage algorithms, where, at each step, combining two clusters that contain the

closest pair of elements not yet belonging to the same cluster as each other, put all products into

just one segment because all the products has same distance with their neighboring product.

Thus, while single linkage algorithms closely resemble TDA in terms of 𝐵𝑒𝑡𝑡𝑖0 groupings, the

Page 82: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

70

single linkage algorithm misses 𝐵𝑒𝑡𝑡𝑖𝑘, (𝑘 ≥ 1) groupings. In the simulation section below, we

conduct a more comprehensive comparison across several clustering methods.

Figure 2.2.7 Case 7 Neighboring two loopy segments with one connection

Page 83: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

71

2.2.7. Voids in three dimensional space (Cases 8 and 9)

Figure 2.2.8 Case 8 Tetrahedron

Case 8 V=0 (Data, Betti0=4) V=5.66 (Betti0=1)

Next, we show when voids occur in three dimensional space using examples with three

customers. Case 8 shows an example with four products. The simplex in three dimensional space

is a tetrahedron, which also has four data points. Therefore, Case 8 cannot contain a void. At

v=5.66, all four products are connected each other, resulting in a tetrahedron as well as four

triangular faces. Because both a tetrahedron and a triangle are simplices, neither a void nor a hole

occurs.

Page 84: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

72

Figure 2.2.9 Case 9 Octahedron with void

Case 9 V=0 (Data, Betti0=6) V=2.83 (Betti0=1, Betti2=1)

V=4 (Betti2=0)

In Case 9, there are six product points that if joined together would form an octahedron. At

v=2.83, each point is connected with four neighboring points, each of which is in the center of its

neighboring square side, thus 𝐵𝑒𝑡𝑡𝑖1 switches from 6 to 1. For example, product a is connected

with products b, c, e, and f, but not product d on the opposite side. As a result, there are four

triangles {abc, abf, ace, aef} that include product a. There are another four triangles that include

Page 85: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

73

d but not a {dbc, dbf, dec, def}. Only these 8 triangles, and no tetragons, are in each plane. Since

a triangle is simplex, no hole is formed. As a result, 𝐵𝑒𝑡𝑡𝑖1 remains at 0.

However, there is one void (𝐵𝑒𝑡𝑡𝑖2 = 1) inside the 8 triangles starting at v=2.83. First,

intuitively, one can see that each point of the six points is connected with the other point in the

opposite side indirectly through their neighboring products. Three product pairs ad, be, and cf

have such an indirect connections. Second, to make sure that the inside is empty, we check

whether any tetrahedrons with four data points occur. For example, product a is connected with

products b, c, e, and f. However, product b is not connected e in its opposite side. There is also

no link between products c and f yet. Therefore, no tetrahedron occurs.

At v=4, the three product pairs {ad, be, cf} on opposite sides connect. Now, the inside is

occupied by twelve tetrahedrons {abcd, abce, abcf, abdf, abef, acde, acef, adef, bcde, bcdf, bedf,

cedf}, leading to 𝐵𝑒𝑡𝑡𝑖2 = 0. The length of the interval with this void is 1.17, and the interval is

[2.83, 4).

The above cases outline how TDA identifies indirect connections between products. Before we

analyze real world data, we provide simulation evidence that TDA generates a different type of

insight than other commonly used methods.

Page 86: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

Table 2.1 TDA cases

Case Description Filtration Value (v)

*Filtered Simplicial Complex = VR(X, v) Betti Numbers

Points Lines Triangles Tetrahedron B0 B1 B2

1 Two segments

0 {a, b, c, d} 4 0 0

1 {ab, cd} 2 0 0

4 {bc} {bcd} 1 0 0

2 Tetragon

0 {a, b, c, d} 4 0 0

1 {ab, bc} 2 0 0

1.42 {ac} {abc} 2 0 0

3.61 {ad, cd} {acd} 1 0 0

3 Square loopy segment

0 {a, b, c, d} 4 0 0

2 {ab, bc, cd, ad} 1 1 0

2.83 {ac, bd} {abc, bcd, cda, dab} 1 0 0

4 Center point within

square

0 {a, b, c, d, e} 4 0 0

1.42 {ae, be, ce, de} 1 0 0

5 Rectangle loopy segment

0 {a, b, c, d} 4 0 0

2 {ab, cd} 2 0 0

4 {ad, ac} 1 1 0

4.48 {ac, bd} {abc, bcd, cda, dab} 1 0 0

6 Distant two loopy

segments

0 {a, b, c, d, e, f, g} 8 0 0

1 {ab, cd, ef, fg, gh, hd} 3 1 0

1.42 {efg, fgh, ghe, hef} 3 0 0

2 {bc, da} 2 1 0

2.24 {abc, bcd, cda, dab} 2 0 0

2.83 {de} 1 0 0

7 Neighboring two loopy

segments with one connection

0 {a, b, c, d, e, f, g} 8 0 0

2 {ab, bc, cd, ad, de, ef, fg, gd} 1 2 0

2.83 {ag, ce, ac, bd, df, eg} {abc, bcd, cda, dab, def, efg, fgd, gde, adg, cde}

1 0 0

8 Tetrahedron 0 {a, b, c, d} 4 0 0

5.66 {ab, ac, ad, bc, bd, cd} {abc, abd, acd, bdd} {abcd} 1 0 0

9 Octahedron with void

0 {a, b, c, d, e, f} 6 0 0

2.83 {ab, ac, ae, af, bc, bd, bf, cd, ce, de, df,ef}

{abc, abf, ace, aef, dbc, dbf, dec, def}

1 0 1

4 {ad, be, cf} {abe, acf, adb, adc, ade, adf, bcf, bec, bed, bef, cfd, cfe}

{abcd, abce, abcf, abdf, abef, acde, acef, adef, bcde, bcdf, bedf, cedf}

1 0 0

*Filtered simplicial complex, VR(X, v), is cumulative as v increases: The table shows the additional simplices for each filtration value v

Page 87: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

2.3. Simulation study

Our goal is to demonstrate that TDA can identify potentially related products that have not been

sold together in the same market. Our target application is to cluster products in two local

markets which are regionally separated. In the IRI data analysis below, we examine sales across

two cities, Eau Claire Wisconsin and Pittsfield Massachusetts. We cluster salty snacks and beers

separately to see whether TDA can find products that occupy the same product space within a

category in the two local markets. Then, we combine both product categories in the same

analysis to see whether TDA can also connect products in different categories and different

markets that could potentially be purchased by the same customers. In other words, for this

simulation to be useful to marketers, we assume that preferences are transitive and examine

whether TDA can unpack the relationships in the data.

Before analyzing real consumer purchase data from the two local markets, we do a

simulation study to examine whether TDA can recover useful loopy segments in such a setting.

We compare results from TDA with those from hierarchical clustering methods and community

detection methods. The purpose of this section is not to demonstrate that TDA is always superior

to other methods. Instead, the purpose is to highlight a particular case in which TDA does detect

a pattern in the data when other methods do not.

2.3.1. Simulation study procedure

Our simulation study has the following 5 steps, as shown in Figure 2.3a. In the simulation, some

products appear only in Wisconsin (W), some products appear only in Massachusetts (M), and

some national products appear in both markets (N).

Page 88: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

76

Figure 2.3a 5 steps for simulation study

Step 1: True segments

We simulate two scenarios, shown in Figure 2.3b. In Scenario 1, there are two loopy segments,

one in each local market. Each local segment includes one national product as well as its own

local product. This shape is called a “wedge sum” in topology. The two local segments are

connected through one national product. In other words, the national product is assigned into

both segments.

Next, one concern is that TDA might identify false gate products although there is no

connection. To address this issue, we do a falsification test. In Scenario 2, we add one local-only

segment into each local market. The added local-only segment should not be connected as the

result of TDA.

Figure 2.3b True segments in simulation study

Page 89: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

77

Scenario 1 Scenario2

Product type W means Wisconsin product, M means Massachusetts product, and N means national product.

Step 2: Correlation matrix

We simulate a correlation structure among products only within the same local market because

we assume that the two local markets are geographically separated and so no consumer can

purchase both groups (M and W) of local products. To generate the loopy segment, we put

higher correlation between neighboring products. Higher correlation means shorter distance. For

example, we give correlation 0.5 and 0.6 between Wisconsin local product W1 and it

neighboring local and national products (W2 and N4), while we assign a correlation of 0.2

between W1 and its non-neighboring product W3.

Step 3: Simulating consumer purchases

Using the above correlation structure and assuming a marginal Poisson distribution, we simulate

10,000 consumers’ purchases in each market (20,000 consumers total). We chose the Poisson

distribution to reflect the discrete nature of purchase quantity. The quantity purchased by each

consumer of each product is therefore a draw based on correlated (across products) Poisson

random variables. To generate correlated Poisson random variables, we utilized an R

Page 90: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

78

implementation by Barbiero and Ferrari (2014). We ensuare that no consumer can buy both M

and W products.1

Step 4: Distance or similarity matrix

With consumer purchases for each product, we calculate Euclidean distance among products

across the 20,000 consumers. This distance matrix becomes input data for TDA and hierarchical

clustering. We also construct similarity matrix for community detection methods that counts

each product pair’s joint purchase frequency as the number of consumers who purchase both

products among the 20,000 consumers.

Step 5: Product clustering

We create segments from this data using TDA, four different hierarchical cluster

algorithms, and five different community detection methods. Hierarchical clustering methods are

perhaps the most commonly used tool for segmenting and positioning products and brands

(Srivastava, Leone, and Shocker 1981; Punj and Stewart 1983; DeSarbo and DeSoete 1984;

Zhai, et al., 2011). The first three hierarchical clustering algorithms we use are single, average,

and complete linkage, which Johnson (1967) defines as the “standard” hierarchical clustering

algorithms. The fourth is Ward’s method (Ward 1963) which is known for working particularly

well with marketing data (Punj and Stewart 1983). Among them, the closest algorithm to TDA is

single linkage, where, at each step, combining two clusters that contain the closest pair of

elements not yet belonging to the same cluster.

1 To ensure the existence of a hole structure, we assign a lower mean value for the national product than for the local

products. Recall that a national product is sold in two markets, while local products are only sold in one market. A

higher mean value of the national product results by construction in a longer distance between the national and local

products. As a result, if a national product has too high of a mean value, it may not be part of a loopy segment.

Page 91: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

79

Recently, community detection methods have been proposed as segmentation tools in network

analysis. The five community detection algorithms we use in this study are those developed by

Newman and Girvan (2004), Clauset, Newman and Moore (2004), Pons and Latapy (2005),

Raghavan, Albert and Kumara (2007), and Blondel et al. (2008). They differ in terms of

scalability and quality of detection.2 Netzer et al. (2012) introduced the community detection

method developed by Girvan and Newman (2002) for the first time in marketing, in order to

segment discussion of 169 car models in an online forum. Newman and Girvan (2004) extended

their previous paper by incorporating the weight of edge between vertices. Later, Clauset,

Newman, and Moore (2004), Pons and Latapy (2005), Raghavan, Albert, and Kumara (2007),

and Blondel et al. (2008) proposed new algorithms to process a large network quickly. Blondel et

al.’s (2008) the Louvain method is known to show better performance in terms of speed and

accuracy. Recently, Ringel and Skiera (2016) adapted the Louvain method as one component of

their market structure map of more than 1,000 products from an online price and product

comparison site.

To estimate TDA, we utilized a JavaPlex implementation by Adams, Tausz, and Vejdemo-

Johansson (2014) through the MATLAB interface developed by Adams and Tausz (2015). For

hierarchical clustering and community detection methods, we use the R “cluster” and “igraph”

packages, respectively.

2.3.2. Simulation study results

Figure 2.4 shows the result of topological data analysis on the simulated data. In Scenario 1,

there are two intervals under Betti1 in the barcode chart, implying that TDA detects two loopy

segments. TDA also generates segment members and their connection order. Each segment

includes the appropriate local products and the national product N4, implying that the two local

segments are connected through the national product. In Scenario 2, there are four intervals

under Betti1. As expected from the generated data, Segments 3 and 4 are connected through the

2 Related to these methods, Henderson, Iacobucci, and Calder (1998) and John et al. (2006) used survey-based

approaches to generate a brand-associative network.

Page 92: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

80

national product N8, there is no overlapping product between Segments 1 and 2. In summary,

TDA recovers the true segments well.

Figure 2.4: TDA barcode chart for simulation study

Scenario 1 Scenario 2

Product type W means Wisconsin product, M means Massachusetts product, and N means national product.

Next, we turn to the results from hierarchical clustering methods in Figure 2.5. For both

Scenarios 1 and 2, the single linkage algorithm yields a different pattern than the others, putting

the national brand, N8, into its own segment. Generally, in Scenario 1, the single linkage

algorithm does successfully capture the different location groupings; however, in Scenario 2, the

single linkage algorithm groups products together that should be completely separate (segments

1 and 2). The national product also connects to segments 1 and 2 when it should be disconnected.

Although the single linkage algorithm is the most similar to TDA in terms of the intuition behind

the algorithm, it performs poorly because it does not allow for loopy segments.

Page 93: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

Figure 2.5 Hierarchical clustering

Figure 2.5a Hierarchical clustering for Scenario 1 in simulation study Figure 2.5b Hierarchical clustering for Scenario 2 in simulation study

Product type W means Wisconsin product, M means Massachusetts product, and N means national product.

Page 94: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

The other three hierarchical clustering methods perform better, in the sense that they do group the

products into the appropriate two or four segments. Still, they do not capture the useful information

that the national product connects the two groups of local products (segments 1 and 2 in Scenario 1

and segments 3 and 4 in Scenario 2) because the algorithms force each product into only one segment.

While this feature of hierarchical clustering methods is often useful in marketing research analysis,

it means that connections across products in different markets are better found using TDA.

We next examine a community detection method result from an R implementation of the

Louvain method (Blondel et al. 2008). Because the other four community detection methods

yielded the same results, the description that follows applies to all five methods. In Scenario 1,

the community detection methods generate two segments: {W1, W2, W3} and {N4, M5, M6,

M7}, failing to identify the gate product N4 because each product is assigned into only one

segment, like the above hierarchical clustering. However, there is a potential way to detect the

gate product using a node betweenness centrality measure (Freeman 1977) in network analysis

with the assumption that a product (i.e. node) with high betweenness will connect local

segments. We show this potential approach with the richer example in Scenario 2.

Scenario 2 results are shown in Table 2.2. Column 1 shows the “true” segments according to the

simulation. Column 2 shows the TDA segments, and Column 3 shows the community detection

method segments. Here the community detection methods split the sample into two groups,

failing to capture the four distinct segments. This suggests that the community detection methods

which we use segment products too broadly, perhaps because such network approaches use all

the given connection information when they generate clusters. This problem may be solved by

Ringel and Skiera (2016), who extend the Louvain method (Blondel et al. 2008) by (1) adding a

“resolution” parameter and (2) combining a multilevel coarsening and refinement procedure

(Rotta and Noack 2011). However, the new method by Ringel and Skiera (2016) also does not

identify indirect connection because it also does not allow for a product to be allocated into

multiple segments (i.e. submarkets). Thus, we do not implement the extension of the community

detection methods used by Ringel and Skiera (2016) here because detecting small segments is

not the key aspect we emphasize in this paper as the key strength of TDA. Rather, we explore

whether there is a potential way to identify indirect connections in the framework of network

analysis as a benchmark model.

Page 95: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

83

Table 2.2: Community detection methods in Scenario 2 in the simulation study

(1) (2) (3) (4) (5)

Product

Type True TDA

Community

detection

Betweenness

centrality

Joint product purchase with national product N8 among 20,000 simulated customers

W1 1 1 1 0 1463

W2 1 1 1 0 1433

W3 1 1 1 0 1445

W4 1 1 1 0 727

W5 3 3 1 0 2399

W6 3 3 1 0 2123

W7 3 3 1 0 2413

N8 3,4 3,4 2 49 N/A

M9 4 4 2 0 2538

M10 4 4 2 0 2203

M11 4 4 2 0 2473

M12 2 2 2 0 1491

M13 2 2 2 0 1493

M14 2 2 2 0 1499

M15 2 2 2 0 675

Column 1 to 3 each show a different method. The numbers in the column represent the assigned segment according

to that method. Therefore the numbers are not related across columns. Only the national product has nonzero

betweenness centrality.

Next, while community detection methods do not directly identify gate products, it is possible to

take the constructed network and identify products with high node betweenness. Column 4

shows that the national product has high betweenness centrality, suggesting that, by adding this

step, the community detection methods can be used to help find the gate product. It is possible to

then look at co-purchasing patterns with this gate product in Column 5 and identify indirectly

connected local products identified through TDA. For example, W7 and M9 are especially likely

to be purchased with N8, correctly suggesting a linkage between them. However, as we

demonstrate in the empirical application below, this approach can be complicated if there are

multiple potential gate products. For example, if Scenario 2 is adapted so that there is another

national product N16, which is co-purchased often with W7 but rarely with M9 then it is not easy

to decide whether W7 and M9 are potentially competing. The difficulty will increase as the

number of national product increases.

In summary, TDA finds clear connections between the two local segments through the national

product. In this small product network, more familiar clustering methods can also show such a

Page 96: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

84

link, but with additional effort required through manual checking of distances and values. As the

number of products grows, however, such effort becomes impractical. Thus, we interpret the

results of the simulation to suggest that TDA captures a potentially useful data pattern that is not

captured by hierarchical clustering or community detection methods.

2.4. Marketing application

2.4.1. Data and computation time

The IRI Marketing data set (Bronnenberg, Kruger & Mela 2008) has individual-level consumer

purchase data in two local cities: Eau Claire Wisconsin and Pittsfield Massachusetts. Consumers

in these cities have distinct tastes and there are some differences in product availability.

Therefore, this data set allows us to investigate whether TDA can detect potentially related

products across local markets. We also look for potentially related products by looking across

two categories, salty snacks and beers.

Like most other clustering methods, TDA use the distance matrix among products as

input data. We calculate Euclidean distance among products across all the consumers’ purchase

quantities during a particular year, 2003. There are 6,352 consumers who meet IRI’s reporting

criteria (Kruger and Pagni 2011 page 16) across the two cities in salty snacks and 3,101 who

meet the reporting criteria in beer.

As we discussed above, TDA is computationally intensive, increasing exponentially as the

number of products increases. To explore computational feasibility, we choose the top 10, 20,

30, 40, and 50 products in salty snacks and beer in each local market. Table 2.3 summarizes the

results. For the top 10 case, there are 15 salty snacks and 17 beers across two local markets

because national products are available in two regions. Some products which are sold in two

regions have very low sales quantities in one local market. In this case, we classify it as a local

product. We define a national product as a product that makes up more than 0.5% of category

sales in each market. With 32 products, TDA took just 0.5 seconds. With 119 products (top 40 in

each market, both categories), TDA took 30 minutes. Finally, with 146 products (top 50 in each

market), our computer kept running without generating a TDA result. This demonstrates the

Page 97: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

85

computational limits of TDA without a high performance computer. The 119 products cover

94% salty snack sales and 89% of beer sales in these two markets. In most of what follows, we

show results on the 32 products (row 1 of Table 2.3) because the smaller number of products

allow for visual comparison of results with hierarchical clustering methods.

Table 2.3 TDA results by the top N products in each market

2.4.2. Potential competitors within a category

Table 2.4a TDA for salty snacks

Birth Death Interval Length

Salty Snacks Product Type

Betti 1 466.8 469.8 3 ROLD GOLD, BARREL O FUN, JAYS, TOSTITOS N, W, W, N

649.6 694.9 45.3 TOSTITOS, UTZ, ROLD GOLD, FRITOS, WAVY LAYS N, M, N, N, N

Product type W means Wisconsin product, M means Massachusetts product, and N means national product.

Table 2.4 reports the results. We focus on loopy segments (Betti1 and Betti2) in order to highlight

the distinct results given by TDA. Table 2.4a shows the loopy segments for salty snacks. There

are two loopy segments with hole (𝐵𝑒𝑡𝑡𝑖1) structures. Figure 2.6 visually summarizes the

members of each segment. Two national products ‘Rold Gold’ and ‘Tostitos’, connect two

neighboring segments. This connection information is useful in identifying products that serve

the same role in different markets. The two national products are competing against (1)

Wisconsin local products Barrel O Fun and Jays in segment 1 and (2) Massachusetts local

product UTZ in segment 2.

S B S+B S B S B S+B S B S+B

10 15 17 32 69 61 0.4 0.4 0.5 2 10 29

20 29 33 62 85 76 1 0.7 11 12 66 163

30 41 49 90 90 84 4.8 2.9 97.9 35 189 486

40 56 63 119 94 89 15.5 5.4 1857.2 106 289 1035

50 68 78 146 96 92Keep

running

*Market coverage is based on sales unit.

Top N

Products in

Each Market

No of Products

across Two MarketsNo of Segment

Elapased Time

(seconds)

Market

Coverage(%)*

Page 98: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

86

Figure 2 .6: Potentially competing products across segments using IRI data

From this indirect relationship, a marketing manager learns that those local products have similar

positioning. In other words, if a marketing manager plans to launch the Midwestern (Wisconsin)

local product Barrel O Fun or Jays in Massachusetts, she can predict that it will be likely to

compete against East Coast (Massachusetts) local product UTZ, although those three local

products do not compete in the same market in our data.

We next examine whether these relationships appear using hierarchical clustering and

community detection methods. Figure 2.7 shows the results of hierarchical clustering the salty

snacks products. Massachusetts local product UTZ does not seem to be related to Wisconsin

local products Barrel O Fun and Jays. It is hard to see a connection between them in any of the

four hierarchical clustering methods. These results are driven by the fact that no consumer

purchases both Wisconsin and Massachusetts products. In summary, the standard hierarchical

clustering cannot capture the pattern of indirect connection, unlike TDA. Table 2.5 shows the

results of five different community detection methods. Again, no segment includes a mix of local

brands from the two regions. As in the simulation, it is possible to use betweenness measures to

try to identify connecting products. In this case, all the national products yield similar

betweenness measures, meaning that all nine national product connect all the local products in

the two regions. Then, to find potentially competing local products, one may need to check joint

product purchases with each of the nine national products, as in Column 5 in Table 2.2. As we

discussed in the simulation section, however, it is hard to see which local product in one region

is potentially competing against whom in the other region due to local product’s different co-

purchasing pattern with each of national products.

Page 99: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

87

Table 2.5: Community detection methods for salty snacks using IRI data

Salty Snacks

Brand

Product

type

Blondel et

al. (2008)

Raghavan,

Albert, and

Kumara

(2007)

Pons and

Latapy

(2005)

Clauset,

Newman,

and Moore

(2004)

Newman

and

Girvan

(2004)

OLD DUTCH W 1 1 1 1 1

BARREL O FUN W 1 1 1 1 1

JAYS W 1 1 1 1 1

LAYS N 2 1 1 2 1

PRIVATE LABEL N 2 1 1 2 1

DORITOS N 2 1 1 2 1

WAVY LAYS N 2 1 1 2 1

CHEETOS N 2 1 1 2 1

PRINGLES N 2 1 1 1 1

ROLD GOLD N 2 1 1 2 1

TOSTITOS N 2 1 1 2 1

FRITOS N 2 1 1 2 1

SMART FOOD M 2 1 2 2 1

UTZ M 2 1 2 2 1

WISE M 2 1 2 2 1

Each column shows a different method. The numbers in the column represent the assigned segment according to

that method. Therefore the numbers are not related across columns. Product type W means Wisconsin product, M

means Massachusetts product, and N means national product.

Page 100: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

88

Figure 2.7 Hierarchical clustering for salty snacks using IRI data

Next, we analyze a beer category. Table 2.4b shows that TDA generates 10 loopy segments with

8 holes (𝐵𝑒𝑡𝑡𝑖1)) and 2 voids (𝐵𝑒𝑡𝑡𝑖2)). Here national beer products connect products from the

different local markets, even within a segment. For example, row 8 contains two national brands,

a Massachusetts brand, and two Wisconsin brands (rows 1, 6, 9, and 10 have similar diversity).

Figure 2.8a visualizes the segment in row 8 of Table 2.4b. Two national products Bud Light and

Page 101: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

89

Heineken connect the only Massachusetts brand in the 32 product data set (Michelob Light) with

two Wisconsin brands (Miller Genuine Draft and Miller Genuine Draft Light).

Table 2.4b TDA for beers

No Birth Death Interval Length

Beers Product Type

Both Locals

Betti 1

1 148.6 162.6 14 SMIRNOFF TWISTED V, HEINEKEN, LEINENKUGEL, MICHELOB GOLDEN DRAFT LIGHT, MICHELOB LIGHT

N, N, W, W, M

Y

2 129.2 175.5 46.3 SMIRNOFF TWISTED V, HEINEKEN, MILLER GENUINE DRAFT, CORONA EXTRA, LEINENKUGEL

N, N, W, N, W

N

3 169.5 175.5 6 SMIRNOFF TWISTED V, CORONA EXTRA, HEINEKEN, MILLER GENUINE DRAFT LIGHT

N, N, N, W

N

4 126.3 195.4 69.1 SMIRNOFF TWISTED V, HEINEKEN, LEINENKUGEL, COORS LIGHT, MILLER GENUINE DRAFT

N, N, W, N, W

N

5 191.5 201.4 9.9 MILLER GENUINE DRAFT, CORONA EXTRA, MICHELOB ULTRA, SMIRNOFF ICE

W, N, N, N

N

6 144.4 225.7 81.3 MILLER GENUINE DRAFT, HEINEKEN, MICHELOB LIGHT, MICHELOB GOLDEN DRAFT LIGHT

W, N, M, W

Y

7 213.6 246.6 33 SMIRNOFF TWISTED V, HEINEKEN, MILLER LITE, MICHELOB LIGHT

N, N, N, M

N

8 181.7 286.1 104.4 BUD LIGHT, MICHELOB LIGHT, HEINEKEN, MILLER GENUINE DRAFT LIGHT, MILLER GENUINE DRAFT

N, M, N, W, W

Y

Betti 2

9 309.8 373.6 63.8 MILLER GENUINE DRAFT, MICHELOB LIGHT, MILLER GENUINE DRAFT LIGHT, MICHELOB ULTRA, HEINEKEN, CORONA EXTRA, SMIRNOFF TWISTED V, BUD LIGHT

W, M, W, N, N, N, N, N

Y

10 375.5 380.4 4.9 BUDWEISER, MILLER GENUINE DRAFT, OLD MILWAUKEE, HEINEKEN, MICHELOB LIGHT, MICHELOB GOLDEN DRAFT LIGHT

N, W, W, N, M, W

Y

Product type W means Wisconsin product, M means Massachusetts product, and N means national product.

Furthermore, national products also connect local products across segments as described in the

salty snacks category and in the simulation: Heineken, Smirnoff Twisted V, Corona Extra and

other national brands appear in multiple segments. Table 2.4b also highlights a limitation of

looking for loopy segments using TDA: There is some repetition of products across segments.

This means that TDA is a useful starting point for identifying potentially interesting connections

between products, but further analysis is needed to assess the strength and validity of those

connections.

Page 102: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

90

2.4.3. Potentially related products across categories

Next, we combine the salty snack and beer data together (32 total products) to see whether TDA

can find products that might be purchased together, if they were available in the same market.

The rightmost columns in Table 2.3 show that TDA generates many more segments from the

combined data (beer + salty snack) than separate product data. For example, in the top 10

product case, there are 2 salty snack segments, 10 beer segments, and 29 combined (salty snack

and beer) segments.

Table 2.4c shows the segment members from the combined data. Most segments (19 of 29) have

both salty snacks and beer products, providing insight into why the combined data have more

segments than the separate data. Given the underlying data, this makes sense: Even if a customer

always buys the same beer brand and the same salty snacks brand, these brands are connected in

the combined data and provides insight into which categories and products tend to be purchased

by the same customers.

Figure 2.8: Potentially related products across segments using IRI data with order of connection

Page 103: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

91

Table 2.4c TDA for the combined data

Prefix b and s mean beer and salty snack, respectively. Product type W, M, and N means Wisconsin, Massachusetts, and national product, respectively.

No Birth DeathInterval

LengthSalty Snack & Beers Product Type

Both

Products

Potentially

Complem-

entary

Betti 1

1 237.4 259.8 22.4 bSMIRNOFF TWISTED V, bHEINEKEN, bLEINENKUGEL, bMICHELOB GOLDEN DRAFT LIGHT, bMICHELOB LIGHT bN, bN, bW, bW, bM N N

2 206.4 280.4 74 bSMIRNOFF TWISTED V, bHEINEKEN, bMILLER GENUINE DRAFT, bCORONA EXTRA, bLEINENKUGEL bN, bN, bW, bN, bW N N

3 270.7 280.4 9.7 bSMIRNOFF TWISTED V, bCORONA EXTRA, bHEINEKEN, bMILLER GENUINE DRAFT LIGHT bN, bN, bN, bW N N

4 201.7 312.2 110.5 bSMIRNOFF TWISTED V, bHEINEKEN, bLEINENKUGEL, bCOORS LIGHT, bMILLER GENUINE DRAFT bN, bN, bW, bN, bW N N

5 199.6 315.2 115.6 bMILLER HIGH LIFE, sUTZ, bMILLER GENUINE DRAFT, bHEINEKEN bN, sM, bW, bN Y Y

6 250.2 315.2 65 bMILLER GENUINE DRAFT, sSMART FOOD, bHEINEKEN, bMILLER HIGH LIFE bW, sM, bN, bN Y Y

7 305.8 321.6 15.8 bMILLER GENUINE DRAFT, bCORONA EXTRA, bMICHELOB ULTRA, bSMIRNOFF ICE bW, bN, bN, bN N N

8 230.7 360.5 129.8 bMILLER GENUINE DRAFT, bHEINEKEN, bMICHELOB LIGHT, bMICHELOB GOLDEN DRAFT LIGHT bW, bN, bM, bW N N

9 356.8 365.8 9 bMICHELOB LIGHT, sPRINGLES, bSMIRNOFF TWISTED V, bHEINEKEN bM, sN, bN, bN Y N

10 341.2 393.9 52.7 bSMIRNOFF TWISTED V, bHEINEKEN, bMILLER LITE, bMICHELOB LIGHT bN, bN, bN, bM N N

11 306.7 428.9 122.2 bSMIRNOFF TWISTED V, bCORONA EXTRA, bMICHELOB LIGHT, sBARREL O FUN, bSMIRNOFF ICE bN, bN, bM, sW, bN Y Y

12 425.1 434.9 9.8 bMICHELOB ULTRA, bMILLER GENUINE DRAFT, bSMIRNOFF ICE, sROLD GOLD bN, bW, bN, sN Y N

13 423.2 440.8 17.6 bSMIRNOFF ICE, sOLD DUTCH, bMILLER GENUINE DRAFT, sJAYS, sWAVY LAYS bN, sW, bW, sW, sN Y N

14 422.6 442.9 20.3 bSMIRNOFF TWISTED V, sJAYS, sWAVY LAYS, bSMIRNOFF ICE bN, sW, sN, bN Y N

15 290.2 456.9 166.7 bBUD LIGHT, bMICHELOB LIGHT, bHEINEKEN, bMILLER GENUINE DRAFT LIGHT, bMILLER GENUINE DRAFT bN, bM, bN, bW, bW N N

16 423.7 477.7 54 bHEINEKEN, sFRITOS, bMILLER GENUINE DRAFT, bCOORS LIGHT bN, sN, bW, bN Y N

17 430.9 511.9 81 bSMIRNOFF TWISTED V, bCORONA EXTRA, bHEINEKEN, sUTZ, bMILLER LITE bN, bN, bN, sM, bN Y N

18 407.6 544.4 136.8 bCORONA EXTRA, sCHEETOS, bMICHELOB LIGHT, bHEINEKEN bN, sN, bM, bN Y N

Betti 2

19 470.5 474.8 4.3bSMIRNOFF TWISTED V, bSMIRNOFF ICE, sTOSTITOS, sWAVY LAYS, sOLD DUTCH, sBARREL O FUN,

bMILLER GENUINE DRAFT, sJAYS

bN, bN, sN, sN, sW, sW,

bW, sWY N

20 440.4 477.2 36.8bSMIRNOFF TWISTED V, bCORONA EXTRA, sBARREL O FUN, bMILLER GENUINE DRAFT, sTOSTITOS,

bMICHELOB LIGHT, bHEINEKEN

bN, bN, sW, bW, sN,

bM, bNY Y

21 468.2 477.2 9bSMIRNOFF ICE, sROLD GOLD, sBARREL O FUN, bSMIRNOFF TWISTED V, sTOSTITOS, bHEINEKEN,

bCORONA EXTRA, bMILLER GENUINE DRAFT

bN, sN, sW, bN, sN, bN,

bN, bWY N

22 455.4 495.9 40.5bSMIRNOFF TWISTED V, bLEINENKUGEL, sTOSTITOS, bMICHELOB ULTRA, bSMIRNOFF ICE,

sROLD GOLD, bHEINEKEN, bMILLER GENUINE DRAFT, bCORONA EXTRA

bN, bW, sN, bN, bN,

sN, bN, bW, bNY N

23 466.8 511.9 45.1bMILLER GENUINE DRAFT, bHEINEKEN, bMILLER HIGH LIFE, bSMIRNOFF TWISTED V, sUTZ,

bCORONA EXTRA, bMICHELOB ULTRA

bW, bN, bN, bN, sM,

bN, bNY Y

24 413.1 534.1 121 bMICHELOB ULTRA, bSMIRNOFF ICE, sFRITOS, bSMIRNOFF TWISTED V, bHEINEKEN, bCORONA EXTRA bN, bN, sN, bN, bN, bN Y N

25 494.8 596.8 102bMILLER GENUINE DRAFT, bMICHELOB LIGHT, bMILLER GENUINE DRAFT LIGHT, bMICHELOB ULTRA,

bHEINEKEN, bCORONA EXTRA, bSMIRNOFF TWISTED V, bBUD LIGHT

bW, bM, bW, bN,

bN, bN, bN, bNN N

26 556.6 597.1 40.5bMILLER GENUINE DRAFT, bMICHELOB LIGHT, bMILLER HIGH LIFE, bMICHELOB ULTRA, bHEINEKEN,

bCORONA EXTRA, bSMIRNOFF TWISTED V, sSMART FOOD

bW, bM, bN, bN, bN,

bN, bN, sMY Y

27 599.7 607.7 8bBUDWEISER, bMILLER GENUINE DRAFT, bOLD MILWAUKEE, bHEINEKEN, bMICHELOB LIGHT,

bMICHELOB GOLDEN DRAFT LIGHT

bN, bW, bW, bN, bM,

bWN N

28 613 634.8 21.8bMILLER GENUINE DRAFT, sROLD GOLD, sOLD DUTCH, bSMIRNOFF TWISTED V, bOLD MILWAUKEE,

bMICHELOB ULTRA, bHEINEKEN

bW, sN, sW, bN, bW,

bN, bNY N

29 652 654.2 2.2 bHEINEKEN, bCORONA EXTRA, sPRINGLES, sWAVY LAYS, bMILLER GENUINE DRAFT, sSMART FOOD bN, bN, sN, sN, bW, sM Y Y

Page 104: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

We also find potentially related products across categories in seven segments, as marked in the

rightmost column in Table 2.4c. Figure 2.8 visualizes the segment in row 5. Massachusetts salty

snack UTZ and Wisconsin beer Miller Genuine Draft are in the same segment. Once again, this

is mainly due to their connection with national products. TDA provides the order of connection:

(1) UTZ + Miller High Life, (2) Miller Genuine Draft + Heineken, (3) Miller High Life +

Heineken, and (4) UTZ + Miller Genuine Draft. Each local product connects with a national

product first and then the Massachusetts salty snack and Wisconsin beer get connected. This

suggests that the purchase behavior of people who buy UTZ in Massachusetts is similar to the

purchase behavior of people who buy Miller Genuine Draft in Wisconsin. This information could

be used to inform product launches across markets. Alternatively, it might help generate

advertising ideas, for example UTZ ads could borrow elements from a successful Miller Genuine

Draft campaign.

2.4.4. Relationship between a segment’s birth and its product diversity

In the above analysis, we focused on a relatively small number of products in order to facilitate

comparison with hierarchical clustering and to ease the communication of the content of the

various segments. When more products are included, TDA can generate more loopy segments. In

this section, we explore how TDA measures of birth filtration value help identify the interesting

segments. To do so, we now use the 119 total products (top 40 in each category in each market)

that make up 94% of salty snacks sales and 89% of beer sales.

Birth filtration value is a useful metric because it measures how unusual a particular grouping is

likely to be. TDA groups products that are close to each other first. Segments that emerge late

are more likely to leverage the distinct insights that the topological approach offers. In particular,

we are interested in detecting loopy segments that connect regionally distinct local products

through national products. Because the connections are indirect, those loopy segments tend to

form later.

We next correlate birth filtration value with the diversity of product members within a segment.

We focus on diversity because, as argued above, a key use of TDA is to identify connections that

Page 105: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

93

other methods would not. In this paper we have emphasized separate local markets. We measure

diversity as follows. We first order all the products in the same local market by quantity. We

assign each a rank based on this ordering, and take the difference of the rank across the two local

markets. This difference is positive if the product is a Massachusetts product, negative if

Wisconsin, and close to zero if national. Finally, we calculate the standard deviation of the rank

gaps within a segment.

This gives a sense of the variation of the location of sales for the products in the segment: If the

segment has a mix of strongly Wisconsin and strongly Massachusetts products, this diversity

measure will be high. If the segment is mostly national products (or mostly from just one region),

the diversity measure will be low. If the segment contains both national products and products

from one region, the diversity measure will be in the middle.

Table 2.6 The relationship between a segment’s “birth” filtration value and its product diversity

Betti 1 Betti 2

Effect

No of Segment

Effect No of

Segment

Salty Snack

All 0.039

(0.027) 56 0.065**

(0.020) 50

Cut at 3

0.039 (0.027)

55 0.060*** (0.019)

41

Beer

All 0.12*** (0.043)

127 0.022** (0.01)

162

Cut at 3

0.12*** (0.042)

124 0.022** (0.01)

153

Combined

All 0.065*** (0.019)

370 0.068*** (0.009)

665

Cut at 3

0.061*** (0.018)

364 0.063*** (0.009)

631

Each number is the coefficient on product diversity from a regression of birth on product diversity. The

number of observations is the number of segments. ***p < 0.01; ** p< 0.05; *p<0.10

Table 2.6 shows the relationship between a segment’s birth filtration value and its product

diversity. We run separate regressions of filtration value on diversity for 𝐵𝑒𝑡𝑡𝑖1 and 𝐵𝑒𝑡𝑡𝑖2

groupings. We also show results that drop short-lived segments, which may occur due to noise in

Page 106: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

94

data (Lesnick 2013). From visual inspection, we chose 3 as the cut-off value for eliminating

segments. Thus, we show twelve regressions: three product cases, two Betti groupings, and

with/without the short-lived segments. The coefficients are all positive and 10 of 12 are

significant, implying that the segments that are formed late are more likely to have mixed local

products across the two cities (i.e. product diversity), as expected. The two non-significant slopes

are for salty snacks 𝐵𝑒𝑡𝑡𝑖1, which has fewer segments than beer or the combined analysis,

suggesting that this might be an issue with statistical power.

In summary, we show that TDA can detect high diversity segments that include local products in

regionally distinct markets, particularly as the filtration value increases. If there are many high

birth filtration value segments, a final step in identifying the potentially most interesting

segments is to look for those with longer filtration range as longer intervals suggest more robust

segments.

2.5. Conclusions

In this paper, we have applied Topological Data Analysis to a particular marketing application.

We have shown that TDA is effective at identifying connections between products that are not

purchased together but hold similar positioning in geographically distinct markets.

A key open question is whether the assumption of transitive preferences holds across settings. In

particular, our framework assumes that two objects that have no direct relationship with each

other, but are bought with a third object, are indirectly related. We have not directly tested this

assumption because we do not have data on several cross-market product launches and data on

pre-launch sales across locations. Furthermore, it is worth exploring whether assumption holds

across market types. For example, it might hold in our setting for consumer non-durables, but it

might not hold for durables or in business-to-business markets. Relatedly, while the IRI data are

ideal in the sense that they have rich customer-level data in two distinct markets, the products do

not have sufficiently rich attribute information to check that they serve similar roles by clustering

products on attributes. In other words, we have shown the potential usefulness of TDA but leave

a field test for future work.

Page 107: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

95

Generally, TDA is a new data mining tool and we anticipate other marketing applications. We

anticipate that those applications will be primarily identifying opportunities and a complement to

other types of analysis. In this way, TDA should not be seen as a final step for segment analysis,

but as a useful part of a more comprehensive analysis. It is exploratory and, as with all

segmentation methods, it does not yield a legitimate causal interpretation. Nevertheless, we

believe Topological Data Analysis should be seen as a useful tool in market structure and

segmentation analysis.

Page 108: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

96

Chapter 3

Does comparative advertising reposition rival brands closer

together or further apart in market-structure maps in search

stage?

3.1 Introduction

Brands with a lower market share often run comparative advertising campaigns against market

leaders. One of the most famous examples is Apple’s “I’m a Mac/I’m a PC” campaign, targeting

Microsoft, which ran from 2006 to 2009. More recently, when Samsung was ranked fifth in the

smartphone market in 2011, it started a comparative advertising campaign targeting the market

leader, the Apple iPhone. There have been many more recent examples of comparative

advertising between rival brands: Domino’s Pizza vs. Subway; Dunkin’ Donuts and Caribou

Coffee vs. Starbucks; Oscar Mayer vs. Ball Park; Verizon vs. AT&T; and General Motors’

Chevy Silverado and Chrysler Ram vs. the Ford F-150.

How does comparative advertising affect the positions of rival brands in a market-structure map

in brand search stage? Given that consumers gain information about both brands in comparative

advertising, they may do less brand co-searching. However, research suggests that the general

effect of comparative advertising is associative rather than differentiating (Johnson and Horne,

1988; Miniard, Rose, Barone, and Manning, 1993). These previous studies were completed

mainly through laboratory experiments. Potential reasons for the findings are as follows.

Comparative advertising is effective as it directly contrasts product attributes, drawing attention

to and promoting recall of both the advertising and the target (rival) brands (Muehling, Stoltman,

and Grossbart, 1990). Moreover, due to their limited cognitive capacity, consumers consider only

a few brands when making purchases, implying that some consumers may substitute non-

advertised brands with advertised brands in their consideration set. As a result, comparative

advertising is likely to decrease the distance between the advertising brand and the target brand

in the consumer mind-set.

Page 109: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

97

In this study, we test the effect of comparative advertising on brand position using consumer

behaviour in a marketplace, unlike previous studies done in a laboratory. Recent studies show

that television commercials result in an increase in consumer searches (Joo, Wilbur, Cowgill and

Zhu, 2015; Liaukonyte, Teixeira and Wilbur, 2015; Hu, Du, and Damangir, 2014; Du, Hu, and

Damangir, 2015). Thus, using publicly available aggregate consumer search data from Google

Trends, we measure how distances between brands change with comparative advertising.

We draw two types of market-structure maps to locate brands in (1) pure brand space and (2)

product space. In the first approach, we measure distances between brands directly. For the first

brand space map, we measure weekly co-occurrence of brand searches (e.g., Samsung and

Apple, HTC and Apple) for all the major brand pairs on the U.S. national level. Then, we draw a

map using multidimensional scaling. The more two brands are searched together, the closer they

are positioned in a map.

In the second approach, we measure brand distances indirectly using the relationship between

brand and attributes. For the second product space map, we count weekly brand–product

attribute co-occurrences (e.g., Samsung screen, Apple screen) for all the pairs between brands

and attributes at the U.S. national level. Then, we use correspondence analysis, where each brand

locates closest to the attribute most co-searched with it. If two brands share a most co-searched

attribute, they are likely to locate the most closely in the product space map. For example, if

Samsung and Apple are often searched with “screen” together, both brands are likely to locate

closely around “screen.” Therefore, this second approach using product space allows us to

investigate whether two brands reposition closely due to common product attributes mentioned

in comparative advertising.

Once we have a map, we can measure time-varying brand distance easily. Using the brand

distances derived from U.S. nationwide weekly co-search volume, we test whether a target brand

becomes closer to an advertising brand compared to non-advertised brands. We apply this

difference-in-difference (diff-in-diff) for both approaches at the U.S. national level.

By analyzing Samsung’s nationwide comparative TV advertising campaign against market

leader Apple iPhone in the U.S., we find empirical evidence that supports comparative

Page 110: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

98

advertising’s associative effect on brand position. First, during the TV commercial, co-search

volume on Samsung and Apple rises sharply, while co-search volume on Apple and other non-

advertised brands (i.e., Blackberry and HTC) increases slightly. Second, reflecting the increase

in co-searches on the advertised rival brands, a brand space map approach shows that Apple

moves closer to Samsung than it does to the other brands during the campaign. Third, in the first

week of the ad campaign, co-search volume of each brand and the advertised product attributes

such as screen, videos, and movies increases, but not much during the rest of the campaign.

Fourth, as the result, relative distance between Apple and Samsung does not decrease

significantly in a product space map during the campaign. These results suggest that comparative

advertising repositions both brands closely due mainly to direct co-searches on both rival brands,

not much through advertised product attributes.

Our contributions are as follows. First, we test the effect of comparative advertising on brand

position in market-structure maps; our study is the first to use consumer search data, and we

show that the theory holds in a marketplace. Second, we add empirical evidence to the rapidly

growing literature on advertising content. Last but not least, we propose one practical way to

monitor real-time marketing effectiveness using publicly available search data from Google

Trends.

We’ve organized the rest of this chapter as follows. In §3.2.1, we discuss why mobile phones are

relevant in studying the effect of advertising on consumer searches. Then, in §3.2.2, we

introduce our focal comparative advertising campaign. In §3.3.1, we explain our empirical

strategy, and then show the results using the direct approach in brand space in §3.3.2. Then, in

§3.3.3, we investigate one potential mechanism that moves the rival brands closer together using

the indirect approach in product space. We conclude in §3.4.

3.2. Data

3.2.1. Why a mobile phone?

We aim to evaluate whether comparative advertising moves rival brands closer together

or further apart, based on consumer behavior in a marketplace. In the consideration stage of the

purchase funnel, consumers begin to search actively for product information. Thus, we use

Page 111: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

99

consumer searches as the measure of consumer behavior in this study. Recent studies also show

that TV advertising increases consumer search volume (Joo, Wilbur, Cowgill, and Zhu 2015;

Liaukonyte, Teixeira, and Wilbur 2015; Hu, Du, and Damangir 2014; Du, Hu, and Damangir

2015). For this purpose, we seek for such an intensive search good.

While most durable goods could be used in this study, we decided to focus on mobile

phones to ensure adequate search volume. As mobile phones are a personal durable good, almost

everyone has one. Thus, many consumers search for a mobile phone. Furthermore, their life

cycle is short, which means that consumers are often in the market for a new phone and search

online for a new phoned.

Figure 3.1 Search volume trend by a brand

Notes: This picture is captured in Google trend.

Table 3.1 Google trend queries used for extracting brand search trend

Brand Query

Apple Apple + iPhone

Blackberry Blackberry

HTC HTC + Evo

Samsung Samsung + Galaxy

Notes: A single-term query will match all the searches containing that term. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.

Appl

e

iPhone Launch

Page 112: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

100

Before we conduct our main analysis, we check whether search volumes from Google Trends

well explain the mobile phone market. Figure 3.1 shows search volume trends by brand. As

shown in Table 3.1, we count the smartphone brand as well as its family brand (e.g., iPhone +

Apple). One can see that search volumes for Apple or iPhone have stayed high since the iPhone

launched in 2007. There are several spikes, which seem to correspond with new product

launches. Next, we do competition analysis using brand co-search volumes for all the brand

pairs. If two brands have both a family and a smartphone brand, there are 4 brand pairs (e.g.,

Apple Samsung + Apple Galaxy + iPhone Samsung + iPhone Galaxy; see Table 3.2 for others).

Figure 3.2 shows that Apple’s top rival brand has changed from Blackberry to HTC and

Samsung based on brand co-searches. Interestingly, major spikes occurred in the co-searches

between a market leader and competing brands.

Figure 3.2 Apple’s top rival brands trend

Notes: This picture is captured in Google trend.

Apple vs

Blackberry

Apple vs HTC Apple vs

Samsung

Page 113: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

101

Table 3.2 Google trend queries used for extracting brands co-search trend with Apple

Brand Pairs Query

Blackberry and Apple Blackberry Apple + Blackberry iPhone

HTC and Apple HTC Apple + HTC iPhone + Evo Apple + Evo iPhone

Samsung and Apple Samsung Apple + Samsung iPhone + Galaxy Apple + Galaxy iPhone

Notes: terms separated by a space will match searches with all the terms. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.

3.2.2. Comparative advertising campaign

We analyze Samsung’s comparative advertising campaign against the market leader,

Apple iPhone. Based on quantity of smartphone sales from March to May in 2011, Apple and

Samsung are ranked first and fifth, respectively (see Table 3.3). To catch up with Apple,

Samsung invested heavily in developing new products and advertising. As one of such efforts,

Samsung aired its first “The Next Big Thing is already here” TV commercial on November 24,

2011—Thanksgiving Day. They emphasized the superiority of their Galaxy S2, recently

launched in September 2011, touting better product attributes (e.g., bigger screen, faster speed,

and longer battery life) than the iPhone 4S unveiled in October 2011. By mocking the Apple

fanboys and fangirls, who are always waiting for a long time in a long line, the Samsung

campaign became one of the most viral TV commercials.

Table 3.3 U.S. smartphone market share from March to May 2011

Brand Market Share

Apple 26.6

Blackberry 24.7

HTC 11.8

Motorola 11.4

Samsung 8.9

Others 16.6

source: comScore MobiLens

After receiving a strong response from the campaign, Samsung ran “The Next Big Thing”

commercial series for several years, featuring their new smartphones or tablets. In this study, we

focus on the first campaign, which was run for about 5 weeks from November 24 to December

26, 2011. We also analyze the post-campaign period to see whether the effect lasted after the TV

commercial was no longer airing. As Samsung ran another comparative advertising campaign

Page 114: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

102

against Apple four weeks after the focal campaign, the three-week span between the two

campaigns is our post-campaign period.

Samsung made two types of content in the first campaign, which we analyze in this study. The

content of the full 60-second version of the first commercial includes the Galaxy S2’s big screen

and fast 4G network, and iPhone’s short battery life. Samsung also aired five 30-second versions:

one for both screen and battery, three for only screen, and one for only 4G. To emphasize its big

screen, in the ad a woman watches video on her phone. The second ad campaign features cloud

service for storing music and movies; however, it had only one 30-second version and ran for

only three weeks. Overall, during the first “Next Big Thing” campaign, Samsung emphasized its

big screen for watching movies or videos the most in terms of the amount of time aired.

3.3. Empirical strategy and results

3.3.1. Empirical strategy

In this study, we test whether comparative advertising repositions rival brands closer together

based on changes in consumer search volumes. In order to measure brand distances using

aggregate search volume, we use two types of market-structure maps based on consumers’

search strategies. Then, using the brand distances obtained from each map, we do diff-in-diff

analysis to test whether a market leader moves closer to an advertising brand compared to other

non-advertised brands during and after the campaign.

Consumers may have different search strategies. To compare alternative brands, consumers may

search two brands simultaneously or one brand each time, sequentially. For simultaneous brand

searches, the researcher can easily identify which brands are considered together. However, that

is not the case with the sequential brand search. If researchers can’t access a consumer’s search

history, a single brand search provides very limited information in identifying relationships

between brands. Unfortunately, Google Trends provides only aggregate search volume trends

rather than consumer-level search data.

For the sequential brand search, we exploit mixed searches between brands and their own

product attributes. As an example, let’s suppose that some consumers want to know which screen

size on a smartphone is optimal for both portability and watching movies or videos. These

Page 115: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

103

consumers may search “Apple screen” and “Samsung screen” sequentially. Although these

consumers did not search Apple and Samsung together, they are likely to consider both brands.

Through the common attribute “screen,” researchers can get some information about the

relationship between the two brands.

Considering the above search strategies, we draw two types of market-structure maps to locate

brands (1) in pure brand space and (2) in product space. In the first approach, we measure

distances between brands directly from simultaneous brand searches. For the first brand space

map, we count the co-occurrence of searches for brands (e.g., Samsung Apple; see Table 3.2 for

queries used) weekly for all the brand pairs at the U.S. national level. This becomes a similarity

matrix, where each cell includes the number of co-searches between two brands. A high value

means that two brands are often searched together. To draw a map using multidimensional

scaling, one needs a distance matrix. Thus, we inverse the number in each cell of the similarity

matrix. As more consumers consider two brands together for their purchase, the number of

simultaneous searches on the two brands increases, and thus the two brands locate more closely

in the consumers’ brand consideration space.

Table 3.4 Google trend queries used for extracting brand-product attribute trend for Apple

Product Attribute Query

App App Apple + Apple Apps + iPhone App + iPhone Apps

Screen Screen Apple + iPhone Screen

4G 4G Apple + iPhone 4G

Videos Videos Apple + Apple Video + iPhone Videos + iPhone Video

Voice Voice Apple + Apple Siri + iPhone Voice + iPhone Siri

Text Text Apple + iPhone Text

Pictures Pictures Apple + Apple Picture + iPhone Pictures + iPhone Picture

Data Data Apple + iPhone Data

Music Music Apple + iPhone Music

Internet Internet Apple + iPhone Internet

Battery Battery Apple + iPhone Battery

Camera Camera Apple + iPhone Camera

Map Map Apple + Apple Maps + iPhone Map + iPhone Maps

3D 3D Apple + iPhone 3D

Movies Movies Apple + Apple Movie + iPhone Movies + iPhone Movie

Cloud Cloud Apple + Apple iCloud + iPhone Cloud + iPhone iCloud

Notes: terms separated by a space will match searches with all the terms. Plus signs between terms/phrases cause the query to match searches with either of the terms/phrases.

Page 116: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

104

In the second approach, we measure brand distances indirectly by exploiting mixed searches

between brands and product attributes. For the second product space map, we count searches

including both a brand and a product attribute (e.g., Samsung screen; see Table 3.4 for queries

used for Apple) weekly for all the pairs between brands and attributes at the U.S. national level.

It becomes a contingency table with a brand column and a product attribute row. Then, we use

correspondence analysis proposed by Hirschfeld (1935) and later developed by Benzécri (1973).

In the resulting map, each brand locates closest to the attribute most co-searched with it. If two

brands have a common attribute that is the most co-searched attribute, those two brands are

likely to locate more closely than any other brands in a product space map.

Once a map is drawn, one can easily measure time-varying brand distances. Using the brand

distances derived from the U.S. nationwide weekly co-search volumes, we test whether

comparative advertising repositions both rival brands closer together. However, there is one

challenge, in that comparative advertising may reposition not only its focal brands, but also other

non-advertised brands in a map. Given that display advertising increases searches for even

competing brands that are not mentioned in the advertising (Leweis and Nguyen 2015), TV

advertising also may increase co-searches between advertised and non-advertised brands. If this

is the case, checking a distance change only between an advertising and a target (rival) brand

would not be enough to test our research question. To address such co-movement issues, we do

diff-in-diff analysis in our study. Diff-in-diff produces a natural measurement of the relative

distance change between the advertised rival brands and other non-advertised brands.

Another issue is to decide on a proper counterfactual distance as a control group. There are two

candidates: (1) distances between an advertising brand and non-advertised ones and (2) distances

between a target brand (a market leader in this study) and non-advertised ones. While either way

would be acceptable in other studies, we think that the second candidate is more appropriate in

our study due to the position of a target brand. As the market leader tends to be searched together

with other brands or product attributes more than any other brands, it is likely to locate in the

center of a market-structure map, which means the position of a market leader can serve as a

base point in measuring distance with other brands. Here is an example. Let’s suppose that a

market leader locates between an advertising and a non-advertised brand. Further, let’s assume

that comparative advertising affects both an advertising and a market leader except non-

advertised brands. During a campaign, when an advertising brand becomes a market leader, the

Page 117: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

105

distance between an advertising and a non-advertised brand also decreases while the distance

between a market leader and a non-advertised brand does not change. This example favors the

second candidate as a proper control group.

Considering the above argument, we test whether a target brand locates closer to an advertising

brand compared to non-advertised brands during and after the campaign. Our identification

assumption is that no other factors affect the change in relative distance between a market leader

and an advertising brand compared to other brands except the comparative advertising campaign.

In our focal TV comparative advertising campaign, while Apple and Samsung are the target and

the advertising brands, respectively, Blackberry and HTC are non-advertised brands. Our

treatment is the distance between Apple and Samsung. Our control groups are the distances

between (1) Apple and Blackberry and (2) Apple and HTC. A brand pair’s distance at week is

𝐷𝑖𝑡 = 𝛽1𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔×𝐷𝑢𝑟𝑖𝑛𝑔 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 + 𝛽2𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔×𝐴𝑓𝑡𝑒𝑟 𝐶𝑎𝑚𝑝𝑎𝑖𝑔𝑛𝑡 +

𝜇𝑖 + 𝜏𝑡 + 𝜀𝑖𝑡 (1)

Where 𝐴𝑝𝑝𝑙𝑒_𝑆𝑎𝑚𝑠𝑢𝑛𝑔 = 1 for an Apple and Samsung brand pair.

3.3.2. Main results using direct approach in brand space

Figure 3.3 shows the co-search trend for brands. Co-searches between the market leader Apple

and the advertiser Samsung jump in the first week of the campaign, and then decrease. As

Christmas season approaches, co-searches increase again. Co-searches between Apple and the

other non-advertised brands (i.e., Blackberry and HTC) also show a similar trend, except that the

increase of co-searches in the first week of a TV commercial is much smaller than that of the

advertised rival brands.

Figure 3.3 Co-search trend by a brand pair with Apple

Page 118: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

106

Now, we do diff-in-diff analysis using Equation (1) with co-search volumes as outcomes.

Column (1) in Table 3.5 shows that the coefficient for effectiveness during the campaign is

positive and significant, suggesting that the increase in co-searching between Apple and

Samsung is significantly higher than that between Apple and the other non-advertised brands.

The coefficient for effectiveness after the campaign is also positive although it is not significant,

suggesting that the increased co-search gap lasts somewhat even after the TV commercial is no

longer airing.

The above pattern in the co-searching trend is reflected in a market-structure map in Figure 3.4.

We use classical metric multidimensional scaling with the cmdscale function in R’s stats library,

also known as principal coordinates analysis (Gower, 1966). With comparative advertising, the

target brand Apple seems to be closer to the advertising brand Samsung than to the non-

advertised brands (i.e., Blackberry and HTC).

Comparative Ad.

Campaign Samsung

Galaxy S2

In U.S.

Apple

iPhone 4S

Page 119: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

107

Figure 3.4 Market-structure map in brand space

Week -2 Week +2

Figure 3.5 Samsung moves closer to Apple during campaign in brand space.

Figure 3.5 shows the trend of brand distances measured from the brand space maps in Figure 3.4.

Overall, Samsung is closer to Apple than the other two brands are. Around Apple’s iPhone 4S

launch, the distances between Apple and all the other three brands decrease and then increase.

Page 120: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

108

During Samsung’s comparative advertising, Samsung moves closer to Apple and then the

distance increases gradually, which is consistent with the decreasing marginal effect of

advertising exposure. On the contrary, Blackberry and HTC move away from Apple from the

second week of the campaign.

Table 3.5 Apple becomes closer to Samsung than the other brands during and after the campaign in

brand space.

Co-search between Brands

Direct Approach in Brand Space

Base Before &

After Nexus - Nexus

(1) (2) (3) (4)

Samsung_Apple x During (5 weeks) 8.83** (3.77)

-0.0250*** (0.0082)

-0.0170* (0.00905)

x Week 1~3 -0.0242** (0.0100)

x Week 4~5 -0.0262** (0.0118)

Samsung_Apple x After 5.27

(4.05) -0.0266* (0.0141)

-0.0266* (0.0143)

-0.0118 (0.0074)

R-sq 0.827 0.866 0.866 0.873

No. of brand pair dummies 2

No. of week dummies 22

No. of observations 69

Note. A dependent variable is distance between two brands measured in brand space except the first

column, whose dependent variable is co-search between two brands. Robust standard errors are

clustered at week level (23 weeks). ***p < 0.01; ** p< 0.05; *p<0.10.

Now, we test formally using Equation (1). Column (2) in Table 3.5 shows that the two

coefficients for the interaction effects are negative and significant. These results show that

market leader Apple becomes closer to an advertiser Samsung than to the other brands during

and even after the campaign. By comparing the result in Column (1), we find that the coefficient

for “after campaign” is significant in the brand space map, but not in co-search volume. This is

because brand positions in a map are also affected by co-search volumes of the other brand pairs,

which are not used in Column (1).

Around the fourth week of the campaign, on December 15, 2011, Samsung launched Galaxy

Nexus in the U.S., which is the third smartphone in the Google Nexus series. Because its launch

Page 121: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

109

timing does not overlap with the start of its comparative advertising, we are not concerned much

with this issue. However, it could affect the effect size. To address this issue, we measure the

effects for both before and after the Nexus phone launch. Column (3) in Table 3.5 shows that

both coefficients for “during campaign” are negative and significant. In this exercise, the first

coefficient is our main interest because it suggests that the comparative campaign rather than the

Nexus phone launch indeed repositions both rival brands.

One might still have concerns about the new product effect as consumers tend to wait and search

for a new product even before its launch. To reduce this concern, we extract search volumes

without Nexus as a keyword: for example, “Apple Samsung – Nexus” as the query in Google

Trends. However, this is a conservative approach. If the comparative advertising indeed affects

brand co-searches, it could boost “iPhone Galaxy Nexus” as well as “iPhone Galaxy.” Therefore,

by eliminating the queries with Nexus, the effect is likely to be underestimated. Column (4)

shows that the coefficient for “during” is negative and significant, although its effect size is

smaller than that in Column (2), which does not exclude brand co-searches with Nexus.

However, another coefficient for “after” is not any more significant. Its effect size decreases

much more than that for “during” compared to that in Column (2). This pattern seems to be

driven by the Nexus phone’s success. In other words, as Nexus was getting popular, consumers

might have searched more using an “iPhone Galaxy Nexus” query. In spite of such a strong test,

these results suggest that comparative advertising moves market leader Apple closer to advertiser

Samsung than to the other brands, at least during the campaign in brand space.

3.3.3. Mechanism check using indirect approach in product space

One can measure distances between brands and their own product attributes in addition to

distances among brands in a product space map drawn using the correspondence analysis. This

allows us to investigate why comparative advertising repositions rival brands closer. First, during

the comparative advertising, we check whether and how long co-searches between advertised

brands and product attributes increase by the type of attributes. Second, we test whether an

advertising brand (i.e., Samsung) and a target brand (i.e., Apple) move closer to each other in

product space.

Page 122: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

110

We classify product attributes in Table 3.6 into three types: (1) advertised, (2) ad-related, and (3)

unadvertised ones. First, advertised attributes are explicitly mentioned or seen in a TV

commercial. Recall that this campaign focuses the most on Samsung Galaxy S2’s huge screen,

which is good for watching videos or movies. The second most emphasized attribute is a fast 4G

network. The last one is Apple iPhone’s short-lived battery.

Table 3.6 Growth rate & change of co-search between each attribute and its brand

Attribute Type

Attribute

Samsung Galaxy Apple iPhone

Growth (%)

Change Growth

(%) Change

Advertised

Movies 100.0 2 10.5 2

Battery 60.0 3 0.0 0

Screen 38.5 5 9.3 4

Videos 33.3 2 18.3 8

4G 20.0 5 14.7 5

Ad-related

Pictures 100.0 4 29.6 8

Internet 100.0 2 0.0 0

Camera 66.7 2 29.4 5

Data 66.7 2 0.0 0

3G 60.0 3 3.8 4

3D 40.0 2 -20.0 -1

Wifi 33.3 1 9.8 4

Charger 0.0 0 0.0 0

Unadvertised

Text 33.3 1 2.9 1

App 11.8 2 3.4 1

Map 0.0 0 66.7 4

Voice 0.0 0 12.2 7

Music -42.9 -3 0.0 0

Cloud NA NA -12.5 -3

Note. There is not enough co-search between cloud and Samsung.

Second, ad-related attributes have similar benefits to the advertised attributes. For example, the

advertised big screen for watching videos or movies offers entertainment benefits. The ability to

watch “3D” movies, a “camera” for taking better “pictures,” and even listening to “music”

provide such entertainment value. Similarly, the advertised 4G improves “internet” speed to

download “data” quickly.

Page 123: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

111

Lastly, there are several unadvertised attributes: app, voice, text, map, and cloud. Among them,

in fact, cloud was advertised in the second campaign. Because we do not include the second

campaign in this product attribute analysis, the cloud is grouped into the unadvertised one.

Figure 3.6(a) shows co-searches with advertised product attributes for both the advertiser

Samsung and the target Apple, respectively. Across most advertised attributes, co-search volume

increases in the first week of the campaign, as shown in Table 3.6, and then decreases. On the

other hand, co-searches for unadvertised attributes in Figure 3.6(b) show a much more stable

trend compared to those for advertised or related ones, except Apple’s voice. Especially,

Samsung’s unadvertised attributes, except its App, do not change in their co-searches in the first

week of the campaign. Co-searches for Samsung App increase only slightly.

Figure 3.6(a) Co-searches between brands and advertised attributes increase in the first week of the

comparative advertising campaign.

Note: Campaign periods: Week 16 to 20, the launch of Apple 4S: Week 8

Page 124: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

112

Figure 3.6(b) Co-searches between brands and unadvertised attributes do not change much in the first

week of the comparative advertising campaign.

Note: Campaign periods: Week 16 to 20, the launch of Apple 4S: Week 8

Table 3.7 Difference in Difference for co-searches between brands and their attributes

Brand Attributes

Type

No. of

Attributes

Before

Ad*

After

Ad**

Mean of the

Differences

Paired

t-test P-value

Mean of the

Growth Rates (%)

Advertiser:

Samsung Galaxy

Advertised 5 10.20 13.60 3.00 3.40 0.007 50.36

Ad-related 8 3.50 5.50 2.00 4.73 0.002 58.33

Unadvertised 5 6.2 6.2 0 0 1.000 0.45

Target Brand:

Apple iPhone

(Market Leader)

Advertised 5 30.67 34.53 3.86 2.75 0.051 10.56

Ad-related 8 32.00 34.50 3.50 2.50 0.063 6.57

Unadvertised 6 29.67 31.29 1.62 1.18 0.291 12.11

Note. *Before Ad: one week before the campaign, **After Ad: the first week of the campaign

From the above observations, we test whether co-searches increase in the first week of the

campaign compared to one week before the campaign by each brand’s attribute type. The results

of a paired t-test are shown in Table 3.7. Reflecting the pattern in Figure 3.6, co-searches

between both brands and both advertised and ad-related attributes increase significantly, while

those for unadvertised attributes do not change significantly. These results suggest three things.

Page 125: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

113

First, advertising increases searches for an advertised attribute as well as its brand. Second, there

is spillover into advertising-related attributes. Third, not only the attacking, but also the attacked

brand gains in co-searches with its attributes. However, an attacking brand gains more. While

both brands gain a similar amount of co-searches (see Column “Mean of the Difference”), the

growth rate (see Column “Mean of the Growth Rates”) is much bigger for advertiser Samsung

because its lower market share has a much smaller search volume than market leader Apple.

Now, we turn to product space. To draw a market-structure map in product space, we do

correspondence analysis with the ca function in R’s ca package. In Figure 3.7, we plot brands

and attributes using the first two dimensions, which explain 94.8% variance. While Apple locates

closely to movies and videos, Blackberry is near App. Samsung and HTC position closely

around 4G.

Figure 3.7 Market-structure map in product space

Week -1 Week +1

By looking into each attribute’s growth rate in co-searches in Table 3.7 and Figure 3.7, in the

first week of the campaign, we find that all the advertised or ad-related attributes that locate

between advertiser Samsung and market leader Apple increase in co-searches with either

Samsung or Apple. Moreover, the growth rate tends to be bigger for attributes that are between

both brands than those that are not. Co-searches between Samsung and movies, which is between

Page 126: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

114

Samsung and Apple, have the highest growth rate. Lastly, none of the unadvertised attributes are

between the two brands in a map.

Figure 3.8 Samsung moves closer to Apple during campaign in product space.

Table 3.8 Apple becomes closer to Samsung than the other brands but insignificantly during and after

the campaign in product space.

Brand-attribute pair

Indirect approach

in product space

Samsung_Apple x During -0.0117 (0.0277)

Samsung_Apple x After -0.0309 (0.0335)

R-sq

No. of brand pair dummies 2

No. of week dummies 22

No. of observations 69

Note. A dependent variable is distance between two brands measured in product space. Robust

standard errors are clustered at week level (23 weeks). ***p < 0.01; ** p< 0.05; *p<0.10.

Page 127: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

115

Figure 3.8 shows the trend of brand distances measured from the product space maps in Figure

3.7. In the first week of the campaign, Samsung moves closer to Apple, while HTC and

Blackberry move more distant from Apple. During the 5-week campaign period, Apple looks

closer to Samsung than to the other two brands. Table 3.8 shows the results of diff-in-diff

analysis using Equation (1) for brand distances in product space; both coefficients for

effectiveness during and after the campaign are negative but insignificant. It suggests that Apple

becomes closer to Samsung than to the other brands, but not too much.

In summary, one possible reason that comparative advertising repositions both rival brands

Samsung and Apple closer together is that consumers do co-searches more between advertised

brands and their advertised attributes (i.e., videos, movies, and screen). However, this does not

seem to be a major force. Instead, consumers directly search more for the rival brands targeted in

comparative advertising.

3.4. Conclusion

In this study, we show that comparative advertising repositions rival brands closer together using

weekly aggregate search volume from Google Trends. As a mechanism, we find that direct brand

comparison (e.g., Apple vs. Samsung) is a major force. While consumers do indirect brand

comparison through advertised attributes (e.g., Apple screen vs. Samsung screen), such indirect

co-searches between brands and their attributes only rise in the beginning of the campaign. Our

results suggest that a brand with a lower market share may benefit from comparative advertising

against a market leader by forcing itself to be considered alongside a market leader when

consumers do brand searches.

An important limitation in this research is due to the nature of aggregate data. Google Trends

provides only aggregate search volume rather than individual search history. As a result,

researchers do not observe exactly which brands each consumer searched for in Google. To

overcome this problem, we exploited co-searching (1) between brands and (2) between brands

and their own attributes. Given that we test how brands reposition before and after a campaign

rather than focus on generating an exact market-structure map, our results would be reliable

unless many consumers change their search strategy around marketing activity. Instead, if the

research goal is to visualize the exact relationship among brands, consumer-level search history

Page 128: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

116

would generate a more accurate map than aggregate search volume. We leave this topic as one

for future research.

Page 129: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

117

References

• Adams H, Tausz A (2015) JavaPlex tutorial.

http://www.math.colostate.edu/~adams/research /javaplex_tutorial.pdf

• Adams H, Tausz A, Vejdemo-Johansson M (2014) JavaPlex: A research software package for

persistent (Co) homology. Proceedings of ICMS 2014, H. Hong and C. Yap (Eds.), Springer-

Verlag Berlin Heidelberg 129–136.

• Ailawadi, Kusum, Donald R. Lehmann, and Scott A. Neslin (2003), “Revenue Premium as an

Outcome Measure of Brand Equity,” Journal of Marketing, 67 (October), 1–17.

• Ailawadi KL, Keller KL (2004) Understanding Retail Branding: Conceptual Insights and

Research Priorities. Journal of Retailing 80(4):331-342.

• Alba, Joseph W., and Chattopadhyay Amitava (1986) Salience Effects in Brand Recall. Journal

of Marketing Research 23(4): 363-69

• Andreasen, Alan R. (1995) Marketing Social Change: Changing Behavior to Promote Health,

Social Development, and the Environment, Jossey-Bass 1st ed.

• Archak N, Ghose A, Ipeirotis PG (2011) Deriving the pricing power of product features by

mining consumer reviews. Management Sci. 57(8):1485–1509.

• Armstrong MA (1983) Basic topology. Springer, New York, Berlin.

• Ayasdi (2015) TDA and machine learning: Better together.

http://www.ayasdi.com/resources/tda-and-machine-learning-better-together-via-intro-tda/.

• Ayasdi (2016) website, http://www.ayasdi.com/industries/communications/personalized-

marketing/, accessed on February 29, 2016.

• Barbiero A and Ferrari PA (2014) Simulation of correlated Poisson variables, Applied

Stochastic Models in Business and Industry 31(5):669–680

• Benzécri, J.-P. (1973). L'Analyse des Données. Volume II. L'Analyse des Correspondances.

Paris, France: Dunod.

• Bergen M, Peteraf MA (2002) Competitor identification and competitor analysis: A broad-

based managerial approach. Managerial and Decision Economics 23(4-5):157-169.

• Blasco, Andrea., Pin, Paolo., Sobbrio, Francesco., 2016. Paying Positive to Go Negative:

Advertisers' Competition and Media Reports. European Economic Review 83, 243–261

• Blei, David M, Andrew Y Ng, and Michael I Jordan (2003) Latent dirichlet allocation. JMLR,

3:993-1022

• Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in

large networks. Journal of Statistical Mechanics: Theory and Experiment (10):P10008.

• Borkovsky, Ron N., Avi Goldfarb, Avery Haviv, and Sridhar Moorthy (2016) Measuring and

Understanding Brand Value in a Dynamic Model of Brand Management. Marketing Science

Forthcoming

• Büschken, Joachim, and Greg M. Allenby (2016) Sentence-Based Text Analysis for Customer

Reviews. Marketing Science Forthcoming

• Bronnenberg BJ, Kruger MW, Mela CF (2008) Database paper: The IRI marketing data

Set. Marketing Science 27(4):745-748.

• Brown, Tom J. and Peter A. Dacin (1997), “The Company and the Product: Corporate

Associations and Consumer Product Responses,” Journal of Marketing, 61 (January), 68-84.

Page 130: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

118

• Cameron, G. T. (1994). Does publicity outperform advertising? An experimental test of the

third-party endorsement. Journal of Public Relations Research, 6, 185–207.

• Carlsson G (2009) Topology and data. Bulletin of the American Mathematical

Society 46(2):255–308.

• Chang, Chun-Tuan (2008), “To Donate or Not to Donate? Product Characteristics and Framing

Effects of Cause-Related Marketing on Consumer Purchase Behavior,” Psychology and

Marketing (12), 1089-1110.

• Ching, Andrew T, Robert Clark, Ignatius Horstmann, Hyunwoo Lim (2016) The Effects of

Publicity on Demand: The Case of Anti-Cholesterol Drugs. Marketing Science 35(1):158-181

• Chintagunta PK, Jiang R, Jin GZ (2009) Information, learning, and drug diffusion: The case

of Cox-2 inhibitors. Quant. Marketing Econom. 7(4):399–443.

• Ciambriello, Roo (2014) How Ads That Empower Women Are Boosting Sales and Bettering

the Industry: Advertising Week panel spotlights 'fem-vertising'. Advertising Week October 3,

2014 http://www.adweek.com/news/advertising-branding/how-ads-empower-women-are-

boosting-sales-and-bettering-industry-160539

• Clauset A, Newman MEJ, Moore C (2004). Finding community structure in very large

networks. http://www.arxiv.org/abs/cond-mat/0408187.

• Conley, Timothy G. and Christopher R. Taber (2011) Inference with "Difference in

Differences" with a Small Number of Policy Changes. The Review of Economics and Statistics,

February 2011, 93(1): 113–125

• Connolly, Katie 2011 Six ads that changed the way you think. BBC News, Washington

http://www.bbc.com/news/world-us-canada-11963364

• Cooper LG, Inoue A (1996) Building market structures from consumer preferences. Journal

of Marketing Research 33(3):293–306.

• Datamonitor (2005) Dove Campaign for Real Beauty case study: Innovative marketing

strategies in the beauty industy, 2005 June

• De Smet, D., Vanormelingen, S., (2012) The Advertiser is Mentioned Twice. Media Bias in

Belgian Newspapers. HUB Research Papers 2012/05.

• DeSarbo WS, Grewal R (2007) An alternative efficient representation of demand‐ based

competitive asymmetry. Strategic Management Journal 28(7):755-766.

• DeSarbo WS, Grewal R, Wind J (2006) Who competes with whom? A demand-based

perspective for identifying and representing asymmetric competition. Strategic Management

Journal 27(2):101-129.

• DeSarbo WS, Manrai AK, Manrai LA (1993) Non-spatial tree models for the assessment of

comparative maket structure: An integrated review of the marketing and psychometric

literature. Eliashberg J, Lilien G, eds. Handbook in operations research and marketing science,

North Holland, Amsterdam, 193-257.

• DeSarbo WS, Soete GD. 1984. On the Use of Hierarchical Clustering for the Analysis of

Nonsymmetric Proximities. Journal of Consumer Research 11(1) 601-610.

• Dove website (2015) The Dove Campaign for Real Beauty. Http://www.dove.us/Social-

Mission/campaign-for-real-beauty.aspx (Last visited on Dec. 5 2015).

• Dove website (2015) The Dove Campaign for Real Beauty. Http://www.dove.us/Social-

Mission/campaign-for-real-beauty.aspx (Last visited on Dec. 5 2015).

Page 131: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

119

• Dove website (2016) Dove Vision. http://www.dove.com/us/en/stories/about-dove/our-

vision.html (Last visited on Dec. 27 2016).

• Drumwright, Minette E. (1996) Company Advertising with a Social Dimension: The Role of

Noneconomic Criteria. Journal of Marketing 60(4):71-87

• Du, Hu & Damangir (2015) “Leveraging trends in online searches for product features in

market response modeling”, Journal of Marketing

• Edelsbrunner H, Harer J (2010) Computational topology: An introduction. American

Mathematical Society, Providence RI.

• Edelsbrunner H, Letscher D, Zomorodian A. (2002) Topological persistence and simplication.

Discrete and Computational Geometry 28:511-533.

• Ellman, M., Germano, F., 2009. What do the papers sell? A model of Advertising and Media

Bias. Econ. J. 119 (537), 680–704.

• Elrod T, Russell GJ, Shocker AD, Andrews RL, Bacon L, Bayus, Carroll JD, Johnson RM,

Kamakura WRA, Lenk P, Mazanec JA, Rao VR, Shankar V. (2002) Inferring market structure

from customer response to competing and complementary products. Marketing Letters 13(3):

221–32.

• Erdem T (1996) A dynamic analysis of market structure based on panel data. Marketing Science

15(4):359-378.

• Erdem T, Keane MP (1996) Decision-Making Under Uncertainty: Capturing Dynamic Choice

Processes in Turbulent Consumer Good Markets Marketing Science 15(1): 1–20.

• Etcoff, Nancy, Susie Orbach, Jennifer Scott, Heidi D’Agostino (2004) The real truth about

beauty: a global report, findings of the global study on women, beauty and well-being,

http://www.dove.us/docs/pdf/19_08_10_The_Truth_About_Beauty-White_Paper_2.pdf

• Focke, Florens, Alexandra Niessen-Ruenzi, and Stefan Ruenzi, 2016 A Friendly Tur

• n: Advertising Bias in the News Media. Tech Report, Universität Mannheim.

• Folse, Judith A.G., Ronald W. Niedrich, and Stacy L. Grau (2010), “Cause-Related Marketing:

The Effect of Purchase Quantity and Firm Donation Amount on Consumer Inferences and

Participation Intentions,” Journal of Retailing, 86 (4), 295-309.

• Fossen, Beth L. and David A. Schweidel (2016) Television Advertising and Online Word-of-

mouth: An Empirical Investigation of Social TV Activity. Conditionally accepted at Marketing

Science

• Freeman L (1977) A set of measures of centrality based on betweenness. Sociometry 40: 35–

41.

• France S, Ghose S (2016) An analysis and visualization methodology for identifying and

testing market structure. Marketing Science 35(1): 182 – 197.

• Gabszewicz, Jean J., Didier Laussel, and Nathalie Sonnac (2002), “Press Advertising and the

Political Differentiation of Newspapers,” Journal of Public Economic Theory, 4 (July), 317–

34.

• Gal-Or, Esther, Tansev Geylani, Tuba Pinar Yildirim (2012) The Impact of Advertising on

Media Bias. Journal of Marketing Research: February 2012, Vol. 49, No. 1, pp. 92-99.

• Gambaro, M., Puglisi, R., 2015. What do ads buy? Daily coverage of listed companies on the

Italian press, European Journal of Political Economy, 39, 41-57

• Garbett, Thomas F. 1981. Corporate Advertising. New York: McGraw-Hill.

Page 132: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

120

• Gary, Erickson and Robert Jacobson (1992), “Gaining Comparative Advantage Through

Discretionary Expenditures: The Returns to R&D and Advertising,” Management Science, 38

(9), 1264–79.

• Gentzkow, Matthew and Jesse M. Shapiro (2010) What Drives Media Slant? Evidence from

U.S. Daily Newspapers. Econometrica 78 (1) 35-71

• Ghose A, Ipeirotis PG, Li B (2012) Designing ranking systems for hotels on travel search

engines by mining user-generated and crowdsourced content. Marketing Sci. 31(3):493–520.

• Girvan M, Newman ME (2002) Community structure in social and biological networks.

Proceedings of the National Academy of Sciences 99(12):7821-7826.

• Gopinath S, Thomas JS, Krishnamurthi L (2014) Investigating the relationship between the

content of online word of mouth,advertising, and brand performance. Marketing Sci.

33(2):241–258.

• Gower, J. C. (1966) Some distance properties of latent root and vector methods used in

multivariate analysis. Biometrika 53, 325–328.

• Gromov M (1987) Hyperbolic groups. Essays in group theory, Mathematical Sciences

Research Institute Publications 8, Springer-Verlag, 75–263.

• Gurun, Umit G. and Alexander W. Butler. 2012. Don't believe the Hype: Local Media Slant,

Local Advertising, and Firm Value." Journal of Finance 67 (2):561-597.

• Harald J. Van Heerde, Els Gijsbrechts, and Koen Pauwels (2015) Fanning the Flames? How

Media Coverage of a Price War Affects Retailers, Consumers, and Investors. Journal of

Marketing Research: October 2015, Vol. 52, No. 5, pp. 674-693.

• Hatcher A (2002) Algebraic topology. Cambridge University Press, Cambridge

• Hausmann JC (1995) On the Vietoris–Rips complexes and a cohomology theory for metric

spaces. Prospects in Topology: Proceedings of a conference in honour of William Browder,

Annals of Mathematics Studies 138, Princeton Univ. Press, 175–188.

• Henderson GR, Iacobucci D, Calder BJ (1998), Brand Diagnostics: Mapping Branding Effect

Using Consumer Associative Networks. European Journal of Operational Research, 111

(December), 306–327.

• Hirschfeld, H.O. (1935) "A connection between correlation and contingency", Proc.

Cambridge Philosophical Society, 31, 520–524

• Hoffman, D. Novak, T (2015) Emergent Experience and the Connected Consumer in the Smart

Home Assemblage and the Internet of Things. Working paper, George Washington University.

• Honiq, Zach (2012) “Apple files German lawsuit against Samsung, targets Galaxy S II, nine

other smartphones”, Engadget, January 17th 2012,

http://www.engadget.com/2012/01/17/apple-files-another-german-lawsuit-against-samsung-

targets-gala/

• Hovland, C. I., & Weiss, W. (1951). The influence of source credibility on communication

effectiveness. Public Opinion Quarterly, 15, 635–650.

• Hu, Du & Damangir (2014) “Decomposing the Impact of Advertising: Augmenting Sales with

Online Search Data”, Journal of marketing research

• Hull, Clyde E. and Sandra Rothenberg (2008), “Firm Performance: The Interactions of

Corporate Social Performance with Innovation and Industry Differentiation,” Strategic

Management Journal, 29 (7), 781-89.

Page 133: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

121

• Jain, Subhash C. and Edwin C. Hackleman (1978) How Effective is Comparison Advertising

for Stimulating Brand Recall? Journal of Advertising 7(3): 20-25

• John DR, Loken B, Kim K, Monga AB (2006) Brand concept maps: A methodology for

identifying brand association networks. Journal of Marketing Research 43(4):549–563.

• Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241-254.

• Joo, Mingyu, Kenneth C. Wilbur, Bo Cowgill, Yi Zhu, (2015) Television Advertising and

Online Search, Management Science,

• Kalra A, Li S, Zhang W (2011) Understanding responses to contradictory information about

products. Marketing Sci. 30(6): 1098–1114

• Kamakura WA, Russell GJ (1989) A Probabilistic Choice Model for Market Segmentation and

Elasticity Structure Journal of Marketing Research, 26 (November), 87–96.

• Kerkhof, Anna and Johannes Münster, 2015 restrictions on advertising, commercial media bias,

and welfare, Journal of Public Economics, 131, 124-141

• Kim JB, Albuquerque P, Bronnenberg BJ (2011) Mapping online consumer search. Journal of

Marketing Research 48(1):13-27.

• Kolstad, Jonathan (2007) Unilever PLC: Campaign for Real Beauty campaign. Encyclopedia

of Major Marketing Campaigns, volume 2. Thomson Gale 1679-1683

• Koschate-Fischer, Nicole, Isabel V. Stefan, and Wayne D. Hoyer (2012), “Willingness to Pay

for Cause-Related Marketing: The Impact of Donation Amount and Moderating Effects,”

Journal of Marketing Research, 49 (December), 910-27.

• Kotler, Philip and Gerald Zaltman 1971 Social Marketing: An Approach to Planned Social

Change. Journal of Marketing, 35 (3)

• Kotler, Philip A., Ned Roberto, and Nancy R. Lee (2002) Social Marketing: Improving the

Quality of Life, Sage Publications 3rd Ed.

• Kotler, Philip and Nancy R. Lee (2007) Social Marketing: Influencing Behaviors for Good,

Sage Publications 3rd Ed.

• Kruger MW, Pagni D (2011) IRI academic data set description. Information Resources, Inc.

page 16

• Lattin JM, Carrol DJ, Green PE (2003) Analyzing multivariate data. Duxbury Resource Center,

Pacific Grove.

• Lee TY, Bradlow ET (2011) Automated marketing research using online customer reviews. J.

Marketing Res. 48(5):881–894.

• Lesnick M (2013) Studying the shape of data using topology. The Institute Letter. Institute for

Advanced Study, Summer Issue, page 10-11.

• Levine, Dan (2011). "U.S. judge says Samsung tablets infringe Apple patents". Reuters.com,

October 13, 2011, http://www.reuters.com/article/2011/10/13/us-apple-samsung-lawsuit-

idUSTRE79C79C20111013?feedType=RSS&feedName=businessNews&utm_source=dlvr.it

&utm_medium=twitter&dlvrit=56943

• Leone, Robert P. (1995), “Generalizing What Is Known About Temporal Aggregation and

Advertising Carryover,” Marketing Science, 14 (3), 141–50.

• Lewis, Randall, Dan Nguyen 2015 Display advertising’s competitive spillovers to consumer

search, Quantitative Marketing and Economics 13 (2), 93-115

Page 134: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

122

• Li, H., & Kannan, P. K. (2014). Attributing conversions in a multichannel online marketing

environment: an empirical model and a field experiment. Journal of Marketing Research, 51(1),

40–56.

• Liaukonyte, Teixeira & Wilbur (2015) Television Advertising and Online Shopping,

Marketing Science,

• Liu Y (2006) Word-of-mouth for movies: Its dynamics and impact on box office revenue. J.

Marketing 70(3):74–89.

• Lord, K. R., & Putrevu, S. (1993). Advertising and publicity: an information processing

perspective. Journal of Economic Psychology, 14, 57–84.

• Ludwig S, de Ruyter K, Friedman M, Brüggen EC, Wetzels M, Pfann G (2013) More than

words: The influence of affective content and linguistic style matches in online reviews on

conversion rates. J. Marketing 77(1):87–103.

• Lum PY, Singh G, Lehman A, Ishkanov T, Vejdemo-Johansson M, Alagappan M, Carlsson J,

Carlsson G (2013) Extracting insights from the shape of complex data using topology.

Scientific Reports 3, 1236.

• Luo, Xueming and Bhattacharya. C.B. (2006) Corporate Social Responsibility, Customer

Satisfaction, and Market Value. Journal of Marketing 70(4):1-18

• Luo, Xueming and Bhattacharya. C.B. (2009), “The Debate over Doing Good: Corporate

Social Performance, Strategic Marketing Levers, and Firm-Idiosyncratic Risk," Journal of

Marketing, 73 (November), 198-213.

• Mantrala, Murali K., Prasad A. Naik, Shrihari Sridhar, and Esther Thorson (2007), “Uphill or

Downhill? Locating the Firm on a Profit Function,” Journal of Marketing, 71 (April), 26–44.

• McQuail (2010) McQuail’s Mass Communication Theory

• Meredith, Macleod (2005) Advertisers bank on 'real women' to sell. The Spectator 24 Aug

2005 http://search.proquest.com/docview/270232751?accountid=14771

• Michelle Andrews, Xueming Luo, Zheng Fang and Jaakko Aspara. (2014) Cause Marketing

Effectiveness and the Moderating Role of Price Discounts. Journal of Marketing 78:6, 120-

142.

• Murry JR., John P., Antonie Stam and John L. Lastovicka (1996) Paid- versus Donated-Media

Strategies For Public Service Announcement Campaigns. Public Opinion Quarterly 60 (1): 1-

29.

• Netzer, Oded, Ronen Feldman, Jacob Goldenberg, Moshe Fresko, (2012) Mine Your Own

Business: Market-Structure Surveillance through Text Mining. Marketing Science 31(3):521-

543.

• Newman ME, Girvan M (2004) Finding and evaluating community structure in networks.

Physical Review E 69(2):026113.

• Omid and Pete 2015 How advertising has become an agent of social change Feb 10, 2015

https://medium.com/@moonstorming/how-advertising-has-become-an-agent-of-social-

change-148aa0ef303a#.xz6y429mm

• Onishi H, Manchanda P (2012) Marketing activity, blogging and sales. Internat. J. Res.

Marketing 29(3):221–234.

• Pauwels H, Stacey E, Lackman A (2013) Beyond likes and tweets: Marketing, online platforms

content, and store performance. MSI Report.

• Pew Research Center, 2014. The State of the News Media 2014. An Annual Report on

American Journalism (Washington DC).

Page 135: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

123

• Pons P, Latapy M (2005) Computing communities in large networks using random walks.

http://arxiv.org/abs/physics/0512106

• Porter, M.F. (1997) An Algorithm for Suffix Stripping. Readings in Information Retrieval,

Karen Sparck Jones and Peter Willett, eds. San Francisco: Morgan Kaufmann Publishers, 313–

16.

• Pracejus, John W. and Norman R. Brown (2003), “On the Prevalence and Impact of Vague

Quantifiers in the Advertising of Cause-Related Marketing (CRM),” Journal of Advertising,

32 (4), 19-28.

• Punj G, Stewart DW (1983) Cluster analysis in marketing research: Review and suggestions

for application. Journal of Marketing Research 20(2):134-148.

• Raghubir, Priya, John Roberts, Katherine N Lemon and Russell S Winer. (2010) Why, When,

and How Should the Effect of Marketing Be Measured? A Stakeholder Perspective for

Corporate Social Responsibility Metrics. Journal of Public Policy & Marketing 29:1, 66-77.

• Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community

structures in large-scale networks. Physical Review E 76, 036106.

• Reuter,J.,Zitzewitz,E.,2006. Do ads influence editors? Advertising and bias in the financial

media. The Quarterly Journal of Economics (2006) 121 (1): 197-227.

• Rinallo, D.,Basuroy,S.,2009 Does advertising spending influence Media coverage of the

advertiser? Journal of Marketing 73, 33–46.

• Ringel DM, Skiera B (2016) Visualizing asymmetric competition among more than 1,000

products using big search data. Forthcoming at Marketing Science.

• Rips E (1982) Subgroups of small cancellation groups. Bulletin of the London Mathematical

Society 14 (1):45–47.

• Robinson, Stefanie R., Caglar Irmak, and Satish Jayachandran (2012), “Choice of Cause in

Cause-Related Marketing,” Journal of Marketing, 76 (July), 126-39

• Rotta R, Noack A (2011) Multilevel local search algorithms for modularity clustering. Journal

of Experimental Algorithmics 16:2-3.

• Saurabh Mishra and Sachin B. Modi. (2016) Corporate Social Responsibility and Shareholder

Wealth: The Role of Marketing Capability. Journal of Marketing 80:1, 26-46.

• Servaes, Henri and Ane Tamayo (2013), “The Impact of Corporate Social Responsibility on

Firm Value: The Role of Customer Awareness,” Management Science, 59, 1045-61.

• Sonnier GP, McAlister L, Rutz OJ (2011) A dynamic model of the effect of online

communications on firm sales. Marketing Sci. 30(4):702–716

• Spiteri, J., 2015. When Is No News Good News? A Model of Information Disclosure and

Commercial Media Bias. Working paper.

• Srinivasan S, Rutz OJ, Pauwels K (2015) Paths to and of purchase: quantifying the impact of

traditional marketing and online consumer activity. Journal of Academic Marketing Science

Forthcoming.

• Srivastava RK, Leone RP, Shocker AD (1981) Market Structure Analysis: Hierarchical

Clustering of Products Based on Substitution-in-use. Journal of Marketing 45(3):38-48.

• Srivastava RK, Alpert MI, Shocker AD (1984) A Customer-Oriented Approach for

Determining Market Structures. Journal of Marketing 48 (1):32–45.

• Sriram, S., Subramanian Balachander, and Manohar U. Kalwani (2007) Monitoring the

Dynamics of Brand Equity Using Store-Level Data. Journal of Marketing 71(2), 61–78

Page 136: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

124

• Strahilevitz, Michal and John G. Myers (1998), “Donations to Charity as Purchase Incentives:

How Well They Work May Depend on What You Are Trying to Sell,” Journal of Consumer

Research, 24 (4), 434-46.

• Strömberg, David (2004), “Mass Media Competition, Political Competition, and Public Policy,”

Review of Economic Studies, 71 (January), 265–84.

• Taddy, Matt (2012) On estimation and selection for topic models. In Proceedings of the

Fifteenth international Conference on Artificial Intelligence and Statistics (AISTATS-12),

1184-119

• Tang C, Guo L (2013) Digging for gold with a simple tool: Validating text mining in studying

electronic word-of-mouth (eWOM) communication. Marketing Lett. 26(1):67–80.

• Tirunillai S, Tellis GJ (2012) Does chatter really matter? Dynamics of user-generated content

and stock performance. Marketing Sci. 31(2):198–215.

• Tirunillai, Seshadri and Gerard J. Tellis (2014) Mining Marketing Meaning from Online

Chatter: Strategic Brand Analysis of Big Data Using Latent Dirichlet Allocation. Journal of

Marketing Research 51 (4) 463-479.

• Urban GL, Johnson PL, Hauser JR. (1984) Testing competitive market structures. Marketing

Science 3(2):83-112.

• Vietoris L (1927) Über den höheren Zusammenhang kompakter Räume und eine Klasse von

zusammenhangstreuen Abbildungen. Mathematische Annalen 97(1):454–472.

• Vingilis, Evelyn, and Barbara Coultes. 1990. "Mass Communications and Drinking-Driving:

Theories, Practice and Results." Alcohol, Drugs and Driving 6(2):61-81.

• Walker, Rob (2005) Social Lubricant–How a marketing campaign became the catalyst for a

societal debate. New York Times Magazine, September 4 2005

• http://www.nytimes.com/2005/09/04/magazine/social-lubricant.html?_r=0

• Ward JH (1963) Hierarchical grouping to optimize an objective function. Journal of the

American Statistical Association 58:236–244.

• Wilbur, K.C., 2008. A two-sided, empirical model of television advertising and viewing

markets. Mark. Sci. 27 (3), 356–378.

• Wojcicki, Susan (2016) Susan Wojcicki on the Effectiveness of Empowering Ads on YouTube.

Think with Google 2016 April https://think.storage.googleapis.com/docs/youtube-

empowering-ads-engage-a.pdf

• Wolf, Naomi (2002) The Beauty Myth: How Images of Beauty Are Used against Women. New

York: William Morrow, (originally published in 1991)

• Xiao, Liu, Singh Param Vir, Srinivasan Kannan (2016) A Structured Analysis of Unstructured

Big Data by Leveraging Cloud Computing, marketing science forthcoming

• Zhai Z, Liu B, Xu H, Jia P (2011) Clustering Product Features for Opinion Mining. In

Proceedings of the Fourth ACM International Conference on Web Search and Data Mining.

New York, NY. ACM, 347-354.

• Zhu, Yi and Anthony Dukes (2015) Selective Reporting of Factual Content by Commercial

Media. Journal of Marketing Research: February 2015, Vol. 52, No. 1, pp. 56-76.

• Zomorodian A, Carlsson G (2005) Computing persistent homology. Discrete and

Computational Geometry 33:249-274.

Page 137: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

Appendices

Table A1 The number of sentences labeled as real beauty topics increases relative to that as other

beauty topics in the treated countries during the month(s) of the real beauty campaign.

Real Beauty X During Campaign

18.93*** (4.41)

Country-Topic Dummies 25

Year-Month Dummies 23

R-sq 0.830

Observations 624

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

The number of sentences is calculated by su

The U.S., Canada, and the U.K. have 10(2), 8(2), and 10(1) topics (real beauty ones), respectively.

Two real beauty topics in each country are aggregated into one topic.

Robust standard errors are clustered at the country-topic level. ***p < 0.01

Page 138: ESSAYS IN ADVERTISING MESSAGES MASS MEDIA ......learning in Toronto, my wife, Eunkyung, started learning about big data tools by building her expertise in computer science. I believe

126

Table A2 The number of sentences labeled as real beauty topics increases relative to that as other

beauty topics in the treated countries relative to control countries during and one month after the Real

Beauty campaign.

(1) (2)

Only

During

During &

After

Real Beauty x Treated Countries

x During Campaign

32.90***

(5.106)

32.38***

(5.047)

x One Month After Campaign 3.852*

(2.262)

x Two Months After Campaign -6.146

(6.177)

Real Beauty

x During Campaign

0.066

(0.647)

0.765

(0.778)

x One Month After Campaign -2.670***

(0.957)

x Two Months After Campaign 6.864***

(1.078)

Country-specific topic dummies 34 34

Year-month dummies 23 23

R-sq 0.738 0.739

Observations 840 840

Dependent variable is the number of sentences in each country-topic, which is the unit level of analysis.

Treated Countries, the U.S., Canada, and the U.K., have 10(2), 8(2), and 10(1) topics (real beauty ones),

respectively. Two real beauty topics in each country are aggregated into one topic.

Control countries, New Zealand and Australia, have 7(1) and 2(0) topics (real beauty ones), respectively.

Robust standard errors are clustered at the country-topic level. ***p < 0.01, *p < 0.10