jing peng, ashish agarwal, kartik …...ashish agarwal is associate professor of information, risk,...

15
JING PENG, ASHISH AGARWAL, KARTIK HOSANAGAR, and RAGHURAM IYENGAR* Improving content sharing on social media platforms helps rms enhance the efcacy of their marketing campaigns. The authors study the impact of network overlapthe overlap in network connections between two userson content sharing in directed social media platforms. The authors propose a hazards model that exibly captures the impact of three measures of network overlap (i.e., common followees, common followers, and common mutual followers) on content sharing. Using data on content sharing from two directed social media platforms (Twitter and Digg), the authors establish that a receiver is more likely to share content from a sender with whom they share more common followees, common followers, or common mutual followers even after accounting for other measures. In addition, common followers have a higher effect than common mutual followers on the sharing propensity of the receiver. Finally, the effect of common followers and common mutual followers is positive when the content is novel but decreases, and may even become negative, when many others have already shared it. Collectively, these results have a bearing for marketers to more effectively target users for spreading content on social media platforms. Keywords: social media, content sharing, network overlap, multiple senders, hazards model Online Supplement : http://dx.doi.org/10.1509/jmr.14.0643 Network Overlap and Content Sharing on Social Media Platforms Social media platforms hold the potential to reshape the way consumers generate, spread, and consume content because of their unique capability to connect users. Consequently, spending on social media advertising on platforms such as Facebook and Twitter has been on the rise worldwide in the last several years (Statista 2018). For example, some brands (e.g., Dell) use Twitter to offer product promotions, while others (e.g., Whole Foods Market) use it to educate customers (for details, see https://www.socialmediaexaminer.com/6-ways-to- increase-twitter-engagement). Firms spent more than $30 billion in 2016, up from $16 billion in 2014 (LePage 2018). In the United States alone, social media spending is projected to ex- ceed $17 billion by 2019. Marketing communication through these platforms can enable rms to reach new customers through userscon- nections and drive the demand for their products (e.g., Schweidel and Moe 2014; Stephen and Toubia 2010). For instance, users on Twitter can retweet any content they re- ceive to make their friends aware of it. Such content shared by users has been found to be more effective in acquiring new users as compared to direct communication from a rm (e.g., Gong et al. 2017). Thus, understanding the factors that affect sharing on social media platforms is important for both marketing practice and theory (e.g., Lambrecht, Tucker, and *Jing Peng is Assistant Professor of Operations and Information Management, School of Business, University of Connecticut (email: [email protected]). Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin (email: [email protected]). Kartik Hosanagar is John C. Hower Professor of Operations, Information and Decisions, the Wharton School, University of Pennsylvania (email: [email protected]). Raghuram Iyengar is Associate Professor of Marketing, the Wharton School, University of Pennsylvania (email: [email protected]). The authors beneted from feedback from session participants at 2013 Symposium on Statistical Challenges in eCommerce Research, 2014 International Conference on Infor- mation Systems, 2015 Workshop on Information in Networks, 2015 INFORMS Annual Meeting, and 2015 Workshop on Information Systems and Economics. The authors thank Professors Christophe Van den Bulte, Paul Shaman, and Dylan Small for helpful discussions. The authors also thank the review team for their valuable feedback. This paper was made possible by nancial support extended to the rst author through Mack Institute Research Fellowship, President Gutmanns Leadership Award, and Baker Retailing Center PhD Research Grant. Jacob Goldenberg served as associate editor for this article. © 2018, American Marketing Association Journal of Marketing Research ISSN: 0022-2437 (print) Vol. LV (August 2018), 571585 1547-7193 (electronic) DOI: 10.1509/jmr.14.0643 571

Upload: others

Post on 17-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

JING PENG, ASHISH AGARWAL, KARTIK HOSANAGAR, and RAGHURAM IYENGAR*

Improving content sharing on social media platforms helps firms enhancethe efficacy of their marketing campaigns. The authors study the impactof network overlap—the overlap in network connections between twousers—on content sharing in directed social media platforms. The authorspropose a hazards model that flexibly captures the impact of threemeasuresof network overlap (i.e., common followees, common followers, and commonmutual followers) on content sharing. Using data on content sharing from twodirected social media platforms (Twitter and Digg), the authors establishthat a receiver is more likely to share content from a sender with whom theysharemore common followees, common followers, or commonmutual followerseven after accounting for othermeasures. In addition, common followers have ahigher effect than common mutual followers on the sharing propensity of thereceiver. Finally, theeffect of common followers andcommonmutual followers ispositive when the content is novel but decreases, and may even becomenegative, whenmany others have already shared it. Collectively, these resultshave a bearing for marketers to more effectively target users for spreadingcontent on social media platforms.

Keywords: social media, content sharing, network overlap, multiple senders,hazards model

Online Supplement: http://dx.doi.org/10.1509/jmr.14.0643

Network Overlap and Content Sharing onSocial Media Platforms

Social media platforms hold the potential to reshape the wayconsumers generate, spread, and consume content because oftheir unique capability to connect users. Consequently, spending

on social media advertising on platforms such as Facebook andTwitter has been on the rise worldwide in the last several years(Statista 2018). For example, some brands (e.g., Dell) useTwitter to offer product promotions, while others (e.g.,Whole Foods Market) use it to educate customers (fordetails, see https://www.socialmediaexaminer.com/6-ways-to-increase-twitter-engagement). Firms spentmore than $30 billionin 2016, up from $16 billion in 2014 (LePage 2018). In theUnited States alone, social media spending is projected to ex-ceed $17 billion by 2019.

Marketing communication through these platforms canenable firms to reach new customers through users’ con-nections and drive the demand for their products (e.g.,Schweidel and Moe 2014; Stephen and Toubia 2010). Forinstance, users on Twitter can retweet any content they re-ceive tomake their friends aware of it. Such content shared byusers has been found to be more effective in acquiring newusers as compared to direct communication from a firm (e.g.,Gong et al. 2017). Thus, understanding the factors that affectsharing on social media platforms is important for bothmarketing practice and theory (e.g., Lambrecht, Tucker, and

*Jing Peng isAssistant Professor ofOperations and InformationManagement,School of Business, University of Connecticut (email: [email protected]).Ashish Agarwal is Associate Professor of Information, Risk, and OperationsManagement, McCombs School of Business, University of Texas at Austin(email: [email protected]). Kartik Hosanagar is John C.Hower Professor of Operations, Information andDecisions, theWharton School,University of Pennsylvania (email: [email protected]). RaghuramIyengar is Associate Professor of Marketing, the Wharton School, Universityof Pennsylvania (email: [email protected]). The authors benefitedfrom feedback from session participants at 2013 Symposium on StatisticalChallenges in eCommerce Research, 2014 International Conference on Infor-mation Systems, 2015Workshop on Information in Networks, 2015 INFORMSAnnual Meeting, and 2015 Workshop on Information Systems and Economics.The authors thank Professors Christophe Van den Bulte, Paul Shaman, andDylan Small for helpful discussions. The authors also thank the review team fortheir valuable feedback. This paper was made possible by financial supportextended to the first author through Mack Institute Research Fellowship,President Gutmann’s Leadership Award, and Baker Retailing Center PhDResearch Grant. Jacob Goldenberg served as associate editor for this article.

© 2018, American Marketing Association Journal of Marketing ResearchISSN: 0022-2437 (print) Vol. LV (August 2018), 571–585

1547-7193 (electronic) DOI: 10.1509/jmr.14.0643571

Page 2: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

Wiertz 2018; Stephen and Lehmann 2016; Zhang, Moe, andSchweidel 2017).

Extant literature has identified several factors that affectsharing. Characteristics of the content (e.g., valence) affecthow much it is shared (e.g., Berger and Milkman 2012). Thisresearch essentially focuses on “what” is being shared. Otherwork has examined “who” is doing the sharing. Some re-searchers have considered the impact of behavioral charac-teristics of senders and receivers on sharing (e.g., Arndt 1967),whereas others have focused on their social networks (Bampoet al. 2008). The latter stream of research has primarily focusedon the role of unitary network characteristics of senders andreceivers (e.g., number of connections) separately on contentsharing (e.g., Hinz et al. 2011). Such analysis ignores anycommon connections between a sender and a receiver, which isthe “network overlap” in the sender–receiver dyad. Given thatthe extent of common connections between a sender and areceiver may represent their common interests or audience, itmay also influence content sharing. For example, if Jim sharescontent with Mary, the number of connections they have incommon can influence how likely she is, in turn, to share thatcontent with others. Clearly, network overlap can vary acrosssender–receiver pairs.1 To the best of our knowledge, there isno research on how network overlap among users will affectcontent sharing.

From a managerial standpoint, whether and how the sharingpropensity of senders is linked to their network overlap withothers can be valuable input for improving the selection ofusers for spreading content (Trusov, Bodapati, and Bucklin2010). Thus, the purpose of this article is to assess the impact ofnetwork overlap across dyads on the level of content sharing insocial media platforms and its implications for marketingcommunications.

The type of network overlap among users depends on thedirectionality of connections. Directionality captures thestrength of ties and howwell users know their audience, which,in turn, may determine the sharing propensity. In undirectednetworks (e.g., Facebook), connections are bidirectional andmembers of any user pair can share content with each other. Inthis case, network overlap is simply the number of commonfriends between two users. In directed networks (e.g., Twitter),by interpreting a connection as a followee (outgoing con-nection), follower (incoming connection) or mutual follower(bidirectional connection), network overlap can be charac-terized using three different metrics: the numbers of commonfollowees, common followers, and commonmutual followers.Each metric captures a different facet of the relationshipamong users as discussed subsequently, and collectively theyprovide a nuanced view of the link between network overlapand sharing. Table 1 summarizes the definitions of these terms.On social media platforms, users can typically obtain infor-mation regarding another user’s followees and followers aswell as the common followees/followers they share. Figure 1shows the detailed network information of a user followed by afocal user on Twitter. A focal user can see how the other user is

connected to her followees and followers and determine theextent of their network overlap. Thus, users can be aware ofthe extent of different types of network overlap with theirconnections.2 In this article, we examine whether and how thesharing propensity of senders is linked to their different typesof network overlap with others in the context of directednetworks.

We posit that overlap in the network connections betweentwo users can influence the sharing propensity in three dif-ferent ways. First, a high number of common followeessuggests that the sender and the receiver have similar interestsand, in turn, may have similar propensity to share a particularpiece of content. Similarly, more common followers andcommonmutual followers between the sender and the receivermay suggest that their followers share a similar taste or interest.In this case, a receivermay consider content to bemore suitablefor her audience and have a higher propensity to share it.Second, a receiver may respond differently to the taste ofher audience depending on whether she shares weak (e.g.,followers) or strong (e.g., common followers) ties with heraudience, which may lead to differential effects of commonfollowers and common mutual followers on content sharing(Dubois, Bonezzi, and De Angelis 2016). Finally, a largercommon audience in terms of common followers and commonmutual followers may suggest higher redundancy in the in-formation received by the audience and deter a user from

Table 1GLOSSARY

Glossary Description

ConnectionsFriend A user mutually connected with the focal user

(undirected networks)Followee A user followed by the focal user (directed

networks)Follower A user following the focal user (directed networks)Mutual follower A user following and followed by the focal user

(directed networks)

Network overlapCommon friend A user mutually connected to both the sender and

the receiver (undirected networks)Common followee A user followed by both the sender and the receiver

(directed networks)Common follower A user following both the sender and the receiver

(directed networks)Common mutual

followerA user following and followed by both the senderand the receiver (directed networks)

OthersShare Retweet a tweet or digg an adFeed Information notifying a user about the sharing

activity of one’s followeesCosenders The set of followees of the focal user who have

already shared the tweet/ad

1Aral and Walker (2014) use the term “embeddedness” to representnetwork overlap. However, embeddedness has been used to represent net-work constraints associated with an actor in a network (Granovetter 1985).To avoid confusion, we use the term “network overlap” (Easley andKleinberg 2010). Reagans and McEvily (2003) use the term “social co-hesion” to further incorporate the weight of each overlapping connection.

2Brashears and Quintane (2015) show that people tend to code networkrelationships as triads and subgroups in their memory. Such network recallcan also help them in remembering network overlap with others. Using datafrom a surveywe conducted onAmazon’sMechanical Turk, we establish thatactive users (i.e., users who retweet at least once a day) can compare theirTwitter connections in terms of overlap with approximately 78% accuracy(for details, see Web Appendix A).

572 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 3: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

sharing the content to satisfy her desire for uniqueness(Cheema and Kaikati 2010; Ho and Dempsey 2010; Lovett,Peres, and Shachar 2013). As a result, a user may be less likelyto share popular content, because many others have alreadyshared it.

We evaluate the impact of the three different types ofnetwork overlap noted previously (numbers of common fol-lowees, common followers, and common mutual followers)for content sharing within sender–receiver dyads. Our micro-level model for sharing accounts for users’ profile informationand their social network. We estimate the model using a dataset that contains sharing of tweets posted by Fortune 500companies on Twitter. We show the robustness of our resultsusing a second data set that contains the sharing of sponsoredads posted by companies on Digg. At the time of data col-lection, bothwebsites (Twitter andDigg)maintained a directedsocial network.We analyze our data using a novel proportionalhazards model that allows a receiver’s decision to be influ-enced by multiple senders.

We emerge from the analyses with three key findings. First,network overlap plays a significant role in content sharing ononline social networks. Second, the propensity of a receiver toshare content depends on all three measures of network overlap(i.e., common followees, common followers, and commonmutual followers), suggesting that each measure independentlycontributes to the sharing propensity. Notably, sharing pro-pensity increases more so with common followers than withcommon mutual followers. Third, the effects of common fol-lowers and common mutual followers are moderated by thenovelty of content. Their effects are positive only when thecontent is relatively novel (i.e., not shared by many others).When many others have shared the content, the positive effectsdecrease and may even become negative, suggesting that users’need for uniqueness is a likely mechanism at work. We use asimulation study to showhow the selection of users for spreading

content can be improved on the basis of the network overlapwiththeir followers. Our results suggest that targeting senders whiletaking into account their network overlap saves approximately35%~70% of time to spread content to a fixed percentage ofusers as comparedwith targetingotherwise identical senderswhohave no network overlap with their followers. Furthermore, theoptimal set of users to target depends on the popularity of thecontent. Collectively, our results can help marketers more ef-fectively target users for spreading content on social mediaplatforms.

RELATED LITERATURE

Our work relates to the broad marketing literature on wordof mouth (WOM) and, specifically, to the role of social net-work characteristics on content sharing. We briefly discusseach of these areas of research.

Extant literature on WOM has focused on how content andbrand characteristics drive the aggregate WOM performance(e.g., Berger 2014; Lovett, Peres, and Shachar 2013). Simi-larly, studies pertaining to content sharing on social mediaplatforms have focused on the role of content characteristics(Lee, Hosanagar, and Nair 2018; Zhang, Moe, and Schweidel2017) and firm strategies (Aral and Walker 2011; Lambrecht,Tucker, and Wiertz 2018). However, this stream of literaturehas not considered the impact of social network structure oncontent sharing and related outcomes.

Users’ network characteristics can be easily observed in anonline social network and can provide quantifiable metrics tomanagers for operationalizing their social media marketingefforts (Van den Bulte and Wuyts 2007, p. 11). To this end,several studies have focused on the role of network charac-teristics on content sharing.A few studies have investigated therole of senders’ unitary network characteristics on the overallextent of content diffusion in the network (e.g., Susarla, Oh,and Tan 2012; Yoganarasimhan 2012). Although these studies

Figure 1THE NETWORK INFORMATION OF A FOLLOWEE AVAILABLE TO A FOCAL USER

Network Overlap and Content Sharing on Social Media Platforms 573

Page 4: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

examine the influence of senders, they do not consider senders’propensity to share content per se. In the context of contentsharing, Shriver, Nair, and Hofstetter (2013) show that userswithmore network ties have a higher propensity to generate andshare content and this, in turn, leads to more network ties. Hinzet al. (2011) show that highly connected users are more likely toparticipate in viral marketing campaigns and are the most ef-fective seeds. However, none of these studies consider an in-dividual receiver’s propensity to further share or rebroadcast thecontent and examine how it depends on the dyadic networkcharacteristics between the sender–receiver pair.

Another stream of research has focused on establishing therole of a content receiver’s local network characteristics on his orher subsequent actions, including the adoption of an online socialnetwork (Katona, Zubcsek, and Sarvary 2011), drug adoption(Iyengar, Van den Bulte, and Valente 2011), retweetingbehavior (Luo et al. 2013) and churn decisions (Nitzan andLibai 2011). However, none of the studies consider the role ofshared network characteristics between the sender–receiverpair on the receiver’s propensity to take an action or sharecontent.

More recent studies have begun to consider the role of dyadicnetwork characteristics on a focal user’s actions. For instance,Shi, Rui, and Whinston (2014) study receivers’ content-sharingpropensity and primarily focus on the role of reciprocity be-tween senders and receivers. Using an email network, Aral andVan Alstyne (2011) show that the novelty of the informationa user receives is positively associated with her networkdiversity—in other words, lower network overlap with herconnections.3 However, these studies do not evaluate the re-ceiver’s propensity to share the content with her followers andhow it varies with her network overlap with the sender. Inparticular, previous research does not explicitly evaluate theeffect of audience overlap—represented by network overlapmeasures such as common followers and common mutualfollowers—in rebroadcasting or content-sharing decisions. Morerecently, Aral andWalker (2014) examine the effect of commonfriends (i.e., common mutual followers) between a sender and areceiver on Facebook and find that it has a positive effect on theadoption of an application. However, their results cannot beapplied to users’ actions as a function of overlap observed indirected networks. In addition, the decision to adopt in theirapplication is private, as users do not share this information withothers, and thus is driven by factors different from those drivingpublicly visible content sharing.4 For example, identity com-munication is more relevant for publicly consumed productscompared with privately consumed products and is likely toplay a role when a user shares content but may not apply when auser adopts a product in private (Cheema and Kaikati 2010).

In summary, there is much interest in understandinghow users’ network characteristics affect content sharing innetworks (Table 2). Extant literature, to the best of ourknowledge, has not considered the role of different types of

network overlap on content sharing. We fill the gap andevaluate how different types of network overlap moderatecontent sharing in directed social networks.

THEORETICAL BACKGROUND

Consumers typically share content to satisfy multiplegoals. Next, we discuss how the three types of networkoverlap satisfy user goals and their potential impact oncontent sharing.

Common Followees

Users share content to shape others’ impressions about them(Berger 2014; Toubia and Stephen 2013). For instance, peopleshare topics or ideas that signal that they have knowledge in aparticular domain.

In a directed social network (e.g., Twitter), people followothers to keep themselves informed about their activities andposted content. Thus, the composition of one’s followees largelyreflects her topical interest or taste. The more commonfollowees two users have, the more likely they have similarinterests. In this case, a receiver with more common fol-lowees with the sender is more likely to be knowledgeableabout the sender’s content and more likely to share it withothers as compared with a receiver with fewer commonfollowees with the sender.

Common Followers

The composition of one’s followers represents the taste ofher audience. To establish a good impression, the taste of theaudience is a factor that users are likely to consider whilesharing content (Berger 2014; McQuarrie, Miller, and Phillips2013). The more common followers two users have, the moresimilar audience they have, and the more likely they will makesimilar decisions regarding whether to share a piece ofcontent to their followers to create an impression. In ad-dition, similarity in the interests of the audiences of twousers also represents similarity in their own expertise orknowledge. As users tend to signal their identity by sharingtheir knowledge, this can further increase the propensity ofthe receiver to share content from the sender with a highnumber of common followers.

Common Mutual Followers

Because of the bidirectional nature of the connections withmutual followers, the number of such connections charac-terizes the mutual accessibility of two users through thirdparties. According to the bandwidth hypothesis, the existenceof common mutual connections expands the bandwidth ofcommunication among users and makes their evaluation ofeach other more accurate (Aral and Van Alstyne 2011; Burt2001). In this case, a receiver may find information receivedfrom a sender with high mutual common followers to be moreuseful and is more likely to share it for creating an impression.In addition, because the bidirectional connection represents astrong tie, the more commonmutual followers two users have,the more likely they belong to the same social group (Shi, Rui,and Whinston 2014). Thus, a user may have a higher need tointeractwith the senderwith commonmutual followers tomeetthe need for social bonding (Baumeister and Leary 1995). Onsocial media platforms, one way for such interactions with thesender is to propagate the content received from the sender.Thus, the higher the number of common mutual followers two

3Although Aral and Van Alstyne (2011) do not conduct an actual dyadic-level analysis, their measure captures the effect of average overlap of a userwith her network neighbors.

4Online social networks allow users to share content, as well as theiractions, with their connections. Aral and Walker (2014) randomize thesharing of app usage information across users in their experiment. However,the app adoption decision is private per se. Furthermore, the visibility of theapp usage information (to a small subset of friends) is passively manipulatedby the app rather than actively enabled by users.

574 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 5: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

users have, the more likely they feel obligated to propagateeach other’s shared content to maintain a strong social bond.Finally, more common mutual followers may also suggestmore similar audience with the sender, as well as a highersimilarity in taste with the sender because of homophily, evenafter accounting for the effect of other network overlapmetrics.This would further increase the receiver’s propensity to sharecontent from the sender.

Common Mutual Followers Versus Common Followers

In a broadcasting context, users focus on their need to createan impressionwhile sharing content (Barasch andBerger 2014).Users on a social network can achieve this need for creating agood impression by being responsive to the taste of their fol-lowers. Thus, an audience’s taste is likely to be the dominantdriver for sharing content, comparedwith other shared attributeswith the sender such as bandwidth and social bond. This isespecially true for directed networks in which users canestablish identity and create an impression in the presence ofmassive audiences (McQuarrie, Miller, and Phillips 2013). Inaddition, users’ responsiveness to their audience’s taste mayvary with audience type. For example, the need for self-enhancement is higher with weaker ties than with strongerties (Dubois, Bonezzi, and De Angelis 2016). Because usersalready know people with strong ties, they may only feel theneed to impress others with whom they share weak ties. In ourcontext, followers and mutual followers represent audienceswith weak and strong ties, respectively. Thus, a user is morelikely to be responsive to the taste of the audience representedby common followers than to that represented by commonmutual followers.

Role of Content Popularity

Users are intrinsically motivated to achieve uniqueness;thus, being overly similar to others induces negative emotions(Snyder and Fromkin 1980). This desire to express uniquenessis stronger for publicly consumed products than privatelyconsumed ones (Cheema and Kaikati 2010). Moreover, theneed for uniqueness is stronger in online interactions thanoffline interactions and leads to higherWOM for differentiatedbrands (Lovett, Peres, and Shachar 2013). Prior work hassuggested that users can satisfy their need for uniqueness bysharing novel online content (e.g., Ho and Dempsey 2010).Thus, to establish a unique identity on social media platformsand to avoid excessive similarity with a sender, a user mayresist sharing popular content received from the sender if theyhave a common audience (common followers or commonmutual followers). Note that the need for uniqueness as a drivershould only come into play when there is an audience.Therefore, the need for uniqueness is unlikely to moderate theeffect of common followees, as they represent sources, ratherthan the audience, of a focal user.

The finding that different drivers can be associated withthree different metrics illustrates the nuanced role of differ-ent types of network overlaps on content sharing in directednetworks. Table 3 summarizes the drivers associated with thethree network overlap metrics in directed networks.

MODEL

Our objective is to evaluate the impact of network overlapon the propensity of a receiver to share content obtained fromsender(s). Specifically, we model the time it takes a receiver toshare the content and how it is affected by her network overlapwith the sender(s) who have already shared the content.We doso using a continuous-time proportional hazards model (Cox1972) at the dyadic (sender–receiver) level.

In social media contexts, one research challenge is that auser may receive multiple feeds from different senders sharingthe same content (or an aggregated feed frommultiple senders)and the contribution of each cosender on the decision to shareis unclear. At the consumer (receiver) level, several modelshave been proposed to deal with the impact of multiple sendersor multiple ad exposures (Braun and Moe 2013; Toubia,Goldenberg, and Garcia 2014; Trusov, Bodapati, and Bucklin2010). A key difference between the present study and thesestudies is that our unit of analysis is a dyad rather than anindividual. Individual-level analysis often comes with someform of aggregation on the sender side. For example, Katona,Zubcsek, and Sarvary (2011) accommodate multiple sendersby considering the average characteristics of senders, whichcompromises model precision. While Trusov, Bodapati, andBucklin (2010) do consider the effect of each individualsender on a user (restricted to be either 0 or 1), their modeldoes not allow statistical inference on the effects of dyadiccharacteristics such as network overlap. We address thischallenge by proposing a novel proportional hazards modelthat allows us to estimate the contribution of individualsenders when multiple cosenders collectively cause a de-cision to share content.

Dyadic Hazard

To ease model exposition, we present it in the context ofsharing content generated or shared by a company over the

Table 2LITERATURE ON THE ROLE OF NETWORK CHARACTERISTICS

ON USER ACTIONS

Literature

Unitary network characteristicsof senders

Impact of sender characteristics on contentpropagation (Susarla, Oh, and Tan 2012;Yoganarasimhan 2012), content generation(Shriver, Nair, and Hofstetter 2013), andparticipation of viral marketing campaigns(Hinz et al. 2011)

Unitary network characteristicsof receivers

Impact of receiver characteristics onwebsite registration (Katona, Zubcsek, andSarvary 2011), drug adoption (Iyengar,Van den Bulte, and Valente 2011), retweetbehavior (Luo et al. 2013), and churndecisions (Nitzan and Libai 2011)

Dyadic network characteristics Impact of network diversity on the noveltyof information received by a user (Aral andVan Alstyne 2011)Impact of reciprocity on a receiver’spropensity to share content (Shi, Rui, andWhinston 2014)Impact of common friends on app adoption(Aral and Walker 2014)Impact of common followees, commonfollowers, and common mutual followerson a receiver’s propensity to share content(present study)

Network Overlap and Content Sharing on Social Media Platforms 575

Page 6: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

social media platform Twitter (as it is the context of ourprimary data set). On Twitter, when a user (sender) retweets(shares) a piece of content (a tweet), her followers (receivers)are immediately notified about her sharing activity in the formof a feed. A receiver can have multiple senders (cosenders) ifmore than one of her followees retweets the same content.

Let i, j, and k index senders, receivers, and tweets, re-spectively. Let t be the time elapsed since the creation of aparticular tweet. Let Xi(t) and Xj(t) represent the unitary at-tributes of sender i and receiver j, respectively (e.g., genderand activity level of a user on Twitter). Let Xij represent thedyadic attributes concerning sender i and receiver j (e.g.,network overlap measures), Xik represent sender i’s at-tributes that are specific to tweet k (e.g., the time senderi retweets tweet k), and Xjk represent receiver j’s attributesthat are specific to tweet k (e.g., number of receiver j’sfollowees [i.e., senders], that have shared tweet k). Letlijk(t) represent the dyadic level hazard of sender i causingreceiver j to share tweet k at time t. Let lk0(t) denote thebaseline hazard for tweet k. The dyadic level hazard,stratified on tweets, is given by

lijkðtÞ = lk0ðtÞexp�b1XiðtÞ + b2XjðtÞ + b3XijðtÞ + b4XikðtÞ

+ b5XjkðtÞ�:

(1)

Note that this semiparametric formulation allows lk0(t) tochange arbitrarily over time and across tweets, enabling us tocapture static content-specific effects such as the text of atweet and time-varying effects such as the overall decliningtendency to share a specific tweet over time. For example,lk0(t) = 0 represents a case when a tweet stops diffusing inthe network. This formulation of dyadic hazard is similar tothe formulations in previous work, but we allow one re-ceiver to be exposed to the same tweet frommultiple senders(Aral and Walker 2014; Lu, Jerath, and Singh 2013). Totest the effect of content popularity, we also consider in-teractions of the popularity of a tweet with common fol-lowers and common mutual followers and include them asdyadic attributes.

Spontaneous Sharing

The basic specification of dyadic hazard ignores the pos-sibility of users sharing spontaneously. For example, a usermay share a brand-authored tweet received from another userin the social network or after browsing the brand’s home pageon Twitter. The latter type of sharing is termed “spontaneoussharing” and occurs through a nonsocial source (e.g., the brand

home page or an external site). In the context of Twitter, thesharing is spontaneous if a user shares a tweet before any of herfollowees do. Otherwise, the sharing is considered potentiallyinfluenced by others. To incorporate the impact of nonsocialsources in our study,we treat them as a special sender and use adummy variable to capture their effect on the hazard rate:

lijkðtÞ = lk0ðtÞexp�b0si + b1XiðtÞ + b2XjðtÞ + b3XijðtÞ

+ b4XikðtÞ + b5XjkðtÞ�,

(2)

where the dummy variable si is 1 if the sender is the specialsender and 0 otherwise. For the special sender, all undefinedunitary and dyadic attributes are coded as missing and set tozero (or any other default value as the selection of default onlyaffects parameter b0). The parameter b0 captures the com-bined effect of all nonsocial sources, as compared with asender whose attributes are zero, on the sharing of the re-ceiver. Because all users can share spontaneously, the specialsender is a cosender for every potential sharing user. Ourdummy variable formulation enables us to seamlessly in-corporate the effect of nonsocial sources.

Identification

A primary challenge for determining the impact of thenetwork characteristics on user actions is that the resultscould be biased due to unobservable characteristics. Forexample, a sender with high popularity offline might be moreinfluential than other senders with similar online charac-teristics. While such offline information might be observableto the receiver, it is often unknown to the researcher.Similarly, a receiver with stronger interest in brand-relatedcontent might be more likely to share their tweets in general,and such topical interest of individual receivers is often notavailable to the researcher. Missing information on eithersenders or receivers can bias model estimates. To addressthis concern, we allow for random effects at the sender leveland the receiver level, which allow each sender and receiverto have a random intercept that captures the main effect ofunobserved characteristics. Given that the special senderrepresenting the effects of nonsocial sources is intrinsicallydifferent from other senders, we allow the variance of thefrailty term for the special sender to be different from othersenders. We also consider random effects at the dyadic levelto account for dyad-specific unobservables, following pre-vious studies in network contexts (Hoff 2005; Lu, Jerath, andSingh 2013).

Note that it is possible that the unobserved characteristics arecorrelatedwith observed characteristics. For example, a senderwith high unobservable popularity may also have a lot ofconnections and, as a result, a larger overlapwith the receiver’sconnections as compared with a less popular sender. Becauserandomeffects cannot accommodate such correlations, we alsoestimate models with fixed effects at the sender and/or thereceiver level (fixed effects allow unobserved characteristics tobe correlated with observed characteristics). To account forfixed effects on the sender/receiver level, we resort to thedummy variable approach because our dyadic specificationmakes it infeasible to construct conditional likelihood on thesender/receiver level. Although the fixed-effects approachappears to be more flexible than the random-effects model interms of its assumptions, it is vulnerable to incidental pa-rameter bias when reoccurrence of senders or receivers is

Table 3DRIVERS ASSOCIATED WITH THE THREE NETWORK

OVERLAP METRICS

NetworkOverlap Metric Positive Driver

NegativeDriver

Commonfollowees

Identity signaling

Commonfollowers

Taste of audience, identity signaling Need foruniqueness

Common mutualfollowers

Bandwidth, social bonding, taste ofaudience, identity signaling

Need foruniqueness

576 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 7: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

limited (e.g., less than 20) (Greene 2004).5 While this is not anissue for senders in our data sample, not all receivers in oursample have 20 observations. For robustness, we also verifyour results by considering fixed effects only for receivers withmore than 20 observations.

Two additional concerns for identification are spontaneousshares and endogenous communication patterns (Aral andWalker 2014). For the former, we explicitly control for thepossibility of spontaneous shares by treating all nonsocialsources as a special sender. Such a control not only teases outthe effect of nonsocial sources, but also alleviates, to someextent, the concern that a receiver is sharing because of herinherent propensity to share. For the latter, in our application,the platform sends a notification to all followers of a sender.Thus, there is no selection bias onwho can see the content (i.e.,no endogenous communication patterns). Finally, anotherproblemwith identifying content sharing across a dyad is that areceiver often sees the same content from multiple sendersbefore sharing, and the quantitative contribution of eachcosender may be unclear. We address this challenge statisti-cally by proposing a novel proportional hazards model thatdetermines the contribution of each cosender on the basis ofher characteristics.6

DATA

We aim to understand how different types of networkoverlap between a sender and a receiver connected in asocial network influence the sharing behavior of the re-ceiver. A dyadic-level study imposes stringent requirementson the data. First, we need a sample of content generated orshared on a social media platform. Next, for each piece ofcontent, we need complete information regarding how andwhen the content propagates through the network fromactivated users (senders) to their followers (receivers) in agiven time duration. Such information includes the profileand social graph information of all involved users (bothsenders and receivers), as well as time-stamped sharingactivities at the individual user level. Note that the latterenables us to include all senders of a receiver in the dyadicmodel; otherwise, our model estimates may be biased. Thesample of involved users can be identified by traversing theaudience of activated users progressively. Specifically, wecan start from a set of seeds (e.g., the author or users whospontaneously share the content) and then treat the followersof these seeds as receivers. This process iterates when areceiver become activated (i.e., she shares the content) untilthe end of the observation time window. This progressiveuser-sampling approach based on ego’s network enables usto focus on users who are relevant to our analysis—namely,all the activated users (senders) and their followers (re-ceivers). A similar approach has been employed by otherresearchers interested in the effects of dyadic networkcharacteristics (Aral and Walker 2014; Shi, Rui, andWhinston 2014). Finally, the profile and social graph in-formation on all relevant users can be collected retro-spectively from historical data.

We collect a data set with the desired information fromTwitter. To improve the managerial relevance of our study, wefocus on the sharing (retweeting) of brand-authored tweets.Weconcentrate on nine brands listed by Fortune magazine as thetop Fortune 500 companies using social media (Bessette2014). We first collected the tweets authored (or retweeted insome cases) by each brand in a 30-day time window aroundApril 2016.7 Then, for each tweet, we collected the socialgraph information needed for our analysis retrospectively intwo steps. As the first step, we collected the social graphinformation of the author and retweeters of each tweet in theten-day period after the tweet was created. These usersrepresent the set of senders for the focal tweets. Next, wecollected the social graph information for the senders’followers (receivers). For every sender, we considered allfollowers who retweet (i.e., activated followers). To reducethe size of the data,8 we randomly sampled 100 non-activated followers of the sender using the risk-set samplingapproach if there are more than 100 such followers (Hu andVan den Bulte 2014; Langholz and Borgan 1995). The risk-set sampling approach can produce unbiased estimates.Finally, we collected the profile information for all the identifiedusers.

For each potential receiver, we generate one dyadic ob-servation for her if one of her followees shares the tweet.Because everyone can retweet spontaneously without theinfluence of their followees, we add an additional dyadicobservation for each receiver, in which the sender is a specialsender who captures the effect of the nonsocial source (asdiscussed in the “Model” section). One converts from a re-ceiver to a sender immediately after the sharing activity.Table 4 shows the summary information of the data set. The

Table 4SUMMARY STATISTICS

Number of tweets 397Number of senders 12,565Number of receivers 869,899Number of (sender, receiver, tweet) tuples 2,972,026Number of spontaneous tuples 1,483,729 (50%)Number of social tuples 1,488,297 (50%)Number of observations after accounting for

time-varying variables949,480,746

Number of shares (retweets) 18,493Number of spontaneous shares 3,695 (20%)Number of potential influenced shares 14,798 (80%)Percentage with more than one cosender

(excluding the special sender)6.4%

Notes: In a social tuple, the sender is an actual user who is followed by thereceiver, whereas in a spontaneous tuple, the sender is a special sender whocaptures the effect of nonsocial sources.

5Greene (2004) examined the bias of the dummy variable approach for avariety of nonlinear models using Monte Carlo methods. In general, the biasbecomes fairly small when an individual had more than 20 repetitions.

6Our estimation approach is included in Web Appendix B.

7Because we could not collect data on all brands concurrently in a shorttime interval (the network structure among users may change if the data arenot collected in a short time interval), we collected data on the nine brands inthree different time windows. All tweets in our sample were posted duringMarch 14–May 4, 2016.

8Because the density and network size of Twitter users is very high (aTwitter user in our data set hasmore than 8,000 followers, on average, and themedian number of followers is 741), collecting social graph information forall followers of every sender is extremely time consuming. It would take usmore than six years to collect information for all receivers using our setup dueto application program interface restrictions.

Network Overlap and Content Sharing on Social Media Platforms 577

Page 8: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

table shows that 6.4% of shares have more than 1 cosender,excluding the special sender, and the average number ofcosenders is 2.12, including the special sender. This vali-dates the need for a model that accounts for multiplecosenders.

We use several control variables pertaining to thesender, the receiver, and the sender–receiver dyad. Thesevariables, summarized in Table 5, include the unitarynetwork attributes of the sender/receiver, the engagement

level of the sender/receiver, the demographics of thesender/receiver, the timing of the sender’s sharing ac-tivity, the number of cosenders in the receiver’s network,and so forth. Table 6 summarizes the summary statisticsfor the main unitary and dyadic network attributes and keycontrol variables. Kaplan–Meier survival curves in WebAppendix C show that path lengths for diffusion are veryshort and underscore the need for improving contentpropagation.

Table 5DESCRIPTIONS OF INDEPENDENT VARIABLES

Independent Variable Description

Xi/Xj Attributes of sender i/receiver j

Network attributes followees Number of followees (out-degree)followers Number of followers (in-degree)mutuals Number of mutual followerslists Number of lists subscribed

Engagement levels statuses Total number of tweets, including retweetsfavorites Total number of favorites

Others verified Whether the Twitter account is verifiedregMon How many months have the user been registered on TwitterisSocial (si) 1 if sender i is a social source (i.e., followee), otherwise 0isAuthor 1 if the sender is the author of the tweet, otherwise 0

Xij Attributes of a sender–receiver dyad

Dyadic network attributes isMutual Whether the sender and the receiver follow each other mutuallycommonFollowees Number of followees shared by the sender and the receivercommonFollowers Number of followers shared by the sender and the receivercommonMutuals Number of mutual followers shared by the sender and the receiver

Xik Sender-specific attributes of a tweet

Sharing timing wday Day of a week when sender i retweeted tweet khour Hour of a day when sender i retweeted tweet kshareTime Hours taken for sender i to retweet since the creation of tweet k

Xjk Receiver-specific attributes of a tweet

cosenders Number of followees (cosenders) of the receiver who have already shared

Xk Attributes of tweet k (only interactions with other variables can be identified)

popularity Number of retweets at a given time point

Table 6KEY STATISTICS OF MAIN VARIABLES

Zeros M SD Min Mdn Max

Unitary Network Attributes of All UsersNumber of followees 5,610 4,553.0 24,351.6 0 741 4,651,052Number of followers 8,268 8,049.2 174,775.8 0 363 79,317,306Number of mutuals 17,910 2,967.0 18,544.3 0 200 1,720,546

Dyadic Network Attributes of Sender–Receiver DyadsisMutual (1 – reciprocal, 0 – nonreciprocal) 612,468 .42 .49 0 0 1Number of common followees 189,767 25.7 79.1 0 6 3,947Number of common followers 359,837 12.4 292.4 0 2 193,179Number of common mutual followers 469,259 19.5 111.7 0 1 15,520

Popularity of ContentNumber of retweets 0 46.6 114.2 1 13 795

578 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 9: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

Table 7 outlines the correlation among dyadic networkcharacteristics. To clearly identify the effects of differentoverlapping connections, we exclude commonmutual followerswhile counting the numbers of common followees and com-mon followers. The correlations among the three metrics are notparticularly high, which suggests that they are capturing dif-ferent drivers. Furthermore, the estimates of the correlatedvariables are stable with changes in model specifications anddata samples, suggesting that multicollinearity is unlikely to bean issue.

To evaluate the potential relationship between networkoverlap and propensity to share content, we compute theaverage network overlap of activated and nonactivatedsender–receiver dyads, respectively. A dyad is consideredactivated if the receiver shares the content. Figure 2 showsthe difference in the magnitude of each type of networkoverlap between activated and nonactivated dyads.Activated dyads have higher network overlap than nonac-tivated dyads, suggesting that higher network overlaps in termsof common followees, common followers, and commonmutual followers can be associated with higher propensity toshare. To determine the effect of popularity of content, wefurther divide activated dyads into two groups on the basis ofwhether theywere activated when the popularity of the tweet isabove or below the average popularity of all tweets. Figure 3shows that the average number of common followers asso-ciated with dyads activated at high popularity is lower com-pared with that associated with dyads activated at lowpopularity. The figure suggests that relatively fewer dyadswith a high number of common followers are activated whenthe tweet popularity is high, implying that popularity maynegatively moderate the effect of common followers on thepropensity to share. Common mutual followers show a similarpattern except that the difference in the average numbers ofcommon mutual followers for activated dyads at high and lowtweet popularity is relatively small. To infer the true effects ofnetwork overlap, we also must control for confounding factorsthat may affect both network overlap and propensity to share.We achieve this by estimating the proportional hazards modeldiscussed previously.

RESULTS

Main Results

Table 8 summarizes the results of four model specifications.Our main model of interest is Model 4, that includes interactionterms representing the moderating effects of tweet popularity oncommon followers and common mutual followers. We alsoestimate models with no interaction terms or including onlyone of the two interaction terms (Models 1–3, respectively).Likelihood ratio tests suggest that Model 4 is preferredover Models 2 and 3 (p < .001). The following discussion

is based on the estimates from Model 4 unless otherwisespecified.

Common followees. The number of common followeeshas a positive effect on the sharing propensity of the receiver.The higher the number of common followees between a senderand a receiver, the higher the similarity in their interests andknowledge. Thus, the more common followees the receiverhas with the sender, the more likely the receiver is knowl-edgeable about or interested in the sender’s content and morelikely she will share the content from the sender to impressothers. Note that we obtain this result after controlling for theeffect of common mutual followers, which represent closefriends. Thus, our result suggests that common followees canalso be used to capture similarity or homophily between users(McPherson, Smith-Lovin, and Cook 2001).

Common followers. The simple effect of common followers(when the logarithm of the content popularity is zero) ispositive, suggesting that the number of common followershas a positive effect on dyadic influence when the popularityof a tweet is low. Because more common followers reflectshigher similarity between the audiences of the sender and thereceiver, the receiver is likely to make the same decision as thesender (i.e., to share), especially when the content is relativelynovel and the concern around uniqueness is not strong. Thenegative interaction of common followers with content pop-ularity can be explained by users’ need for uniqueness incontent sharing (Ho and Dempsey 2010). This is similar to theextant finding that consumers with a high need for uniquenessmay decrease the consumption of a product if it becomescommonplace, also known as the reverse-bandwagon effect(Cheema and Kaikati 2010).

Table 7CORRELATION AMONG DYADIC NETWORK CHARACTERISTICS

isMutual logCommonFollowees logCommonFollowers logCommonMutuals

isMutual 1.00 .17 .30 .51logCommonFollowees 1.00 .52 .51logCommonFollowers 1.00 .74logCommonMutuals 1.00

Figure 2THE MEAN NETWORK OVERLAP OF ACTIVATED VERSUS

NONACTIVATED DYADS

0

50

100

150

200

250

Common Followees Common Followers Common Mutuals

Not ActivatedActivated

Mea

n N

etw

ork

Ove

rlap

Network Overlap and Content Sharing on Social Media Platforms 579

Page 10: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

Common mutual followers. The simple effect of commonmutual followers (when the logarithm of content popularity iszero) is positive and demonstrates that, when the content isrelatively novel, common mutual followers has a positiveimpact on sharing. A high number of common mutual fol-lowers between a sender and a receiver represents highersimilarity in their interests and the taste of their audiences. Inaddition, such overlap suggests stronger social bonding aswellas higher bandwidth to better evaluate each other’s content(Aral and Van Alstyne 2011; Burt 2001). Thus, a receiver ismore likely to find content from a sender with many commonmutual followers to be useful and is more likely to share it.

The negative interaction of common mutual followers withpopularity shows a boundary condition for the positive effectof common friends (common mutual followers) previouslyreported in undirected networks (Aral andWalker 2014; Bapnaet al. 2017). Specifically, the effect of common friends ispositive only when the information to be communicated isrelatively novel (or not as popular).

Our results show that the effect of network overlap in di-rected networks varies across different types of “connections.”Moreover, the impacts of common followers and commonmutual followers are negatively moderated by content novelty.The negative interaction effects suggest that users are even-tually going to stop sharing because of concerns regardinguniqueness. As a result, the content is likely to diffuse for shortdistances within a network.9 The negative interaction effectsalso confirm that common followers and common mutualfollowers do represent characteristics such as the similarities inaudiences with weak and strong ties and are not just redundantmeasures representing homophily.

Common mutual followers versus common followers. Thecoefficient of common followers is higher than the coefficientof common mutual followers (Table 8).10 Note that Model 4shows the simple effect of common followers and commonmutual followers (when the logarithm of the content popularity

is zero). The difference between the two coefficients is alsosignificant in Model 1, which captures the effects of commonfollowers and common mutual followers averaged across alllevels of content popularity. In addition, we employ an al-ternate model in which we constrain the coefficients ofcommon followers and common mutual followers to be thesame. We find that our current model specification provides amuch better fit than this alternate model (the difference inBayesian information criterion [BIC] is much larger than 10).These results confirm that the difference between the co-efficients of common followers and commonmutual followersis positive and significant. Thus, the propensity to share in-creases more with common followers than that with commonmutual followers.

A possible explanation is that users paymuchmore attentionto the taste of an audiencewithweaker ties (followers) than thatfor an audience with stronger ties (mutual followers). As aresult, they tend to share content from the sender with whomthey share more common followers. That different types offollowers have differential effects confirms the importance ofconsidering the directionality of connections in social net-works. Our results show that in targeting users for contentpropagation, it is better to select users who share commonfollowers with their audiences than users who share commonmutual followers with their audiences.

We also validate our results using the presence of overlapbetween users instead of the magnitude of overlap. Specifi-cally, we include an indicator variable, which is 1 if there isoverlap between two users and 0 otherwise. The results in-dicate that the presence of overlap with the sender affects thesharing behavior of receivers.

In addition to sharing the findings on the three networkoverlapmeasures, we highlight the estimates for two additionalvariables (i.e., cosenders and shareTime),11 which help usunderstand how each cosender contributes to a receiver’spropensity to share. First, the effect of cosenders is negative,showing that the marginal effect of a cosender on contentsharing decreases with the number of cosenders (though theoverall effect of all senders may increase). This echoes aprevious finding on how multiple friends affect the sharing ofURLs on Facebook (Bakshy et al. 2012). Second, the effect ofshareTime is positive, though not significant in all modelspecifications, suggesting that the later a cosender shared, thestronger the cosender’s effect on the receiver. This patterndocuments a recency effect for cosenders, consistent withprevious findings that social effects decay over time (Nitzanand Libai 2011; Trusov, Bucklin, and Pauwels 2009).

Robustness Checks

Unobserved characteristics. A potential concern with ouranalysis is that the sharing of content could be driven byunobserved characteristics at the sender, receiver, and even thedyad level. In our main analysis, we consider sender-specific,receiver-specific, and dyad-specific random effects. As a ro-bustness check, we also account for the effects of unobservedcharacteristics with a fixed-effects approach because it allowsunobserved characteristics to be correlated with observedcharacteristics. We estimate fixed effects at the sender and/orreceiver level and find that the results are similar. Fixed effects

Figure 3THE MEAN COMMON FOLLOWERS AND COMMON MUTUAL

FOLLOWERS FOR DYADS ACTIVATED AT HIGH VERSUS LOW

POPULARITY

Mea

n N

etw

ork

Ove

rlap

20

40

60

80

100

Common Followers Common Mutuals

Activated at low popularityActivated at high popularity

9This result may explain the short information cascades reported inprevious literature (Goel et al. 2016) and observed in our data set (WebAppendix C).

10AWald test suggests that the difference is significant (seeWebAppendixD). 11For the coefficients of these two variables, see Web Appendix D.

580 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 11: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

at the dyadic level are not a viable alternative, because theeffects of dyadic network characteristics are not identified.Note that the random/fixed effects enable us to account forunobserved factors such as the possibility that some Twitterusers might be bots.

Table 9 presents the results from different models withrandom and fixed effects at sender, receiver and dyad levels.Overall, the estimates on the dyadic network characteristics arequalitatively similar across different model specifications.

Validation with additional data sets. To test whether ourfindings generalize to other types of content and other directednetworks, we collected an additional data set from Digg, alarge online social news aggregation website. We focus on thesharing activities of 31 ads in amonth-long period. Because theDigg network is much smaller than Twitter, we were able tocollect social graph information for all involved senders andreceivers.12 We estimate a model with all three levels ofrandom effects, identical to what we do for the Twitter data set,and find a similar pattern of results (see Table 10).

SIMULATION STUDY FOR CONTENT PROPAGATION

Companies routinely target influential users on social mediaplatforms to spread content.13 Given our results on the impact ofnetwork overlap on content sharing, we posit that user targetingcan be improved drawing on the user’s network overlap withfollowers. As an example, Twitter allows advertisers to chooseaudiences on the basis of gender, device, interests, and even anylist of users provided by advertisers (see Web Appendix G).

To quantify how network overlap can facilitate the sharingof content, we develop a simulation study that uses the socialnetwork structure in the Twitter data set together with theestimated model parameters. We assume all senders (ex-cluding brands) are activated and then simulate how long ittakes the senders to activate 1% of all receivers, both with andwithout considering network overlap. Because we did not

estimate the baseline hazard function in our main model, wemust make a parametric assumption for the baseline hazardfunction to simulate survival times for receivers. Without lossof generality, we assume the baseline hazard function follows aWeibull distribution with shape parameter k. Figure 4 sum-marizes the contribution of network overlap for acceleratingcontent sharing under different scenarios. Compared with notconsidering network overlap (equivalent to assuming that allcoefficients related to network overlap variables are zero),accounting for network overlap saves approximately 35%~70% of time to activate 1% of receivers across a wide range ofthe shape parameter k. The amount of time saved is similar foractivating 5% and 10% of receivers. The results are similarwhen the baseline hazard function is a Gompertz distribution.

To better quantify the marginal effect of network overlap,we artificially increase the network overlap between a dyadby a certain percentage and determine how much it lowers thetime for activation. Figure 5 shows how the percentage of timesaved increases with network overlap. For instance, a 20%increase in network overlap can reduce the activation time of adyad by approximately 13%. These simulation results dem-onstrate the value of network overlap in increasing the speed ofcontent propagation in social networks.

Our results also indicate that the effect of network overlapvaries with the popularity of content. To illustrate how thisfinding can influence the selection of seeders, we choose twosets of seeders having high and low network overlap with theirfollowers, respectively, and then compare the time it takes toactivate 1% of all their receivers. To obtain seeders with highand lownetwork overlap,wefirst select the top 200 senderswiththe largest number of followers in the Twitter data set. Next, wecalculate the average number of common followers each usershares with her followers. Using the median of the averagenumbers for the 200 users as a threshold,we assign these users tohigh- and low-network-overlap groups. The 100 users in thehigh-network-overlap (i.e., common followers) set have muchfewer followers than the 100 users in the low-network-overlapset, on average (for details, see Web Appendix D).

To facilitate the comparison, we compute the ratio of thetime taken to activate 1% of the receivers by the high- and

Table 8PARAMETER ESTIMATES OF DIFFERENT MODEL SPECIFICATIONS

Model 1 Model 2 Model 3 Model 4

Network OverlaplogCommonFollowees .072*** .081*** .087*** .089***logCommonFollowers .116*** .239*** .113*** .230***logCommonMutuals .063*** .063*** .224*** .079*

Interactions with PopularitylogCommonFollowers:logPopularity −.032*** −.029***logCommonMutuals:logPopularity −.052*** −.017*

FitLog-likelihood −100,246 −100,123 −100,136 −100,113BIC 200,855 200,620 200,646 200,610

*p < .05.***p < .001.Notes: The main effect of logPopularity cannot be identified, because everyone sees the same retweet number at a given time point, the effect of which is

canceled out in the likelihood.We choseModel 4 as ourmainmodel on the basis of fit.We omit the coefficients on control variables for clarity. For the complete setof parameter estimates, see Web Appendix D.

12Web Appendix E provides more details regarding the collection of thisdata set. Web Appendix F shows detailed statistics.

13For example, Adly (http://adly.com/) is a platform specialized ininfluencer marketing and lists AT&T, Toyota, and Walmart as its clients.

Network Overlap and Content Sharing on Social Media Platforms 581

Page 12: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

low-network-overlap seeders at different levels of contentpopularity. Figure 6 summarizes the results. When contentpopularity is low, the ratio is less than one, indicating thatthe high-overlap seeders activate 1% receivers faster thanthe low-network-overlap seeders, even though the high-overlap seeders have much fewer followers, on average.However, when the content popularity is high, the low-network-overlap seeders activate 1% receivers faster.

Our results illustrate that seeders with high network overlapshould be selected only when the content is not popular. Inpractice, the popularity of content often varies across brands.For example, a tweet posted by Starbucks is usually muchmore popular than one posted byAllstate. Thus, different brandsmay want to target different sets of seeders. Specifically, itwould be more effective for popular brands (e.g., Starbucks) totarget userswith lower network overlap (with their followers). Incontrast, less popular brands (e.g., Allstate) struggling for en-gagements may want to choose high network overlap users asseeders.

DISCUSSION AND CONCLUSION

Social media platforms have become an increasinglypopular medium for firms to reach and engage with customers.Understanding what leads to effective content sharing at thedyadic level lies at the core of cost-effective marketingcampaigns on these platforms. While the effects of unitarynetwork attributes on content sharing have been well-studiedin the literature, studies on the effects of dyadic network at-tributes are nascent.

In this article, we study the effect of a dyad’s networkoverlap on content sharing in directed networks. Previouswork has focused on the role of unitary characteristics ofsenders and receivers and dyadic characteristics such as rec-iprocity in content sharing. We extend previous work bydemonstrating how shared network characteristics such asnetwork overlap can also inform the sharing propensity at adyadic level. Substantively, our results show that the effectof network overlap in directed networks varies across dif-ferent types of connections. The number of common

Table 10PARAMETER ESTIMATES ON DIGG DATA SET

Model 1 Model 2 Model 3 Model 4

Network OverlaplogCommonFollowees .192. .212** .073 .191**logCommonFollowers 1.125*** 2.287*** .154** 1.520***logCommonMutuals −.079 .034 2.739*** .768***

Interactions with PopularitylogCommonFollowers:logPopularity −.517*** −.339***logCommonMutuals:logPopularity −.589*** −.171***

FitLog-likelihood −20,116 −19,753 −19,693 −19,675BIC 40,549 39,832 39,712 39,683

**p < .01.***p < .001.

Table 9PARAMETER ESTIMATES FROM DIFFERENT RANDOM-/FIXED-/MIXED-EFFECTS MODELS

None rs-rr fs-fr rs-rr-rd fs-fr-rd

Network OverlaplogCommonFollowees .057*** .096*** .090*** .089*** .099***logCommonFollowers .364*** .218*** .134*** .230*** .128***logCommonMutuals −.072. .092* .042* .079* .057**

Interactions with PopularitylogCommonFollowers:logPopularity −.034*** −.026*** −.029*** −.029*** −.029***logCommonMutuals:logPopularity .008 −.023** −.015. −.017* −.019*

FitLog-likelihood −112,004 −100,386 −992,22 −100,113 −991,98BIC 224,351 201,145 445,959 200,610 445,920

*p < .05.**p < .01.***p < .001.Notes: In the header row, the first letter representswhetherfixed (f) or random (r) effects are used. The second letter indicates the subject (“s” for sender, “r” for receiver,

and “d” for dyad) onwhich the specified effect is applied. For example, “rs” represents amodel with random effects on sender, and “fs-fr-rd” represents amodel withfixedeffects on sender, fixed effects on receiver, and random effects on dyad. “rs-rr-rd” is the main model used in this research. The model “none” does not include random orfixed effects on any subject. To avoid extreme individual effects, we consider only fixed effects for senders/receivers who reoccur more than 20 times.

582 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 13: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

followees is positively associated with a receiver’s pro-pensity for sharing. Other network overlap measures, suchas the numbers of common followers and common mutualfollowers, also have positive effects on this propensity.Furthermore, we document the moderating role of contentpopularity on the effect of network overlap on sharing. Weshow that the positive effect of network overlap on sharingpropensity decreases with the popularity of the sharedcontent. In doing so, we add to the existing literature byhighlighting the role of uniqueness in social consumption(Cheema and Kaikati 2010). More specifically, we showhow the uniqueness concern is revealed through differentialresponses to network overlap measures with the increase ofcontent popularity. Finally, we demonstrate the importanceof directionality of connections by documenting the dif-ference in the sharing propensity depending on the type of

network overlap. Specifically, we show that the sharingpropensity is more likely to increase with common followersas compared to commonmutual followers. To the best of ourknowledge, ours is the first study to document users’ dif-ferential responses on social networks based on the di-rectionality of their shared connections or network overlap.

Our article makes a methodological contribution byproposing a new hazard rate modeling approach to determinethe contribution of individual senders on influencing a re-ceiver when multiple senders are involved. Previous workhas either made strong assumptions about how the contri-bution should be attributed to different senders (Braun andMoe 2013; Katona, Zubcsek, and Sarvary 2011; Toubia,Goldenberg, and Garcia 2014) or not focused on identifyingthe effect of shared characteristics (Trusov, Bodapati, andBucklin 2010). Our approach makes no such assumptionsand, consequently, can better tease apart the effects of sharednetwork attributes. This approach can be applied to mar-keting campaigns when there are multiple sources targetingthe same user.

For marketing managers, we provide insights on how totarget customers in a directed network at a micro level.Many platforms support micro-level targeting to improvethe efficacy of targeting (e.g., display of promoted tweetson Twitter) and prevent information overload for theirmembers (e.g., filtering of feeds on Weibo). Our resultsshow that platforms such as Twitter or Weibo can improvetheir targeting or filtering by focusing on dyads embeddedin different types of connections (i.e., followees, fol-lowers, and mutual followers). For example, whendeciding whether to show a promoted tweet to a givenuser,14 Twitter may want to consider how many common

Figure 4THE EFFECT OF NETWORK OVERLAP IN SPEEDING UP THE

SHARING OF CONTENT

Weibull Shape Parameter k

% T

ime

Sav

ed t

o A

ctiv

ate

x% R

ecei

vers

40

50

60

70

.6 .8 1.0 1.2 1.4

Activated Receivers1%5%10%

Figure 5THE MARGINAL EFFECT OF NETWORK OVERLAP IN

ACTIVATION TIME

% Increase in Network Overlap0 20 40 60 80 100

% T

ime

Sav

ed t

o A

ctiv

ate

1% R

ecei

vers

0

10

20

30

40

50

Figure 6HOW THE RELATIVE EFFECTIVENESS OF HIGH- (VS. LOW-)

NETWORK-OVERLAP SEEDERS VARIESWITH THE POPULARITY

OF CONTENT

Popularity of Content

1 10 20 30 50 70 100 150 200

Rat

io o

f A

ctiv

atio

n T

ime

k = .5k = 1k = 1.5.4

.6

.8

1.0

1.2

14Once a tweet is promoted, Twitter can display the tweet to any user on theplatform, even if this user does not follow the author of the tweet. However, inpractice, to avoid spamming users, Twitter only displays promoted tweets toselective users deemed relevant. Note that an advertiser can promote a tweetauthored by a random user.

Network Overlap and Content Sharing on Social Media Platforms 583

Page 14: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

connections this user shares with the author, as well as theoverall popularity of the tweet. Specifically, targeting userswho have more common followees with the author can bemore effective. Targeting users who have large numbers ofcommon followers and common mutual followers can alsobe effective when the tweet is not that popular but might becounterproductive when the tweet is already sufficientlypopular.

REFERENCES

Aral, Sinan, and Marshall Van Alstyne (2011), “The Diversity-Bandwidth Trade-Off,” American Journal of Sociology, 117 (1),90–171.

Aral, Sinan, and Dylan Walker (2011), “Creating Social Contagionthrough Viral Product Design: A Randomized Trial of Peer In-fluence in Networks,” Management Science, 57 (9), 1623–39.

Aral, Sinan, and Dylan Walker (2014), “Tie Strength, Embedded-ness, and Social Influence: A Large-Scale Networked Experi-ment,” Management Science, 60 (6), 1352–70.

Arndt, Johan (1967), “Role of Product-Related Conversations in theDiffusion of a New Product,” Journal of Marketing Research,4 (3), 291–95.

Bakshy, Eytan, Itamar Rosenn, CameronMarlow, and Lada Adamic(2012), “The Role of Social Networks in Information Diffusion,”in Proceedings of the 21st International Conference on WorldWide Web. New York: Association for Computing Machinery,519–28.

Bampo,Mauro,Michael T. Ewing, Dineli R.Mather, David Stewart,and Mark Wallace (2008), “The Effects of the Social Structure ofDigital Networks on Viral Marketing Performance,” InformationSystems Research, 19 (3), 273–90.

Bapna, Ravi, Alok Gupta, Sarah Rice, and Arun Sundararajan(2017), “Trust and the Strength of Ties in Online Social Net-works: An Exploratory Field Experiment,” Management In-formation Systems Quarterly, 41 (1), 115–30.

Barasch, Alixandra, and Jonah Berger (2014), “Broadcasting andNarrowcasting: How Audience Size Affects What People Share,”Journal of Marketing Research, 51 (3), 286–99.

Baumeister, Roy F., and Mark R. Leary (1995), “The Need toBelong: Desire for Interpersonal Attachments as a FundamentalHuman Motivation,” Psychological Bulletin, 117 (3), 497.

Berger, Jonah (2014), “Word of Mouth and Interpersonal Com-munication: A Review and Directions for Future Research,”Journal of Consumer Psychology, 24 (4), 586–607.

Berger, Jonah, and Katherine L. Milkman (2012), “What MakesOnline Content Viral?” Journal of Marketing Research, 49 (2),192–205.

Bessette, Chanelle (2014), “Fortune 500: The Top Companies UsingSocial Media,” Fortune (June 2), http://fortune.com/2014/06/02/500-social-media.

Brashears, Matthew E and Eric Quintane (2015), “The Micro-structures of Network Recall: How Social Networks Are Encodedand Represented in Human Memory,” Social Networks, 41,113–26.

Braun, Michael, and Wendy W. Moe (2013), “Online DisplayAdvertising: Modeling the Effects of Multiple Creatives andIndividual Impression Histories,” Marketing Science, 32 (5),753–67.

Burt, Ronald S. (2001), “Bandwidth and Echo: Trust, Information,and Gossip in Social Networks,” in Networks and Markets:Contributions from Economics and Sociology, James E. Rauchand Alexandra Casella, eds. NewYork: Russell Sage Foundation,30–74.

Cheema, Amar, and AndrewM. Kaikati (2010), “The Effect of Needfor Uniqueness on Word of Mouth,” Journal of Marketing Re-search, 47 (3), 553–63.

Cox, David R. (1972), “Regression Models and Life Tables,”Journal of the Royal Statistical Society. Series B.Methodological,34 (2), 187–220.

Dubois, David, Andrea Bonezzi, and Matteo De Angelis (2016),“Sharing with Friends Versus Strangers: How InterpersonalCloseness Influences Word-of-Mouth Valence,” Journal ofMarketing Research, 53 (5), 712–27.

Easley, David, and Jon Kleinberg (2010), Networks, Crowds, andMarkets. Cambridge, UK: Cambridge University Press.

Goel, Sharad, Ashton Anderson, Jake Hofman, and Duncan J. Watts(2016), “The Structural Virality of Online Diffusion,” Manage-ment Science, 62 (1), 180–96.

Gong, Shiyang, Juanjuan Zhang, Ping Zhao, and Xuping Jiang(2017), “Tweeting as aMarketing Tool: A Field Experiment in theTV Industry,” Journal of Marketing Research, 54 (6), 833–50.

Granovetter, Mark (1985), “Economic Action and Social Structure:The Problem of Embeddedness,” American Journal of Sociology,91 (3), 481–510.

Greene, William (2004), “The Behaviour of the Maximum Like-lihood Estimator of Limited Dependent Variable Models in thePresence of Fixed Effects,” Econometrics Journal, 7 (1), 98–119.

Hinz, Oliver, Bernd Skiera, Christian Barrot, and Jan U. Becker(2011), “Seeding Strategies for Viral Marketing: An EmpiricalComparison,” Journal of Marketing, 75 (6), 55–71.

Ho, Jason Y.C., and Melanie Dempsey (2010), “Viral Marketing:Motivations to Forward Online Content,” Journal of BusinessResearch, 63 (9), 1000–06.

Hoff, PeterD. (2005), “BilinearMixed-EffectsModels forDyadicData,”Journal of the American Statistical Association, 100 (469), 286–95.

Hu, Yansong, and Christophe Van den Bulte (2014), “Non-monotonic Status Effects in New Product Adoption,” MarketingScience, 33 (4), 509–33.

Iyengar, Raghuram, Christophe Van den Bulte, and Thomas W.Valente (2011), “Opinion Leadership and Social Contagion inNew Product Diffusion,” Marketing Science, 30 (2), 195–212.

Katona, Zsolt, Peter Pal Zubcsek, andMiklos Sarvary (2011), “NetworkEffects and Personal Influences: The Diffusion of an Online SocialNetwork,” Journal of Marketing Research, 48 (3), 425–43.

Lambrecht, Anja, Catherine E. Tucker, and Caroline Wiertz (2018),“Advertising to Early Trend Propagators? Evidence from Twit-ter,” Marketing Science, 37 (2), 177–99.

Langholz, Bryan, and Ørnulf Borgan (1995), “Counter-Matching: AStratified Nested Case-Control Sampling Method,” Biometrika,82 (1), 69–79.

Lee, Dokyun, Kartik Hosanagar, and Harikesh Nair (2018), “Ad-vertising Content and Consumer Engagement on Social Media:Evidence from Facebook,” Management Science, publishedonline January 18, DOI: 10.1287/mnsc.2017.2902.

LePage, Evan (2018), “All the Social Media Advertising Stats YouNeed to Know,” HootSuite, https://blog.hootsuite.com/social-media-advertising-stats/.

Lovett, Mitchell J, Renana Peres, and Ron Shachar (2013), “OnBrands and Word of Mouth,” Journal of Marketing Research,50 (4), 427–44.

Lu, Yingda, Kinshuk Jerath, and Param Vir Singh (2013), “TheEmergence of Opinion Leaders in a Networked Online Com-munity: A DyadicModel with Time Dynamics and a Heuristic forFast Estimation,” Management Science, 59 (8), 1783–99.

Luo, Zhunchen, Miles Osborne, Jintao Tang, and Ting Wang(2013), “WhoWill Retweet Me? Finding Retweeters in Twitter,”in Proceedings of the 36th International ACM SIGIR Conference

584 JOURNAL OF MARKETING RESEARCH, AUGUST 2018

Page 15: JING PENG, ASHISH AGARWAL, KARTIK …...Ashish Agarwal is Associate Professor of Information, Risk, and Operations Management, McCombs School of Business, University of Texas at Austin

on Research and Development in Information Retrieval. NewYork: Association for Computing Machinery, 869–72.

McPherson, Miller, Lynn Smith-Lovin, and James M. Cook (2001),“Birds of a Feather: Homophily in Social Networks,” AnnualReview of Sociology, 27, 415–44.

McQuarrie, Edward F., Jessica Miller, and Barbara J. Phillips(2013), “The Megaphone Effect: Taste and Audience in FashionBlogging,” Journal of Consumer Research, 40 (1), 136–58.

Nitzan, Irit, and Barak Libai (2011), “Social Effects on CustomerRetention,” Journal of Marketing, 75 (6), 24–38.

Reagans, Ray, and Bill McEvily (2003), “Network Structure andKnowledge Transfer: The Effects of Cohesion and Range,”Administrative Science Quarterly, 48 (2), 240–67.

Schweidel, David A., and Wendy W. Moe (2014), “Listening in onSocial Media: A Joint Model of Sentiment and Venue FormatChoice,” Journal of Marketing Research, 51 (4), 387–402.

Shi, Zhan, Huaxia Rui, and Andrew B. Whinston (2014), “ContentSharing in a Social Broadcasting Environment: Evidence from Twit-ter,” Management Information Systems Quarterly, 38 (1), 123–42.

Shriver, Scott K., Harikesh S. Nair, and Reto Hofstetter (2013),“Social Ties and User-Generated Content: Evidence from anOnline Social Network,” Management Science, 59 (6), 1425–43.

Snyder, Charles Richard and Howard L. Fromkin (1980), Uniqueness:The Human Pursuit of Difference. New York: Plenum Press.

Statista (2018), “Leading Social Media Platforms Used byMarketersWorldwide as of January 2018,” (accessedMay 22, 2018), https://www.statista.com/statistics/259379/social-media-platforms-used-by-marketers-worldwide/.

Stephen, Andrew T., and Donald R. Lehmann (2016), “How Word-of-Mouth Transmission Encouragement Affects Consumers’

Transmission Decisions, Receiver Selection, and Diffusion Speed,”International Journal of Research in Marketing, 33 (4), 755–66.

Stephen, Andrew T., and Olivier Toubia (2010), “Deriving Valuefrom Social Commerce Networks,” Journal of Marketing Re-search, 47 (2), 215–28.

Susarla, Anjana, Jeong-Ha Oh, and Yong Tan (2012), “SocialNetworks and theDiffusion of User-Generated Content: Evidencefrom YouTube,” Information Systems Research, 23 (1), 23–41.

Toubia, Olivier, Jacob Goldenberg, and Rosanna Garcia (2014),“Improving Penetration Forecasts Using Social InteractionsData,” Management Science, 60 (12), 3049–66.

Toubia, Olivier, and Andrew T. Stephen (2013), “Intrinsic vs.Image-Related Utility in Social Media: Why Do People Con-tribute Content to Twitter?” Marketing Science, 32 (3), 368–92.

Trusov, Michael, Anand V. Bodapati, and Randolph E. Bucklin(2010), “Determining Influential Users in Internet Social Net-works,” Journal of Marketing Research, 47 (4), 643–58.

Trusov, Michael, Randolph E. Bucklin, and Koen Pauwels (2009),“Effects of Word-of-Mouth Versus Traditional Marketing:Findings from an Internet Social Networking Site,” Journal ofMarketing, 73 (5), 90–102.

Van den Bulte, Christophe, and Stefan Wuyts (2007), Social Networksin Marketing. Cambridge, MA: Marketing Science Institute.

Yoganarasimhan, Hema (2012), “Impact of Social Network Structureon Content Propagation: A Study Using YouTube Data,”Quantitative Marketing and Economics, 10 (1), 111–50.

Zhang, Yuchi, Wendy W. Moe, and David A. Schweidel (2017),“Modeling the Role of Message Content and Influencers in SocialMedia Rebroadcasting,” International Journal of Research inMarketing, 34 (1), 100–19.

Network Overlap and Content Sharing on Social Media Platforms 585