Social Networks and Research Output

Download Social Networks and Research Output

Post on 09-Apr-2017




1 download


<ul><li><p>SOCIAL NETWORKS AND RESEARCH OUTPUT</p><p>Lorenzo Ductor, Marcel Fafchamps, Sanjeev Goyal, and Marco J. van der Leij*</p><p>AbstractWe study how knowledge about the social network of an individ-ual researcher, as embodied in his coauthor relations, helps us in developinga more accurate prediction of his or her future productivity. We findthat incorporating information about coauthor networks leads to a mod-est improvement in the accuracy of forecasts on individual output, over andabove what we can predict based on the knowledge of past individual out-put. Second, we find that the informativeness of networks dissipates overthe lifetime of a researchers career. This suggests that the signaling contentof the network is quantitatively more important than the flow of ideas.</p><p>I. Introduction</p><p>GOOD recruitment requires an accurate prediction of acandidates potential future performance. Sports clubs,academic departments, and business firms routinely use pastperformance as a guide to predict the potential of applicantsand to forecast their future performance. In this paper, thefocus is on researchers.</p><p>Social interaction is an important aspect of research activ-ity: researchers discuss and comment on each others work,they assess the work of others for publication and forprizes, and they join together to coauthor publications. Sci-entific collaboration involves the exchange of opinions andideas and facilitates the generation of new ideas. Accessto new and original ideas in turn may help researchers bemore productive. It follows that other things being equal,individuals who are better connected and more central intheir professional network may be more productive in thefuture.</p><p>Network connectedness and centrality arise out of linkscreated by individuals and thus reflect their individual charac-teristics: ability, sociability, and ambition, for example. Sincethe ability of a researcher is imperfectly known, the existenceof such ties may be informative.</p><p>These considerations suggest that someones collaborationnetwork is related to his or her research output in two ways:the network serves as a conduit of ideas and signals individualquality. The first channel suggests a causal relationship from</p><p>Received for publication August 27, 2011. Revision accepted for publi-cation May 3, 2013. Editor: Philippe Aghion.</p><p>* Ductor: Massey University; Fafchamps: University of Oxford and Mans-field College; Goyal: University of Cambridge and Christs College; van derLeij: CeNDEF, University of Amsterdam; De Nederlandsche Bank; and Tin-bergen Institute.</p><p>We thank the editor and two anonymous referees for a number of help-ful comments. We are also grateful to Maria Dolores Collado, MarkusMobius, and conference participants at SAEe (Vigo), Bristol, Stern (NYU),Microsoft Research, Cambridge, MIT, Alicante, Oxford, Tinbergen Insti-tute, Stockholm University, and City University London for useful com-ments. L.D. gratefully acknowledges financial support from the SpanishMinistry of Education (Programa de Formacion del Profesorado Univer-sitario). S.G. thanks the Keynes Fellowship for financial support. M.L.thanks the Spanish Ministry of Science and Innovation (project SEJ2007-62656) and the NWO Complexity program for financial support. The viewsexpressed are our own and do not necessarily reflect official positions of DeNederlandsche Bank.</p><p>A supplemental appendix is available online at</p><p>network to research output, whereas the second does not.Determining causality would clarify the importance of thetwo channels. Unfortunately, as is known in the literature onsocial interactions (Manski, 1993; Moffit, 2001), identifyingnetwork effects in a causal sense is difficult in the absence ofrandomized experiments.</p><p>In this paper, we take an alternative route: we focus onthe predictive power of social networks in terms of futureresearch output. That is, we investigate how much currentand past information on collaboration networks contributesto forecasting future research output. Causality in the sense ofprediction informativeness is known as Granger causality andis commonly analyzed in the macroeconometrics literature;for example, Stock and Watson (1999) investigate the predic-tive power of unemployment rate and other macroeconomicsvariables on forecasting inflation.1</p><p>Finding that network variables Granger-cause future out-put does not constitute conclusive evidence of causal networkeffects in the traditional sense. Nonetheless, it implies thatknowledge of a researchers network can potentially beused by an academic department in making recruitmentdecisions.</p><p>We apply this methodology to evaluate the predictivepower of collaboration networks on future research out-put, measured in terms of future publications in economics.We first ask whether social network measures help predictfuture research output beyond the information containedin individual past performance. We then investigate whichspecific network variables are informative and how theirinformativeness varies over a researchers career.</p><p>Our first set of findings is about the information value ofnetworks. We find that including information about coauthornetworks leads to an improvement in the accuracy of forecastsabout individual output over and above what we can predictbased on past individual output. The effect is significant butmodest; the root mean squared error in predicting future pro-ductivity falls from 0.773 to 0.758 and the R2 increases from0.395 to 0.417. We also observe that several network vari-ables, such as productivity of coauthors, closeness centrality,and the number of coauthors, have predictive power. Of those,the productivity of coauthors is the most informative networkstatistic among those we examine.</p><p>Second, the predictive power of network informationvaries over a researchers career: it is more powerful foryoung researchers but declines systematically with careertime. By contrast, information on recent past output remainsa strong predictor of future output over an authors entirecareer. As a result, fourteen years after the onset of a</p><p>1 A few examples of applications that have determined the appropriatenessof a model based on its ability to predict are Swanson and White (1997),Sullivan, Timmermann, and White (1999), Lettau and Ludvigson (2001),Rapach and Wohar (2002) and Hong and Lee (2003).</p><p>The Review of Economics and Statistics, December 2014, 96(5): 936948 2014 by the President and Fellows of Harvard College and the Massachusetts Institute of Technologydoi:10.1162/REST_a_00430</p></li><li><p>SOCIAL NETWORKS AND RESEARCH OUTPUT 937</p><p>researchers publishing career, networks do not have anypredictive value on future research output over and abovewhat can be predicted using recent and past output alone.</p><p>Our third set of findings is about the relation betweenauthor ability and the predictive value of networks. Wepartition individual authors in terms of past productivityand examine the extent to which network variables pre-dict their future productivity. We find that the predictivevalue of network variables is nonmonotonic with respectto past productivity. Network variables do not predict thefuture productivity of individuals with below-average initialproductivity. They are somewhat informative for individu-als in the highest past-productivity tier group. But they aremost informative about individuals in between. In fact, forthese individuals, networks contain more information abouttheir future productivity than recent research output. Takentogether, these results predict that academic recruiters wouldbenefit from gathering and analyzing information about thecoauthor network of young researchers, especially for thosewho are relatively productive.</p><p>This paper is a contribution to the empirical study ofsocial interactions. Traditionally, economists have studied thequestion of how social interactions affect behavior acrosswell-defined groups, paying special attention to the diffi-culty of empirically identifying social interaction effects.(For an overview of this work, see, e.g., Moffitt (2001) andGlaeser and Scheinkman, 2003.) In recent years, interesthas shifted to the ways by which the architecture of socialnetworks influences behavior and outcomes.2 Recent empir-ical papers on network effects include Bramoull, Djebbariand Fortin (2009), Calv-Armengol, Patacchini, and Zenou(2009), Conley and Udry (2010), and Fafchamps, Goyal, andvan der Leij (2010).</p><p>This paper is also related to a more specialized litera-ture on research productivity. Two recent papers, Azoulay,Zivin, and Wang (2010) and Waldinger (2010), both usethe unanticipated removal of individuals as a natural experi-ment to measure network effects on researchers productivity.Azoulay et al. (2010) study the effects of the unexpected deathof superstar life scientists. Their main finding is that coau-thors of these superstars experience a 5% to 8% decline intheir publication rate. Waldinger (2010) studies the dismissalof Jewish professors from Nazi Germany in 1933 to 1934. Hismain finding is that a fall in the quality of a faculty has sig-nificant and long-lasting effects on the outcomes of researchstudents. Our paper quantifies the predictive power of net-work information over and above the information containedin past output.</p><p>The rest of the paper is organized as follows. Section IIlays out the empirical framework. Section III describes thedata and defines the variables. Section IV presents our find-ings. Section V checks the robustness of our main findings.Section VI concludes.</p><p>2 For a survey of the theoretical work on social networks see Goyal (2007),Jackson (2008), and Vega-Redondo (2007).</p><p>II. Empirical Framework</p><p>It is standard practice in most organizations to look atthe past performance of job candidates as a guide to theirfuture output. This is certainly true for the recruitment andpromotion of researchers, possibly because research outputjournal articles and booksis publicly observable.</p><p>The practice of looking at past performance appears torest on two ideas. The first is that a researchers outputlargely depends on ability and effort. The second is thatindividuals are aware of the relationship between perfor-mance and reward and consequently exert effort consistentwith their career goals and ambition. This potentially cre-ates a stable relationship between ability and ambition, onthe one hand, and individual performance, on the other hand.Given this relationship, it is possible to (imperfectly) predictfuture output on the basis of past output. In this paper, westart by asking how well past performance predicts futureoutput.</p><p>We then ask if future output can be better predicted if weinclude information about an individuals research network.Social interaction among researchers takes a variety of forms,some of it more tangible than others. Our focus is on socialinteraction, reflected in the coauthorship of a published paper,a concrete and quantifiable form of interaction. Coauthorshipof academic articles in economics rarely involves more thanfour authors, so it is likely that coauthorship entails personalinteraction. Moreover, given the length of papers and theduration of the review process in economics, it is reasonableto suppose that collaboration entails communication overan extended period of time. These considerationspersonalinteraction and sustained communicationin turn suggestseveral ways by which someones coauthorship network canreveal valuable information on their future productivity. Wefocus on two: research networks as a conduit of ideas andcoauthorship as a signal about unobserved ability and careerobjectives.</p><p>Consider first the role of research networks as a conduit forideas. Communication in the course of research collaborationinvolves the exchange of ideas, so we expect that a researcherwho is collaborating with highly creative and productive peo-ple has access to more new ideas. This in turn suggests that aresearcher who is close to more productive researchers mayhave early access to new ideas. As early publication is a keyelement in the research process, early access to new ideas canlead to greater productivity. These considerations lead us toexpect that other things being equal, an individual who is inclose proximity to highly productive authors will on averagehave greater future productivity.</p><p>Proximity need not be immediate, however: if A coauthorswith B and B coauthors with C, then ideas may flow from A toC through their common collaborator B. The same argumentcan be extended to larger network neighborhoods. It followsthat authors who are more central in the research networkare expected to have earlier and better access to new researchideas.</p></li><li><p>938 THE REVIEW OF ECONOMICS AND STATISTICS</p><p>As a first step, we look at how the productivity of anindividual, say i, varies with the productivity of his or hercoauthors. We then examine whether is future productivitydepends on the past productivity of the coauthors of his orher coauthors. Finally, we generalize this idea to is central-ity in the network in terms of how close a researcher is to allother researchers (closeness) or how critical a researcher isto connections among other researchers (betweenness)theidea being that centrality gives privileged access to ideas thatcan help a researchers productivity.</p><p>Access to new ideas may open valuable opportunities, butit takes ability and effort to turn a valuable idea into a publica-tion in an academic journal. It is reasonable to suppose that theusefulness of new ideas varies with ability and effort. In par-ticular, a more able researcher is probably better able than aless able researcher to turn the ideas accessed through the net-work into publications. Since ability and industriousness arereflected in past performance, we expect the value of a socialnetwork to vary with past performance. To investigate thispossibility, we partition researchers into different-tier groupsbased on their past performance and examine whether thepredictive power of having productive coauthors and otherrelated network variables varies systematically across tiergroups.</p><p>The second way by which network information may helppredict future output is that the quantity and quality of onescoauthors is correlated with, and thus can serve as a signalfor, an individuals hidden ability and ambition. Given thecommitment of time and effort involved in a research col-laboration, it is reasonable to assume that researchers do notcasually engage in a collaborative research venture. Hencewhen a highly productive researcher forms and maintains acollaboration with another, possibly more junior, researcheri, this link reveals positive attributes of i that could not beinferred from other observable data. Over time, however,evidence on is performance accumulates, and residual uncer-tainty about is ability and industriousness decreases. Wetherefore expect the signal value of network characteristicsto be higher at the beginning of a researchers career...</p></li></ul>