the principles of experimental design and their ... · principles of design, laboratory...

SO39CH02-Jackson ARI 27 June 2013 10:38

The Principles ofExperimental Designand Their Applicationin SociologyMichelle Jackson1,2 and D.R. Cox2

1Institute for Research in the Social Sciences, Stanford University, Stanford,California 94305; email: [email protected] College, University of Oxford, Oxford OX1 1NF, United Kingdom;email: [email protected]

Annu. Rev. Sociol. 2013. 39:27–49

First published online as a Review in Advance onApril 5, 2013

The Annual Review of Sociology is online athttp://soc.annualreviews.org

This article’s doi:10.1146/annurev-soc-071811-145443

Copyright c© 2013 by Annual Reviews.All rights reserved

Keywords

principles of design, laboratory experiments, field experiments, surveyexperiments, visibility of experimental design, history of experimentsin the social sciences

Abstract

In light of an increasing interest in experimental work, we provide areview of some of the general issues involved in the design of experi-ments and illustrate their relevance to sociology and to other areas ofsocial science of interest to sociologists. We provide both an introduc-tion to the principles of experimental design and examples of influentialapplications of design for different types of social science research. Ouraim is twofold: to provide a foundation in the principles of design thatmay be useful to those planning experiments and to provide a criticaloverview of the range of applications of experimental design across thesocial sciences.

27

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


INTRODUCTION

The social sciences are largely observational,characterized by the application of nonex-perimental methods such as social surveys,interviewing, or direct observation. But manysocial science questions may also be addressedusing experimental methods, and indeed, somemay be best approached this way. Particularlyappropriate for experimental investigationare questions that aim to establish causalrelationships between two or more variables.

Concepts needed for the design of success-ful experiments have been discussed for manyyears, mainly in the context of the natural sci-ences, initially in physics, and later especially insuch applied subjects as agriculture, industrialtechnology, and clinical trials. Over the past80 years, the discussion of experimental designin contexts where there is appreciable hap-hazard (that is, irregular and nonpredictable)variation has become a major chapter in statis-tical theory and application. In this article, wereview the relevance of such ideas for the socialsciences taken in a broad sense. We draw ourillustrations largely from sociology and fromsociologically relevant work in experimentaleconomics and political science. Our focus istherefore on those social science disciplines thathave been primarily observational since theirinception; on the whole, we do not discuss fieldsthat have always been dominated by experi-mental research, such as certain subdisciplinesof psychology that might be classified under theauspices of social science (subdisciplines thatinclude studies of personality, cognition, learn-ing, and particularly experimental psychology).

The organization of the review is as follows:We begin by discussing the early use of exper-iments in the social sciences before presentingan empirical analysis showing a recent increasein their use. Next, we summarize the mainstatistical ideas involved in the design of exper-iments, ideas that are an important foundationfor the types of application that we discuss inthe remainder of the article (laboratory, field,and survey). We conclude with some generalcomments. Our aim is twofold: to provide a

review of the principles of design that may beuseful to those planning experiments and toprovide an overview of the range of applicationsof experimental design in sociology and othersocial sciences. We review neither the manyand diverse findings of social science exper-iments nor the huge swaths of social scienceliterature that stem from experimental work.Rather, our focus is on the design of experi-ments as typically understood in the statisticalliterature, and we review such design in thecontext of applications within social science.

An initial difficulty in a broad-rangingdiscussion is that terminology varies betweenspecific application fields as well as betweenthese and the general statistical literature, withdisagreement even on the question of whatconstitutes an experiment. We deal here withinvestigations in which the effects of a numberof alternative conditions or treatments areto be compared. Broadly, the investigationis an experiment if the investigator controlsthe allocation of treatments to the individualsin the study and the other main features ofthe work, whereas it is observational if, inparticular, the allocation of treatments hasalready been determined by some processoutside the investigator’s control and detailedknowledge. The allocation of treatments toindividuals is commonly labeled manipulationin the social science context.

SOME HISTORY OFEXPERIMENTAL DESIGNIN THE SOCIAL SCIENCES

Although most social science disciplines devel-oped during a period in which experimentaldesign was well established in the statisticaland scientific literature, such designs have acheckered history in the social sciences. Webriefly review this history and highlight someof the design aspects of studies that are nowcelebrated as classics in the field.

As mentioned above, some social scienceshave always had essentially experimental com-ponents. In social psychology, for example,

28 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


experimental designs have been employedthroughout the history of the discipline.1 In-deed, the first empirical paper published withinthe field of social psychology included anexperimental study of the effect of competitionon individual task performance. In an articlein the American Journal of Psychology, Triplett(1898) reported the results from a laboratoryexperiment designed to shed light on theempirical finding that racing cyclists producedfaster times in races that were paced or in com-petition with other riders than in unpaced races,one example of a more general phenomenonlater labeled “social facilitation” (Allport 1920,Aiello & Douthitt 2001). In Triplett’s experi-ment, two groups of 20 children were asked tocomplete a simple physical task—winding fish-ing line on a reel—as quickly as possible, eitheralone or in the presence of competition. The ex-perimental design is therefore relatively simple,with manipulation of a single treatment (com-petition, absent or present) and focus on a singleoutcome variable (speed of task completion).Each child completed the task six times, with adifferent ordering of the trials (alone or in com-petition) for the two groups of children. Thearticle does not describe the process throughwhich individuals were allocated to groups; thesystematic use of randomization had not beensuggested until much later. After a comparisonof task completion times between the twoconditions, Triplett (1898, p. 533) concludes,“From the. . .laboratory races we infer that thebodily presence of another contestant partic-ipating simultaneously in the race serves toliberate latent energy not ordinarily available.”

Although the beginning of any disciplineis difficult to define with precision, Strube(2005, p. 281) writes that “[t]he publication ofTriplett’s experiment in 1898 was a watershedevent in the history of social psychology. It

1At its inception, social psychology was an interdisciplinaryenterprise. During the twentieth century, the subject of so-cial psychology came to be studied in both psychology andsociology departments, with differences in focus related tothe umbrella discipline. In this review we focus on social psy-chology as practiced within sociology.

marked the formal beginning of the field as anempirical enterprise and launched one of itsmost enduring and far-reaching research lines”(see Allport 1954 and Brehm et al. 1999 forsimilar plaudits). Experimental designs havesubsequently dominated the field of so-cial psychology, with other notable studiesincluding Asch’s (1951, 1956) tests of con-formity, Milgram’s (1974) tests of obedienceto authority, Bandura et al.’s (1961) investi-gation into aggression as social learning, andZimbardo’s field experiment testing the effectof externally imposed social roles on individ-ual behavior, better known as the StanfordPrison experiment (Haney et al. 1973; see alsohttp://www.prisonexp.org).

By contrast, other areas of social sciencehave been relatively slow to embrace experi-mental design as a central research method. Inthe first issue of American Sociological Review,Chapin (1936, pp. 7–8) found it necessary, inhis presidential address, “to resolve a confusionin thought that is present in many discussionsof the procedures of measurement and of ex-periment. The opinion that experiment can-not be used in social research is an erroneousjudgment.” The prevailing view criticized byChapin that the social sciences are fundamen-tally nonexperimental appears to have also beenpresent in political science, and the presidentof the American Political Science Associationclaimed that “[w]e are limited by the impos-sibility of experiment. Politics is an observa-tional, not an experimental science” (Lowell1910, p. 7, quoted in Druckman et al. 2006,p. 627). And Alfred Marshall (1890), in Princi-ples of Economics, wrote that “[t]he natural sci-ences and especially the physical group of themhave this great advantage as a discipline over allstudies of man’s action, that in them the inves-tigator is called on for exact conclusions whichcan be verified by subsequent observation orexperiment” (ch. 4, §5).

Despite such epistemological statementsabout the essential nature of the social sciences,even the very early histories of economics,political science, sociology, and other social

www.annualreviews.org • The Principles of Experimental Design 29

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.

http://www.prisonexp.org


science disciplines are in fact not lacking forexamples of highly influential experimentalstudies. Perhaps the most well-known earlysociological experiments were carried out atWestern Electric’s Hawthorne Works in the1920s and 1930s and reported in Managementand the Worker (Roethlisberger & Dickson1939; see also Mayo 1933). The aim of theexperiments was to evaluate the effect onworkers’ productivity of a range of stimuli,including the level of light in the factory, thelength and timing of breaks, the provision offood, and other environmental factors. Theillumination experiments became the mostnotorious of the studies as a result of a curiousfinding, later labeled the “Hawthorne effect.”

The procedure in the Hawthorne illumina-tion studies was as follows: Over a period of al-most three years, the productivity of workers inthree departments of the factory was measuredthrough a simple count of the number of units(parts inspected, telephone relays assembled,wire coils wound) produced by each worker oneach day. In the first stage of the experiment, allthree groups of workers were subject to the ex-perimental conditions, whereas in later stages asmall group of workers was assigned to workin a special room, where more extreme ma-nipulations could occur. Manipulation of light-ing conditions allowed the researchers to testthe hypothesis that better lighting conditionswould lead to increased productivity. Dickson& Roethlisberger (1966, p. 20) later summa-rized the results of the experiments:

[I]t was difficult to conclude just what the effectof illumination on productivity was. Some ofthe results of these experiments seemed to saythat it was positive, some that it was zero, andsome that it was screwy. For example, some-times an improvement in illumination was ac-companied by an improvement in output andthen again, sometimes a positive increase inone did not produce an appreciable increase inthe other. And then—this is the screwy part—quite often when the illumination intensitywas decreased, output did not go down but re-mained the same and in some cases increased.

The finding that productivity increased evenwhen the level of illumination was extremelypoor (indeed, at times the illumination wasso low as to be labeled moonlight) led theresearchers to suspect that another factorwas at work. From further experiments andobservations, they argued that a psycholog-ical mechanism was operating to increaseproductivity: The workers were apparentlyresponding positively to the increased attentionto their productivity by researchers, and not tothe changes in illumination. Note that Dickson& Roethlisberger’s summary of the findingsof the Hawthorne studies is rather morecautious than some of the later summariesand interpretations published by other socialscientists [see Jones (1992, p. 453, footnote 2)for a discussion of this point].

The Hawthorne studies showed that in-terpersonal processes in the workplace couldaffect employee productivity. But the method-ological contributions of the research wereperhaps even more important, contributionsthat came about primarily because the initialexperimental designs were defective and lackedsufficient control and because observer effectsthat should have been avoided through carefuldesign were instead entangled with the effectsof interest. The authors were commendablyopen about the challenges posed both by theenvironment and by experimental findingsthat were unexpected and unpredicted. It isnot often appreciated that the Hawthornestudies encompassed several quite distinctdesigns, with substantial changes to designand procedure occurring as a result of unex-pected findings from early experiments whenthe researchers discovered that a number ofuncontrolled variables could have influencedthe results (Roethlisberger & Dickson 1939,p. 19). One consequence of these changes tothe experimental design was that there wasinsufficient independent replication acrossthe experiments, and in retrospect a differentdesign would have been a better starting point(a factorial design would have been most appro-priate). But the recognition that the researchermay serve as a confounding factor when

30 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


experiments are conducted on human subjectswas profoundly influential (Levitt & List 2011;see also Adair et al. 1989 for a meta-analysis ofstudies testing for the presence of a Hawthorneeffect in educational research). Indeed, thestudies were so influential that the findings havebeen subjected to scrutiny since the 1930s, andreanalyses of the Hawthorne data are still beingpublished (e.g., Franke & Kaul 1978, Gillespie1991, Jones 1992, Levitt & List 2011). Manyof these reanalyses find that the original claimsfor the existence of a Hawthorne effect arenot supported or are only weakly supportedby the experimental data. Of course, themethods of statistical analysis available to thoseundertaking reanalyses are better than thoseavailable to the Hawthorne researchers.

The Hawthorne studies thus have a specialplace in the history of experimentation inthe social sciences, and no other sociologicalexperiment has achieved the same level ofrenown. We do not consider additional his-torical examples of experimental social sciencestudies here, but for discussion of the historyof experiments in sociology, see Oakley (1998);for a history of experiments in political science,see Green & Gerber (2003), and in economics,Roth (1993, 1995).

THE INCREASED USE OFEXPERIMENTAL DESIGN INTHE SOCIAL SCIENCES

Use of experiments in the social sciences hasincreased appreciably in recent years. Socialscience disciplines that have traditionally es-chewed experimental designs have witnessed anincrease in experimental research, and univer-sities and research institutions have made in-frastructural investments to support researcherscollecting data using experiments.

One way to gauge the growth in experi-mental designs is to examine the publicationfrequency of research employing them. Asan example, we measured the visibility ofexperimental design within economics, polit-ical science, and sociology through a simple

content analysis of articles published in thetop-ranked journals for each discipline between1990 and 2010. We calculated the proportionof research articles that report results fromexperimental designs in the top-ranked jour-nals for economics (American Economic Review,Econometrica), political science (AmericanJournal of Political Science, American PoliticalScience Review), and sociology (American Jour-nal of Sociology, American Sociological Review).The top-ranked journals are defined as thosewith the largest number of total citations from2008–2009, as reported in Thomson Reuters’sWeb of Knowledge Journal Citation Report R©.We include only full research articles; researchnotes, replies, corrections, and similar piecesare excluded, leaving a total sample of 5,874articles. An article was coded as employingexperimental design if empirical results werereported that originated from the manipulationof human subjects. Articles that described aformal model of behavior with a hypothet-ical experimental test were not coded asexperimental.

Figure 1 shows a three-year moving aver-age of the proportion of all articles published ineach of the top-ranked journals that employedexperimental design (original journal by yearcoding is available from the authors on re-quest). The figure demonstrates a clear growthover time in the number of experimentalresearch articles. Experimental designs werepresent in top-ranked journals at the beginningof the period, but they represented only about3% of the total number of research articles inthe six journals. A rapid growth in the visibilityof experimental designs occurred in the early2000s, and by 2010 almost 8.5% of the totalnumber of research articles in the top-rankedjournals were experimental.

There are clear disciplinary differences.The growth of experimental designs was mostpronounced in economics, and particularly inthe American Economic Review, where by theend of the period 18% of research articles wereexperimental. Political science also witnesseda growth, albeit tempered in comparison with


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


0

0.05

0.1

0.15

0.2

Prop

orti

on

1990 1995 2000 2005 2010

Year (three-year moving average)

AERECON

AJPSAPSR

AJSASR

Figure 1The proportion of research articles employing experimental design in top-ranked journals in economics,political science, and sociology (three-year moving average). Abbreviations: AER, American Economic Review;ECON, Econometrica; AJPS, American Journal of Political Science; APSR, American Political Science Review, AJS,American Journal of Sociology; ASR, American Sociological Review.

economics.2 Indeed, the notable exception tothe general pattern is sociology, where thevisibility of experimental design fluctuatesin a rather trendless fashion over the periodconsidered.

An interesting qualitative feature of thecontent analysis was the extent to which nonex-perimental papers increasingly referenced thelanguage of experimental design. In particular,the term “natural experiment” appears to havegained in popularity in describing studies thatwould have been described as standard obser-vational studies in the past (and were thereforecoded as nonexperimental in our contentanalysis). For example, the implementation of anew social policy might today be described as a

2Druckman et al. (2006) carried out a content analysis ofarticles published in the American Political Science Review overa much longer period than that considered here (between1906 and 2004) and found a sharp growth toward the end ofthe period in the number of published articles reporting theresults of experiments.

natural experiment, with the researcher draw-ing inferences about causal processes whencomparing the pre– and post–policy changecircumstances. The term “natural experi-ment” probably derives from the guidelines ofBradford Hill (1965), who specified conditionsin which observational studies in epidemiol-ogy might reasonably point toward a causalinterpretation; note that they were guidelines,and not criteria. A natural experiment in thiscontext meant a large nonplanned interventionwhose impact was so large that it could beseparated from possible confounding effects.An extreme example is the use in radiationepidemiology of data from the survivors of theHiroshima bomb. Such a version of the term“natural experiment” is too strong for manyof the social science contexts in which it isapplied.

The content analysis thus demonstratesa substantial increase in the visibility ofexperimental designs in the social sciencesover the past 20 years. Other sources further

32 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


demonstrate that studies employing experi-mental designs are more frequently cited thanother research articles, which suggests thatthe increased visibility of experimental studiesin the top-ranked journals will be more thanmatched by the impact of these studies withinthe social scientific literature (Druckman et al.2006, p. 632).

PRINCIPLES OF EXPERIMENTALDESIGN

Sociologists who are planning experiments haveavailable to them all manner of textbooks andpractical guides (e.g., Campbell & Stanley 1963,Brown & Melamed 1990), alongside existingexperimental research that provides convenienttemplates for new studies. But when consider-ing the details of the design of any particularexperiment, it is helpful to return to the smallset of basic principles that lie at the heart ofstrong experimental design. Detailed discussionof general principles of experimental design inthe presence of substantial uncontrolled varia-tion largely dates from Fisher (1926), who de-veloped his ideas in a book (Fisher 1935) still inprint. He in effect suggested the four main prin-ciples that are the touchstones of experimentaldesign to this day.

First, it is important to eliminate possiblesystematic errors. This may be achieved byensuring that major predictable sources ofunwanted variability affect all treatmentsequally and, importantly, by random allocationof treatments to units, ensuring that each unitis equally likely to receive each treatment.In this context, randomization always meansuse of an impersonal device with knownprobabilistic properties. Randomization is inprinciple required at every stage of a studyat which appreciable uncontrolled variabilitymight enter and is often importantly combinedwith concealment to deal with various kinds ofsubjective preference, which in some kinds ofstudies may easily distort conclusions.

We require conclusions of reasonableprecision: that is, conclusions that are based onsufficient data to stand out above random error.

Precision is to be distinguished from accuracy,which also requires freedom from systematicerror. The second principle is therefore toenhance precision by comparing like with like.That is, the experimental units are grouped intosets or blocks expected to respond similarly,arranged such that each treatment occurs thesame number of times in each block. The broadidea is that control through comparison againstsome null situation is essential to establish thata certain intervention produces a particulareffect. Further, if an intervention produces aparticular type of effect, we need to comparethe consequences of various changes to theintervention in order to probe the nature ofthat effect.

Thirdly, replication is typically importantboth to verify directly repeatability of con-clusions and also to enhance precision and toallow empirical estimation of the reliability ofthe estimated treatment effects. Replication,provided it is genuinely independent, gives aguide to the reproducibility of conclusions andensures that confidence limits and other as-pects of statistical analysis deliver a reasonableassessment of the reliability of conclusions.

The fourth notion is that of a factorialexperiment. In a factorial experiment, severalquestions are addressed within one experimentrather than by a one-at-a-time approach, andthus the factorial route may be much moreefficient in use of resources. Each question isrepresented in the design as a factor to be var-ied, and every combination of factors is assignedto the experimental units. One example is twodifferent kinds of change in policy, each im-plemented for either a short, medium, or longperiod and each implemented in either a weakor strong form. This yields 2×3×2 possibilities,and a factorial experiment would involve allthese, each typically implemented on a numberof distinct experimental units (see Ledolter &Swersey 2007 on factorial designs). Factorialdesigns allow for the estimation of statisticalinteractions that may be crucial for interpre-tation. In an extension of the idea, some of thefactors may refer to baseline properties of thestudy individuals: age, gender, and ethnicity, for


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


example. An absence of important interactionsof treatment effects with these baseline proper-ties may be an important source of support forgeneralizability. Factorial experiments are fre-quently used in the social sciences, particularlyin the context of survey experiments.

The Design of Experiments

There are several stages in the design of anexperiment. Keeping the research objectivesin mind, the researcher begins by clearlyoutlining the alternative treatments underinvestigation, treatments that often consist ofall or most combinations of a number of factors(that is, variables). The researcher chooses theexperimental units (or subjects) and recordsappropriate baseline features of the units to al-low for comparison of pre- and post-treatmentconditions. Subsequently, the researcher allo-cates one treatment to each experimental unit.In a narrow sense, the procedure for assigningtreatments to experimental units is in fact whatconstitutes the specific choice of experimentaldesign. The outcome(s) of interest must beidentified, and procedures for measuring theoutcome(s) must also be determined. As thedesign is implemented in the experimentalsetting—laboratory, field, or survey—carefulmonitoring will ensure both that the designis accurately implemented and that it can beadjusted where necessary.

In planning the design, the researcher aimsto keep four key requirements in mind. First,the design should be capable of delivering con-clusions that are free of systematic error andmore broadly of ambiguities of interpretation.The importance of this requirement was high-lighted in the Hawthorne experiments, wherethe findings of the experiments could not beunambiguously interpreted with respect to theresearch questions of interest (an aspect that,somewhat ironically, secured the study’s placeas a sociological classic). Second, the statisticalerrors of estimation connected with the designshould be reduced to a reasonable level and itshould be possible to estimate the magnitude ofsuch errors, for example by calculating confi-

dence intervals for the size of any comparisonsestimated. Ideally, adequate precision will beachieved as economically as is feasible, and theconfidence intervals will not be too wide. Third,it should be relatively simple to implement thedesign reliably. Overly complex designs are tobe avoided. And fourth, especially for investi-gations intended as a basis for practical appli-cation on a broad front, the conclusions shouldhave a broad range of validity. In major inves-tigations, especially those of a novel kind, smallpilot investigations are often carried out to al-low the design to be tested and refined beforethe final implementation and data collection.The extent to which the four requirements canbe met is likely to differ depending on whetherdesigns are applied in the laboratory, field, orsurvey context, as we discuss below. For exam-ple, it may be more difficult to satisfy the thirdrequirement—that it should be relatively sim-ple to implement the design reliably—in a fieldexperiment than in a laboratory study.

Choice of Experimental Unitsand External Validity

An experimental unit is the physical entitythat is subject to treatments and upon whichoutcomes will be recorded; it corresponds tothe smallest subdivision of the system underinvestigation such that any two units mightreceive different treatments. Sometimes itmay be desirable or even essential to use thesame physical entity several times as distinctexperimental units, commonly known as awithin-subject design in the social sciences.Thus, in a laboratory experiment, an exper-imental unit will often be a subject-sessioncombination, each subject taking part in severalsessions. This arrangement may be efficientin the sense that treatment contrasts can bestudied free of simple intersubject systematicdifferences, and it is particularly appealing ifthe number of subjects available is limited. Theclassic social psychology experiment designedby Triplett (1898) included a within-subjectcomponent, with each subject completingthe same task six times under two different

34 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


treatments (competition, absent or present); animportant feature of the analysis was a within-subject comparison of performance under thedifferent treatments. A potential difficulty withwithin-subject designs is that the outcomeobserved on a particular unit may depend notonly on the treatment experienced in thatsession but possibly on the treatments that thesubject received in earlier sessions. Suitablespacing of sessions may alleviate the problem,and if the so-called carryover or residual effectis confined to one session, appropriate designand analysis will correct for that. The possi-bility of more complex dependencies on theprevious experience of the subject would be amajor concern. As we discuss below, a standardprotocol for experimental economics labora-tories is that deception is forbidden, a protocolthat has arisen, in part, to ensure that theprevious experience of an experimental subjectdoes not contaminate subsequent studies.

The choice of experimental units is con-sequential for what social scientists refer to asexternal validity, known also as range of validityand, most recently, as transportability (Pearl &Bareinboim 2011). External validity concernsthe extent to which conclusions of a study can beapplied more generally, and it depends both onthe generalizability of the sample of experimen-tal units and on the generalizability of the con-text within which the experiment takes place.

For studies bearing directly on generalsocial science theory or public policy, range ofvalidity regarding experimental units poten-tially depends on three aspects. The first is thepossibility of support from theory or externalinformation, which provides the basis for aclaim that the experimental findings are likelyto be constant across individuals and contexts,and therefore generalizable. A well-constructedsociological theory will specify the scope con-ditions under which the theory will hold andto which the findings will be generalizable(on scope conditions see especially Walker &Cohen 1985, Cohen 1988). The second andmost immediately important aspect in the de-sign is the use of key characteristics of the studyindividuals with the hope of showing stability of

the primary effects with respect to these charac-teristics (see, for example, Cox 1958, pp. 9–11).Thus, depending on context, verifying thatthe conclusions applied regardless of gender,age, and ethnicity would give some basis forextrapolation. The third is that the individualsstudied are, if not a respectable sample in aformal sense, at least broadly representative ofthe target population. The emphasis on each ofthese three aspects differs between laboratory,field, and survey experiments, which we discussin more detail below.

Range of validity regarding the experi-mental context depends only on whether thecontext of the experiment—e.g., laboratory,field, survey—provides an appropriate basis forgeneralizability. Highly artificial environmentsmay be a threat to external validity even whenthe experimental units are perfectly represen-tative of the population of interest.

Importantly, for the detailed investigationof a new phenomenon, external validity maybe of relatively little concern. For example, tostudy in depth the spread of a new infectionor the causes of social unrest, it would beappropriate to concentrate on areas where theinfection or unrest was common regardless ofgeneral representativeness.

Choice of Treatments

In some ways, the most crucial aspect ofexperimental design is the specification oftreatments, the comparison of which will yieldinteresting conclusions free of ambiguities ofinterpretation. The simplest treatment struc-ture has just two treatments, often regardedas a control or null treatment and a positivetreatment or intervention. It is important thatboth are clearly specified, and in such a waythat any difference in outcome found has a clearunderlying interpretation. It is possible to havea number of qualitatively different treatmentswith or without a control. As described above,a common approach is to use a factorial designin which treatments are identified by severalclearly defined features, each of which coulddefine a new treatment on its own. Typically,


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


the various treatments will be equally repli-cated, although exceptions may be madewhen some treatments are more expensiveor time-consuming to apply, or when thereis a control and several other treatments andthe main emphasis is on comparing the othertreatments individually with the control whenextra replication of the control is desirable.

CONTEMPORARYEXPERIMENTAL DESIGNIN THE SOCIAL SCIENCES

In this section, we consider different classesof experiments commonly applied in the so-cial sciences and discuss the design issues as-sociated with each. The classes of experimentsthat we focus on are laboratory experiments,field experiments, and survey experiments. Tosome extent, this division of experimental de-signs into different classes is arbitrary, but as thecategories are well understood by social scien-tists, this categorization is appropriate here. Foreach class of experiment, we provide examplesof the type of work carried out using these de-signs in sociology and other social sciences, aswell as an assessment of the strengths and weak-nesses of the class of experiments in relation tothe principles of design.

Laboratory Experiments

Laboratory experiments are generally under-stood to come closest to the types of experimentcommon in some parts of the natural sciences,in that they take place in a physical location de-termined by the researcher, and the researcherhas a high degree of control over treatmentsand other experimental conditions. Laboratoryexperiments in sociology are typically describedas empirically driven or theory driven, alterna-tively labeled descriptive or formal (e.g., Martin& Sell 1979, Thye 2007, Walker & Willer2007). Empirically driven experiments aim toestablish empirical regularities, so that generalconclusions about real-world phenomena canbe drawn. In contrast, theory-driven experi-ments aim to isolate causal processes and test

theory. Where well-formulated social scientifictheories exist, theory-driven experiments allowresearchers to create the conditions underwhich causal relationships can be identified.The use of the word theory in these contexts isdifferent from the usage in the natural sciences,however, where the word hypothesis would beused for a proposal suggested provisionally asa basis for interpretation.

Laboratory experimentation has historicallyhad a relatively small role in social sciencedata collection, albeit an important one.3

The social psychology field in sociology has along-standing experimental tradition, with sig-nificant areas of research based around testingcentral theories, particularly expectation statestheory and exchange theory (for summariesof the former theory, see Berger et al. 1977,Berger & Wagner 1993, Berger & Webster2006; for summaries of the latter theory, seeCook et al. 1990, Molm & Cook 1995, Molm2007). On the former, experimental tests ofexpectation states theory manipulate statuscharacteristics within and between recognizedsocial groups (e.g., men and women, young andold) in the laboratory to examine whether socialdifferences form the basis of status distinctionsthat subsequently influence behavior (e.g.,Ridgeway 1991, 2001; Ridgeway et al. 1998,2009; Ridgeway & Erickson 2000; Ridgeway& Correll 2006; Berger 2007). On the latter,social psychologists working in the exchangetheory tradition argue that human interactionsmust be understood in the context of exchangerelationships, in which individuals offer mate-rial and nonmaterial goods in the expectationthat they will receive similar goods in exchange(e.g., Blau 1955, 1964; Homans 1958; Cook1987). Laboratory experiments are designedto manipulate both the social networks withinwhich exchanges will be undertaken as well asthe form of social exchange (Cook & Emerson1978; Cook et al. 1983; Molm et al. 2000, 2001;

3We do not consider laboratory experimentation in politicalscience here, but for an introduction, see McDermott 2007,Palfrey 2009, Morton & Williams 2010, Iyengar 2011.

36 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Cook & Cooper 2003). A useful summaryof the role of laboratory experimentation intesting expectation states theory and exchangetheory can be found in Bonacich & Light(1978).

One type of laboratory experiment thathas come to prominence in recent years,and with influence across the social sciences,is that carried out under the auspices ofexperimental economics. On the whole, exper-imental economists are engaged in explicitlytheory-driven experiments that speak to thecentral tenets of economic theory. Levitt &List (2007, p. 153) write that “[t]he allure of thelaboratory experimental method in economicsis that, in principle, it provides ceteris paribusobservations of individual economic agents,which are otherwise difficult to obtain.” Specialareas of interest in experimental economicsinclude auction theory, choices made underrisk and uncertainty, the limits to rationality,public goods, social preferences, trust, andasset markets. See Friedman & Sunder (1994,appendix 1) for readings in experimentaleconomics, split by subject area, and Kagel& Roth (1995) for reviews of experimentalfindings, again by subject area. Levitt & List(2007) also give a thoughtful summary ofeconomics experiments on social preferences.

For economics, a discipline with somestrongly embedded theory and assumptionsabout human behavior, the theory-drivenexperiment has much appeal, as the degreeof control afforded by the laboratory allowsfor situations that mimic those specified byeconomic theory, ensuring internal validity.Many economics experiments (or games) are ofrelatively simple design. For example, in an ulti-matum game, there are two paired participants.One participant (the proposer) is provided witha sum of money and asked to propose a divisionof the total sum between herself and the otherparticipant (the responder). The responder isasked whether he is prepared to accept the pro-posed division; if he accepts, the sum is dividedas proposed, but if he rejects the proposal,neither player receives any money (Eckel 2007,p. 507). An important feature of this game,

and indeed of many other games carried outunder the auspices of experimental economics,is that the study is not, in fact, experimentalin the sense used in the present paper becausethe comparative element that is crucial to theconcept of an experiment is often missing.Rather, investigations of this type are carefulobservational studies of the behavior of subjectsin a tightly defined context mimicking aspectsof the real economic world. In a correspondingexperimental setting, the context would bespecified by a limited number of componentelements capable of separate variation and theemphasis would be on estimating the impactand relative importance of these components,with the aim of understanding how the contextgenerates its effect. Use of the same subject fortwo or more different settings would often beprecluded by the possibility of complex depen-dencies of later responses on the earlier setting,although such carryover effects may sometimesbe reduced by deceiving the subjects as to theobjective of the investigation. However, suchdeceptions are usually forbidden by the proto-cols of the experimental economics laboratory.

The internal validity afforded by laboratoryexperiments is seen to be the greatest strengthof the method, enabling precise and clear testsof social scientific theories. But a standard ob-jection to the laboratory experiment, familiarto anyone who has ever opened a social sci-ence methods textbook, is that external validityis lacking because the environment in the labo-ratory is artificial, and experimental results aretherefore not generalizable. As external valid-ity is so contentious a topic within discussionsof laboratory experiments, we here discuss theissues at stake in some detail. The external va-lidity issues are in fact seen to be rather differentfor empirically driven and theory-driven exper-iments (Thye 2007).

In an empirically driven experiment, whichaims to establish empirical regularities, gener-alizability is seen to be enhanced through theapplication of statistical principles: In an exper-iment conducted on a representative sampleof individuals, we would observe differencesbetween control and treatment groups and ask


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


whether these differences were likely to havebeen observed by chance (Walker & Willer2007). Thus, results from empirically drivenexperiments are purportedly generalizable injust the same way as nonexperimental results(Thye 2007). But there are two potentialthreats to external validity and generalizabilitythat we see as important for empirically drivenlaboratory experiments. First, if empiricalregularities are to be established and imme-diately generalized to the population, clearlythe sample of experimental units within theexperiment should be broadly representative ofthat population to which the researcher wishesto generalize (Martin & Sell 1979). The secondthreat to external validity is one of context andcomes from the artificiality of the laboratoryenvironment. If the aim is to generalize froma laboratory experiment to claim an empiricalregularity, there must be some confidencethat what happens in the environment ofthe laboratory is a reasonable representationof what happens in the real world. There isevidence that some real-world contexts canbe reasonably approximated in a laboratorysetting (e.g., Correll et al. 2007, p. 1333).

In theory-driven experiments, the artificialenvironment of the laboratory is embraced inorder to fully isolate causal processes of interestand thus to test elements of scientific theories.The aim of such experiments is not to general-ize to a wider population, but rather to hone andtest a scientific theory that, if validated, couldlater be applied to a real-world context to ex-plain human behavior (Webster & Kervin 1971,Zelditch 2007). As Martin & Sell summarize,

Experimental tests are artificial so their re-sults cannot be directly generalized to naturalphenomena as they operate in the real world.Instead, the results are pertinent only to thelaw-like statements that the experiment wasdesigned to test. . . . The relation of the re-sults to the real world is indirect in that the re-sults of an experimental test allow only for theassessment of the support or nonsupport forthe law-like statements. If the law-like state-ments are supported, then the statements may

be used to explain and predict phenomena ofthe real world. . . . In other words, the spe-cific experimental results are not generalizedto the real world; however, when empiricallysupported, the law-like statements are. (Mar-tin & Sell 1979, p. 587; see Webster & Kervin1971, p. 269, for a similar argument)

From this perspective, the two threats toexternal validity described in relation to em-pirically driven laboratory experiments—therepresentativeness of the sample of individ-uals and the artificiality of the laboratoryenvironment—are thus not applicable totheory-driven experiments.

The literature on laboratory experimentssuggests that there is a broad level of agree-ment that the external validity concerns differbetween empirically driven and theory-drivenexperiments. But it is not at all clear that, first,theory-driven experiments are immune fromthe requirements of external validity by virtue oftheir special role in testing theory or, second,that all social scientists who conduct theory-driven experiments explicitly limit them-selves to making nongeneralizable, theoreti-cally based claims. The first of these criticismscenters on the principle that a theory-drivenlaboratory experiment may indeed be able todemonstrate that x causes y but that it demon-strates this causal relationship only under con-dition l, the laboratory condition, and for sam-ple s. To claim that a theory-driven experimentcan be generalized through the law-like state-ments supported by the results is simply to claimthat condition l and sample s are not relevantto the applicability of the law-like statements(rather than to establish this in reference to ex-ternal theory and data). In the natural sciences,such assumptions may be somewhat less contro-versial, but in the social sciences we are boundto ask whether, say, the experimental context ofthe economics laboratory can simply be ignoredbecause the results will speak to theory, partic-ularly when the step from theory to the applica-tion of theory in explaining empirical phenom-ena is sometimes rather small. Only when wecan find support for the assumptions that l and

38 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


s do not influence our findings is external va-lidity assured (e.g., Zelditch 2007). Our secondcriticism accepts, for the sake of argument, thattheory-driven experiments are less susceptibleto concerns about external validity, but it ques-tions whether theory-driven experimenters arealways clear that the empirical results are notgeneralizable in any simple way. In practice,the distinction between empirically driven andtheory-driven experiments is often blurred, andthe research results of experiments that are de-scribed as theory-driven are generalized to awider population without a full discussion ofthe role of experimental units and context.

A final topic worthy of discussion is howfar the practical arrangements of laboratorieshelp or hinder external validity. In many insti-tutions, the laboratory condition and perhapseven the experimental subjects will be deter-mined by someone other than the researcher.Laboratory protocols are particularly stringentin experimental economics, where to a largedegree there is consensus on how laboratoriesshould be set up and how experiments shouldbe conducted. One rule present in almost alleconomics laboratories states that participantsmust not be deceived, even though deceptionhas long been a rather standard feature oflaboratory experiments in other social sciencedisciplines (and particularly in social psychol-ogy). Friedman & Sunder (1994) argue that“experimental economists require completecredibility because salience and dominance arelost if subjects doubt the announced relation be-tween actions and rewards, or if subjects hedgeagainst possible tricks” (p. 17; see also Eckel2007). More practically, many laboratoriesoutlaw deception in order that a finite subjectpool is not contaminated for future researchby deception in previous experiments (e.g.,Holt 2007). But the strategies used to avoiddeception in the laboratory may be so artificial,or so different from strategies that would beemployed were deception to be permitted, thatthe experimental results are no longer adequateto speak to the validity of a theory. As Cook &Yamagishi (2008) state, a consequence of thenondeception protocol is that some nonrational

elements of human behavior are “ruled outby design” (p. 216; see also Sell 2008). Otherlaboratory (and journal) protocols put in placeto protect subject pools and uphold standards(including an understanding that participantsshould always be paid in hard cash, as cashrewards are seen to most closely mimic the con-ditions of a market) also have the potential toalter the context of the experiment in theoret-ically relevant ways (e.g., Guala 2005, ch. 11).To some extent, the laboratory protocolswithin experimental economics should be ofconcern only to economists, but in institutionswhere resources are pooled, common protocolsmay be imposed for disciplines across the socialsciences, resulting in harmful consequences forsociological experimental design.

Field Experiments

The distinction between laboratory and fieldexperiments is seemingly well understood bysocial scientists, who if asked to describe thedifference between the two would perhaps statesomething in line with the following: Fieldexperiments are distinguished from laboratoryexperiments by virtue of taking place in areal-world context. But it is, in fact, extremelydifficult to pin down the distinction betweenlaboratory and field experiments. What is a real-world context? Are the Hawthorne experimentsexamples of field experiments, because theywere undertaken in a real-world factory? Ordoes the fact that a group of workers was sepa-rated from other factory workers to be observedin a special room mean that the studies shouldbe characterized as laboratory experiments?

One important difference between field andlaboratory experiments lies in the degree ofcontrol that the investigator holds. Humphreys& Weinstein (2009, p. 369) argue that in a fieldexperiment, “Researchers typically maintaincontrol over the assignment to treatment whileforgoing a measure of control over the treat-ment itself. Features such as the characteristicsof subjects, the information available to them,and the precise manner and context in whichthe treatment is applied are more likely to


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


take on values given by ‘nature’ rather thanbeing set at the discretion of the investigator.”Although many researchers would providesimilar definitions (e.g., de Rooij et al. 2009),there is by no means complete agreement onthe classification of experiments as laboratoryor field. For example, Harrison & List (2004)provide a taxonomy of experiments with fourcategories, and with six classification rules todetermine into which category the experimentfalls (pp. 1012–14). Here, however, we settleon Humphrey & Weinstein’s (2009) definitionof a field experiment, as this succinctly capturesthe distinction that we wish to highlight. Thus,the real-world quality that social scientists in-tuitively recognize as the distinguishing featureof field experiments stems from the researcher’slack of control over the experiment once thetreatment has been implemented within a nat-uralistic setting. In this light, the Hawthorneexperiments are best described as field exper-iments, albeit ones with a tighter degree ofcontrol than might be possible in other studies.

Concerns about the external validity oflaboratory experiments have always acted toincrease the appeal of field experiments tosociologists (for a discussion of influential fieldexperiments outside of sociology, see de Rooijet al. 2009, Humphreys & Weinstein 2009,Green & John 2010). Field experiments areunderstood to have the capacity to test causalquestions in a setting that is largely generaliz-able to human behavior as it occurs in the wild,so to speak. One area in which field experimentshave been particularly favored is the study ofdiscrimination, a phenomenon that is typicallyconstrued as wrong, unfair, and in many casesillegal. Given the deviant nature of discrimi-natory practices, almost all alternative obser-vational research methodologies can provideonly poor (and possibly misleading) evidenceof the extent and nature of discrimination.

A standard design for an experimental studyof discrimination is the correspondence test,and we here describe this design as it wouldbe applied to study race discrimination inrecruitment. The researcher takes the role of apotential employee, making applications for a

number of jobs. Typically, two or more lettersof application are sent to a single employer.The letters are substantively the same, varyingonly in respect to the treatment condition;thus, all characteristics of an individual thatmight be deemed relevant to job performance(e.g., qualifications or experience) and all char-acteristics of the firm are controlled, allowingthe researcher to evaluate the effect of the raceof the applicant alone (a matched-pairs design).In studies of race discrimination, the treatmentis most commonly the name of the applicant,because names are taken to be a strong signalof race; in Bertrand & Mullainathan (2004),for example, half of the applications wereattributed to Emily Walsh or Greg Baker andhalf to Lakisha Washington or Jamal Jones.Interest is in how far employers differ in theirbehavior toward the applicants: If identicallyqualified applicants receive systematicallydifferent responses from the employer, thedifference can be attributed to the treatment,and race discrimination is established. Virtu-ally all studies employing a correspondencetest design find evidence of discrimination(e.g., Daniel 1968; McIntosh & Smith 1974;Newman 1978, 1980; Commission for RacialEquality 1980, 1996; Firth 1981; Howitt &Owusu-Bempah 1990; Esmail & Everington1993; Noon 1993; Hoque & Noon 1999;Bertrand & Mullainathan 2004; see also Pager& Shepherd 2008). The correspondence testdesign has been adapted to study other formsof discrimination; for example, Correll et al.(2007) demonstrate that employers discrimi-nate against mothers, whereas Jackson (2009)uses a factorial design to show that character-istics related to class origin are used as a basisfor discriminatory judgments by employers.

Audit studies have employed similar ex-perimental designs to test whether employersdiscriminate when faced with job candidatesin person rather than written applications;in these studies, equally matched actors ofdifferent racial backgrounds apply for the samejob, and if invited, subsequently attend jobinterviews. Once again, results from theseexperiments provide robust evidence of racial

40 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


discrimination by employers (McIntosh &Smith 1974, Cross et al. 1990, Turner et al.1991, Commission for Racial Equality 1996,Riach & Rich 2002). In an influential socio-logical audit study, Pager (2003) extends thestandard audit study design to assess the extentof discrimination related both to race andto possession of a criminal record. She findsstrong evidence of employer discriminationagainst African Americans and against thosewith a criminal record (Pager 2003, p. 958).

The studies of discrimination throw intosharp relief some of the advantages of exper-imental design as applied to social sciencequestions. Most obviously, discriminationstudies directly test the level of discriminationfaced by potential employees by using a simpleexperimental design with easily interpretableresults. Furthermore, the experiments have astrong claim to external validity because they soclosely mimic the situation that would obtainin the real world, and thus the conclusions canbe readily applied to a wider societal context.But discrimination studies also demonstratesome of the weaknesses of field experiments.A lack of control over treatment is a potentialproblem, particularly for audit studies, whereactors play the role of job applicants andwhere the experimental results will thereforebe influenced by the quality and consistencyof the actor’s performance. By their nature,audit studies are not double-blind, as theactor must know to which condition they havebeen assigned in order to play the role, andhis or her behavior might be influenced bythis knowledge (e.g., Pager 2003, pp. 969–70;2007). Furthermore, Heckman & Seligman(1993) argue that while, for example, twoactors are matched on characteristics related toproductivity in order to isolate the effect of raceon the chances of a successful job application,productivity-relevant characteristics of theactors that are unobserved by the researcherbut observed by employers mean that this testof discrimination is not a pure one (see alsoHeckman et al. 1998, Neumark 2011). A finalconcern voiced in relation to discriminationstudies is not a design issue but an ethical one: Is

it morally acceptable to deceive the participantsin the experiment, who are indeed unaware thatthey are participants at all? Although detaileddiscussion of this point may be best left tothose with special expertise in ethical issues,it bears noting that field experiments relyingon a degree of deception for their success facepractical hurdles put in place by universities,professional bodies, and presses.

A form of field experiment that we do notconsider in detail here is the trial, or policyintervention study, also labeled the social exper-iment (e.g., Greenberg & Shroder 2004). Trialsare to all intents and purposes field experiments,but they tend to be conducted on a larger scalethan most social science field experiments, andthey aim to evaluate the effects of particularpolicy interventions on social outcomes. Theseexperiments are particularly important fordisciplines in which there is significant interestin altering human behavior and/or socialprocesses, such as criminology and educationalresearch. For a discussion of design issues inthe context of social experiments, see Bloom(2005), as well as Michalopoulos (2005),Riecken & Boruch (1978), Cook & Shadish(1994), and for summaries of specific socialexperiments, see Greenberg & Shroder (2004).For a thoughtful discussion of the importanceof experimentation within criminologicalresearch, see Sherman (2007, 2009); examplesof experimental criminology are describedin Barnes et al. (2010) and Farrington &Welsh (2005). On experiments in educationalresearch, the influential work of Mosteller andcolleagues on the Tennessee class size projectdemonstrates the utility of experimentaldesigns in assessing and guiding educationalpolicy (Mosteller 1995, Mosteller et al. 1996,Cook 2002).

Survey Experiments

The final class of experiments that we consideris the survey experiment, also known as thepopulation-based survey experiment. A basicdefinition comes from Mutz’s (2011, p. 2)recent book on survey experiments: “[A]


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


population-based survey experiment is anexperiment that is administered to a represen-tative population sample. . . . [It] uses surveysampling methods to produce a collection ofexperimental subjects that is representative ofthe target population of interest for a particulartheory.” Just as with field experiments, in a sur-vey experiment the investigator has full controlover the experimental treatments and overassignment of treatments to experimental units,but little control over the treatments or subjectsonce the experiment is under way. Survey ex-perimentation has been made more feasible forsocial scientists with a significant improvementin the infrastructure available to conduct suchexperiments; computer-aided survey tech-niques allow for greater flexibility of designthan in the past, and organizations that supportdata gathering through survey experiments aregrowing in prominence (for example, Knowl-edge Networks and TESS: Time-SharingExperiments for the Social Sciences).4

In a survey experiment, the researcher em-beds the experimental design within a samplesurvey, often alongside standard survey instru-ments (such as those measuring demographicinformation or voting intentions). Treatmentsare assigned randomly to individuals participat-ing in the survey. One common design, knownas the factorial survey, employs vignettes tomeasure attitudes toward subjects or groups(for more on the use of vignettes in survey ex-periments, see Alves & Rossi 1978, Rossi 1979,Rossi & Anderson 1982, Rossi & Nock 1982).Vignettes are short, hypothetical descriptionsof individuals or situations, and in a factorial

4Many present-day survey experiments are conducted online,although not all online (or Web-based) experiments are sur-vey experiments. Increasingly, researchers are implementingdesigns online that have been previously implemented in thelaboratory, with no attempt to reach a representative sampleof respondents as in a standard survey experiment. Onlineexperiments of this type are likely to be subject to designconsiderations that differ somewhat from experiments imple-mented in other contexts, not least because of technologicalconstraints (see, e.g., Kohavi et al. 2009). It is not yet clearwhether online experiments will earn a place alongside lab-oratory, field, and survey experiments as a standard contextfor sociological data collection.

survey, the researcher designs a number ofvignettes in order to manipulate respondents’attitudes toward an “object of judgment”(Mutz 2011, p. 54). For example, Jasso & Opp(1997) used a factorial survey to measure normsaround political protest in the city of Leipzig.They identified several potential features thatcould lead individuals to feel duty-bound toundertake a political protest, including thelevel of discontent with the system, the typesof protest actions available, and the level ofpersonal risk expected by participating in polit-ical protest. They then designed vignettes thatvaried these features: 504 different vignetteswere constructed to represent all combinationsof the desired treatments, of which 36 vignetteswere excluded for reasons of plausibility.Each respondent was asked to evaluate 10 ofthe vignettes chosen at random, and theseevaluations comprise the outcome variable ofthe analysis. Interest is therefore in assessingthe effect of the treatments, both alone andin combination with other treatments, onevaluative judgments (or attitudes).

The possibility of multifactorial designs, inwhich multiple treatments are identified andevery combination of treatments is tested, isa significant advantage of survey experiments.The mode of administration of the surveyexperiment allows for much larger samplesizes than would be feasible in a laboratoryor perhaps even a field experiment, and theselarger sample sizes provide the statistical powerto detect even modest effects. In addition, eachrespondent to the survey will evaluate a set ofvignettes, the size of the set being determinedby the researcher, allowing for tests of within-and between-subject differences in responses.In statistical analysis of the treatments, bothmain effects and interaction effects can beexamined, which may uncover significant butpreviously unexpected results. Thus, surveyexperiments fulfill many of the criteria identi-fied earlier as features of strong experimentaldesign.

But perhaps the most attractive featureof survey experiments for social scientists isthat, like field experiments, they are seen to

42 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


benefit from a high degree of external validitybecause the experiments are carried out on arepresentative sample of people outside ofa laboratory. This high degree of externalvalidity is coupled with a similarly high degreeof internal validity; the researcher maintainscontrol over the experimental conditions inmuch the same way as in the laboratory. Con-sequently, the results of the survey experimentare seen to be both easily generalizable andappropriate for testing causal hypotheses.

Although statements asserting the internaland external validity of survey experimentswould be seen as largely uncontroversialamong social scientists, the validity issuesraised by survey experiments are in fact morecomplicated than one would at first suppose.As discussed, external validity is influencedboth by experimental units and by context, andalthough a representative sample speaks to thegeneralizability of findings from the chosenexperimental units, the artificiality of thesurvey environment does raise questions aboutgeneralizability of context: The survey exper-iment is not conducted in the real world, aswith field experiments, but in an environmentat least partially constructed by the researcher.Some research has indicated that the surveyenvironment might in fact be more problemat-ically artificial than that of the laboratory: Forexample, Collett & Childs (2011, p. 520) showthat subjects respond differently to laboratoryand vignette treatments that are designed tobe equivalent, with stronger (and theoreticallyexpected) affective responses observed in thelaboratory, where the subjects experience thetreatments in a more emotionally intenseway. Experiments conducted in the laboratorymight also encourage increased engagementand involvement on the part of subjects. There-fore, although surveys are administered in whatmight appear to be more naturalistic settingsthan the laboratory, behavior may still be in-fluenced by the mode of administration of thesurvey, a lack of care and attention to the ques-tions, the necessary artificiality of the questions,or social desirability effects, as well as by otherpotential problems associated with standard

survey designs (e.g., Finch 1987, Eifler 2010).For a detailed discussion of external validity insurvey experiments, see Mutz (2011, ch. 8).

Social science studies employing surveyexperiment designs include Sniderman et al.(1991), Williams et al. (2008), and Hopkins& King (2010). For impressive reviews of thedesign of survey experiments, particularly inrelation to the study of political and socialattitudes, see Sniderman & Grob (1996) andSniderman (2011), and for an up-to-datesummary of the use of survey experiments inthe social sciences, see Mutz (2011).

SOME GENERALCONSIDERATIONS

Despite the primacy of observational methodswithin the social sciences, and sociology inparticular, experimental methods have had sub-stantial and growing influence. In this review,we have provided examples of the types of socialscience questions amenable to experimentalinvestigation and the design tools available toaddress those questions. But before embarkingon a headlong rush to experimentation, the cau-tious sociologist may wish to consider not justthe issues of experimental design reviewed here,but also the role of experiments in the produc-tion of scientific knowledge. The experimentalmethod is extremely powerful, particularly inlargely eliminating the contaminating role ofunobserved confounders and thereby makingcausal explanation more secure, but a goodpart of that power derives from the integrationof experimental methods with establishedscientific practices. Three (related) features ofscientific practices should be highlighted: therole of theory, the importance of replication,and the role of disciplinary boundaries.

Rigorous research questions and/or hy-potheses linked to theory are an importantfoundation of scientific discovery, and wherethe aim is to draw theoretically relevantand meaningful conclusions from empiricalresearch, the theory and method must beintegrated. For example, experiments maybe used to test hypotheses derived from


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


well-established theories or to establish base-line empirical regularities that will potentiallyform the basis of new theories. One conse-quence of the integration of theory and methodis that results from different studies are com-mensurable, and thus cumulative knowledge ismade possible. In the social sciences, perhapswith the exception of economics and socialpsychology, there are few commonly acceptedtheories from which hypotheses can be derivedand tested. Instead, there is a proliferationof theories, and it is not uncommon to seeexperimental tests of theories that have beenestablished only a few pages before the exper-imental results are reported. In part this maybe attributed to the perceived higher statusattached to theory development in contrast tothe testing of established theories. But the re-sult is a lack of coherence, an incomprehensiblepattern of small points of light in a dark sky.

The career incentives to invest in the repli-cation of existing experimental results are alsolow, yet independent replication of experimen-tal findings is absolutely essential for scientificprogress. Fisher identified independent repli-cation as one of the four central principles ofexperimental design, and Campbell & Stanley(1963, p. 3) similarly emphasize the impor-tance of replication in their classic volume onexperimentation in the social sciences:

[W]e must increase our time perspective, andrecognize that continuous, multiple experi-mentation is more typical of science thanonce-and-for-all definitive experiments. Theexperiments we do today, if successful, willneed replication and cross-validation at othertimes under other conditions before they canbecome an established part of science, be-fore they can be theoretically interpreted withconfidence.

Independent replication does not simplyentail exact repetition of the same design.For example, a factorial experiment may beset up partly to confirm previous conclusionsbut also to examine new features. It is largelythrough independent replication that the

experimental method can be allowed to claim aprivileged role in the testing of scientific theoryin comparison with observational methods,although without some change in the rewardsthat replication confers, any simple injunctionin support of replication will have little effect.But without independent replication, we areunable to judge the stability of the experimen-tal conclusions or the extent to which thoseconclusions can be extended outside the scopeconditions of the initial experiment. From theperspective of a cumulative approach to scien-tific knowledge, a single experiment is not likelyto offer conclusive evidence in support of ageneral theory, and it is important that findingsfrom both identical and distinct (but related)studies reinforce the same finding. The valueof experimental research is to be found in thecollectivity.

Disciplinary boundaries within the socialsciences may to some extent limit the cur-rent usefulness of experimental methods. Theseboundaries discourage the establishment ofgeneral theories of human behavior by em-bedding individual theories within discipline-specific debates and terminology, consequentlyincreasing the cost of independent replicationby those outside the discipline. Disciplinaryboundaries also limit the extent to which re-sults of experiments are shared across special-ized fields, so that a test of, say, cognitive disso-nance reduction theory in economics may takeplace without reference to the extensive bodyof social psychological work testing the theory.Finally, even within social science disciplines,methodological and theoretical overspecializa-tion works to limit the scientific usefulness ofexperiments. Social scientists are apt to describethemselves and others as x methodologists or ytheorists, sometimes with an implication of all-inclusive explanation, but it is dangerous indeedif a discipline places the power to test theoriesin the hands of experimentalists, and the powerto describe patterns of empirical regularities inthe hands of observationalists.

The emphasis throughout this discussionhas been on the design of individual experi-mental studies leading to secure conclusions as

44 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


far as possible free of worrying caveats. Nev-ertheless, progress typically depends on thesynthesis of conclusions from various sources.Cochran (1965) remarked that he had ear-lier asked R.A. Fisher how he thought theconclusions from observational studies couldbe made more secure. The characteristically

gnomic reply: “Make your theories elaborate.”Cochran reported that Fisher meant that theevidence used should be as wide-ranging innature as feasible. This must surely apply toboth observationally and experimentally basedevidence and strongly encourages mixtures ofapproaches.

DISCLOSURE STATEMENT

The authors are not aware of any affiliations, memberships, funding, or financial holdings thatmight be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS

We thank Karen Cook, Shelley Correll, David Grusky, Oscar Grusky, Colin Mills, Cecilia Ridge-way, and the anonymous reviewers for their very helpful comments on earlier drafts of this review.

LITERATURE CITED

Adair JG, Sharpe D, Huynh C-L. 1989. Hawthorne control procedures in educational experiments: a recon-sideration of their use and effectiveness. Rev. Educ. Res. 59:215–28

Aiello JR, Douthitt EA. 2001. Social facilitation from Triplett to electronic performance monitoring. GroupDyn. 5:163–80

Allport FH. 1920. The influence of the group upon association and thought. J. Exp. Psychol. 3:159–82Allport FH. 1954. The historical background of modern social psychology. In The Handbook of Social Psychology,

ed. G Lindzey, pp. 3–56. Reading, MA: Addison-WesleyAlves WM, Rossi PH. 1978. Who should get what? Fairness judgments in the distribution of earnings. Am.

J. Sociol. 84:541–64Asch SE. 1951. Effects of group pressure upon the modification and distortion of judgment. In Groups, Lead-

ership and Men, ed. H Guetzkow, pp. 177–90. Pittsburgh, PA: Carnegie PressAsch SE. 1956. Studies of independence and conformity: a minority of one against a unanimous majority.

Psychol. Monogr. 70:416Bandura A, Ross D, Ross SA. 1961. Transmission of aggression through imitation of aggressive models.

J. Abnorm. Soc. Psychol. 63:575–82Barnes GC, Ahlman L, Gill C, Sherman LW, Kurtz E, Malvestuto R. 2010. Low-intensity community super-

vision for low-risk offenders: a randomized, controlled trial. J. Exp. Criminol. 6:159–89Berger J. 2007. The standardized experimental situation in expectation states research: Notes on history, uses,

and special features. See Webster & Sell 2007, pp. 353–78Berger J, Ficek MH, Norman RZ, Zelditch M. 1977. Status Characteristics and Social Interaction. New York:

ElsevierBerger J, Wagner D. 1993. Status characteristics theory: the growth of a program. In Theoretical Research

Programs: Studies in the Growth of Theory, ed. J Berger, M Zelditch, pp. 23–63. Stanford, CA: StanfordUniv. Press

Berger J, Webster M. 2006. Expectations, status, and behaviour. In Contemporary Social Psychological Theories,ed. PJ Burke, pp. 268–300. Stanford, CA: Stanford Univ. Press

Bertrand M, Mullainathan S. 2004. Are Emily and Greg more employable than Lakisha and Jamal? A fieldexperiment on labor market discrimination. Am. Econ. Rev. 94:991–1013

Blau PM. 1955. The Dynamics of Bureaucracy. Chicago: Univ. Chicago PressBlau PM. 1964. Exchange and Power in Social Life. New York: Wiley


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Bloom HS, ed. 2005. Learning More from Social Experiments. Evolving Analytic Approaches. New York: RussellSage Found.

Bonacich P, Light J. 1978. Laboratory experimentation in sociology. Annu. Rev. Sociol. 4:145–70Bradford Hill A. 1965. The environment and disease: association or causation? Proc. R. Soc. Med. 58:295–300Brehm SS, Kassin SM, Fein S. 1999. Social Psychology. Boston, MA: Houghton Mifflin. 4th ed.Brown SR, Melamed LE. 1990. Experimental Design and Analysis (Quantitative Applications in the Social Sciences).

Newbury Park, CA: SageCampbell D, Stanley J. 1963. Experimental and Quasi-Experimental Designs for Research. Chicago: Rand-

McNallyChapin FS. 1936. Social theory and social action. Am. Sociol. Rev. 1:1–11Cochran WG. 1965. The planning of observational studies of human populations. J. R. Stat. Soc. A 128:234–66Cohen BP. 1988. Developing Sociological Knowledge: Theory and Method. Chicago: Nelson-HallCollett JL, Childs E. 2011. Minding the gap: meaning, affect, and the potential shortcomings of vignettes. Soc.

Sci. Res. 40:513–22Commission for Racial Equality. 1980. Half a Chance? A Report on Job Discrimination Against Young Blacks in

Nottingham. Nottingham: CRECommission for Racial Equality. 1996. We Regret to Inform You. . . London: CRECook KS. 1987. Social Exchange Theory. Beverly Hills, CA: SageCook KS, Cooper RM. 2003. Experimental studies of cooperation, trust and social exchange. In Trust,

Reciprocity and Gains from Association: Interdisciplinary Lessons from Experimental Research, ed. E Ostrom,J Walker, pp. 277–333. New York: Russell Sage Found.

Cook KS, Emerson RM. 1978. Power, equity and commitment in exchange networks. Am. Sociol. Rev. 43:721–39

Cook KS, Emerson RM, Gillmore MR, Yamagishi T. 1983. The distribution of power in exchange networks:theory and experimental results. Am. J. Sociol. 89:275–305

Cook KS, O’Brien J, Kollock P. 1990. Exchange theory: a blueprint for structure and process. In Frontiers ofSocial Theory: The New Synthesis, ed. G Ritzer, pp. 158–81. New York: Columbia Univ. Press

Cook KS, Yamagishi T. 2008. In defense of deception. Soc. Psychol. Q. 71:215–21Cook TD. 2002. Randomized experiments in educational policy research: a critical examination of the reasons

the educational evaluation community has offered for not doing them. Educ. Eval. Policy Anal. 24:175–99Cook TD, Shadish WR. 1994. Social experiments: some developments over the past fifteen years. Annu. Rev.

Psychol. 45:545–80Correll SJ, Benard S, Paik I. 2007. Getting a job: Is there a motherhood penalty? Am. J. Sociol. 112:1297–38Cox DR. 1958. Planning of Experiments. New York: WileyCross H, Kenny G, Mell J, Zimmerman W. 1990. Differential Treatment of Hispanic and Anglo Job Seekers:

Hiring Practices in Two Cities. Washington, DC: Urban Inst. PressDaniel WW. 1968. Racial Discrimination in England. Based on the PEP Report. Harmondsworth: Penguinde Rooij EA, Green DP, Gerber AS. 2009. Field experiments on political behavior and collective action. Annu.

Rev. Polit. Sci. 12:389–95Dickson WJ, Roethlisberger FJ. 1966. Counselling in an Organization: A Sequel to the Hawthorne Researches.

Boston, MA: Harv. Bus. Sch. PressDruckman JN, Green DP, Kuklinski JH, Lupia A. 2006. The growth and development of experimental

research in political science. Am. Polit. Sci. Rev. 100:627–35Eckel C. 2007. Economic games for social scientists. See Webster & Sell 2007, pp. 497–515Eifler S. 2010. Validity of a factorial survey approach to the analysis of criminal behavior. Methodology 6:139–46Esmail A, Everington S. 1993. Racial discrimination against doctors from ethnic minorities. BMJ 306:691–92Farrington D, Welsh B. 2005. Randomized experiments in criminology: What have we learned in the last two

decades? J. Exp. Criminol. 1:9–28Finch J. 1987. The vignette technique in survey research. Sociology 21:105–14Firth M. 1981. Racial discrimination in the British labour market. Ind. Labor Relat. Rev. 34:265–72Fisher RA. 1926. The arrangement of field experiments. J. Minist. Agric. G. B. 33:503–13Fisher RA. 1935. Design of Experiments. Edinburgh: Oliver & Boyd

46 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Franke RH, Kaul JD. 1978. The Hawthorne experiments: first statistical interpretation. Am. Sociol. Rev.43:623–43

Friedman D, Sunder S. 1994. Experimental Methods: A Primer for Economists. Cambridge, UK: CambridgeUniv. Press

Gillespie R. 1991. Manufacturing Knowledge: A History of the Hawthorne Experiments. New York: CambridgeUniv. Press

Green DP, Gerber AS. 2003. The underprovision of experiments in political science. Ann. Am. Acad. Polit.Soc. Sci. 589:94–112

Green DP, John P. 2010. Field experiments in comparative politics and policy. Ann. Am. Acad. Polit. Soc. Sci.628:6–10

Greenberg DH, Shroder M. 2004. The Digest of Social Experiments. Washington, DC: Urban Inst. PressGuala F. 2005. The Methodology of Experimental Economics. Cambridge, UK: Cambridge Univ. PressHaney C, Banks C, Zimbardo P. 1973. Interpersonal dynamics in a simulated prison. Int. J. Criminol. Penol.

1:69–97Harrison GW, List JA. 2004. Field experiments. J. Econ. Lit. 42:1009–55Heckman J, Ichimura H, Smith J, Todd P. 1998. Characterizing selection bias using experimental data.

Econometrica 6:1017–99Heckman J, Seligman P. 1993. The Urban Institute Audit Studies: their methods and findings. In Clear

and Convincing Evidence: Measurement of Discrimination in America, ed. M Fix, RJ Struyk, pp. 187–258.Washington, DC: Urban Inst. Press

Holt CA. 2007. Markets, Games, and Strategic Behavior. Boston, MA: PearsonHomans GC. 1958. Social behavior as exchange. Am. J. Sociol. 63:597–606Hopkins D, King G. 2010. Improving anchoring vignettes: designing surveys to correct interpersonal incom-

parability. Public Opin. Q. 74:201–22Hoque K, Noon M. 1999. Racial discrimination in speculative applications: new optimism six years on? Hum.

Resour. Manag. J. 9:71–82Howitt D, Owusu-Bempah J. 1990. The pragmatics of institutional racism: beyond words. Hum. Relat. 9:885–

99Humphreys M, Weinstein JM. 2009. Field experiments and the political economy of development. Annu. Rev.

Polit. Sci. 12:367–78Iyengar S. 2011. Laboratory experiments in political science. In Cambridge Handbook of Experimental Political

Science, ed. JN Druckman, DP Green, JH Kuklinski, A Lupia, pp. 73–88. New York: Cambridge Univ.Press

Jackson M. 2009. Disadvantaged through discrimination. the role of employers in social stratification. Br. J.Sociol. 60:669–92

Jasso G, Opp K-D. 1997. Probing the character of norms: a factorial survey analysis of the norms of politicalaction. Am. Sociol. Rev. 62:947–64

Jones SRG. 1992. Was there a Hawthorne effect? Am. J. Sociol. 98:451–68Kagel JK, Roth AE, eds. 1995. Handbook of Experimental Economics. Princeton, NJ: Princeton Univ. PressKohavi R, Longbotham R, Sommerfield D, Henne RM. 2009. Controlled experiments on the web: survey

and practical guide. Data Min. Knowl. Discov. 18:140–81Ledolter J, Swersey AJ. 2007. Testing 1–2–3. Stanford, CA: Stanford Univ. PressLevitt SD, List JA. 2007. What do laboratory experiments measuring social preferences reveal about the real

world? J. Econ. Perspect. 21:153–74Levitt SD, List JA. 2011. Was there really a Hawthorne effect at the Hawthorne plant? An analysis of the

original illumination experiments. Am. Econ. J. 3:224–38Lowell AL. 1910. The physiology of politics. Am. Polit. Sci. Rev. 4:1–15Marshall A. 1890. Principles of Economics: An Introductory Volume. London: MacmillanMartin MW, Sell J. 1979. The role of the experiment in the social sciences. Sociol. Q. 20:581–90Mayo E. 1933. The Human Problems of an Industrial Civilization. New York: MacmillanMcDermott R. 2007. Experimental political science. See Webster & Sell 2007, pp. 483–96McIntosh N, Smith DJ. 1974. The Extent of Racial Discrimination. PEP, Broadsh. No. 547. London: PEP


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Michalopoulos C. 2005. Precedents and prospects for randomized experiments. In Learning More from SocialExperiments. Evolving Analytic Approaches, ed. HS Bloom, pp. 1–36. New York: Russell Sage Found.

Milgram S. 1974. Obedience to Authority; An Experimental View. New York: Harper & RowMolm LD. 2007. Experiments on exchange relations and exchange networks in sociology. See Webster & Sell

2007, pp. 379–406Molm LD, Cook KS. 1995. Social exchange theory. In Sociological Perspectives on Social Psychology, ed. KS Cook,

G Fine, J House, pp. 209–35. Needham, MA: Allyn & BaconMolm LD, Peterson G, Takahashi N. 2001. The value of exchange. Soc. Forces 80:159–84Molm LD, Takahashi N, Peterson G. 2000. Risk and trust in social exchange: an experimental test of a classical

proposition. Am. J. Sociol. 105:1396–427Morton RB, Williams KC. 2010. Experimental Political Science and the Study of Causality: From Nature to the

Lab. New York: Cambridge Univ. PressMosteller F. 1995. The Tennessee study of class size in the early school grades. Crit. Issues Child. Youths

5:113–127.Mosteller F, Light RJ, Sachs JA. 1996. Sustained inquiry in education: lessons from skill grouping and class

size. Harv. Educ. Rev. 66:797–842Mutz DC. 2011. Population-Based Survey Experiments. Princeton, NJ: Princeton Univ. PressNeumark D. 2011. Detecting discrimination in audit and correspondence studies. NBER Work. Pap. 16448, Natl.

Bur. Econ. Res., Cambridge, MA. http://www.nber.org/papers/w16448Newman J. 1978. Discrimination in recruitment: an empirical analysis. Ind. Labor Relat. Rev. 32:15–23Newman J. 1980. Reply. Ind. Labor Relat. Rev. 33:547–50Noon M. 1993. Racial discrimination in speculative application: evidence from the UK’s top 100 firms. Hum.

Resour. Manag. J. 3:35–47Oakley A. 1998. Experimentation and social interventions: a forgotten but important history. BMJ 317:1239–

42Pager D. 2003. The mark of a criminal record. Am. J. Sociol. 108:937–75Pager D. 2007. The use of field experiments for studies of employment discrimination: contributions, critiques,

and directions for the future. Ann. Am. Acad. Polit. Soc. Sci. 609:104–33Pager D, Shepherd H. 2008. The sociology of discrimination: racial discrimination in employment, housing,

credit, and consumer markets. Annu. Rev. Sociol. 34:181–209Palfrey TR. 2009. Laboratory experiments in political economy. Annu. Rev. Polit. Sci. 12:379–88Pearl J, Bareinboim E. 2011. External validity and transportability: a formal approach. 58th World Stat. Congr.,

Int. Stat. Inst., Aug., Dublin. The Hague, Neth.: Int. Stat. Inst.Riach P, Rich J. 2002. Field experiments of discrimination in the market place. Econ. J. 112:480–518Ridgeway CL. 1991. The social construction of status value: gender and other nominal characteristics. Soc.

Forces 70:367–86Ridgeway CL. 2001. Gender, status and leadership. J. Soc. Issues 57:637–55Ridgeway CL, Backor K, Li YE, Tinkler JE, Erickson KG. 2009. How easily does a social difference become

a status distinction? Gender matters. Am. Sociol. Rev. 74:44–62Ridgeway CL, Boyle EH, Kuipers K, Robinson D. 1998. How do status beliefs develop? The role of resources

and interaction. Am. Sociol. Rev. 63:331–50Ridgeway CL, Correll SJ. 2006. Consensus and the creation of status beliefs. Soc. Forces 85:431–54Ridgeway CL, Erickson KG. 2000. Creating and spreading status beliefs. Am. J. Sociol. 106:579–615Riecken HW, Boruch RF. 1978. Social experiments. Annu. Rev. Sociol. 4:511–32Roethlisberger FJ, Dickson WJ. 1939. Management and the Worker. Cambridge, MA: Harvard Univ. PressRossi PH. 1979. Vignette analysis: uncovering the normative structure of complex judgments. In Qualitative

and Quantitative Social Research: Papers in Honor of Paul F. Lazarsfeld, ed. RK Merton, JS Coleman,PH Rossi, pp. 176–86. New York: Free Press

Rossi PH, Anderson AB. 1982. The factorial survey approach: an introduction. See Rossi & Nock 1982,pp. 15–67

Rossi PH, Nock SL, eds. 1982. Measuring Social Judgments: The Factorial Survey Approach. Beverly Hills,CA: Sage

48 Jackson · Cox

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.

http://www.nber.org/papers/w16448


Roth AE. 1993. On the early history of experimental economics. J. Hist. Econ. Thought 15:184–209Roth AE. 1995. Introduction to experimental economics. See Kagel & Roth 1995, pp. 3–109Sell J. 2008. Introduction to deception debate. Soc. Psychol. Q. 71:213–14Sherman LW. 2007. The power few: experimental criminology and the reduction of harm. J. Exp. Criminol.

3:299–21Sherman LW. 2009. Evidence and liberty: the promise of experimental criminology. Criminol. Crim. Justice

9:5–28Sniderman PM. 2011. The logic and design of the survey experiment: an autobiography of a methodolog-

ical innovation. In Cambridge Handbook of Experimental Political Science, ed. JN Druckman, DP Green,JH Kuklinski, A Lupia, pp. 102–14. New York: Cambridge Univ. Press

Sniderman PM, Grob DB. 1996. Innovations in experimental design in attitude surveys. Annu. Rev. Sociol.22:377–99

Sniderman PM, Piazza T, Tetlock PE, Kendrick A. 1991. The new racism. Am. J. Polit. Sci. 35:423–47Strube MT. 2005. What did Triplett really find? A contemporary analysis of the first experiment in social

psychology. Am. J. Psychol. 118:271–86Thye SR. 2007. Logical and philosophical foundations of experimental research in the social sciences. See

Webster & Sell 2007, pp. 57–86Triplett N. 1898. The dynamogenic factors in pacemaking and competition. Am. J. Psychol. 9:507–33Turner M, Fix M, Struyk R. 1991. Opportunities Denied, Opportunities Diminished: Racial Discrimination in

Hiring. Washington, DC: Urban Inst. PressWalker HA, Cohen BP. 1985. Scope statements: imperatives for evaluating theory. Am. Sociol. Rev. 50:288–301Walker HA, Willer D. 2007. Experiments and the science of sociology. See Webster & Sell 2007, pp. 25–56Webster JR, Kervin JB. 1971. Artificiality in experimental sociology. Can. Rev. Sociol. 8:263–72Webster JR, Sell J, eds. 2007. Laboratory Experiments in the Social Sciences. Amsterdam: Academic/ElsevierWilliams MTE, Turkheimer E, Magee E, Guterbock T. 2008. The effects of race and racial priming on

self-report of contamination anxiety. Personal. Individ. Differ. 44:746–57Zelditch M. 2007. Laboratory experiments in sociology. See Webster & Sell 2007, pp. 517–31


Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.

SO39-FrontMatter ARI 5 June 2013 17:48

Annual Reviewof Sociology

Volume 39, 2013Contents

FrontispieceCharles Tilly � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � xiv

Prefatory Chapter

Formations and Formalisms: Charles Tilly and the Paradoxof the ActorJohn Krinsky and Ann Mische � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 1

Theory and Methods

The Principles of Experimental Design and Their Applicationin SociologyMichelle Jackson and D.R. Cox � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �27

The New Sociology of MoralitySteven Hitlin and Stephen Vaisey � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �51

Social Processes

Social Scientific Inquiry Into Genocide and Mass Killing:From Unitary Outcome to Complex ProcessesPeter B. Owens, Yang Su, and David A. Snow � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �69

Interest-Oriented ActionLyn Spillman and Michael Strand � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �85

Drugs, Violence, and the StateBryan R. Roberts and Yu Chen � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 105

Healthcare Systems in Comparative Perspective: Classification,Convergence, Institutions, Inequalities, and Five Missed TurnsJason Beckfield, Sigrun Olafsdottir, and Benjamin Sosnaud � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 127

Institutions and Culture

Multiculturalism and Immigration: A Contested Fieldin Cross-National ComparisonRuud Koopmans � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 147

Sociology of Fashion: Order and ChangePatrik Aspers and Frederic Godart � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 171

v

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Religion, Nationalism, and Violence: An Integrated ApproachPhilip S. Gorski and Gulay Turkmen-Dervisoglu � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 193

Formal Organizations

Race, Religious Organizations, and IntegrationKorie L. Edwards, Brad Christerson, and Michael O. Emerson � � � � � � � � � � � � � � � � � � � � � � � � � 211

Political and Economic Sociology

An Environmental Sociology for the Twenty-First CenturyDavid N. Pellow and Hollie Nyseth Brehm � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 229

Economic Institutions and the State: Insights from Economic HistoryHenning Hillmann � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 251

Differentiation and Stratification

Demographic Change and Parent-Child Relationships in AdulthoodJudith A. Seltzer and Suzanne M. Bianchi � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 275

Individual and Society

Gender and CrimeCandace Kruttschnitt � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 291

White-Collar Crime: A Review of Recent Developments andPromising Directions for Future ResearchSally S. Simpson � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 309

From Social Structure to Gene Regulation, and Back: A CriticalIntroduction to Environmental Epigenetics for SociologyHannah Landecker and Aaron Panofsky � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 333

Racial Formation in Perspective: Connecting Individuals, Institutions,and Power RelationsAliya Saperstein, Andrew M. Penner, and Ryan Light � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 359

The Critical Sociology of Race and Sport: The First Fifty YearsBen Carrington � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 379

Demography

The Causal Effects of Father AbsenceSara McLanahan, Laura Tach, and Daniel Schneider � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 399

International Migration and Familial Change in Communitiesof Origin: Transformation and ResistancePatricia Arias � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 429

Trends and Variation in Assortative Mating: Causes and ConsequencesChristine R. Schwartz � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 451

vi Contents

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.


Gender and International Migration: Contributions andCross-FertilizationsGioconda Herrera � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 471

LGBT Sexuality and Families at the Start of the Twenty-First CenturyMignon R. Moore and Michael Stambolis-Ruhstorfer � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 491

Urban and Rural Community Sociology

Housing: Commodity versus RightMary Pattillo � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 509

Indexes

Cumulative Index of Contributing Authors, Volumes 30–39 � � � � � � � � � � � � � � � � � � � � � � � � � � � 533

Cumulative Index of Article Titles, Volumes 30–39 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 537

Errata

An online log of corrections to Annual Review of Sociology articles may be found athttp://soc.annualreviews.org/errata.shtml

Contents vii

Ann

u. R

ev. S

ocio

l. 20

13.3

9:27

-49.

Dow

nloa

ded

from

ww

w.a

nnua

lrev

iew

s.or

g A

cces

s pr

ovid

ed b

y U

nive

rsity

of

Oxf

ord

- B

odle

ian

Lib

rary

on

12/1

3/17

. For

per

sona

l use

onl

y.

the principles of experimental design and their ... · principles of design, laboratory...

Documents