the neural basis of decision makingcourses.cs.washington.edu/courses/cse528/07sp/gold... · 2007....

40
ANRV314-NE30-21 ARI 19 March 2007 22:1 R E V I E W S I N A D V A N C E The Neural Basis of Decision Making Joshua I. Gold 1 and Michael N. Shadlen 2 1 Department of Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6072; email: [email protected] 2 Howard Hughes Medical Institute and Department of Physiology and Biophysics, University of Washington, Seattle, Washington 98195-7290; email: [email protected] Annu. Rev. Neurosci. 2007. 30:535–74 The Annual Review of Neuroscience is online at neuro.annualreviews.org This article’s doi: 10.1146/annurev.neuro.29.051605.113038 Copyright c 2007 by Annual Reviews. All rights reserved 0147-006X/07/0721-0535$20.00 Key Words psychophysics, signal detection theory, sequential analysis, motion perception, vibrotactile perception, choice, reaction time Abstract The study of decision making spans such varied fields as neuro- science, psychology, economics, statistics, political science, and com- puter science. Despite this diversity of applications, most decisions share common elements including deliberation and commitment. Here we evaluate recent progress in understanding how these basic elements of decision formation are implemented in the brain. We focus on simple decisions that can be studied in the laboratory but emphasize general principles likely to extend to other settings. 535

Upload: others

Post on 10-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    RE V

    I E W

    S

    IN

    AD V A

    NC

    E

    The Neural Basis ofDecision MakingJoshua I. Gold1 and Michael N. Shadlen21Department of Neuroscience, University of Pennsylvania, Philadelphia,Pennsylvania 19104-6072; email: [email protected] Hughes Medical Institute and Department of Physiology and Biophysics,University of Washington, Seattle, Washington 98195-7290;email: [email protected]

    Annu. Rev. Neurosci. 2007. 30:535–74

    The Annual Review of Neuroscience is online atneuro.annualreviews.org

    This article’s doi:10.1146/annurev.neuro.29.051605.113038

    Copyright c© 2007 by Annual Reviews.All rights reserved

    0147-006X/07/0721-0535$20.00

    Key Words

    psychophysics, signal detection theory, sequential analysis, motionperception, vibrotactile perception, choice, reaction time

    AbstractThe study of decision making spans such varied fields as neuro-science, psychology, economics, statistics, political science, and com-puter science. Despite this diversity of applications, most decisionsshare common elements including deliberation and commitment.Here we evaluate recent progress in understanding how these basicelements of decision formation are implemented in the brain. Wefocus on simple decisions that can be studied in the laboratory butemphasize general principles likely to extend to other settings.

    535

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Contents

    INTRODUCTION. . . . . . . . . . . . . . . . . 536Elements of a Decision . . . . . . . . . . . 536Conceptual Framework . . . . . . . . . . . 538

    EXPERIMENTS . . . . . . . . . . . . . . . . . . . 542Perceptual Tasks. . . . . . . . . . . . . . . . . . 542Simple Motor Latencies: Deciding

    When to Initiate an Action. . . . . 556Value-Based Decisions . . . . . . . . . . . . 560

    CONCLUSIONS. . . . . . . . . . . . . . . . . . . 561

    INTRODUCTION

    A decision is a deliberative process that resultsin the commitment to a categorical proposi-tion. An apt analogy is a judge or jury that musttake time to weigh evidence for alternativeinterpretations and/or possible ramificationsbefore settling on a verdict. Here we evalu-ate progress in understanding how this pro-cess is implemented in the brain. Our scopeis somewhat narrow: We consider primarilystudies that relate behavior on simple sensory-motor tasks to activity measured in the brainbecause of the ability to precisely control sen-sory input, quantify motor output, and targetrelevant brain regions for measurement andanalysis. Nevertheless, our intent is broad: Wehope to identify principles that seem likelyto contribute to the kinds of flexible and nu-anced decisions that are a hallmark of highercognition.

    SDT: signaldetection theory

    SA: sequentialanalysis

    The organization of this review is as fol-lows. We first describe the computational el-ements that comprise the decision process.We then briefly review signal detection the-ory (SDT) and sequential analysis (SA), tworelated branches of statistical decision theorythat represent formal, mathematical prescrip-tions for how to form a decision using thesecomputational elements. We then dissect sev-eral experimental results in the context of thistheoretical framework to identify neural sub-strates of decision making. We conclude witha discussion of the strengths and limitations

    of this approach for inferring principles ofhigher brain function.

    Elements of a Decision

    The decisions required for many sensory-motor tasks can be thought of as a formof statistical inference (Kersten et al. 2004,Rao 1999, Tenenbaum & Griffiths 2001, vonHelmholtz 1925): What is the (unknown)state of the world, given the noisy data pro-vided by the sensory systems? These decisionsselect among competing hypotheses h1 . . . hn(often n = 2) that each represent a state ofthe world (e.g., a stimulus is present or ab-sent). The elements of this decision process(see Figure 1) are described in terms of prob-ability theory, as follows.

    The probability P(hi), or prior, refers tothe probability that hi is true before obtain-ing any evidence about it. In the courtroomanalogy, priors correspond to prejudices thatcan bias jurors’ judgments. Bayesian infer-ence prescribes a more positive role for pri-ors, which are necessary to convert measur-able properties of evidence (the values it canattain when hi is true) to inferred ones (theprobability that hi is true given a particularobservation. For a sensory-motor task, a priortypically corresponds to the predicted proba-bility of seeing a particular stimulus or receiv-ing a particular reward on the upcoming trial,which can be instructed (e.g., Carpenter &Williams 1995, Basso & Wurtz 1998, Dorris& Munoz 1998, Platt & Glimcher 1999) orinferred from its relative frequency of occur-rence on previous trials (Sugrue et al. 2004).

    The evidence (e) refers to information thatbears on whether (and possibly when) to com-mit to a particular hypothesis. A strand of hairfound at a crime scene can be used as evidenceif it supports or opposes the hypothesis that acertain person was present at that location.For a perceptual task, neural activity that rep-resents immediate or remembered attributesof a sensory stimulus can be used as evidence.However, like hair at a crime scene, this sen-sory activity is evidence only insofar as it bears

    536 Gold · Shadlen

    jigoldInserted Text)

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    One state holdse.g., up

    Two possible states

    {up,down}

    Consideration oftwo propositions

    (hypotheses)

    h1:up or h2 : down

    x = x1 , x2 ,...{ }Sensory data

    Useful form of evidence

    e

    In the world In the brain

    Decision variable

    l12 (e) ≡P e h1( )P e h2( )

    or logLR12 ≡ log l12 (e)[ ]e.g., y ≈ ...

    One actione.g., answer “up”

    Experience payoff or costvalue or utility

    vij ∈{1, 2 }

    Apply decision rulee.g., choose left ifl12 (e) ≥ criterion

    Contexte.g., instruction

    Motivation toperform the task

    Establishdecision rule

    based on goals

    Statistical knowledge

    P e hi( )likelihoods:priors:

    values:

    P hi( )v hi( )Hj

    Consequenceof action & state

    (4 possible)

    Contextual cues and prexisting knowledge

    Information flowfor each decision

    Evaluation

    Figure 1Elements of a simple decision between two alternatives. The left side represents elements of the world.The right side represents elements of the decision process in the brain. Black elements establish context.Red elements form the decision. Blue elements evaluate and possibly update the decision process.

    on a hypothesis. Thus e is useful when it canbe interpreted in the context of conditionalprobabilities such as P (e | hi ), the “likelihood”function describing the values that e can attainwhen hi is true. Perceptual tasks are useful forstudying decision formation in part becauseof the ability to control precisely the quan-tity and quality of the sensory evidence andmeasure the impact on likelihood functionsobtained from relevant sensory neurons.

    Value (v) is the subjective costs and benefitsthat can be attributed to each of the potentialoutcomes (and associated courses of action)of a decision process. Value can be manip-ulated by giving explicit feedback or mone-tary rewards to human subjects or preferredfood or drink to nonhuman subjects. Valuecan also reflect more implicit factors such asthe costs associated with wasted time, effort,and resources. Here we make no distinction

    www.annualreviews.org • Decision Making 537

    jigoldComment on Textwrong font size for "likelihoods"(should be the same font size as "priors" and "values",and right justified with those words)

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    DV: decisionvariable

    between value and utility, disregarding theirtechnical meanings and opting instead for amore general concept that describes subjec-tive influences on the decision process.

    The decision variable (DV) represents theaccrual of all sources of priors, evidence, andvalue into a quantity that is interpreted by thedecision rule to produce a choice. It is a con-ceptual entity corresponding to the delibera-tions in a trial leading up to the verdict. Notethat here “deliberations” does not imply thatthe DV is necessarily computed rationally andwithout emotion; rather, it emphasizes thatthe DV is capable of accounting for multiplesources of information (priors, evidence, andvalue) that are interpreted over time. Thus,the DV is not tied to the (possibly fleeting)appearance of stimuli but spans the time fromthe first pieces of relevant information to thefinal choice. Also, unlike the choice, which isdiscrete, it is often best thought of as an ana-log quantity. We spend much of this chapterrefining the concept of a DV and describingefforts to find its neural correlates.

    The decision rule determines how andwhen the DV is interpreted to arrive at a com-mitment to a particular alternative Hi (thechoice associated with hypothesis hi). The rulecauses the jury to declare, “we have a verdict.”A conceptually simple rule is to place a cri-terion value on the DV. This rule requires aDV whose magnitude reflects the balance ofsupport and opposition for a hypothesis. Sucha rule allows the decision maker to achieve atleast one of several appealing long-term goals,including maximizing accuracy or reward orachieving a target decision time.

    The course of action that follows the com-mitment to an alternative is often necessaryto reap the costs and benefits associated withthat alternative. In these cases, the decisionitself might be best thought of not as anabstract computation but rather as the ex-plicit intention to pursue (or avoid) a par-ticular course of action. This idea is a formof “embodiment” that places high-order cog-nitive capacities such as decision making inthe context of behavioral planning and execu-

    tion (Cisek 2007, Clark 1997, Merleau-Ponty1962, O’Regan & Noë 2001). A key practi-cal implication is that the parts of the brainresponsible for selecting (or planning) certainbehaviors may play critical roles in formingdecisions that lead to those behaviors.

    The goals of a decision maker are toachieve desired outcomes and avoid unde-sired ones. Desired outcomes include “get-ting it right” or maximizing the percentageof correct responses in tasks that have rightand wrong answers or, more generally, maxi-mizing expected value (Green & Swets 1966).Undesired outcomes include getting it wrong,minimizing value and wasting time, effort,or resources. Goals are critical because thedecision process is assumed to be intended,and perhaps even optimized, to achieve them.Indeed, optimality can be assessed only inthe context of a goal. Thus, behavior that is“suboptimal” with respect to certain objectivegoals such as maximizing accuracy might infact be optimal with respect to the idiosyn-cratic goal(s) of the decision maker.

    Evaluation, or performance monitoring,is necessary to analyze the efficacy or op-timality of a decision with respect to itsparticular goals. For laboratory tasks, evalu-ation can occur with or without explicit feed-back (e.g., Carter et al. 1998, Ito et al. 2003,Ridderinkhof et al. 2004, Schall et al. 2002,Stuphorn et al. 2000). In either case it is likelyto play a critical role in shaping future deci-sions via learning mechanisms that, in princi-ple, can affect every aspect of the process, fromincorporating the most appropriate priors, ev-idence, and value into the DV to establishingthe most effective decision rule.

    Conceptual Framework

    Signal detection theory. SDT is one of themost successful formalisms ever used to studyperception. Unlike information theory andother biostatistical tools commonly used fordata analysis, SDT prescribes a process toconvert a single observation of noisy evidenceinto a categorical choice. Early applications

    538 Gold · Shadlen

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    allowed psychologists to infer from behaviorproperties of the underlying sensory repre-sentation (Green & Swets 1966). Later, pi-oneering work in retinal and somatosensoryphysiology established SDT as a valuable toolto relate the measured responses of sensoryneurons to the limits of detection and discrim-ination (for reviews see Parker & Newsome1998, Rieke et al. 1997). More recently, it hasbegun to shed light on decision mechanisms.

    According to SDT, the decision maker ob-tains an observation of evidence, e. In per-ceptual psychophysics, e is derived from thesenses and might be the spike count froma neuron or pool of neurons, or a derivedquantity such as the difference between spikerates of two pools of neurons. It is causedby a stimulus (or state) controlled by the ex-perimenter; e.g., h1 (stimulus present) or h2(stimulus absent). If e is informative, then itsmagnitude differs under these states. How-ever, e is also corrupted by noise. Thus e isa random variable described by a distributionwhose parameters (e.g., the mean) are set byh1 or h2. These conditionalized distributionsare the likelihoods P (e | h1) and P (e | h2). Un-like standard statistical methods, the object ofSDT is not to determine whether the param-eters describing these distributions are differ-ent but instead to decide which of the statesgave rise to the observation e.

    The decision requires the construction ofa DV from e. For binary decisions, the DV istypically related to the ratio of the likelihoodsof h1 and h2 given e: l12(e) ≡ P (e | h1)/P (e | h2).A simple decision rule is to apply a criterion tothe DV; e.g., choose h1 if and only if l12(e) ≥ β,where β is a constant. A strength of SDT isthat a variety of goals can be reached by simplyusing different values for the criterion. If thegoal is accuracy and the two alternatives areequally likely, then β = 1. If the goal is accu-racy and the prior probability favors one of thehypotheses, then β = P (h2)/P (h1). If the goalis to maximize value (where vij is the value as-sociated with choice Hj when hypothesis hi istrue), then β = (v22 + v21)P (h2)(v11 + v12)P (h1) . For more details,the reader should refer to the first chapter of

    LR: likelihood ratio

    logLR: logarithm ofthe likelihood ratio

    Green & Swets (1966), where these expres-sions are derived.

    SDT thus provides a flexible frameworkto form decisions that incorporate priors,evidence, and value to achieve a variety ofgoals. Unfortunately, this flexibility also posesa challenge to neurobiologists. The above ex-pressions were obtained assuming that the DVis the likelihood ratio (LR), l12 (e). However,equivalent expressions (that is, those that willachieve the same goals) can be obtained (byscaling β) using any quantity that is monoton-ically related to the LR. In other words, theseequations do not constrain the priors, e, value,the DV, or β to take on any particular form,only that they interact in a certain way. Thusit is difficult to assign a quantity measured inthe brain to any one of these elements with-out knowing how the others are represented.One powerful approach to unraveling this co-nundrum is to exploit differences in the timescales of these elements in decision formation.

    Sequential analysis. SA is a natural exten-sion to SDT that accommodates multiplepieces of evidence observed over time. SAassumes that the decision has two parts: theusual one between h1 and h2, and anotherabout whether it is time to stop the processand commit (Figure 2). In its most generalform, SA allows the procedure for construct-ing the DV and the decision rule to be adjustedwith each new sample of evidence. However,many decisions can be understood by assum-ing fixed definitions for these elements. A sim-ple DV constructed from multiple, indepen-dent pieces of evidence, e1, e2, . . . , en, is thelogarithm of the LR (logLR, or “weight of ev-idence”), which is just the sum of the logLRsassociated with each piece of evidence:

    log LR12 ≡ log P (h1| e1, e2, . . . , en)P (h2| e1, e2, . . . , en)

    =n∑

    i=1log

    P (h1| ei )P (h2| ei ) . 1.

    A simple stopping rule is to update this DVwith new pieces of evidence until reaching a

    www.annualreviews.org • Decision Making 539

    jigoldCross-Out

    jigoldReplacement Textdescribe

    jigoldComment on TextEvery P(h*|e*) should be trasposed to P(e*|h*); i.e.,they should be (in descending order):

    P(e1,e2,...,en|h1)P(e1,e2,...,en|h2)P(ei|h1)P(ei|h2)

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Choose

    e0 → f0 e Stopor

    Stopor

    Stopor

    0( ) ⇒ →

    e1 → f1 e0 ,e1( ) ⇒

    e2 → f1 e0 ,e1 ,e2( ) ⇒ …

    … …en → f1 e0 ,e1 , ,en( ) ⇒

    Sequential analysis famework a

    c

    Symmetric random walkb

    A

    -A

    Mean drift rate = mean of e

    e

    Mean of e dependson strength ofevidence

    0

    Acc

    um

    ula

    ted

    evid

    ence

    fo

    r

    ove

    r h 1

    h 2

    Choose H1

    H2

    Race model

    A

    H1Choose

    Acc

    um

    ula

    ted

    evid

    ence

    fo

    r h 1

    BH2Choose

    Acc

    um

    ula

    ted

    evid

    ence

    fo

    r h 2

    Figure 2Sequential analysis. (a) General framework. The decision is based on a sequence of observations. Aftereach acquisition, a DV is calculated from the evidence obtained up to that point; then more evidence canbe obtained or the process can be terminated with a commitment to H1 or H2. In principle, both thefi (· · ·)s, which convert the evidence to a DV, and the criteria can be dynamic (e.g., to incorporate the costof elapsed time). e0 can be interpreted as the evidence bearing on the prior probability of the hypotheses.(b) In random walk models, the DV is a cumulative sum of the evidence. The bounds represent thestopping rule. If e is a logLR, then this process is the SPRT (see The Sequential Probability Ratio Test).When the evidence is sampled from a Gaussian distribution in infinitesimal time steps, the process istermed diffusion with drift μ, or bounded diffusion. (c) In the race model, two or more decision processesrepresent the accumulated evidence for each alternative. When there are two alternatives and theaccumulations are inversely correlated, the race model is nearly identical to a symmetric random walk.

    540 Gold · Shadlen

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    positive or negative criterion (the bounds inFigure 2b).

    Together, this DV and stopping rule com-prise the sequential probability ratio test(SPRT) (see The Sequential Probability RatioTest), which is the most efficient test for de-ciding between two hypotheses on this kindof problem: It achieves a desired error ratewith the smallest number of samples, on av-erage (Wald & Wolfowitz 1947). This proce-dure played a prominent role in allowing AlanTuring and colleagues to break the Germanenigma cipher in World War II (Good 1979,1983). Their success depended on not onlydeducing the contents of intercepted mes-sages correctly, but also doing so in time forthe information to be of strategic use.

    SA in numerous guises has been a valu-able tool for psychophysical analysis, particu-larly for studying the trade-off between speedand accuracy (Luce 1986, Smith & Ratcliff2004). In recruitment or race models, evi-dence supporting the various alternatives isaccumulated independently to fixed thresh-olds (Audley & Pike 1965, LaBerge 1962,Logan 2002, Reddi et al. 2003, Vickers 1970).In other models that are more closely re-lated to the SPRT, a weight of evidence isaccumulated to support one alternative ver-sus another (Busemeyer & Townsend 1993,Diederich 2003, Laming 1968, Link 1992,Link & Heath 1975). These models mirrorthe mathematical description of a randomwalk or diffusion process (Ratcliff & Rouder1998, Ratcliff & Smith 2004, Smith 2000,Smith & Ratcliff 2004): The accumulationof noisy evidence creates a virtual trajectoryequivalent to the dancing movements of a tinyparticle in Brownian motion (Figure 2).

    SA promises to play an important role inthe neurobiology of decision making. First,investigators continue to develop neurobio-logically inspired implementations of SA thatwill help to identify where and how the braincarries out the underlying computations (Lo& Wang 2006, Usher & McClelland 2001,Wang 2002). Second, SA provides a means todistinguish evidence from the DV. Evidence is

    THE SEQUENTIAL PROBABILITY RATIOTEST

    Consider the following toy problem. Two coins are placed in abag. They are identical except that one is fair and the other isa trick coin, weighted so that heads appears on 60% of tosses,on average. Suppose one of the coins is drawn from the bag,and we are asked to decide whether it is the trick coin. Wecan base our decision on a series of any amount of tosses. TheSPRT works as follows. Each observation (toss) ei is convertedto a weight of evidence, the logLR in favor of the trick coinhypothesis. There are only two possible values of evidence,heads or tails, which give rise to weights (wi ):

    wi =

    ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

    logP (ei = heads|h1 : trick coin)P (ei = heads|h2 : fair coin)

    = log 0.60.5

    = 0.182 if heads

    logP (ei = tails|h1 : trick coin)P (ei = tails|h2 : fair coin)

    = log 0.40.5

    = −0.223 if tailsAccording to SPRT, the decision variable is the running sum(accumulation) of the weights. After the nth toss, the decisionvariable is

    yn =n∑

    i=1wi

    We apply the following rules:

    if yn ≥ log 1 − αα

    answer “trick”

    if yn ≤ log β1 − β answer “fair”

    if logβ

    1 − β � yn � log1 − α

    αget more evidence

    where α is the probability that a fair coin will be misidentified[i.e., a type I error: P (H1| h2)] and β is the probability thata trick coin will be misidentified [a type II error:P (H2| h1)].For example, if α = β = 0.05, then the process stops when|yn| ≥ log(19). The criteria can be viewed as bounds on arandom walk. To achieve a lower rate of errors, the boundsmust be moved further from zero, thus requiring more sam-ples of evidence, on average, to stop the process.

    SPRT: sequentialprobability ratio test

    momentary, whereas the DV evolves in time.Changes to either can affect accuracy or de-cision times differently and can, in principle,be distinguished in neural recordings (Hankset al. 2006). Third, SA includes a termination

    www.annualreviews.org • Decision Making 541

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    VTF: vibrotactilefrequency

    rule. Analogous mechanisms in the brain arerequired to make decisions and commit to al-ternatives on a time frame that is not governedby the immediacy of sensory input or motoroutput, a hallmark of cognition.

    EXPERIMENTS

    Below we summarize key experimental resultsthat shed light on how the brain implementsthe elements of a decision. We focus first onperceptual decisions and how to distinguishsensory evidence from the DV and the deci-sion rule. We then describe simple motor tasksthat appear to engage similar decision mech-anisms. Finally we discuss value-based deci-sions that weigh expectation and preferenceas opposed to sensory evidence.

    Perceptual Tasks

    Vibrotactile frequency (VTF) discrimina-tion. Developed in the 1960s by Mountcas-tle and colleagues, the vibrotactile frequency(VTF) paradigm requires the subject, typi-cally a monkey, to compare the frequency ofvibration of two tactile stimuli, f 1 and f 2,separated by a time gap (Figure 3a). Therange of frequencies used (∼10–50 Hz) doesnot activate specialized frequency detectorsbut instead requires the nervous system to ex-tract the intervals or rate of skin depression(Luna et al. 2005, Mountcastle et al. 1990).This information is used to decide whetherthe frequency is greater in the first or secondinterval. The monkey communicates its an-swer by pressing a button with the nonstimu-lated hand.

    Comparison (f2)Base (f1) KU PB

    500 ms

    10 15 20 25 300

    0.2

    0.4

    0.6

    0.8

    1

    Frequency of comparison (Hz)

    b

    a

    Proportion of choices

    f2 > f1

    f1 = 20 Hz

    Figure 3Neural correlates of a decision about vibrotactile frquency. (a) Testing paradigm. A test probe delivers asinusoidal tactile stimulus to the finger at base frequency f 1. After a delay period, a comparison stimulusis delivered at frequency f 2. Then the monkey must decide whether f 2 > f 1, a decision it indicates byreleasing a key (KU) and pressing a button (PB) with its free hand. (b) Psychometric function. The task isdifficult when the base (20 Hz) and comparison frequencies are similar ( f 2 ≈ f 1). (c) Response of atypical S1 neuron. Rasters show spikes from individual trials, grouped by the combination of base andcomparison frequencies (left). The neuron responds to the vibration stimulus in both the base and thecomparison periods. The firing rate encodes the vibration frequency similarly in the base and comparisonperiods (brown and purple graphs, respectively), but it does not reflect the monkey’s choice (black and whitelines and data points in the lower panels). The neuron is uninformative in the interval between base andcomparison stimuli. (d ) Response of a neuron in ventral premotor cortex. Same conventions as in (c).This neuron carries information about the base frequency during the interstimulus interval (blue). In thecomparison epoch (purple), the neuron is more active when f 2 < f 1. Note that for all trials, the base andcomparison frequencies differ by 8 Hz. Adapted with permission from Hernandez et al. (2000), Romoet al. (2004).

    542 Gold · Shadlen

    jigoldCross-Out

    jigoldCross-Out

    jigoldCross-Out

    jigoldNoteshould read:

    ...the VTF paradigm...

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Figure 3(Continued )

    www.annualreviews.org • Decision Making 543

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Signals that encode the VTF stimulus havebeen traced from the periphery into the pri-mary somatosensory cortex (S1). For mostS1 neurons, the average firing rate increasesmonotonically with increasing stimulus fre-quency. For many S1 neurons, firing ratemodulations also follow the periodicity of thestimulus (Figure 3c). However, the averagefiring rate is thought to represent the evi-dence used to perform the task. First, behav-ioral sensitivity more closely matches the dis-criminability of S1 responses when averagerates, rather than periodic modulations, areused. Second, trial-to-trial variations in therates, but not the periodic modulations, ofS1 neurons predict to a slight but significantdegree the monkeys’ choices (Salinas et al.2000). This relationship, termed the choiceprobability (Parker & Newsome 1998), is ex-pected for neurons that provide noisy evi-dence for the decision, especially if the sametrial-to-trial variations are shared by other S1neurons ( Johnson 1980a,b; Kohn & Smith2005; Shadlen et al. 1996; Zohary et al. 1994).Third, replacing the VTF stimulus in the firstand/or second interval with electrical micros-timulation of S1, in some cases using aperiodicstimuli that lack regular intervals between ac-tivations, elicits nearly the same behavioral re-sponses as does the physical stimulus (Romoet al. 1998, 2000).

    The firing rate of S1 neurons thus repre-sents the sensory evidence that underlies thedecision. Why the evidence and not the DV?To make a decision, the brain must compare f 2with f 1. This comparison cannot occur untilf 2 is applied, and it must incorporate informa-tion about f 1 that has been held in workingmemory. S1 responses reflect the f 1 stimulusduring the first interval and the f 2 stimulusduring the second interval (Figure 3c). Theytherefore do not provide the comparison.

    Activity in several brain areas, includingthe second somatosensory cortex (S2; Romoet al. 2002) and the dorsolateral prefrontalcortex (dlPFC, or Walker area 46; Brody et al.2003, Romo et al. 1999) but especially themedial and ventral premotor cortices (MPC

    and VPC), more closely resembles a DV.Many neurons in these areas persist in firingthrough the delay period between f 1 and f 2(Figure 3d, red insert). Moreover, during thesecond interval, the activity of some of theseneurons reflects a comparison between f 2 andf 1 (Figure 3d, blue insert). However, identify-ing the nature of this comparison can be diffi-cult. For the example neuron in Figure 3d,it is unknown whether the activity duringthe comparison period reflects the differencef 2 − f 1 or merely the sign of the difference(which is more closely related to the decisionoutcome than the DV) because in every casef 2 − f 1 = 8 Hz. A less direct analysis usingchoice probability suggests that some neuronsin these brain areas might represent the differ-ence between f 2 and f 1 (see Hernandez et al.2002, Romo et al. 2004).

    This extraordinary body of work providesto date the most complete picture of the di-versity of brain areas that contribute to de-cision formation on even a simple sensory-motor task. For some areas, their role in taskperformance seems clear. S1 provides the sen-sory evidence. Primary motor cortex helps toprepare and execute the behavioral response.However, for the remainder of these brain ar-eas (and doubtless others yet to be studied)that lie at intermediate stages between sensoryinput and motor output, many challenges liein the way of an equally precise recounting oftheir roles in decision formation.

    One challenge is to understand the appar-ent redundancy. For example, memory tracesof f 1 and f 1/f 2 comparisons are both foundin S2, VPC, MPC, and dlPFC. Do the subtledifferences in how those computations man-ifest in the different brain areas indicate sub-tly different roles in these processes? Or isthere simply a continuous flow of informa-tion through these circuits, such that each per-forms a unique role but has continuous accessto the computations performed by the othercircuits?

    Another challenge lies in linking neural ac-tivity to a particular element of the decisionprocess. For example, in several brain regions,

    544 Gold · Shadlen

    jigoldCross-Out

    jigoldInserted Textblue

    jigoldCross-Out

    jigoldReplacement Textpurple

    jigoldHighlight

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    but especially in the dlPFC, delay-period ac-tivity resembles a working memory trace ofthe evidence provided by f 1. An alternativeexplanation is that the delay-period activityrepresents the DV: a prediction of the deci-sion, given f 1. Specifically, if f 1 is low, theanswer is (or might seem) likely to be f 2 > f 1.If f 1 is high, f 2 < f 1 might seem more likely.This DV would later be updated by the f 2evidence. Consistent with this idea, manipu-lating properties of the f 1 stimulus can biaschoices under some conditions (Luna et al.2005). Contrary to this idea, dlPFC delay-period activity reflects f 1 when using a stim-ulus set in which the value of f 1 cannot beused to predict f 2 (Romo et al. 1999). How-ever, little is known about the task-related ex-pectations that might be represented duringthe delay period.

    Random-dot motion (RDM) direction dis-crimination. Similar to the VTF paradigm,the RDM paradigm (Figures 4 and 5) wasdeveloped to study the relationship betweensensory encoding and perception. Unlike theVTF task, the RDM task requires a singlestimulus presentation and thus eliminates theneed for working memory. The monkey de-cides between two possible (opposite) direc-tions of motion that are known in advance.Task difficulty is controlled by varying thepercentage of coherently moving dots. Thedirection decision is typically indicated withan eye movement.

    The evidence used to form the directiondecision has been traced to neurons in themiddle temporal area (MT/V5) tuned for thedirection of visual motion. SDT analyses ofthe strength and variability of MT responsesprovided a foundation for understanding be-havioral accuracy (Britten et al. 1992, 1993;Shadlen et al. 1996). Choice probabilities in-dicated that individual MT neurons weaklybut significantly predict the monkey’s direc-tion decisions, including errors (Britten et al.1996). Lesion and microstimulation studies,exploiting the systematic organization of MTwith respect to motion location and direction,

    RDM: random-dotmotion

    further established a causal link for MT ac-tivity and task performance (Ditterich et al.2003; Newsome & Paré 1988; Salzman et al.1990, 1992).

    Two aspects of the task facilitate the studyof the neural mechanisms of decision forma-tion. The first aspect is that the time neededto make the decision is particularly long forperceptual tasks, typically many 100s of ms.Thus researchers have characterized neuralcorrelates of the decision process as it un-folds in time. For a version of the RDMtask in which motion viewing time (t) is con-trolled by the experimenter, performance im-proves as roughly

    √t (Figure 4b), the rela-

    tion expected if at each successive moment thebrain acquired (i.e., integrated) an indepen-dent sample of noisy evidence. Performanceon other perceptual tasks, including even aversion of the RDM task, can show little orno improvement with prolonged viewing du-ration (Ludwig et al. 2005, Uchida et al. 2006,Uka & DeAngelis 2003, Watson 1986). In thissense the RDM task may be less representa-tive of perception than of cognitive decisionmaking, which can involve multiple sources ofevidence acquired over a flexible time scale.

    The second benefit of the RDM task isthe imposed link between the direction de-cision and a particular course of action, theeye-movement response. This link enables in-vestigators to treat the decision as a prob-lem of movement selection. Thus, the searchfor the DV has focused on parts of the braininvolved in the selection and preparation ofeye movements, including the lateral intra-parietal area (LIP), superior colliculus (SC),frontal eye field (FEF), and dlPFC (Horwitz &Newsome 1999, 2001; Kim & Shadlen 1999;Shadlen & Newsome 1996, 2001).

    In one experiment (Figure 4), motionviewing was interrupted at a random timeduring decision formation by turning off theRDM stimulus and applying a brief electricalcurrent to the frontal eye field (FEF) (Gold &Shadlen 2000, 2003). The microstimulationcaused a short-latency saccade whose ampli-tude and direction were determined by the

    www.annualreviews.org • Decision Making 545

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    a VoluntarysaccadeMotion

    Fixation

    Evokedsaccade

    Time

    100 300 5000

    0.5

    1.0

    1.5

    Viewing duration (ms)

    51.2% coh25.6%

    6.4%

    0%

    Am

    ou

    nt

    of

    dev

    iati

    on

    (º)

    -5 0 5 10

    -5

    0

    5

    X position (º)

    Y p

    osi

    tio

    n (

    º)

    FixUp

    Down

    Th

    resh

    old

    (%

    co

    h)

    100 200 400 800

    Viewing duration (ms)

    c

    d

    b

    6.4

    12.8

    25.6

    Figure 4Representation of an evolving DV by the motor system. (a) Interrupted direction discrimination task.The monkey decides the net direction of motion, here shown as up versus down. Task difficulty isgoverned by the fraction of dots that move coherently from one movie frame to the next (% coherence).The motion viewing is interrupted prematurely, and on a fraction of trials, a brief current is applied tothe FEF to evoke a saccade. The monkey makes a second, voluntary movement to a choice target toindicate his decision. (b) Decision accuracy improves as a function of motion-viewing duration.Psychophysical threshold is defined as the motion coherence supporting 82% correct. Threshold falls by√

    t (slope of line fit on log-log plot = –0.46; 95% CI: –0.59 to –0.33). These data are from trials inwhich no stimulation occurred. Similar data were obtained on stimulated trials. (c) Examples of eyemovement trajectories. Fixation point is at the origin. The two larger circles are the choice targets. Therandom-dot stimulus (not shown) was centered on the fixation point. The symbols mark eye position in2-ms steps. FEF stimulation during fixation, in the absence of motion and choice targets, elicited arightward saccade (trace marked “fix”). Stimulation while viewing upward and downward motion inducedsaccades that deviated in the direction of the subsequent, voluntary eye movements. (d ) The averageamount of deviation depends on motion strength and viewing time. The amount of deviation toward thechosen target was estimated using the evoked saccades from 32 stimulation sites (14,972 trials). Thisresult shows that the oculomotor system is privy to information about the evolving decision, not just thefinal outcome of the decision process. Adapted from Gold & Shadlen (2000, 2003) with permission.

    546 Gold · Shadlen

    jigoldHighlight

    jigoldHighlight**should be "Fix"(uppercase "F" like in the figure)

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    site of stimulation, a defining feature of theFEF (Bruce et al. 1985, Robinson & Fuchs1969). This evoked saccade tended to deviatein the direction governed by the eye move-ment associated with the monkey’s ultimatechoice. The amount of deviation, even whenmeasured early in the decision process, par-alleled the evolution of a DV that explainedaccuracy as a function of motion strength andviewing time. This result is inconsistent withthe notion that a central decision maker com-pletes its operation before activating the mo-tor structures to perform the necessary ac-tion. Instead, it implies that, at least undersome conditions, information flow from sen-sory neurons to motor structures is more orless continuous (Spivey et al. 2005).

    To measure the correspondence more pre-cisely in time of neural activity with elementsof the decision process, a reaction-time (RT)version of the RDM task was developed. Thistask allowed the monkey to indicate its choiceas soon as a commitment to one of the alterna-tives is reached. Figure 5b shows examples ofchoice and RT functions for a monkey. MeanRT increases as task difficulty increases. Foreasy stimuli, RT varies with stimulus strengtheven when choice accuracy is perfect. It ul-timately approaches an asymptote that rep-resents time that is not used on the decisionper se. This nondecision time (328 ms in thisdata set) includes visual and motor latenciesand possibly other processing stages that areless understood (see below). For difficult stim-uli, the nondecision time is relatively shortcompared with the RT, implying long deci-sion times. In contrast, this nondecision timeoften takes up the lion’s share of RT for sim-pler tasks, an important caveat for interpretingmany RT studies.

    Studies of neural mechanisms underlyingthe decision process on the RT task have fo-cused on area LIP. LIP is anatomically po-sitioned midway through the sensory-motorchain, with inputs from MT and MST andoutputs to the FEF and SC (Andersen et al.1990, 1992; Asanuma et al. 1985; Blatt et al.1990; Fries 1984; Lewis & Van Essen 2000a;

    RT: reaction time

    MST: medialsuperior temporalarea

    Lynch et al. 1985; Paré & Wurtz 1997). Thisarea has been implicated in other high-orderprocesses involved in the selection of saccadetargets, including working memory, alloca-tion of attention, behavioral intention, spa-tial inference, and representation of bias, re-ward, expected value, and elapsed time (Assad& Maunsell 1995, Chafee & Goldman-Rakic2000, Dorris & Glimcher 2004, Eskandar& Assad 1999, Friedman & Goldman-Rakic1994, Janssen & Shadlen 2005, Leon &Shadlen 2003, Platt & Glimcher 1999, Sugrueet al. 2004). Moreover, neural activity inLIP—particularly in its ventral subdivision,termed LIPv (Blatt et al. 1990, Lewis & VanEssen 2000b)—reflects decision formation ona fixed-duration version of the RDM task(Shadlen & Newsome 1996, 2001).

    Figure 5c,d illustrates the responses ofLIP neurons during the RT paradigm. Inthese experiments, one of the choice targets(Tin) is in the response field (RF) of theLIP neuron; the other target (Tout) and theRDM stimulus lie outside the neuron’s RF(Figure 5a). Thus, these neurons are stud-ied under conditions in which they are apt tosignal the monkey’s choice (via the associatedaction). What is more interesting is that theiractivity reflects the decision process that leadsto that choice.

    Aligning the responses to stimulus onset(t = 0 on the left side of Figure 5c) pro-vides a glimpse into the brain’s activity inthe epoch when the animal is forming thedecision but has yet to commit overtly to achoice. Initially, there is a brief dip in thefiring rate followed by a rise in activity thatis independent of the direction and strengthof motion or the monkey’s ultimate choice.Then, after ∼220 ms, the average responsebegins to reveal differences in the evidenceand outcome of the decision. On trials thatend in a Tin choice, the firing rate rises like aramp, on average. On trials that end in a Toutchoice, the firing rate meanders or tends todecline. This dependence on choice is evidenteven when the stimulus is ambiguous (0%coherence).

    www.annualreviews.org • Decision Making 547

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Aligning the responses to saccade initia-tion (Figure 5c, right) reveals a correlate ofcommitment: a threshold rate of firing be-fore Tin choices. When separated by motionstrength, the curves overlap considerably justprior to the saccade and thus make it im-possible to identify a single point of conver-gence because each motion strength leads toa broad distribution of RTs. When these sameresponses are grouped by RT instead of mo-

    tion strength, they achieve a common levelof activity ∼70 ms before saccade initiation(arrow in Figure 5d ). Thus the decisionprocess appears to terminate when the neu-rons associated with the chosen target reacha critical firing rate. When the monkeychooses Tout, another set of neurons—theones with the chosen target in their RFs—that determine the termination of the decisionprocess.

    Saccade

    Motion

    Targets

    Fixation RF

    React

    ion tim

    e (RT

    )

    a

    20

    30

    40

    50

    60

    70

    200 400 600 8000 -200 0

    Time (ms)

    5

    45

    c

    0 2.5 5 10 20 40

    400

    500

    600

    700

    800

    900

    Mea

    n R

    T (

    ms)

    Motion strength (% coh)

    0.4

    0.6

    0.8

    1

    Per

    cen

    t co

    rrec

    t

    Behaviorb

    RT (ms)

    Time from saccade (ms)

    Fir

    ing

    rat

    e (s

    p s

    –1)

    30

    40

    50

    60

    70

    d

    –1000 –500 0

    MT

    Motionstrength

    Motionon

    Eyemovement

    51.212.80 Select Tin

    Select Tout

    Fir

    ing

    rat

    e (s

    p s

    –1)

    400500600700800900

    548 Gold · Shadlen

    jigoldCross-Out

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    The pattern of LIP activity matches pre-dictions of the diffusion model (Figure 2b).The coherence-dependent rise appears to re-flect an accumulation of noisy evidence. Thisevidence comes from a difference in activityof pools of MT neurons with opposite di-rection preferences (Figure 5c, shaded insert),which is thought to approximate the associ-ated logLR (Gold & Shadlen 2001). However,some caveats should be noted. The dashedcurves of Figure 5c (right) do not end in acommon lower bound and instead look likethe average of paths of evidence for, say, rightwhen the left choice neurons win in a racemodel (Mazurek et al. 2003). Also, the DV isrepresented only from ∼220 ms after motiononset until ∼70–80 ms before saccade initia-tion. Reassuringly, these times taken togethernearly match the nondecision time from fitsof the diffusion model to the choice-accuracybehavioral data. Whereas the initial ∼220-mslatency is long in comparison with latenciesof neural responses in MT (which show di-rection selectivity ∼100 ms after onset of aRDM stimulus; Figure 5c, insert) and LIP(which can indicate the presence of a target in

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Stimulate right choice LIP neurons

    –A

    0

    +A

    Stimulate rightward MT neurons

    Relatively large effect on choice and RT:equivalent to added rightward motion

    Motion strengthStrong

    rightward

    0

    Strongleftward

    Motion strengthStrong

    rightward

    0

    Strongleftward

    P

    rop

    ort

    ion

    rig

    htw

    ard

    ch

    oic

    esR

    eact

    ion

    tim

    e

    Small effect on choice,modest effect on RT:

    not equivalent to added rightward motionP

    rop

    ort

    ion

    rig

    htw

    ard

    ch

    oic

    esR

    eact

    ion

    tim

    e

    0

    Mean

    a

    b

    Bound for right choice

    Bound for left choice

    Momentary evidence in MT DV in LIP

    –A

    0

    +ABound for right choice

    Bound for left choice

    Momentary evidence in MT

    stim adds cumulatively

    Stim adds constant0

    mean

    Evidencefor right(R – L)

    DV in LIP

    Evidencefor right(R – L)

    Figure 6Effects of microstimulation in MT and LIP. In both areas microstimulation causes a change in bothchoice and RT. The schematic shows the consequences of adding a small change in spike rate to theevidence or to the DV. The graphs on the right are theoretical results obtained using the boundeddiffusion model. They resemble the pattern of data in Hanks et al. (2006). (a) MT microstimulationmimics a change in stimulus strength (evidence). (b) LIP microstimulation mimics an additive offset tothe DV (or, equivalently, the height of the bounds).

    550 Gold · Shadlen

    jigoldInserted Text(red curves)

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    perturbations in the background of the RDMdisplay (Huk & Shadlen 2005). Each pertur-bation was a subtle boost or decrement inmotion energy lasting just 100 ms. The ef-fects on RT and choices were consistent witha process of integration. The motion pulsesaffected choices that occurred up to 800 msafter the pulse, and they affected RTs througha sustained effect on the DV (like MT micros-timulation). LIP neurons also registered thesebrief motion perturbations with long-lastingchanges in firing rate consistent with a processof integration.

    We do not know how or where this in-tegral is computed. It might be computedin LIP itself, or it might be computed else-where and merely reflected in LIP. A modelof LIP can achieve integration by mix-ing feedback excitation—using N-methyl-D-aspartate (NMDA) channels with relativelylong conductance times—within neuronalpools that share a common RF with an in-hibitory antagonism between pools represent-ing opposite directions (Wang 2002). Thishybrid of biophysical and large-scale neuralmodeling was originally designed to simulateworking memory. That the model can alsoform a DV in a manner consistent with manyaspects of the physiological results, includ-ing deviations from perfect integration (Wonget al. 2005, Wong & Wang 2006), suggeststhat this kind of persistent activity might servemultiple roles in the brain. Indeeed, the ques-tion of how neurons or neural circuits can in-tegrate is not limited to the study of decisionmaking and working memory but extends tomotor control and navigation as well (Major& Tank 2004). Progress in this area is likelyto shed light on cognitive functions that oper-ate on time scales longer than biophysical andsignaling time constants in single cells.

    We also know little about how the criterionis applied to the DV. Neurons that achieve athreshold level of activity in anticipation ofa saccadic eye movement on RT tasks havebeen found not just in LIP but also in FEFand SC (Hanes & Schall 1996, Ratcliff et al.2003). However, how this criterion is set and

    what happens when it is reached are unknown.Because the criterion controls the trade-offbetween speed and accuracy (Palmer et al.2005), parts of the basal ganglia sensitive toboth reward and movement onset have beensuggested as possible substrates (Lo & Wang2006). An alternative possibility is that a sin-gle neural circuit can represent the DV andits conversion to a binary choice, which wouldsuggest that the criterion is an intrinsic prop-erty of LIP, FEF, or SC (Machens et al. 2005,Wang 2002, Wong & Wang 2006).

    Heading discrimination. Optic flow is thepattern of motion that occurs when we movethrough the environment (Gibson 1950). Anatural candidate for the momentary evidenceused to infer heading direction from optic flowis in the medial superior temporal cortex (areaMST). MST neurons are tuned for expansion,rotation, translation and other large-field mo-tion patterns that comprise optic flow (Duffy& Wurtz 1991, 1995, 1997; Graziano et al.1994; Lagae et al. 1994; Saito et al. 1986;Tanaka et al. 1986; Tanaka & Saito 1989).Indeed, for a one-interval heading direction-discrimination task using a RDM flow field,signal-to-noise measurements of MST activ-ity correlate with behavioral sensitivity in amanner similar to results from MT using theRDM direction task (Britten & van Wezel2002, Heuer & Britten 2004).

    However, microstimulation experimentshave provided only weak evidence for a causalrole of MST neurons for the heading deci-sion (Britten & van Wezel 1998). Microstim-ulation can cause the monkey to bias choicesfor one heading direction, but often it is inthe direction opposite the preferred headingof the stimulated neurons. This might be ex-plained by a lack of a clustered organization ofneurons tuned to similar optic flow patterns(Britten 1998). A more tantalizing explana-tion comes from the fact that many MST neu-rons receive visual and vestibular inputs thatboth contribute to heading sensitivity (Guet al. 2006) but, for about half these neurons,have opposite direction preferences. Thus,

    www.annualreviews.org • Decision Making 551

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    microstimulation intended to bias judgmentsin a particular direction using visual cuesmight instead activate a local circuit domi-nated more by vestibular tuning in the op-posite direction. Consistent with this idea,choice probabilities (the weak correlations be-tween the variable discharge of MST neuronsand the trial-to-trial variations in the mon-key’s decisions) also appear to depend on thevestibular tuning of the neuron (DeAngeliset al. 2006).

    We do not know where the DV is repre-sented for this task. One possibility is MST it-self, which contains many neurons with spik-ing activity that persists even in the absenceof a stimulus. However, the lack of build-up activity as the decision is formed, andweak choice probability supports the idea thatMST represents only the momentary evi-dence (Heuer & Britten 2004). Another pos-sibility is the ventral intraparietal area (VIP),which receives direct input from MT andMST and contains many neurons with re-sponse properties similar to MST on op-tic flow and heading tasks (Bremmer et al.2002a,b; Zhang et al. 2004; Zhang & Britten2004). However, as in MST, it is difficult totell if VIP represents a DV or the momen-tary evidence used by other brain structures todecide whether, say, an object is nearing thehead and needs to be avoided (Colby et al.1993, Duhamel et al. 1998, Graziano et al.1997). In our view, the DV for the headingtask is likely to be represented in structuresthat provide high-level control of the behav-ioral (eye movement) response; e.g., area LIP.If the monkey were trained to reach for but-tons, likely candidates would be parietal andprefrontal cortical areas that provide analo-gous control of reaching movements.

    Disparity discrimination. MT neurons areselective for not only motion direction butalso the binocular disparity of images pre-sented to the two eyes, a cue for depth(Maunsell & Van Essen 1983, Uka &DeAngelis 2006). Recent studies using aone-interval depth-discrimination task with

    a RDM stimulus viewed stereoscopicallyshowed that the sensitivity of MT neuronsto noisy perturbations in disparity rivaled thebehavioral sensitivity of the monkey (Uka &DeAngelis 2003). Electrical microstimulationexperiments, aided by the clustered organiza-tion of MT with respect to disparity, furtherestablished their causal role in providing ev-idence for decisions about depth (DeAngeliset al. 1998, 1999; Krug et al. 2004).

    Can MT neurons that provide evidenceabout depth provide, in a different con-text, evidence about direction? To test this,monkeys were trained on a RDM direction-discrimination task in which the dots couldbe presented with task-irrelevant disparities.Electrical microstimulation at sites with weakdisparity tuning tended to have the largesteffects on performance. Microstimulation atsites with strong disparity tuning had weakereffects on performance, even when those siteswere strongly tuned for direction (DeAngelis& Newsome 2004). The results suggest thatthe DV tended to discount MT neuronswhose responses were sensitive to disparity,possibly because that variable was irrelevantto the direction task. This kind of context-dependent read-out implies a critical role forlearning in establishing the flow of informa-tion from neurons that represent the evidenceto neurons that form the DV (Freedman &Assad 2006, Law & Gold 2005).

    Less is known about the role of MT neu-rons in decisions that require both depthand motion information. Nevertheless, a taskrequiring a decision about the direction ofrotation of a transparent, sparsely texturedcylinder has provided a striking result forMT choice probabilities. An observer view-ing such a cylinder sees the texture (e.g.,dots) on the front and back surfaces movein opposite directions. The opposing motiongives rise to a distinct pattern of firing ratesamong direction-selective MT neurons thatcan be unambiguously associated with trans-parency. However, if not for the depth cues,the cylinder can appear to rotate in a clock-wise or counterclockwise direction depending

    552 Gold · Shadlen

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    on whether the two motion directions are as-sociated with the front and back or vice versa(Ullman 1979).

    MT neurons that are both disparity anddirection selective furnish evidence to removethis ambiguity (Bradley et al. 1998). For exam-ple, an MT neuron preferring near disparitiesand rightward motion responds best when thenearer plane is moving to the right. Such aneuron provides positive evidence for the in-terpretation that a vertical cylinder is rotatingin accordance with a right-hand rule (frontsurface to the right) (Dodd et al. 2001). Astrong trial-to-trial correlation exists betweenthe variable discharge of these neurons andthe monkeys’ rotation judgments (Krug et al.2004). In fact, this choice probability is largerthan for any of the decision tasks reviewedin this article, including leading us to suspectthat it is a sign of feedback from elements inthe brain that have rendered a decision, as op-posed to feedforward variations of noisy ev-idence that underlie difficult decisions nearpsychophysical threshold.

    Face/object discrimination. Neurons se-lective for images of faces and other com-plex objects are found in the ventral stream“what” or “vision for perception” pathway(Goodale & Milner 1992, Ungerleider &Mishkin 1982). Recent studies have begun toexamine how these neurons contribute to per-ceptual decisions by comparing brain activityand behavior (Allred et al. 2005, Baylis et al.2003, Dolan et al. 1997, Freedman et al. 2002,2003; Grill-Spector et al. 2000, Op de Beecket al. 2001, Rainer et al. 2004). In one study,monkeys performed a one-interval discrimi-nation of face versus nonface images maskedby white noise (Afraz et al. 2006). Electri-cal microstimulation applied during stimulusviewing to clusters of neurons in the infer-otemporal (IT) cortex that showed a prefer-ence for faces biased decisions toward faceversus nonface. The magnitude of the bias wascomparable to that found using MT micros-timulation on direction and disparity tasks,equivalent to a change in stimulus strength on

    the order of psychophysical threshold. Thisfinding provided the first direct evidence thatface-selective IT neurons play a causal rolein the perception of faces. It suggests that ITactivity represents the evidence used to solvethe task but does not rule out the possibilitythat IT represents a DV or even the outcomeof the perceptual categorization (Sheinberg &Logothetis 1997, 2001).

    In a related study, measurements of bloodoxygen level differences (BOLD) in fMRIwere used to identify correlates of a DV thatreads out object categorization evidence fromIT (Heekeren et al. 2004). Human subjectswere trained to perform a one-interval dis-crimination between faces and houses maskedby noise. The two sets of stimuli were usedbecause distinct regions of IT are activatedfor unmasked images from the two categories(Haxby et al. 1994, Kanwisher et al. 1997). Acandidate DV was found in the dlPFC. Activ-ity in this area was strongest when the sensoryevidence was strongest and tended to covarywith the magnitude of the difference in theBOLD signal measured on single trials in the“face” and “house” areas. Such fMRI experi-ments provide an essential link between mon-key neurobiology and human brain function,although they lack the spatial and temporalresolution to characterize fully the neuronaldynamics that distinguish evidence from theDV.

    Electroencephalography (EEG), which isnot burdened by the same temporal resolu-tion problem, has been measured in humansubjects for similar one-interval categoriza-tion tasks requiring a discrimination be-tween pictures of faces and pictures of cars(Philiastides et al. 2006, Philiastides & Sajda2006, VanRullen & Thorpe 2001). Two sig-nal kernels could best differentiate the car andface stimuli on single trials. An early potentialthat appeared ∼170 ms after stimulus onsetwas selective for faces and only weakly predic-tive of errors, a possible correlate of sensoryevidence. A later potential appearing ∼300 msafter stimulus onset appeared to reflect thedifficulty of the decision linking the sensory

    www.annualreviews.org • Decision Making 553

    jigoldHighlight

    jigoldCross-Out

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    evidence to the subject’s choice, a quality ofthe DV. However, localization of the signals ishampered by the limitations of EEG record-ing and the powerful technique used in thisstudy to combine signal across electrodes.

    Olfactory discrimination. To date, therehave been few experiments on the neurobi-ology of decisions in animals beside the mon-key. However, the availability of techniquesto study molecular and cellular mechanismsin rodents hints at tremendous possibilitiesif they could be trained on decision-makingtasks. Several recent studies have begun tomake progress by using tasks that exploit twoof their natural strengths: foraging behaviorand olfactory processing.

    In one study, rats were trained to discrim-inate between a pair of odors using a one-interval task with mixtures of the two odors(Uchida & Mainen 2003). The patterns of ac-tivation in the olfactory bulb were distinctivewhen the stimulus was behaviorally discrim-inable and less distinctive otherwise, a firststep toward identifying signals comprising thesensory evidence. However, the rats appearedto form their decisions in a single sniff anddid not benefit from more time (Kepecs et al.2006, Uchida et al. 2006). Such a short timescale clouds the distinction between evidenceand DV and thus exposes possible limitationsof this model for the study of decision making.

    However, it seems unlikely that rats can-not accrue evidence in time. A recent studydemonstrated a clear trade-off between speedand accuracy in a similar odor-discriminationtask (see also Abraham et al. 2004, Rinberget al. 2006). A window of integration of∼400 ms was identified, which rivals the in-tegration times for monkey’s performing theRDM direction-discrimination task. Com-bined with the recent discovery of the ol-factory receptor genes (Buck 1996, Buck &Axel 1991) and subsequent progress in under-standing the stable, topographic organizationof the olfactory bulb (e.g., Rubin & Katz 1999,Meister & Bonhoeffer 2001, Mombaerts et al.1996) these results suggest a promising future

    for combined psychophysical and physiologi-cal experiments on olfactory decision makingin rodents.

    Detection. Detection experiments arethe archetypical SDT paradigm, yet theypresent serious challenges to neurobiologiststrying to understand the underlying decisionprocess. According to SDT, detection beginswith a sample of evidence generated byeither a signal plus noise or noise alone. TheDV based on this sample is monotonicallyrelated to the LR in favor of h1: “S present”and against h2: “S absent”. The decisionis H1: “Yes” if the DV exceeds a criterion,which is set to achieve some desired goal,and H2: “No” otherwise. This procedureseems straightforward but glosses over animportant question: When should the DVincorporate evidence if there is temporaluncertainty about when the stimulus willappear? Obtaining samples at the wrong timemight miss the signal [i.e., high P (H2 | h1)].In contrast, accumulating samples over timewill accumulate noise alone in all epochsthat do not contain the signal, causing moremisses in signal-present conditions and morefalse alarms (P (H1 | h2)) in signal-absent con-ditions. The consequence of these errors willbe a loss of sensitivity (Lasley & Cohn 1981).

    This problem has several possible solu-tions. One is to use an integrator that leaks,causing irrelevant information to affect thedecision process for only a limited amount oftime (Smith 1995, 1998). Other approachesinclude taking a time derivative of the evi-dence to identify changes or using knowledgeof the spatial and temporal structure of thestimulus to guide a more directed search forthe evidence. Some studies in the psychophys-ical literature hint of such mechanisms (e.g.,Henning et al. 1975, Nachmias & Rogowitz1983, Verghese et al. 1999, Schrater et al.2000), but to date there is little understandingof their general role in detection.

    Despite these challenges, several recentstudies have begun to shed light on de-tection mechanisms. In one study, monkeys

    554 Gold · Shadlen

    jigoldCross-Out

    jigoldHighlight

    jigoldHighlight**replace outer parens with brackets, consistent with the style used elsewhere; i.e.,

    [P(H1|h2)]

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    a

    b

    c

    Receptivefield

    Coherentmotionat onepatch(750 ms)

    0% Coherent(500 - 8000 ms)

    Static cue

    Time

    Fixationpoint

    0 10 20 300

    1

    Pro

    po

    rtio

    n“y

    es”

    resp

    on

    ses

    Motion coherence

    Fir

    ing

    rat

    e (s

    p s

    –1)

    Time (ms)

    Missed

    MT

    0 60010

    35

    312

    377 411446492

    561694

    False alarms

    Threshold

    10

    Time (ms)

    35

    0 600

    309374

    406 440483

    549 702

    VIP

    Missed

    False alarms

    Fir

    ing

    rat

    e (s

    p s

    –1)

    d

    Figure 7Motion detection. (a) Detection task. The monkey views a RDM stimulus without any net direction ofmotion and must release a bar when the motion becomes coherent. Task difficulty is controlled by theintensity of the motion step (% coherence). (b) Probability of deciding Yes plotted as a function ofstimulus intensity. (c, d ) Average firing rates from neurons in MT and VIP. The responses are aligned toonset of the motion step and grouped by RT. Yes decisions are predicted by a rise in firing rate in bothareas. The horizontal dashed line is a criterion derived to match the animal’s performance. The misseddetections (motion step, but decision = No) are explained by a failure of the firing rate to reach thislevel. According to SDT, the false alarms (no motion step, but decision = Yes) should reach this level,but with responses aligned to the lever release at 750 ms, they do not; a possible explanation is a variablerelationship between the end of the decision process and the time of the lever release. Adapted withpermission from Cook & Maunsell (2002a,b).

    were trained to detect the onset of partiallycoherent motion in a RDM display (Figure 7)(Cook & Maunsell 2002b). Activity in bothMT and VIP was correlated with trial-by-trial detection performance and RT. However,single-unit responses were far less reliable de-tectors of the stimulus than were the mon-key subjects. This finding contrasts resultsfrom discrimination experiments and seems

    likely to result, in part, from the temporaluncertainty problem described above. Never-theless, population analyses using a leaky ac-cumulator model provide some insight intothe decision process. For MT, population re-sponses on hit trials deviated from baselineat a time that was closely coupled with mo-tion onset and then rose steadily with a rateof rise that was correlated with RT. Responses

    www.annualreviews.org • Decision Making 555

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    on miss trials were similar but failed to attainthe same level of activation as hits. In con-trast, responses on false alarm trials deviatedlittle from baseline. These results suggest thatMT provides a preliminary form of evidence.In contrast, VIP population responses wereless coupled to motion onset and were morestereotypical with respect to the behavioral re-sponse, suggesting they represent either theDV or simply the outcome of the decisionprocess.

    Similar detection experiments were con-ducted using the VTF stimulus (Figure 8)(de Lafuente & Romo 2005). As for the dis-crimination task, S1 activity appears to rep-resent the sensory evidence and not the DVor choice. Increasingly intense stimuli (deeperskin depressions) lead to high firing rates thatare more easily differentiated from neural ac-tivity when no stimulus is present (Figure 8c,d ). Moreover, trial-to-trial variations in theresponses to a weak stimulus are correlated,albeit weakly, with the monkey’s yes and nodecisions. This weak choice probability is sim-ilar in magnitude to that found for MT neu-rons on the RDM direction-discriminationtask (Britten et al. 1996).

    In contrast to S1, MPC responses are mod-ulated by not only stimulus intensity but alsothe monkey’s behavioral report (Figure 8e).

    These results suggest that MPC representsthe DV that converts the evidence into achoice. However, inspection of Figure 8esuggests that different stimulus intensitiescause primarily differences in latency, the ex-pected behavior of a neuron that simply re-sponds stereotypically after the decision hasended. Nevertheless, even if MPC neuronsrepresent the decision outcome and not theDV, they do not do so in a trivial matter.The responses are not simply associated withthe button press. Moreover, electrical mi-crostimulation of MPC without presentationof the tactile stimulus leads to similar per-ceptual reports. These results have been usedto support the bold (and currently unverifi-able) assertion that MPC responses providea neural correlate of the subjective percep-tion of the tactile stimulus (see also Ress et al.2000, Ress & Heeger 2003) for a similar asser-tion about activity in early visual cortex, mea-sured using fMRI, for a contrast-detectiontask).

    Simple Motor Latencies: DecidingWhen to Initiate an Action

    In this section, we consider easy decisions thatdo not tax perceptual processing but insteadsimply trigger a movement. Commonly used

    −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→Figure 8Vibrotactile detection. (a) Task. The VTF probe is placed on the finger pad. After a random delay, a20-Hz sinusoidal indentation is applied on half the trials. The monkey must decide whether the stimuluswas present. Detection is indicated by removing a finger of the other hand from a key and pushing abutton. (PD, probe down; KD, key down; PU, probe up; MT, movement to the response button).(b) Psychometric function. The monkey is more likely to decide Yes at larger vibration amplitudes. Thefalse alarm rate is ∼8%. (c) Response of a typical S1 neuron. Red markings indicate errors. The top halfof the raster shows trials in which vibration was applied. Red marks indicate missed detections. Thebottom half of the raster shows trials in which no stimulus was shown. Red marks indicate false alarms.(d ) Distributions of firing rates in S1 during the stimulus period. Each curve represents a frequencydistribution associated with vibration at one intensity (indicated by color). The vertical line shows a possiblecriterion that the brain could apply to decide yes or no. (e) Response of a typical MPC neuron. Sameconventions as in panel c. The neuron responds strongly when the monkey reports Yes even when there isno stimulus present (false alarms). Notice that on some trials on the top half of the graph, the neuronseems to indicate a detection decision before the stimulus is applied. These trials appear as correctdetections, but they are probably lucky mistakes. ( f ) Average firing rate of S1 and MPC neurons as afunction of stimulus intensity. The firing rates are for the epoch of stimulus presentation. Only correctresponses (Yes when amplitude >0) are included. Adapted with permission from de Lafuente & Romo(2005).

    556 Gold · Shadlen

    jigoldCross-Out

    jigoldHighlight

    jigoldHighlight

    jigoldCross-Out

    jigoldCross-Out

    jigoldInserted Textfalse-alarm errors

    jigoldCross-Out

    jigoldInserted Textmissed-detection errors

    jigoldHighlight

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    www.annualreviews.org • Decision Making 557

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    tasks require a saccadic eye movement to avisual target that might appear at an unpre-dictable time or location but is unambiguousonce it appears. Insights into the underlyingdecision process have come primarily by an-alyzing the distributions of latencies from vi-sual instruction to saccade initiation.

    Motor latencies for these tasks average∼200 ms but can range from ∼90 ms to>400 ms (Carpenter 1988, Sparks 2002,

    a

    b

    c

    Threshold ST

    Start level S0

    Rate ofrise (r)

    inf

    Per

    cen

    t cu

    mu

    lati

    vefr

    equ

    ency

    Infinite-timeintercept

    Median

    Reaction time (ms)

    50

    10

    95

    5100 200 500 1000

    90

    S0

    μ = mean(r)

    Changing causes reciprobit swivelS0

    Changing causes reciprobit shiftμ

    Latency

    Reaction timedistribution

    Westheimer 1954). By comparison, the timeit takes to register visual input in visuo-motorareas such as LIP, FEF, and SC can be

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    a probability scale (a reciprobit plot) is nearlylinear (Figure 9b). To produce this distribu-tion, assume that a DV begins at level S0 att = 0 then rises linearly with slope r. Thesaccade is initiated when the DV reachesST at t = t1. Critically, r is not determinedprecisely but is rather a random numberdrawn from a Gaussian distribution withmean μ and standard deviation σ . Because onany trial ri = (ST − S0)/ti , the reciprocals ofthe response latencies (1/ti) obey a Gaussiandistribution. The harmonic mean of the RTis the threshold height (ST − S0) dividedby μ. This model is called LATER (linearaccumulation to threshold with ergodic rate),which emphasizes that the DV is simply theaccumulation of a constant. It makes testablepredictions about how the distribution of RTshould change for two different mechanisms:changes in the threshold height (ST − S0) orthe mean rate of rise (μ) (Figure 9c).

    This formulation has some shortcomings.It typically deals with only one alternativeand short RTs, conditions under which evi-dence accumulation can be reduced to a singlenumber. Moreover, several possible sources ofvariability are not explored as alternate hy-potheses for resulting RT distributions, in-cluding latencies outside the decision process(e.g., sensory and motor delays), changes inthe variance of the rate of rise (r), and vari-ance in the starting point or threshold. Thesefactors can cause real data not to conformneatly to the prescriptions for lines, swivels,and shifts illustrated in Figure 9. When theydo, however, the overall simplicity of LATERis appealing, and it has been used to explainseveral interesting phenomena.

    One study tested the idea that prior prob-ability affects the threshold for initiating anaction (Carpenter & Williams 1995). Priorswere manipulated by changing the probabil-ity that the target would be shown to the rightor left of fixation. Not surprisingly, the laten-cies for eye movements to the target at themore probable location were faster, on aver-age, than to the other location. Differences

    in the distributions of RT for different pri-ors conformed to the “swivel” prediction, sug-gesting a change in the threshold (relative tothe starting point) and not in the rate of riseof the DV. The threshold appeared to changeas a linear function of the logarithm of theprior, which suggested that the DV has unitslog(P ). This idea is important because it im-plies a form of probabilistic reasoning, withthe DV representing a level of certainty thatthe prepared movement should be executed.Further studies have shown that different de-cision strategies that favor urgency, certainty,or prior expectations seem to trade off in theseunits of log(P) (Reddi et al. 2003, Reddi &Carpenter 2000).

    Several physiological results provide qual-ified support for the LATER model. First,the rate of rise (r) in the activity of FEFmovement cells (Bruce & Goldberg 1985)prior to a saccade is variable and predicts RT,whereas the final level of activation (ST ) is rel-atively fixed regardless of RT (Hanes & Schall1996). However, other studies of the FEF,the SC, and primary motor cortex have foundthat prestimulus activity (S0) is more predic-tive than r of trial-to-trial variability of RTs(Connolly et al. 2005, Dorris et al. 1997,Dorris & Munoz 1998, Everling & Munoz2000, Lecas et al. 1986, Riehle & Requin1993). Second, priors affect the responses ofbuild-up cells in the SC (Basso & Wurtz 1997,1998; Dorris & Munoz 1998). In general, thehigher the probability is of making a saccadeto a particular target, the higher the levelis of activity that occurs just before the tar-get appears (S0) and the shorter the RT is.Third, behavior on a countermanding task,in which the subject must occasionally with-hold a planned saccade, is consistent with arace between two independent LATER pro-cesses representing “stop” and “go” (Hanes& Carpenter 1999, Logan et al. 1984). Thesetwo processes have correlates in the activityof fixation- and saccade-related cells, respec-tively, in the FEF and SC (Hanes et al. 1998,Munoz & Wurtz 1993, Paré & Hanes 2003).

    www.annualreviews.org • Decision Making 559

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    Value-Based Decisions

    Decisions that are based primarily on thesubjective value associated with each of thepossible alternatives are the focus of thenascent field of neuroeconomics (for reviewssee Glimcher 2005, Sanfey et al. 2006, Sugrueet al. 2005). A multitude of approaches, in-cluding behavior and imaging in human sub-jects and behavior and electrophysiology innon-human primates, are being used to ex-amine how the brain assigns, stores, retrieves,and uses value to make decisions. Here welimit our remarks to a few key concepts fromperceptual decisions that seem relevant to thestudy of value-based decisions.

    Neurobiological correlates of value havebeen described in orbitofrontal and cingulatecortex and the basal ganglia, areas of the braintraditionally associated with reward-seekingbehavior (Kawagoe et al. 2004; Lauwereynset al. 2002; McCoy et al. 2003; Schultz 1992,1998; Schultz et al. 1997; Tremblay & Schultz1999, 2000; Watanabe 1996; Watanabe et al.2003). Some neurons in orbitofrontal cor-tex appear to represent value independentlyfrom evidence, choice, and action (Padoa-Schioppa & Assad 2006). Anterior cingulatecortex is thought to represent negative value(Carter et al. 1998, Gehring & Willoughby2002, Yeung & Sanfey 2004). Recent stud-ies have shown additional representations ofreward size or probability in parietal and pre-frontal association areas in the same neu-rons implicated in perceptual decision mak-ing (Kobayashi et al. 2002, Leon & Shadlen1998, Platt & Glimcher 1999). In LIP anddlPFC, the representation of value seems tobe dynamic, adjusted on the basis of the re-cent history of choices and their consequences(Barraclough et al. 2004, Dorris & Glimcher2004, Sugrue et al. 2004).

    It is tempting to try to analyze these phe-nomena in the context of the elements of a de-cision listed above. An obvious place to startwould be to try to distinguish representationsof the DV from representations of its rawmaterials (reward/value/utility). Regions like

    LIP that represent the DV on perceptual tasksmight represent the DV on value-based tasks,as well. Indeed, according to SDT and SA,value can, in principle, be treated the sameway as priors and sensory evidence in forminga decision and can be applied to either the DVor the criterion. Thus a neural circuit that rep-resents a DV by integrating sensory evidenceon a perceptual task might be equally suitedto integrate value in a different context.

    Of course, this line of inquiry has seriouschallenges. Little is known about the units inwhich value is represented in the brain, al-though some studies suggest that quantitieslike utility—the product of reward magnitudeand probability—might be represented di-rectly (Breiter et al. 2001, Dorris & Glimcher2004, Knutson et al. 2005, Padoa-Schioppa &Assad 2006, Platt & Glimcher 1999, Schultz2004). Moreover, unlike for sensory evidenceon a perceptual task, the time course of a valuerepresentation is not easily defined or manip-ulated, making it difficult to identify.

    Scholars also debate whether the basicmechanisms we have described for perceptualdecisions even apply to value-based decisions.This debate is concerned with randomness.Choices on both perceptual and value-basedtasks often appear to be governed by a randomprocess. For perceptual tasks, this randomnessis typically explained by considering the evi-dence as a mixture of signal plus noise. TheDV and decision rule are both formulated tominimize the effects of this noise in pursuitof a particular goal. However, for value-baseddecisions, the randomness is often assumed tobe part of the decision process itself. That is, asubjective measure like utility is used to assignthe relative desirability of each choice. Thedecision rule is then probabilistic: a randomselection weighed by these relative measures(Barraclough et al. 2004, Corrado et al. 2005,Glimcher 2005, Lee et al. 2005, Sugrue et al.2005).

    One argument supporting this idea comesfrom the theory of games (Glimcher 2005, vonNeumann & Morgenstern 1944). In competi-tive games, subjects outsmart their opponents

    560 Gold · Shadlen

    jigoldCross-Out

    jigoldInserted Textlike expected utility–the product of subjective reward value

    jigoldHighlight

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    by making choices that appear to an opponentas random. Not doing so would make themvulnerable to an opponent able to exploit pre-dictability. A second argument is that random-ness facilitates exploration, which is essentialfor discovery of nonstationary features of theenvironment and is thus found in many learn-ing algorithms (Kaelbling et al. 1996). A thirdargument comes from the simple observationthat average behavior is apparently randomunder many circumstances, such as when fol-lowing a matching law in foraging (Herrnstein1961).

    A recent study provided an intriguing anal-ysis to support this idea (Corrado et al. 2005).The study examined the sequences of choicesmade by monkeys on a simple oculomotorforaging task (Sugrue et al. 2005). The mon-keys made eye movements to one of two vi-sual targets, each of which was rewarded on adynamic, variable-interval schedule. Consis-tent with previous reports, the monkeys typ-ically exhibited matching behavior, in whichthe fraction of choices to one of the targetsmatched the fraction of total rewards theyearned for that choice. A model that suc-cessfully described the monkeys’ behavior andcould generate realistic choice sequences wasbased on a deterministic, noise-free calcula-tion of the DV (in this case describing ex-pected reward) based on the recent historyof rewards, followed by a random (Poisson)process that generates a choice based on theDV. Critically, the values of the parame-ters of the model that provided the best fitto the data were very close to optimal interms of maximizing the average reward re-ceived per trial. However, alternative mod-els that assumed that noise was present inthe DV instead of (or in addition to) thedecision rule were not tested, so it is diffi-cult to assess which model is the more likelyimplementation.

    In general, it is not clear why core princi-ples of decision making that apply to percep-tual tasks should be abandoned to account forthese phenomena. The key issue is whetherbehavior that appears, on average, to be ran-

    dom reflects a decision mechanism that ex-plicitly generates randomness or instead en-acts a rule to achieve a goal but is faced withnoisy input (Herrnstein & Vaughan 1980, Lau& Glimcher 2005). The former is certainlypossible: We can explicitly decide to try togenerate random behavior. However, the lat-ter mechanism is central to our perceptualabilities. It can, in principle, deal with theappropriate kind of input—in this case infor-mation about value, expected outcomes, anddividends/costs associated with exploration—and produce the best possible choice. Thus,a series of value-based choices might appearrandom for the same reason that a series ofperceptual decisions appears random underconditions of uncertainty.

    Reconsider the speed-accuracy trade-off.It arises because each observation, e, is equiv-ocal; e by itself cannot be used to distinguishperfectly the alternative hypotheses. Thus,the decision maker is left in a quandary. Gath-ering more observations might improve accu-racy, but at the cost of speed. What is the rightthing to do? The correct answer is, it depends.If speed is valued, gather less evidence. If ac-curacy is valued, gather more evidence. If bothare valued, attempt to maximize quantities likethe rate of reward (Gold & Shadlen 2002).The point is that even perceptual decisionshave, at their core, value judgments. Noisy in-put leads to imperfect output. Because no uni-versal prescription exists for which imperfec-tions are acceptable and which are not, theirrelative values must be weighed and then usedto shape the decision process. It thus seemsreasonable to posit that value-based decisionscan exploit these same mechanisms. This re-mains to be seen.

    CONCLUSIONS

    In this review we have evaluated progress inunderstanding how the brain forms decisions.Our focus was intentionally narrow, consid-ering only decisions on simple sensory-motortasks that are amenable to behavioral and neu-rophysiological studies in the laboratory. We

    www.annualreviews.org • Decision Making 561

  • ANRV314-NE30-21 ARI 19 March 2007 22:1

    have presented a theoretical framework fromstatistical decision theory that describes howto form decisions using priors, evidence, andvalue to achieve certain desirable goals. Wehave used the elements of this framework—particularly the distinction between evidenceand the DV—to analyze experimental resultsfrom tasks requiring perceptual decisions (dis-crimination and detection), simple motor de-cisions, and value-based decisions.

    Even these simple sensory-motor tasks re-quire nuanced and flexible mechanisms thatseem likely to play general roles in decisionmaking. However, one must consider severalqualifying factors. First, there is a strong de-gree of automaticity in subjects performingthese tasks over and over, in some cases forweeks or months at a time. This is in starkcontrast to many of the decisions we en-counter in real life, such as deciding whomto marry, which require more deliberationand often have few or no similar experiencesfrom which to draw. In fact, the case has beenmade for two distinct decision-making sys-tems: one “intuitive,” which controls simplebehaviors learned through repeated experi-ence, and the other “deliberative,” which isdesigned to achieve goals in a dynamic envi-ronment (Kahneman 2002). However, there islittle evidence for these distinct mechanismsin the brain (Sugrue et al. 2005). As we haveargued, even the simplest sensory-motor de-cisions seem to be based on deliberative ele-ments.

    A second qualifying factor is that manytasks require a selection between two alter-natives. This design, long favored by psy-chophysicists, allows for rigorous quantifi-cation of the relationship between stimulusand response. It is also consistent with deci-sion algorithms, such as SPRT, that are basedon the value of a ratio (e.g., logLR) that isan explicit comparison of the two alterna-tives (see The Sequential Probability RatioTest). However, it is unclear how or even ifthe mechanisms responsible for these simplecomparisons can generalize to more complexdecisions involving many alternatives. Recent

    theoretical work has begun to explore algo-rithms to solve these decisions [e.g., the multi-ple sequential probability ratio test (MSPRT);McMillen & Holmes 2006]) and how theymight be implemented in the brain (Gurney& Bogacz 2006, Roe et al. 2001, Usher &McClelland 2001), but much work remains tobe done.

    A third qualifying factor is the explicit linkbetween the decision and a specific course ofaction (e.g., an eye or hand movement) en-forced in most tasks. As described above, manyneurophysiological studies exploit this designby treating the decision process as a problemof movement selection. The search for neu-ral correlates of the decision can thus focuson parts of the brain known to select and pre-pare the associated movement. This approachhas shown that the mechanisms of movementselection appear to incorporate all the ele-ments of a deliberative decision. However, itleaves open the question of how and wherethe brain forms decisions that are not used toselect a particular movement. One possibilitythat maintains this intention-based architec-ture is that abstract decisions are formed bycircuits involved in abstract forms of behav-ioral planning; e.g., flexible rules that involvefuture contingencies, of the sort thought tobe encoded in areas of the prefrontal cortex(Wallis et al. 2001).

    The path from simple decisions to com-plex ones may be more straightforward thanit appears. Consider a simple decision aboutthe direction of RDM that is not tied to a par-ticular action (Gold & Shadlen 2003, Horwitzet al. 2004). For this abstract decision, the DVand decision rule are likely recognizable butare carried out in circuits linked to workingmemory, long-term planning, or behavioralcontingencies (e.g., the context/motivationboxes in Figure 1) as opposed to specific ac-tions. Other complex decisions are made us-ing sources of evidence that, like p