designing interventions that include delayed reinforcement

13

Click here to load reader

Upload: stephen-barnes

Post on 06-May-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Designing Interventions That Include Delayed Reinforcement

359

JOURNAL OF APPLIED BEHAVIOR ANALYSIS 2000, 33, 359–371 NUMBER 3 (FALL 2000)

DESIGNING INTERVENTIONS THAT INCLUDEDELAYED REINFORCEMENT: IMPLICATIONS OF

RECENT LABORATORY RESEARCH

ROBERT STROMER

EUNICE KENNEDY SHRIVER CENTER

JENNIFER J. MCCOMAS

UNIVERSITY OF MINNESOTA

AND

RUTH ANNE REHFELDT

TRINITY SERVICES

The search for robust and durable interventions in everyday situations typically involvesthe use of delayed reinforcers, sometimes delivered well after a target behavior occurs.Integrating the findings from laboratory research on delayed reinforcement can contributeto the design and analysis of those applied interventions. As illustrations, we examinearticles from the Journal of the Experimental Analysis of Behavior that analyzed delayedreinforcement with respect to response allocation (A. M. Williams & Lattal, 1999), stim-ulus chaining (B. A. Williams, 1999), and self-control (Jackson & Hackenberg, 1996).These studies help to clarify the conditions under which delayed reinforcement (a) ex-ercises control of behavior, (b) entails conditioned reinforcement, and (c) displaces theeffects of immediate reinforcement. The research has applied implications, including thedevelopment of positive social behavior and teaching people to make adaptive choices.

DESCRIPTORS: delayed reinforcement, response allocation, stimulus chains, self-control, integration of basic and applied research

Establishing the initial instances of a be-havioral repertoire typically requires the useof programmed consequences that occur im-mediately after the target response occurs.However, the job of the applied behavior an-alyst also involves the strategic use of delayedreinforcement. Behaviors that yield delayedreinforcement are highly adaptive in every-

Preparation of this review was supported by the Na-tional Institute of Child Health and Human Devel-opment (Grants HD32506 and HD25995) and theMassachusetts Department of Mental Retardation(Contract 100220023SC).

Ruth Anne Rehfeldt is now affiliated with the Re-habilitation Institute, Southern Illinois University atCarbondale.

Address correspondence to Robert Stromer, Psycho-logical Sciences Division, Eunice Kennedy ShriverCenter, 200 Trapelo Road, Waltham, Massachusetts02452 (E-mail: [email protected]).

day life, but they may be difficult to estab-lish and maintain. Such difficulties raise em-pirical questions about how delayed rein-forcement exercises control and how inter-vening stimulus events might enhance itseffects on behavior. The use of delayed re-inforcement also may be challenging whenthere are favorable contingencies for behav-iors that are incompatible with those thatsatisfy the delayed consequences, even whenthe reinforcers involved are relatively smallor less preferred. As discussed later, exercis-ing self-control is a matter of learning to for-go engaging in behaviors that yield imme-diate reinforcers in favor of behaviors thatyield delayed reinforcers.

Because of its practical significance, andbecause of the difficulties involved in its sys-tematic arrangement, applied researchers

Page 2: Designing Interventions That Include Delayed Reinforcement

360 ROBERT STROMER et al.

could benefit from the knowledge gained innumerous laboratory studies on the topic ofdelayed reinforcement to better integrate ba-sic and applied sciences. In a previous articlein this series, Hayes and Hayes (1993) em-phasized that the effects of events that hap-pen during a delay, such as delivery of to-kens or teaching someone to self-instruct,are important for a behavioral analysis. Thepresent article extends that discussion by ex-amining three other empirical papers recent-ly published in the Journal of the Experimen-tal Analysis of Behavior (JEAB) concernedwith delayed reinforcement. The studies ex-amined response–reinforcer relations withrespect to delayed, unsignaled reinforcement(A. M. Williams & Lattal, 1999), stimuluschaining (B. A. Williams, 1999), and self-control (Jackson & Hackenberg, 1996). Toeasily connect basic concepts to areas of ap-plication, we begin by describing scenariosderived from research published in the Jour-nal of Applied Behavior Analysis that involveddelayed reinforcement. We then describesome of the methods and results of the JEABtarget articles, and suggest ways of integrat-ing that information with applied behavioranalysis by referring to the case scenarios.

Delayed Reinforcement in Applied Settings

The following case scenarios illustratepractical situations in which the interven-tions involve delayed reinforcement. Weconsider the applied goals of (a) promotingsharing and cooperation, (b) completing as-signed activities, and (c) substituting func-tional requests for aggression.

Promoting sharing and cooperation. With abasic social repertoire intact, delivering re-inforcement later rather than sooner may ac-tually increase positive social behavior. Con-sider a preschooler with typical developmentwho seldom shares or cooperates with otherchildren during free play (Fowler & Baer,1981). To increase socialization, the child istaken aside before a day’s play periods to

briefly rehearse the social skills desired. Thechild is also told to exhibit those skills dur-ing that day’s free play. Then, the child isobserved during two play periods, one heldearly in the day and one held later. After aplay period, if positive social behavior oc-curs, the child receives a sticker that is ex-changed for a toy. When such reinforcementoccurs after the early play period, sharingand cooperation increase during that periodbut not during the late play period. How-ever, if reinforcement is delivered at the endof the preschool day, positive social behaviorincreases during both the early and late playperiods. This happens even though there areno exteroceptive cues or signals (such as ver-bal reminders, tokens, etc.) that the delayedreinforcers will be provided. As discussed be-low, this scenario has a lot in common withA. M. Williams and Lattal’s (1999) study ofdelayed reinforcement with pigeons.

Completing assigned activities. An explicitlearning history may be needed to establishthe self-control choices that yield large de-layed reinforcers rather than impulsivechoices that yield small immediate reinforc-ers (e.g., Ainslie, 1974; Mazur, 1998; Rach-lin & Green, 1972). Suppose self-controlmethods are to be used with an adult withmental retardation (Dixon et al., 1998). Partof the adult’s day-treatment program entailssitting with a small group engaging in learn-ing activities. The adult has difficulty doingthis and is often prompted to remain seatedand complete a task. Verbal instructions aretried first, and the adult is promised a largereinforcer (three crossword puzzles) for fin-ishing the day’s activity. Neither of these ap-proaches improves performance, but the fol-lowing intervention does: During an initialassessment, the adult consistently chooses alarge reinforcer (three crossword puzzles) in-stead of a small reinforcer (one crosswordpuzzle). Then, the adult is offered the smallreinforcer immediately (in effect, for doingno work) or the large reinforcer for doing

Page 3: Designing Interventions That Include Delayed Reinforcement

361DELAYED REINFORCEMENT

just a little (5 min) of a day’s activity. As aresult, the adult opts for doing the activityfor 5 min and receives the large reinforcer.Gradually, the amount of time required isincreased until the objective is met: 25 minof sustained participation in the day’s sched-uled activities. Thus the use of signals (in-structions) encourages behaviors that yieldlarge delayed reinforcers and helps to in-crease tolerance for delays.

Substituting requests for aggression. Aggres-sion in some individuals might be concep-tualized as impulsive behavior (Vollmer, Bor-rero, Lalli, & Daniel, 1999). During a clin-ical assessment, when a participant with de-velopmental disabilities is aggressive (e.g.,hitting, kicking, and scratching), a small re-inforcer (e.g., one potato chip or 30 s ofwatching television) is delivered; but whenthe participant mands appropriately byhanding the teacher a picture card, a largereinforcer (e.g., three potato chips or 60 s ofwatching television) is delivered. Underthese conditions, the participant shows apreference for manding over aggression.However, aggression increases when thesmall reinforcer is delivered immediately fol-lowing it and manding produces a delay tothe large delayed reinforcer. Such impulsivechoices are altered by delivering the large re-inforcer at the end of a signaled delay (e.g.,a reinforcer is visible during the delay; thedelay is marked with a kitchen timer). Thus,the participant exhibits self-control when thelonger delay is signaled, and exhibits impul-sive choices when the longer delay is unsig-naled. Studies by B. A. Williams (1999) andJackson and Hackenberg (1996) describedbelow are germane to these two scenarios.

Basic Research Related toDelayed Reinforcement

The role of signals. Delayed reinforcementcan be arranged to establish and maintainresponding in the absence of shaping or oth-er training (e.g., Lattal & Williams, 1997).

This phenomenon has been demonstratedacross several species, including rats and pi-geons (Lattal & Gleeson, 1990; Wilkenfield,Nickel, Blakely, & Poling, 1992), Siamesefighting fish (Lattal & Metzger, 1994), andhuman infants (Reeve, Reeve, Brown,Brown, & Poulson, 1992). The findings ofsuch studies consistently demonstrate thatunsignaled delayed reinforcement produceslow but persistent rates of responding(Critchfield & Lattal, 1993; Lattal & Glee-son, 1990; Wilkenfield et al., 1992). A re-cent study by A. M. Williams and Lattal(1999) with pigeons contributes to the basicliterature on delayed reinforcement and pro-vides some potential avenues of research forthe application of delayed reinforcement.

A. M. Williams and Lattal (1999) ar-ranged a concurrent schedule in which a pi-geon’s pecks on one key did not produce anyconsequences (the irrelevant key) and peckson the other key were reinforced on a tan-dem schedule involving variable-interval(VI) 15-s and differential-reinforcement-of-other-behavior (DRO) 10-s components(the relevant key). Thus, responding on therelevant key initiated the DRO 10-s sched-ule an average of once every 15 s, and re-sponding on either key during the DRO in-terval reset the delay interval. Selection ofthe left or right key as the relevant operan-dum occurred before each session, based ona semirandom sequence that limited assign-ment of the same key as the relevant oper-andum to no more than three consecutivesessions. The resultant data demonstratedthat, overall, the pigeons allocated more re-sponses to the relevant than to the irrelevantkey. Analysis of the data indicated that (a)unsignaled resetting delays can maintain ahigher relative rate of responding, (b) con-trol by a particular response–reinforcer rela-tion increases with longer histories of expo-sure to that relation, and (c) the response–reinforcer relation and not some other be-havioral process is primarily responsible for

Page 4: Designing Interventions That Include Delayed Reinforcement

362 ROBERT STROMER et al.

the sustained low response rates that occurwith unsignaled resetting delayed reinforce-ment.

Application: The role of signals and responseallocation. Aberrant and adaptive behaviorsmay be viewed as concurrent schedules (e.g.,Fisher & Mazur, 1997; Mace & Roberts,1993; McComas & Mace, in press) that areinfluenced by four variables: rate, quality,and immediacy of reinforcement, as well asresponse effort (McDowell, 1988). To theextent that the parameters of these variablesare understood, they can be arranged to op-timize the effects of reinforcement for an in-dividual in a given situation. Applications ofthese reinforcement and response parametershave demonstrated the effects of concurrentreinforcement contingencies on response al-location across aberrant and appropriate re-sponse alternatives (Horner & Day, 1991;Peck et al., 1996). Specifically, concurrentschedules are often arranged in which afunctionally equivalent appropriate alterna-tive response (often a mand for the identi-fied reinforcer) produces a more favorableschedule of reinforcement relative to the ab-errant behavior. Few applied studies havedemonstrated the effects of delayed rein-forcement in this type of concurrent-sched-ules arrangement (Vollmer et al., 1999).Questions remain regarding how delayed re-inforcement can be arranged within concur-rent schedules to generate an appropriate al-ternative response that will effectively com-pete with the aberrant behavior and main-tain the appropriate alternative response at ahigher rate relative to the occurrence of ab-errant behavior.

Whereas delayed reinforcement can effec-tively establish and maintain responding, itcan also reduce the occurrence of steady-state responding. B. A. Williams (1976) ex-amined the effects of an unsignaled delayedreinforcement procedure that involved a VIschedule in which completion of the re-sponse requirement began a delay timer,

with the reinforcer being delivered at theend of the interval. Delays as short as 3 ssubstantially reduced the pigeons’ respond-ing from its steady state. Moreover, respons-es could occur during the delay, such thatobtained delays were often shorter thanscheduled delays. These findings suggest thatin order to avoid the occurrence of behaviorduring the delay (e.g., the development of aresponse chain consisting of the appropriatebehavior followed by the aberrant behavior,as reported by Wacker et al. (1990), it maybe important to incorporate a DRO require-ment during the delay (A. M. Williams &Lattal, 1999).

Signals may be an important variable toconsider when the goal involves mainte-nance of appropriate alternative behavior ondelayed reinforcement schedules within con-current schedules. For example, Lattal(1984) found that signaled delays main-tained a higher rate of responding in pigeonsthan did unsignaled delays. Similarly, Voll-mer et al. (1999) demonstrated that appro-priate alternative responding could be main-tained by signaled delays of up to 10 min,whereas unsignaled delays failed to maintainthe same behavior for more than 60 s (seediscussion below). In addition, aggressionoccurred at much lower rates when delayswere signaled than when they were not.Whether a DRO component would be a vi-able alternative or supplement to Vollmer etal.’s treatment package is a possibility worthfurther research.

The particular arrangement of the delayinterval is another factor that appears to in-fluence response allocation in concurrentschedules involving delayed reinforcement.Specifically, basic research on concurrentschedules has demonstrated that pigeonsshow greater response allocation to keys as-sociated with mixed delays than to those as-sociated with a constant delay (Chelonis,King, Logue, & Tobin, 1994; Cicerone,1976). Furthermore, intermittent or partial

Page 5: Designing Interventions That Include Delayed Reinforcement

363DELAYED REINFORCEMENT

delays appear to increase resistance to ex-tinction, whereas constant delays appear toweaken resistance (Crum, Brown, & Bitter-man, 1951; Nevin, 1974; Tombaugh,1966). Thus, there are numerous factors toconsider in the arrangement of delayed re-inforcement for establishing and maintain-ing an appropriate alternative response in thecontext of aberrant behavior.

Stimulus chains. Simultaneous discrimi-nation learning is known to be impairedwhen the consequences of a correct choiceare delayed relative to when those conse-quences are immediate. The presentation ofstimulus events during the delay may reversethese effects on discrimination learning. B.A. Williams (1999) examined the processesresponsible for the facilitation of discrimi-nation learning by stimuli inserted within adelay-to-reinforcement interval. He consid-ered three hypotheses: First, choice for de-layed reinforcement would be more likelywhen stimuli presented during the delay re-liably predicted reinforcement and thusfunctioned as conditioned reinforcers. Sec-ond, stimuli presented during a delay inter-val serve to bridge the response and the de-layed outcome, such that choice for the de-layed reinforcer would be more likely whenthe choice response and the interveningstimuli were highly correlated. Third, inter-vening stimuli mark the choice response andcause it to become more memorable at thetime of the delayed reinforcer delivery, suchthat changes in the correlation between in-tervening stimuli and a delayed reinforcerwould affect discrimination learning onlyminimally. B. A. Williams’ study differedfrom previous ones that examined the effectsof delayed reinforcement on discriminationlearning, in which only a single stimulus istypically presented during the delay-to-rein-forcement interval. This stands in contrastto studies of chain schedules of reinforce-ment, which have shown that the greater thenumber of intervening stimuli, the more

poorly choice behavior is maintained (e.g.,Duncan & Fantino, 1972).

In B. A. Williams’ (1999) procedure, 8rats were trained to simultaneously discrim-inate between two levers. Responding to onelever reliably produced reinforcement on aVI 20-s reinforcement schedule, and choos-ing the other lever produced no reinforce-ment. Two-link stimulus chains were pro-grammed to occur for specified durationsduring the delay intervals following eachchoice response. The middle- and terminal-link stimuli of each chain terminated auto-matically according to a variable-time 15-sschedule or a fixed-time 20-s schedule. Oncethis initial discrimination was established,the outcomes for the two levers were alteredin a series of contingency reversals. The cor-relations of the stimuli of each chain withdelayed reinforcement or no reinforcementwere also varied across reversals, such thateither the middle-link or terminal-link stim-ulus of each chain could have the same oropposite correlation with reinforcement as ithad during the preceding reversal. A total offour types of reversals were presented. Eachsubject received one reversal for each of thefour possible types. If a conditioned rein-forcement hypothesis accounted for facilitat-ed discriminations, consistent stimulus–re-inforcer correlations across successive rever-sals were expected to facilitate learning. Ifbridging explained improvement in discrim-ination, it was expected that changing thecorrelations between the choice response andthe intervening stimuli would disrupt learn-ing. If enhanced learning was due to themarking of the correct choice response bythe onset of the intervening stimuli, changesin the correlations between the interveningstimuli and trial outcome would be irrele-vant.

B. A. Williams’ (1999) results demon-strated that new discriminations were ac-quired more rapidly when the correlationsbetween the stimuli and consequences were

Page 6: Designing Interventions That Include Delayed Reinforcement

364 ROBERT STROMER et al.

the same as they had been during each pre-ceding contingency reversal. Moreover, themain determinant of acquisition rate waswhether or not the middle-link stimulus wascorrelated with the same outcome as it hadbeen during the preceding reversal. Whenthe status of the middle-link stimulus as apredictor of reinforcement did not changerelative to preceding reversals, changes in thestatus of the terminal-link stimulus affectedthe rate of discrimination only minimally.Alternatively, when the status of the middle-link stimulus as a predictor of reinforcementdid change relative to preceding reversals,the status of the terminal-link stimulus as apredictor of reinforcement became muchmore important. B. A. Williams contendedthat the stimuli presented during the delaygained their conditioned value through abackward chaining effect. These results arethus consistent with a conditioned reinforce-ment explanation for the facilitative role ofstimuli presented during a delay-to-rein-forcement interval and coincide with thosereported elsewhere (e.g., Alsop, Stewart, &Honig, 1994; Dunn, Williams, & Royalty,1987; B. A. Williams & Dunn, 1991).

Application: Stimulus and response chains.These results are important, because theysuggest that learning to discriminate be-tween situations resulting in delayed rein-forcement or no reinforcement may be en-hanced if stimuli that reliably predict rein-forcement are presented during the delay.Moreover, the correlation between thosestimuli and reinforcement must not be al-tered. For example, a child might be giventhe choice of doing a specified amount ofwork in class and earning the privilege toengage in a desired activity at the end of theday, or doing something other than workingand not earning that privilege. In the ab-sence of relevant stimuli indicating when orwhere the reinforcer will become available,it may be difficult to discriminate betweenthe two options (see Baer, Williams, Osnes,

& Stokes, 1984; Fowler & Baer, 1981). Thedelivery of conditioned reinforcers duringthe delay before which the activity is avail-able could potentially increase the likelihoodthat the child will choose to do his or herwork. The stimuli will serve as most effectiveconditioned reinforcers if they are regularlypresented contingent upon the target re-sponse and if they reliably predict reinforce-ment. Considerable research has identifiedthe effectiveness of tokens, points, gold stars,and the like in achieving this purpose (Kaz-din, 1982; Kazdin & Bootzin, 1972), yet theidentification of less contrived, more natu-rally occurring conditioned reinforcers willbe useful for ensuring generalization to othersettings. Pictures of a desired object, edibleitem, or activity may serve a similarly usefulpurpose when presented during a delay, par-ticularly if the pictures are being used con-comitantly in a communication teachingprogram (e.g., Bondy & Frost, 1993). Con-ditioned reinforcers might also be verbal innature (Hayes & Hayes, 1993). Praise froma parent, teacher, or caregiver, or remindersof the reinforcement that is to come, maystrengthen the effectiveness of delayed rein-forcement when presented during the delay.

Useful procedures might be established byrequiring individuals to produce conditionedreinforcers during the delay. Verbal stimuli,for example, might be produced by the in-dividual in the form of spoken or writtenself-instructions or self-prompts (e.g., seeJay, Grote, & Baer, 1999; Stromer, Mackay,Howell, McVay, & Flusser, 1996; Stromer,Mackay, McVay, & Fowler, 1998; Taylor &O’Reilly, 1997). Functional communicationtraining (Carr & Durand, 1985; Durand &Carr, 1991) might serve as a means by whichindividuals can produce their own condi-tioned reinforcers as well. Teaching an in-dividual to mand desired items using chosenpictures may allow the pictures to acquireconditioned reinforcing properties when the

Page 7: Designing Interventions That Include Delayed Reinforcement

365DELAYED REINFORCEMENT

specified reinforcer is made available at theend of the delay (Bondy & Frost, 1993).

Sequences or chains of two stimuli werepresented during the delays in the B. A. Wil-liams (1999) study. In the most effective ar-rangement, a middle-link stimulus was pre-sented contingent upon the correct choiceresponse; this was followed by the terminal-link stimulus that signaled the availability ofreinforcement at the end of the delay. Theprogramming of chains of conditioned re-inforcers in practical settings may be usefulin facilitating the effectiveness of delayed re-inforcement, particularly when the delays arerelatively long. As an example of such achain, consider a teaching situation in whicha child (a) accumulates pennies for each cor-rect response, (b) exchanges the pennies fortokens after a specified number of penniesare accumulated, then (c) exchanges the to-kens for a desired activity after a specifiednumber of tokens are accumulated. As pre-viously noted, the use of tokens as condi-tioned reinforcers in applied settings hasbeen elaborated upon a great deal (e.g., Kaz-din, 1982; Kazdin & Bootzin, 1972), buthow the availability of chains of conditionedreinforcers systematically affects choice mak-ing in practical situations is currently un-known. In addition, the identification ofmore naturally occurring stimulus chainswill again be profitable. Finally, basic re-search has suggested that the greater thenumber of stimuli presented during a delay,the less likely an individual is to choose thedelayed reinforcer (Duncan & Fantino,1972). What limits there might be on thenumber of stimuli presented during delaysand the facilitative effects reported by B. A.Williams merit examination.

The completion of the activities specifiedby picture activity schedules or written listsmay be regarded as signaled response chains.Reinforcement is not available until the en-tire sequence of activities is completed, andthe completion of one activity is condition-

ally reinforced by the opportunity to engagein the next activity (MacDuff, Krantz, &McClannahan, 1993; McClannahan &Krantz, 1999). The opportunity to completeresponse chains might increase the likeli-hood with which individuals choose re-sponse alternatives that lead to delayed re-inforcement. In effect, choosing to workcould become more probable than choosingnot to work, due to the conditioned rein-forcing properties of work itself (Eisenberger& Masterson, 1983; Eisenberger & Shank,1985). The stimuli correlated with each linkof a chain (pictures or textual prompts)might also acquire conditioned reinforcingproperties that maintain responding duringthe delay. Again, the limit to the length ofsuch chains ought to be investigated. Re-sponse alternatives that lead to no reinforce-ment may actually be preferable to responsealternatives that lead to delayed reinforce-ment but have very lengthy chain require-ments during the delay.

Teaching individuals to make adaptivechoices could involve expanding a self-con-trol repertoire via behavioral chaining anddelayed reinforcement. The optimism thatrather long behavioral chains are achievablecomes from the promising applied researchon teaching students to follow photographicschedules of school and home activities(Krantz, MacDuff, & McClannahan, 1993;MacDuff et al., 1993). For example,MacDuff et al. taught children with autismto engage in sequences of play activities de-picted in a series of photographs. As a result,the children learned to engage in ‘‘lengthyresponse chains,’’ as well as to ‘‘indepen-dently change activities, and change activi-ties in different group home settings in theabsence of immediate supervision andprompts from others’’ (p. 89). In another ex-ample, Lalli, Casey, Goh, and Merlino(1994) used activity-scheduling procedureswith older students who exhibited aggres-sion. These investigators also incorporated

Page 8: Designing Interventions That Include Delayed Reinforcement

366 ROBERT STROMER et al.

reading instruction based on laboratorystudies of stimulus class formation. By doingso, they demonstrated how students maylearn to follow schedules in which the namesof the activities are written rather than rep-resented in photographs.

Specifying the methods for establishing aninitial repertoire of schedule following willrequire research. A reasonable place to startwould be to use activities that already func-tion as reinforcers (e.g., putting a puzzle to-gether) and have a clearly defined beginningand ending (McClannahan & Krantz,1999). Such activities could facilitate sched-ule following. However, performing otheractivities (e.g., putting things away) may de-pend on the delivery of highly desirable sup-plemental reinforcers (e.g., a favorite snackor free play) upon completion of the sched-uled activities. For some students, it will alsobe necessary to ensure that the photographsare functionally related to the activities theydepict. For instance, communication train-ing that uses pictures or photographs (e.g.,Bondy & Frost, 1993) would be a naturalcontext for establishing the relations amongphotos and activities that could subsequentlyenhance learning to follow an activity sched-ule.

Due to practical constraints, the deliveryof delayed reinforcers may at times be un-certain. A teacher may wish to provide stu-dents with the opportunity to engage in adesired activity following their completionof homework problems, but may be unableto provide this reinforcer due to instances ofaggression on the part of other students orthe unavailability of the specified activity.Thus, delayed reinforcers may, under somecircumstances, be available with less than a100% probability. Indeed, unreliable rein-forcement may be more typical in instruc-tional and clinical situations than reliable re-inforcement. Research has shown that non-human subjects are likely to prefer unreliablereinforcement to reliable reinforcement

when conditioned reinforcers are presentedduring delay intervals preceding the avail-ability of the unreliable reinforcement (Belke& Spetch, 1994; Dunn & Spetch, 1990).Little applied research has been conductedin this area, however. Empirical support forthe programming of conditioned reinforcersthat predict unreliable or uncertain out-comes is necessary (see Lalli & Mauro,1995).

Self-control. Jackson and Hackenberg(1996) adapted laboratory methods typicallyused with adult humans without intellectualimpairments (e.g., Flora & Pavlik, 1992;Hyten, Madden, & Field, 1994) to examineself-control choices in pigeons. Sorting outthe procedural differences that may accountfor the differences among people and pi-geons may help to pinpoint the behavioralprocess involved (see also Grosch & Neurin-ger, 1981). Attempts to clarify such issuesare timely because of relevance to the broad-er applied interests in choice (e.g., Fisher &Mazur, 1997) and the ongoing concernswith extending and applying self-controlmethods to clinical populations (e.g., Dixonet al., 1998; Neef, Mace, & Shade, 1993;Neef, Shade, & Miller, 1994; Vollmer et al.,1999).

The tokens that Jackson and Hackenberg(1996) used with pigeons were light-emit-ting diodes (LEDs), each exchanged for 2-saccess to food. A panel of 34 LEDs accom-modated the maximum number of tokensthat could be accumulated in a session.There were also green and blue choice re-sponse keys and a red token-exchange key.Each session was 12 trials: 2 forced-choicetrials followed by 10 free-choice trials. Theforced-choice trials ensured that the birdsmade contact with the contingencies ar-ranged during the session. Conditions cen-tral to the study included (a) the immediatedelivery of a small token reinforcer (oneLED) after pecking one choice key, (b) thedelayed (e.g., 6 s) delivery of a large token

Page 9: Designing Interventions That Include Delayed Reinforcement

367DELAYED REINFORCEMENT

reinforcer (three LEDs) after pecking theother choice key, and (c) a discrete token-exchange period during which each peck tothe exchange key turned off an LED andresulted in 2-s access to food. Gradually, allof the LEDs were exchanged at the end ofeach 12-trial session; thus, the delay to tokenexchange was the same for both small andlarge choices (e.g., Condition ED10, Exper-iment 2). In other words, the terminal foodreinforcer was neither more immediate normore plentiful for choosing the small token.Including the two forced-choice trials, if apigeon always pecked ‘‘small’’ (one LED), atotal of 14 LEDs would be exchanged for28 s of food at the end of the session. Incontrast, if a pigeon always pecked ‘‘large’’(three LEDs), 34 LEDs would be exchangedfor 68 s of food.

In Experiments 1 and 2, the pigeons’choices proved to be quite sensitive to thetiming of the token-exchange periods. Thatis, access to food was the important variable,irrespective of the LED delays: If the ex-change for a large amount of food occurredlater than the small amount, the pigeonswere more apt to choose the side key relatedto one LED and the small amount of food.However, if the timing of the exchange forthe small and large amounts of food wasequal, the birds were more apt to choose thekey related to three LEDs and the largeamount of food. This finding replicates nu-merous prior studies with pigeons that in-volved unequal delays before delivering foodreinforcers (e.g., see review by Mazur, 1998),with adult humans exposed to token systems(e.g., Flora & Pavlik, 1992; Hyten et al.,1994), with children (Schweitzer & Sulzer-Azaroff, 1988), and with individuals withmental retardation (Ragotzy, Blakely, & Pol-ing, 1988) who received tangible reinforcers.The results also support the findings of ap-plied studies (Neef et al., 1993; Vollmer etal., 1999).

Jackson and Hackenberg’s (1996) study

makes another point. Experiments 3 and 4explored aspects of the procedure that mayhave controlled the pigeons’ preferences forthe large delayed reinforcer. Were the col-ored side keys and their relations to the dif-ferential food consequences controlling thebirds’ choices, rather than the LEDs? Resultssuggested that the intervening LEDs wereimportant in maintaining selections of thekey leading to large delayed reinforcement.For example, when responses to the choicekeys led to their respective small and largereinforcers without the intervening LEDs,the pigeons selected the key related to largedelayed reinforcement less often. The resultsare consistent with the positive effects of pre-senting cues or signals during the delays toreinforcement.

Application: Intervening stimuli. Jacksonand Hackenberg (1996) were able to estab-lish the pigeons’ preferences for large rein-forcers and maintain those preferences as thedelay to token exchange was increased. Theirstudy suggests a two-part program to estab-lish self-control: (a) While maintaining pref-erences for the large reinforcer, gradually in-crease the delays equally for both small andlarge reinforcers, then (b) while holding thelong delay constant, gradually decrease thedelay before the small reinforcer. This ap-proach resembles others used in the labora-tory to establish self-control choice by grad-ually increasing the delay to the large rein-forcer in children identified as impulsive(Schweitzer & Sulzer-Azaroff, 1988) and inindividuals with mental retardation in bothlaboratory (Ragotzy et al., 1988) and applied(Dixon et al., 1998) settings. However, fewstudies of self-control with humans withlimited development have used tokens (butsee Burns & Osborne, 1975); instead, ter-minal reinforcers (e.g., stickers, snacks, timesocializing) were delivered either immediate-ly or after a delay. An obvious advantage ofthe systematic use of tokens with such in-dividuals includes the possibility that longer

Page 10: Designing Interventions That Include Delayed Reinforcement

368 ROBERT STROMER et al.

delays in reinforcement may be possiblewithout degrading self-control, in part be-cause of the variety of reinforcers that canbe delivered during the token-exchange pe-riod.

If tokens are used to encourage schedulefollowing and mediate the delivery of de-layed reinforcers, behavior analysts will finda wealth of programmatic ideas already avail-able in the applied literature (e.g., Kazdin,1982; Kazdin & Bootzin, 1972), backed upby an informative collection of laboratory ar-ticles on relevant topics like conditioned re-inforcement (e.g., B. A. Williams, 1994).The basic studies can inform the applied re-searcher, just as the basic research focused onin this paper reminds us that whether tokensfunction as conditioned reinforcers will de-pend on a relevant learning history (B. A.Williams, 1999), and that the successful useof tokens demands an appreciation for thescheduling of token delivery, token ex-change, and delivery of terminal reinforcers(Jackson & Hackenberg, 1996). Consider-ation must also be given to the long-termeffectiveness of tokens in maintaining desir-able behaviors.

Although the analysis of procedures forincreasing self-control in applied settings hasbegun (Dixon et al., 1998), the role of sig-nally stimuli—marking, bridging, or rein-forcing in nature—has not been exploredsystematically. Jackson and Hackenberg’s(1996) data suggest that the scheduling ofintervening stimuli might increase preferenc-es for delayed terminal reinforcers. As a stepin this direction, Vollmer et al. (1999)showed that participants with developmentaldisabilities and severe challenging behaviorswere more likely to exhibit self-control thanimpulsive choices when the longer delayswere signaled rather than unsignaled. Itwould be worthwhile to determine whetherthe use of tokens or other methods of estab-lishing behavioral chains could expand thekind of self-control repertoire examined in

Vollmer et al.’s study. The outcome could bethe development of methods that teach suchindividuals to engage in greater and greateramounts of behavior for delayed reinforcers.

Concluding Comments

We examined three laboratory studiesconcerned with delayed reinforcement andexplored their relevance for applied researchand practice. A. M. Williams and Lattal’s(1999) study with pigeons suggests that un-signaled delayed reinforcement might exer-cise control over behavior in contexts inwhich relatively few sources of competingbehavioral control exist. B. A. Williams’(1999) study with rats indicates that the ef-fects of delayed reinforcement may bestrengthened by presenting exteroceptive sig-nals during the delay before reinforcementdelivery, especially if the intervening stimulifunction as conditioned reinforcers. Finally,Jackson and Hackenberg’s (1996) study withpigeons demonstrates that the use of inter-vening stimuli might increase the likelihoodwith which large delayed reinforcers are cho-sen over small immediate reinforcers. Inte-grating the findings and adapting the meth-ods of such basic research will have impli-cations for applied analyses of delayed rein-forcement, including studies describedpreviously on social behavior (Baer et al.,1984; Fowler & Baer, 1981) and adaptivechoice making (Dixon et al., 1998; Vollmeret al., 1999).

A number of applied research questionsare suggested by basic research on delayedreinforcement. Under what conditions aresignals necessary for the maintenance of ap-propriate alternative behavior? Under whatconditions is it necessary to include a DROcomponent to avoid establishing a responsechain that includes aberrant behavior? Un-der what conditions is delayed reinforcementeffective in establishing a functionally equiv-alent appropriate alternative response tocompete with aberrant behavior?

Page 11: Designing Interventions That Include Delayed Reinforcement

369DELAYED REINFORCEMENT

There may be small or large delays thataccompany the scheduling of these events,and teachers are faced with the challenge ofencouraging students to learn to behave ap-propriately nonetheless. Likewise, there maybe terminal reinforcers that, as mentionedabove, are available only probabilistically. Se-lection of procedures aimed at producing ef-fective behavior change that can be main-tained over time and generalizes across set-tings (Stokes & Baer, 1977; and see Baer etal., 1984; Fowler & Baer, 1981) should beinformed by an understanding of the basicbehavioral processes.

REFERENCES

Ainslie, G. W. (1974). Impulse control in pigeons.Journal of the Experimental Analysis of Behavior,21, 485–489.

Alsop, B., Stewart, K. E., & Honig, W. K. (1994).Cued and uncued terminal links in concurrent-chains schedules. Journal of the Experimental Anal-ysis of Behavior, 62, 385–397.

Baer, R. A., Williams, J. A., Osnes, P. G., & Stokes,T. F. (1984). Delayed reinforcement as an indis-criminable contingency in verbal/nonverbal cor-respondence training. Journal of Applied BehaviorAnalysis, 17, 429–440.

Belke, T. W., & Spetch, M. L. (1994). Choice be-tween reliable and unreliable reinforcement alter-natives revisited: Preference for unreliable rein-forcement. Journal of the Experimental Analysis ofBehavior, 62, 353–366.

Bondy, A. S., & Frost, L. A. (1993). Mands acrossthe water: A report on the application of the pic-ture-exchange communication system in Peru.The Behavior Analyst, 16, 123–128.

Burns, D. J., & Osborne, J. G. (1975). Choice andself-control in children: A test of Rachlin’s model.Bulletin of the Psychonomic Society, 5, 156–158.

Carr, E. G., & Durand, V. M. (1985). Reducing be-havior problems through functional communica-tion training. Journal of Applied Behavior Analysis,18, 111–126.

Chelonis, J. J., King, G., Logue, A. W., & Tobin, H.(1994). The effect of variable delays on self-con-trol. Journal of the Experimental Analysis of Behav-ior, 62, 33–43.

Cicerone, R. A. (1976). Preference for mixed versusconstant delay of reinforcement. Journal of the Ex-perimental Analysis of Behavior, 25, 257–261.

Critchfield, T. S., & Lattal, K. A. (1993). Acquisitionof a spatially defined operant with delayed rein-

forcement. Journal of the Experimental Analysis ofBehavior, 59, 373–387.

Crum, J., Brown, W. L., & Bitterman, M. E. (1951).The effect of partial and delayed reinforcement onresistance to extinction. American Journal of Psy-chology, 64, 228–237.

Dixon, M. R., Hayes, L. J., Binder, L. M., Manthey,S., Sigman, C., & Zdankowki, D. M. (1998).Using a self-control training procedure to increaseappropriate behavior. Journal of Applied BehaviorAnalysis, 31, 203–210.

Duncan, B., & Fantino, E. (1972). The psychologicaldistance to reward. Journal of the ExperimentalAnalysis of Behavior, 18, 23–34.

Dunn, R., & Spetch, M. L. (1990). Choice with un-certain outcomes: Conditioned reinforcement ef-fects. Journal of the Experimental Analysis of Be-havior, 53, 201–218.

Dunn, R., Williams, B., & Royalty, P. (1987). De-valuation of stimuli contingent on choice: Evi-dence for conditioned reinforcement. Journal ofthe Experimental Analysis of Behavior, 48, 117–131.

Durand, V. M., & Carr, E. G. (1991). Functionalcommunication training to reduce challenging be-havior: Maintenance and application to new set-tings. Journal of Applied Behavior Analysis, 24,251–264.

Eisenberger, R., & Masterson, F. A. (1983). Requiredhigh effort increases subsequent persistence andreduces cheating. Journal of Personality and SocialPsychology, 44, 593–599.

Eisenberger, R., & Shank, D. M. (1985). Personalwork ethic and effort training affect cheating.Journal of Personality and Social Psychology, 49,520–528.

Fisher, W. W., & Mazur, J. E. (1997). Basic and ap-plied research on choice responding. Journal of Ap-plied Behavior Analysis, 30, 387–410.

Flora, S. R., & Pavlik, W. B. (1992). Human self-control and the density of reinforcement. Journalof the Experimental Analysis of Behavior, 57, 201–208.

Fowler, S. A., & Baer, D. M. (1981). Do I have tobe good all day? The timing of delayed reinforce-ment as a factor in generalization. Journal of Ap-plied Behavior Analysis, 14, 13–24.

Grosch, J., & Neuringer, A. (1981). Self-control inpigeons under the Mischel paradigm. Journal ofthe Experimental Analysis of Behavior, 35, 3–21.

Hayes, S. C., & Hayes, L. J. (1993). Applied impli-cations of current JEAB research on derived rela-tions and delayed reinforcement. Journal of Ap-plied Behavior Analysis, 26, 507–511.

Horner, R. H., & Day, H. M. (1991). The effects ofresponse efficiency on functionally equivalent,competing behaviors. Journal of Applied BehaviorAnalysis, 24, 719–732.

Hyten, C., Madden, G. J., & Field, D. P. (1994).

Page 12: Designing Interventions That Include Delayed Reinforcement

370 ROBERT STROMER et al.

Exchange delays and impulsive choice in adult hu-mans. Journal of the Experimental Analysis of Be-havior, 62, 225–233.

Jackson, K., & Hackenberg, T. D. (1996). Token re-inforcement, choice, and self-control in pigeons.Journal of the Experimental Analysis of Behavior,66, 29–49.

Jay, A. S., Grote, I., & Baer, D. M. (1999). Teachingparticipants with developmental disabilities tocomply with self-instructions. American Journal onMental Retardation, 104, 509–522.

Kazdin, A. E. (1982). The token economy: A decadelater. Journal of Applied Behavior Analysis, 15,431–445.

Kazdin, A. E., & Bootzin, R. R. (1972). The tokeneconomy: An evaluative review. Journal of AppliedBehavior Analysis, 5, 343–372.

Krantz, P. J., MacDuff, M. T., & McClannahan, L.E. (1993). Programming participation in familyactivities for children with autism: Parents’ use ofphotographic activity schedules. Journal of AppliedBehavior Analysis, 26, 137–138.

Lalli, J. S., Casey, S., Goh, H., & Merlino, J. (1994).Treatment of escape-maintained aberrant behaviorwith escape extinction and predictable routines.Journal of Applied Behavior Analysis, 27, 705–714.

Lalli, J. S., & Mauro, B. C. (1995). The paradox ofpreference for unreliable reinforcement: The roleof context and conditioned reinforcement. Journalof Applied Behavior Analysis, 28, 389–394.

Lattal, K. A. (1984). Signal functions in delayed re-inforcement. Journal of the Experimental Analysisof Behavior, 42, 239–253.

Lattal, K. A., & Gleeson, S. (1990). Response acqui-sition with delayed reinforcement. Journal of Ex-perimental Psychology: Animal Behavior Processes,16, 27–39.

Lattal, K. A., & Metzger, B. (1994). Response ac-quisition by Siamese fighting fish (Betta splendens)with delayed visual reinforcement. Journal of theExperimental Analysis of Behavior, 61, 35–44.

Lattal, K. A., & Williams, A. M. (1997). Body weightand response acquisition with delayed reinforce-ment. Journal of the Experimental Analysis of Be-havior, 67, 131–143.

MacDuff, G. S., Krantz, P. J., & McClannahan, L. E.(1993). Teaching children with autism to usephotographic activity schedules: Maintenance andgeneralization of complex response chains. Journalof Applied Behavior Analysis, 26, 89–97.

Mace, F. C., & Roberts, M. L. (1993). Factors af-fecting selection of behavioral interventions. In J.Reichle & D. P. Wacker (Eds.), Communicativealternatives to challenging behavior: Integratingfunctional assessment and intervention strategies (pp.113–133). Baltimore: Paul H. Brookes.

Mazur, J. E. (1998). Choice and self-control. In K.A. Lattal & M. Perone (Eds.), Handbook of re-

search methods in human operant behavior (pp.131–161). New York: Plenum.

McClannahan, L. E., & Krantz, P. J. (1999). Activityschedules for children with autism: Teaching inde-pendent behavior. Bethesda, MD: WoodbineHouse.

McComas, J. J., & Mace, F. C. (in press). Functionalanalysis. In E. S. Shapiro & T. Kratochwill (Eds.),Behavioral assessment in schools: Theory, research,and clinical foundations (2nd ed.). New York:Guilford.

McDowell, J. J. (1988). Matching theory in naturalhuman environments. The Behavior Analyst, 11,95–109.

Neef, N. A., Mace, F. C., & Shade, D. (1993). Im-pulsivity in students with serious emotional dis-turbance: The interactive effects of reinforcer rate,delay, and quality. Journal of Applied BehaviorAnalysis, 26, 37–52.

Neef, N. A., Shade, D., & Miller, M. S. (1994). As-sessing influential dimensions of reinforcers onchoice in students with serious emotional distur-bance. Journal of Applied Behavior Analysis, 27,575–583.

Nevin, J. A. (1974). Response strength in multipleschedules. Journal of the Experimental Analysis ofBehavior, 21, 389–408.

Peck, S. M., Wacker, D. P., Berg, W. K., Cooper, L.J., Brown, K. A., Richman, D., McComas, J. J.,Frischmeyer, P., & Millard, T. (1996). Choice-making treatment of young children’s severe be-havior problems. Journal of Applied Behavior Anal-ysis, 29, 263–290.

Rachlin, H., & Green, L. (1972). Commitment,choice and self-control. Journal of the ExperimentalAnalysis of Behavior, 17, 15–22.

Ragotzy, S. P., Blakely, E., & Poling, A. (1988). Self-control in mentally retarded adolescents: Choiceas a function of amount and delay of reinforce-ment. Journal of the Experimental Analysis of Be-havior, 49, 191–199.

Reeve, L., Reeve, K. F., Brown, A. K., Brown, J. L.,& Poulson, C. L. (1992). Effects of delayed re-inforcement on infant vocalization rate. Journal ofthe Experimental Analysis of Behavior, 58, 1–8.

Schweitzer, J. B., & Sulzer-Azaroff, B. (1988). Self-control: Teaching tolerance for delay in impulsivechildren. Journal of the Experimental Analysis of Be-havior, 50, 173–186.

Stokes, T. F., & Baer, D. M. (1977). An implicittechnology of generalization. Journal of AppliedBehavior Analysis, 10, 349–367.

Stromer, R., Mackay, H. A., Howell, S. R., McVay, A.A., & Flusser, D. (1996). Teaching computer-based spelling to individuals with developmentaland hearing disabilities: Transfer of stimulus con-trol to writing tasks. Journal of Applied BehaviorAnalysis, 29, 25–42.

Stromer, R., Mackay, H. A., McVay, A. A., & Fowler,

Page 13: Designing Interventions That Include Delayed Reinforcement

371DELAYED REINFORCEMENT

T. (1998). Written lists as mediating stimuli inthe matching-to-sample performances of individ-uals with mental retardation. Journal of AppliedBehavior Analysis, 31, 1–19.

Taylor, I., & O’Reilly, M. F. (1997). Toward a func-tional analysis of private verbal self-regulation.Journal of Applied Behavior Analysis, 30, 43–58.

Tombaugh, T. N. (1966). Resistance to extinction asa function of the interaction between training andextinction delays. Psychological Reports, 19, 791–798.

Vollmer, T. R., Borrero, J. C., Lalli, J. S., & Daniel,D. (1999). Evaluating self-control and impulsiv-ity in children with severe behavior disorders.Journal of Applied Behavior Analysis, 32, 451–466.

Wacker, D. P., Steege, M. W., Northup, J., Sasso, G.,Berg, W., Reimers, T. L., Cooper, L., Cigrand, K.,& Donn, L. (1990). A component analysis offunctional communication training across threetopographies of severe behavior problems. Journalof Applied Behavior Analysis, 23, 417–429.

Wilkenfield, J., Nickel, M., Blakely, E., & Poling, A.(1992). Acquisition of lever-press responding in

rats with delayed reinforcement: A comparison ofthree procedures. Journal of the Experimental Anal-ysis of Behavior, 58, 431–443.

Williams, A. M., & Lattal, K. A. (1999). The role ofthe response-reinforcer relation in delay-of-rein-forcement effects. Journal of the ExperimentalAnalysis of Behavior, 71, 187–194.

Williams, B. A. (1976). The effects of unsignaled de-layed reinforcement. Journal of the ExperimentalAnalysis of Behavior, 26, 441–449.

Williams, B. A. (1994). Conditioned reinforcement:Experimental and theoretical issues. The BehaviorAnalyst, 17, 261–285.

Williams, B. A. (1999). Value transmission in dis-crimination learning involving stimulus chains.Journal of the Experimental Analysis of Behavior,72, 177–185.

Williams, B. A., & Dunn, R. (1991). Preference forconditioned reinforcement. Journal of the Experi-mental Analysis of Behavior, 55, 37–46.

Received May 10, 2000Final acceptance May 15, 2000Action Editor, F. Charles Mace