chapter 11 chapter 111111 operant conditioningoperant ...edward lee thorndike edward lee...

12
© Andy Johnson, Ph.D. Minnesota State University, Mankato www.OPDT-Johnson.com Chapter Chapter Chapter Chapter 11 11 11 11 Operant Conditioning Operant Conditioning Operant Conditioning Operant Conditioning: Cats, Mice, and : Cats, Mice, and : Cats, Mice, and : Cats, Mice, and Dancing chickens Dancing chickens Dancing chickens Dancing chickens OPERANT CONDITIONING Classical conditioning is different from operant conditioning. The former involves an organism (human, rat, wombat, etc.) that is passive, simply responding to stimuli presented to it. Operant conditioning however involves an organism that must first act upon (or operate on) the environment in some way. As the organism acts, those acts (or behaviors) that are followed by pleasurable outcomes (mouse pellets, praise, or money) are reinforced and tend to be repeated. Those acts that are followed by punishing outcomes (electric shock, yelling, imprisonment, or embarrassment) tend not to be repeated. Put another way, humans (and other organisms) learn certain behaviors as they act and are rewarded or punished. Unlike classical conditioning, operant conditioning is not concerned with simply pairing a stimulus and response (S-R); rather, it focuses on A-B-C: The antecedent (the conditions before the behavior), the behavior, and the consequences (what followed the behavior). Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson, Thorndike acknowledged the existence of thought, which he called mental units. A mental unit was anything sensed or perceived or the sensing, perceiving bit of consciousness. A physical unit was a stimulus or response (observable behavior). Learning for Thorndike was a matter of making four kinds of connections: (a) mental and physical units, (b) physical units with mental, (c) mental units with other mental units, and (d) physical units with other physical units. His experiments looked for those things that strengthened these connections. Thorndike’s Hungry Cats Thorndike’s learning theories came from his study of cats in a puzzle box (Lattal, 1998). Here a hungry cat was put in a box. On the outside of the box was a fish that the cat could see and smell. The box had a door that could be opened by pressing a lever inside the cage (see Figure 11.1). To illustrate the relationship between the antecedent, behavior, and consequence: the antecedent was the hungry cat, box, lever, and fish. Sensing the fish, the cat would engage in a variety of behaviors in attempt to open the door and get the fish. Eventually one of these behaviors (pressing the level) would result in the door opening and the cat getting the fish. The consequence then was the open door and the fish (reward). Figure 11.1. Thorndike’s puzzle box and cat

Upload: others

Post on 10-Aug-2020

11 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

Chapter Chapter Chapter Chapter 11111111

Operant ConditioningOperant ConditioningOperant ConditioningOperant Conditioning: Cats, Mice, and : Cats, Mice, and : Cats, Mice, and : Cats, Mice, and

Dancing chickensDancing chickensDancing chickensDancing chickens

OPERANT CONDITIONING Classical conditioning is different from operant conditioning. The former involves an organism

(human, rat, wombat, etc.) that is passive, simply responding to stimuli presented to it. Operant conditioning

however involves an organism that must first act upon (or operate on) the environment in some way. As the

organism acts, those acts (or behaviors) that are followed by pleasurable outcomes (mouse pellets, praise, or

money) are reinforced and tend to be repeated. Those acts that are followed by punishing outcomes (electric

shock, yelling, imprisonment, or embarrassment) tend not to be repeated. Put another way, humans (and other

organisms) learn certain behaviors as they act and are rewarded or punished. Unlike classical conditioning,

operant conditioning is not concerned with simply pairing a stimulus and response (S-R); rather, it focuses on

A-B-C: The antecedent (the conditions before the behavior), the behavior, and the consequences (what followed

the behavior).

Edward Lee Thorndike

Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike

Watson, Thorndike acknowledged the existence of thought, which he called mental units. A mental unit was

anything sensed or perceived or the sensing, perceiving bit of consciousness. A physical unit was a stimulus or

response (observable behavior). Learning for Thorndike was a matter of making four kinds of connections: (a)

mental and physical units, (b) physical units with mental, (c) mental units with other mental units, and (d)

physical units with other physical units. His experiments looked for those things that strengthened these

connections.

Thorndike’s Hungry Cats

Thorndike’s learning theories came from his study of cats in a puzzle box (Lattal, 1998). Here a hungry

cat was put in a box. On the outside of the box was a fish that the cat could see and smell. The box had a door

that could be opened by pressing a lever inside the cage (see Figure 11.1). To illustrate the relationship between

the antecedent, behavior, and consequence: the antecedent was the hungry cat, box, lever, and fish. Sensing the

fish, the cat would engage in a variety of behaviors in attempt to open the door and get the fish. Eventually one

of these behaviors (pressing the level) would result in the door opening and the cat getting the fish. The

consequence then was the open door and the fish (reward).

Figure 11.1. Thorndike’s puzzle box and cat

Page 2: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

Learning for the hungry cat was a matter of making the connection between lever pressing and door-

opening/fish eating. This learning was incremental not insightful (see Figure 11.2). This means that the cat was

not able to gain sudden insight or make a logical connection between level pressing and door-opening/fish-

eating. Instead, the cat made small incremental gains toward the lever-open door connection. Each time the cat

was put in the puzzle box, it took successively fewer tries for it to make this connection between. Finally, after

many times in the puzzle box, the cat eventually would go directly to the lever. This is called trial and error

learning or selecting and connecting. A behavior was selected (lever-pressing) and a connection was eventually

made and strengthened with the door-opening consequence.

Figure 11.2. Incremental vs. insightful learning.

Incremental

Learning

number of tries to earn reward

time

Insightful

Learning

number of tries to earn reward

time

Laws of Learning Based on his experiments, Thorndike came up with three laws of learning.

• Law of effect – The strength of a connection is influenced by the consequences of a response. In other

words, an action followed by a pleasurable consequence is more likely to be repeated. Inversely, an action

followed by an annoying or painful consequence is less likely to be repeated. Put simply, actions that are

rewarded tend to be strengthened and repeated, those that are punished tend to be weakened and not repeated.

As a human example, if little Billy tells a joke and is reward by laughter and attention (something he

enjoys), he will probably tell a joke at some point in the future. Learning is a function of the consequences of

behaviors rather than contiguity (two behaviors occur simultaneously). If little Billy was to tell the same joke

and instead of laughing, everybody turned away in disgust, he would be less likely to tell a joke in the future.

By the way, Thorndike found that pleasure was more potent for stamping out response than pain. If you want a

negative behavior to go away, it is more effective to reward a conflicting positive behavior than to simply

punish the negative behavior. (Reward students for doing the positive things that would make it impossible for

them to do the negative things.)

Page 3: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

• Law of exercise – The more a stimulus-induced response is repeated the longer it will be retained.

Another way of saying this is that connections between a stimulus and response becomes strengthened with

practice and weakened when practice is discontinued. The more often the cat is put in the puzzle box to make

the connection between lever and gate opening, the longer this behavior will be retained. However, if the cat

was only put in the puzzle box once every other week, the learning it had gained would quickly recede. That is,

the number of tries and the amount of time it took to press the level would increase.

As a human example, if little Billy was in a situation were he was able to engage in joke-telling behavior

every day and got lots of laughter and positive attention, he would be well on his way toward become a

comedian. If he was in a situation where he was only able to tell an occasional joke (a very restrictive or

repressive environment) he might be well on his way toward becoming an accountant.

•Law of readiness – When a human is ready to act, it is reinforcing for it to do so and annoying for is

not to do so. When human is not ready to act, forcing it to do so is annoying.

When Billy is in a jovial mood where he wants to tell a joke, joke telling is reinforcing in and of itself.

Not being able to tell a joke is painful. In the same vein, if Billy were forced to tell jokes when he did not want

to do so, joke telling would be painful or annoying. The same behavior (joke telling) can be either reinforcing

or annoying depending on the antecedent or condition of little Billy.

The law of readiness has great implication for the holistic educators. One of the important ideas that

inform our practice is that learning should be natural. That is, we should, to the greatest extent possible, create

learning experiences that align with students’ natural ways of interacting with the world. When students are

curious and want to learn, to do so is reinforcing. When they are not ready to learn, being made to do so is

painful. What are the ramifications? We should strive to include some open-ended experiences and choice in

our curriculums so that students can both discover and explore topics that are of interest to them (again, to the

greatest extent possible). Also, we should try to create personal connections to the curriculum; teaching content

and skills that have real life applications and implications.

When a child is ready to learn, being able to do so is reinforcing. This also suggests that motivation is

an important aspect of learning (described in Book II) and should be given more consideration that is currently

the case in most school settings. Learning that is based on students’ intrinsic desire to learn and find out about

themselves and the world in which they live is more apt to create powerful learning experiences. What are your

students curious about? What do they want to learn? What concerns do they have in their lives? What would

they like to be able to do? How would they like to learn? Why not ask them? This could be the start of some

real learning. This does not mean; however, that you need to abandon your curriculum or ignore the content

standards that have been assigned to you (usually in the form of top-down mandates). This instead is an

invitation to adopt and adapt the curriculum to meet the needs and interests of your students. This, by the way

is what makes teaching exciting and interesting and keeps so many excellent teachers coming back every year.

Teaching is a creative, intellectual endeavor when the teacher is allowed to make choices in regards to what and

how to teach. However, simply opening the teachers’ manual and replicating what it tells you is extremely

boring and not nearly as effective.

Burrhus Frederic Skinner

B.F. Skinner (1904-1990) studied how organisms learn and also how behavior could be controlled. His

theories emphasize the effects of a response on the response itself. Skinner thought that most animal and

human behavior is controlled by the events that precede the behavior (antecedents) and also those that follow

(consequences) the behavior (see Figure 11.3). In general, the antecedent tells a person what to do and the

consequence either strengthens or weakens the behavior.

Page 4: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

Figure 11.3. Reinforcement used to strengthen behaviors.

ANTECENDENT → BEHAVIOR → CONSEQUENCE [reinforcer] → EFFECT [strengthened or repeated

behavior]

ANTECENDENT → BEHAVIOR → CONSEQUENCE [punisher] → EFFECT [weakened or decreased

behavior]

- make this a cartoon –

-

Skinner’s Very Smart Mice Skinner’s early work involved the use of a mechanism that has come to be known as a Skinner box.

This is small cage that usually has a light, a lever, and a food cup (see Figure 11.4). Some Skinner boxes are

also designed to elicit electric shocks via the grid which makes up the floor. Skinner discovered that mice could

be taught to perform specific behaviors through the use of shaping and reinforcement (described below). He

taught his mice to press the lever (behavior) every time a green light (antecedent) appeared. This lever-pressing

behavior would result in a mouse pellet appearing in the food cup (consequence). The mouse’s behavior was

modified or changed, hence the term, behavior modification. Variations of behavior modification techniques

are currently used in a variety of classrooms, teaching situations, and parenting situations.

Figure 11.4. Skinner box.

Reinforcing and Punishing Behaviors So how do you teach a mouse to press levers? And how might you use behavior modification to shape

Page 5: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

human behaviors? Simply define a desired behavior (or an approximation of that behavior) and reward the

organism every time it appears. Or define an undesired behavior and punish the organism every time it appears.

• Reinforcement. A reinforcer is any consequence that increases the likelihood that a behavior will

occur again. Reinforcement is the process of attaching reinforcers to certain behaviors. There are two types of

reinforcement: The first is positive reinforcement. This is a reward or a pleasurable thing that is attached to a

behavior. For a mouse, positive reinforcement would be the tasty mouse pellet. For a human, positive

reinforcement could be money, attention, candy, recognition, an early recess, a grade in an educational

psychology course, book contracts, or other types of earned rewards.

The second type of reinforcement is negative reinforcement (this term is often confused with

punishment). Negative reinforcement is when the removal of an annoying or painful condition is attached to a

behavior. In the case of the mouse, if a mild electric shock were sent through the floor, negative reinforcement

would be the removal of the shock the would occur by pressing the lever. In the case of a human, negative

reinforcement could be removal of an unpleasant situation, such as being allowed to go outside once homework

was completed or being allowed to join the group once a student stopped making rude remarks.

• Punishment. What about punishment? Punishment is consequence that decreases or suppresses

behavior. Behaviors followed by a punisher are less likely to be repeated. (We will consider the limitations of

punishment below.) There are two types of punishment. Type I punishment or presentation punishment is

when an aversive stimulus (something annoying or unpleasant) follows a behavior. With mice an example of an

aversive stimulus would be an electric shock. An aversive stimulus such as this would make it very likely that

the behavior that preceded the shock would not be repeated. With humans an example of an aversive stimuli

(not always intentionally) would be things such as humiliation, frustration, spanking, hurtful words, (none of

these are recommended) or time-outs.

Type II punishment or removal punishment is when a rewarding stimulus is taken away following a

behavior. This would be when a behavior results in the removal of a pleasant or reinforcing stimulus. A mouse

example would be if the pressing the lever resulted in its food being removed. A human example would be if a

child was asked to go inside as a result of a negative or unwanted playground behavior. A word of caution

about punishment: its effectiveness is very limited in modifying behavior. With both mice and humans, as soon

as the punishment or threat of punishment disappears the behavior reappears. When it is used as the sole means

of modifying behavior mice, humans, and other organisms simply learn to avoid punishment; they do not learn

the correct behavior (more about this below).

For reinforcement or punishment to be effective, it should occur immediately after the behavior. For

example, to reinforce lever-pressing in a mouse the food pellet needs to appear immediately upon pressing the

lever. To reinforce hand-raising in children (instead of shouting out answers), some sort of reinforcement need

to be provided immediately after the behavior such as, “I like the way Pat has her hand raised. Nice job, Pat. I

can call on you.” One thing to keep in mind with reinforcement and human beings is that what may be

reinforcing to one may not be reinforcing to another. We are not standardized products. While some children

crave attention and are reinforced by it, others do not. The key in using reinforcement effectively is in knowing

your students and using what they naturally like to do to reinforce and shape the positive behaviors you would

like to see.

Page 6: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

A Policy of Getting Tough In schools some advocate a policy of getting tough with kids as the sole way of dealing with behavior

problems. (“In my day if we acted out, Old Man Nelson would … [insert a pain-inflicting practice here]…”) On

a societal level some also advocate a similarly simplistic notion of getting tough on criminals as the sole way of

making crime disappear. In both cases the goal is to manage (or control) behavior only by making bad things

happen to humans (students and criminals). The thinking behind this (if you can call it thinking), is that if

enough bad things happen to badly behaved humans (students or criminals) they will “learn”. As the old saying

goes, “That will teach them.” Unfortunately it does not. There is very little learning in these situations. A body

of research has shown this to be the case in dealing with both mice and humans.

Punishment should never be used as the sole means of modifying behavior. Why not? Three reasons

are described here: First, some students (and criminals) have had lifetimes filled with bad things happening.

Merely making one more bad thing happen in a lifetime of bad things has little impact. Second, simply making

bad things happen for problem behaviors in a school setting results in a bit of classical conditioning whereby the

school becomes paired with the aversive conditioner (the bad thing). And third, and most important, by relying

solely on punishment, behavior becomes controlled by external stimuli; never internalized. And while there

may be some initial short term improvement, it does little to teach the correct behavior or to solve the problem.

As stated above, as soon as the punishment or the threat of punishment disappears, the problem behavior

reappears.

This does not mean to imply, however, that there is not an appropriate place for mice, students, and

criminals to experience the logical consequence of their negative behavior. What it does suggest is that

punishment should not be used as the sole means of modifying behavior. If it is used, it should by used in a

very limited fashion and in combination with reinforcement that both teaches and rewards positive behaviors.

Also, when dealing with mice and humans, the antecedent should always be considered as well. That is, what

are the conditions that foster negative behavior? What can be done to mitigate these circumstances? One of the

best ways to manage negative behaviors is to prevent them from occurring in the first place. This is not an

attempt to excuse negative behavior; rather, it seeks to make a case for understanding some of the forces that

may have contributed to it in order to reduce or prevent future behaviors.

Dancing Chickens and Shaping Behaviors How do you teach a chicken to dance? And what do dancing chickens have to do with human beings in

a classroom? Dancing chickens effectively illustrate the concept of shaping. Shaping is when a teacher,

parent, or chicken trainer rewards small steps toward a desired behavior initially, with following rewards

coming only after the behaviors become closer to the desired behavior. Put another way, if you wanted to

reinforce a behavior that the child (or chicken) does not initially display you would first reward successive

approximations of that desired behavior until the full behavior appears.

Page 7: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

To illustrate exactly how shaping would be used to teach a chicken to dance: First the chicken would be

rewarded by simply making a slight turn to the left. When this behavior was learned, the chicken would then be

rewarded by making a half turn to the left. When this behavior was learned it would be reward for turning a full

circle. Next, it would be turning a full circle and bobbing, and on and on until the chicken eventual displays the

desired dancing behaviors (see Figure 11.5).

Figure 11.5. Steps used in shaping the behavior of a dancing chicken.

1. Slight turn.

2. Half turn.

3. Full circle.

4. Full circle and bob.

5. Full circle bob and step.

6. Full circle bob, step and flap.

To illustrate how shaping might be used with a human: let us say that 7th

grade Sally talks frequently

during her language arts class and often day dreams. The desired behavior is for Sally to be attentive (looking

at the teacher or front of the class) and quiet (raising her hand to speak). To reinforce behaviors, her teacher

would use some sort of token as a reward. At first, Sally would be rewarded for being quiet and looking at the

teacher when the class starts. Then she would be rewarded for being attentive and quiet every ten minutes.

Finally she would be rewarded for being attentive and quiet for the entire class (see Figure 11.6). This shaping

would take place over two to four weeks.

Figure 11.6. Steps used in shaping the behavior of 7

th Grade Sally.

1. Attentive and quiet when class starts.

2. Attentive and quiet every 10 minutes.

3. Attentive and quiet every 20 minutes.

4. Attentive and quiet for the entire lecture.

In this same class, the instructor usually allows class time for students to work on their writing

assignments; however, 7th

grade Sidney does not use this writing time productively. He usually jokes with his

neighbors, stares off into space, draws pictures on his paper, or engages in a variety of other behaviors to avoid

writing. The desired behavior would be for Sidney to begin working on his writing assignment without delay

and to stay engaged. Using a token system, Sidney would first be rewarded for getting his paper out and putting

his name on it within two minutes. After this behavior occurred on a regular basis, his reward would then come

after completing the first step and the identifying a writing topic. Then he would be rewarded for completing

two draft paragraphs. Next, he would be rewarded for being engaged in productive writing behaviors for half

the writing time. Finally, he would be rewarded for being engaged during the entire writing time (see Figure

11.7).

Figure 11.7. Steps used in shaping the behavior of 7

th Grade Sidney.

1. Gets materials out, puts name on paper.

2. Identify a topic.

3. Engage in a pre-writing activity.

4. Complete two draft paragraphs.

5. Engaged in productive writing behaviors for half the work time.

Page 8: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

6. Engaged in productive writing behaviors for the entire work time.

Schedules of Reinforcement Schedules of reinforcement pertains to how, when, and how often reinforcement is given. The schedule

of reinforcement determines how quickly a behavior is learned and how long it lasts once the reinforcement

disappears (see Figure 11.8). In this order: VR, FR, VI, FI, CR

Figure 11.8. Line graph showing schedules of reinforcement.

• Continuous reinforcement (CR). Reinforcement is given after every response or behavioral

response. Behaviors here are learned very quickly here; however, there is little persistence. That is, once the

reinforcement stops, the behavior quickly stops.

Bobbi Jo Example – Bobbi Jo is in 2nd

grade. When he comes to school in the morning and after every

recess he throws his coat on the floor and leaves his boots in the middle of the floor. His teacher wants him to

hand up his coat in the closet and put his books, side-by-side, in the book rack. Continuous reinforcement

would be to reward him each time he does this. He would learn very quickly, however, as soon as the rewards

disappear he would quickly go back to his old ways. Also, with continuous reinforcement, the effect wears off.

If he got a sticker, token, or prize every morning and after every recess break, he would quickly tire of it.

• Fix-interval (FI). Reinforcement is given after a specific time increments or intervals. For example,

a reward would be given every 10 minutes if a behavior were present. Other real life examples would be the

weekly quiz to reinforce reading, or a pay check every two weeks. With fixed-interval reinforcement the

behavioral responses rates increase as the time for the reinforcement nears, but then drops off soon after the

reinforcement. Also, there is little persistence once the reinforcement stops.

Bobbi Jo Example – Instead of reinforcing Bobbi Jo after every coat-hanging incident, he would get a

reward at the end of the day or end of the week. When it is time for Bobbi-Jo’s reward we see him much more

attentive to coat-hanging behaviors, but soon after, his attention lapses.

• Variable-interval reinforcement (VI). Reinforcement occurs after the first behavioral response then

it is given after varying lengths of time. Examples of this would be pop quizzes the might be given every

couple of days or every couple of weeks or mouse pellets that appear every couple minutes or every few

seconds. The interval between reinforcement varies. This type of reinforcement results in a slow, steady rate of

learning with no pause after reinforcement. There is more persistence after reinforcement stops. That is, after

the reinforcement stops there is a slow, steady decline of the operant behavior.

Bobbi Jo Example – Bobbi Jo is reinforced after the first couple coat-hanging incidents, then his reward

comes at different intervals; sometimes twice a day, sometimes every two days. Bobbi Jo would be slower to

respond to this type of reinforcement; however, there would be very little decline of coat-hanging behaviors

after he earned his reward and a little greater persistence once the rewards were discontinued.

• Fix-ratio reinforcement (FR). Reinforcement occurs after set number of behavioral responses.

Examples, after three bar-pressings the mouse gets a pellet, get a reward for raising your hand three times in a

row, get allowance money after washing the dishes three times. This results is a fairly rapid increase in

Page 9: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

behaviors, however, there is little persistent. When the expected reward does not occur after the defined

number of responses there is a fairly rapid drop in behavior. Also, there is a slight pause or decrease in

response behavior after the reinforcement has been administered.

Bobbi Jo Example – Bobbi Jo is reinforced after every three coat-hanging episodes. Bobbi Jo would

learn fairly quickly here. There would be a slight lapse in attention after he earned his reward; however, his

coat-hanging behaviors would continue longer after the rewards were discontinued.

• Variable-ratio reinforcement (VR). Reinforcement occurs after varying numb of behavioral

responses. This is the most powerful type of reinforcement schedule for learning and maintaining behaviors. A

slot machine best illustrates this type of reinforcement. Sometimes a payout comes at short intervals,

sometimes longer intervals. It is unpredictable and thus, keeps people coming back. What makes a slot

machine even more reinforcing is that the amount or type of reward also varies. Often is very small,

occasionally it is a little bigger, and on rare occasions there is a huge payout. Variable-Ratio reinforcement

results in very high behavioral response rates initially, there is little pause after reinforcement, and the greatest

persistence of all the reinforcement schedules. That is, once the reinforcement is discontinued, the response

behaviors continue the longest and have the slowest amount of decline with this schedule of reinforcement.

Implications So what do schedules of reinforcement have to do with teaching in your future classrooms? Answer: If

you wish to see a new behavior appear or an old behavior disappear, it is best to use a continuous reinforcement

schedule initially but soon after move to variable-ratio reinforcement schedule to maintain the behaviors. As

the behaviors take shape you should continue to use a variable-ratio schedule but the reinforcement should

slowly become more out and eventually disappear altogether.

Terms and Concepts Related to Operant Conditioning The following terms and concepts are also related to operant conditioning: • Trial and error learning – This is the act of trying a number of different responses in problem solving

until a solution is found. When originally confront with a problem an organism will engage in multiple

responses until one is found to work. In each successive attempt to solve the same problem the number of

attempts is lessened before an answer is found.

• Incremental learning – This is learning that occurs a little bit at a time rather than all at once see

Figure 7.2). (With each successive problem the number of attempts needed to arrive at the solution diminishes.)

Thorndike believed that all learning was incremental not insightful.

• Insightful learning – This is learning that occurs all at once. Using logic and human reasoning, we

can put things together and instantaneously make the S-R connections.

• Connectionism – This is the term used to describe Thorndike’s explanation of learning. He assumed

learning involved the strengthening of neural bonds (connections) between stimulating conditions and the

response to them.

• Transfer – This is when learning that occurs in one situation is applied in a different situation. The

amount of transfer is determined by the number of common elements in the two situations. As common

elements goes up, the amount of transfer between the two also goes up.

• Response by analogy – This is when an organism responds to similar or known situations. The

response to unfamiliar situation is determined by the number of common elements in the two situations.

• Law of conditioning – Skinner proposed that a response followed by a reinforcing stimulus is

Page 10: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

strengthened and more likely to occur again.

• Law of extinction – Skinner also proposed that a response that is not followed by a reinforcing

stimulus is weakened and less likely to occur again.

• Premack Principle – This is finding what students like to do naturally to reinforce something students

do not like to do. For example, if a child liked to play a computer game. This could be used to reinforce

something the child may not like to do, such as homework. After completing a certain amount of homework,

the child would then be allowed to play on the computer.

BEHAVIOR MODIFICATION STRATEGIES As described in the previous chapter, behavior modification strategies involve rewarding behaviors that

you wish to increase and ignoring or pairing with an aversive condition behaviors that you wish to decrease.

There are a variety of behavior modification strategies of varying levels of complexities that can be used in a

classroom. Described here are three fairly simple yet effective ones: contingency contracts, token economies,

and analysis of reinforcers (AR).

A Contingency Contract A contingency contract is a form of behavior modification that is effective in focusing on one or two

specific behaviors that you wish to increase or decrease. Contingency contracts involve an “if/then” or

“when/then” situation where receiving a reward or a privilege of some sort is contingent on a student’s

behavior. The contract provides a visual record and an external reminder for students. It also provides

immediate feedback and holds students accountable for their behaviors.

Ms. Finley, a 3rd grade teacher, used the contingency contract below with one of her students, Patty.

Patty had trouble on the playground with other children, getting into fights and not being considerate. She also

sometimes forgot classroom rules and acted out in class or did not complete her work. At conference time, Ms.

Finley talked with Patty’s parents and introduced the idea of a contingency contract. Ms. Finley identified two

behaviors that she wanted to increase: (a) being considerate of others on the playground, in gym class, and in

the hallway, and (b) following classroom rules. With the approval of Patty’s parents, Ms. Finley started using a

contingency contract (Figure 11.9). Here she gave Patty a rating on each behavior in the morning and in the

afternoon. At the end of the week, Patty would take the contract home along with Ms. Finley’s comments

written on the back. If Patty had a score of 20 or better she would be allowed to watch TV that night. If she had

a score of 25 or better he would be able to select a movie to watch at the video store.

(When using contingency contracts it is best to start with low criteria to insure early success, then raise the

criteria gradually. This is also a form of shaping.)

Figure 11.9. Contingency contract.

Monday Tuesday Wednesday Thursday Friday

a.m. being

considerate of

others

a.m. follows

classroom rules

p.m. being

considerate of

others

p.m. follows

classroom rules

Key: 2 = good job; 1 = okay; 0 = let=s try again.

This contract was taped inside of Patty’s desk. Ms. Finley was able to give Patty fairly immediate

feedback on her behavior twice a day. At the end of the week, Patty brought his contract to share with the

Page 11: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

principal or guidance counselor. This allowed Patty to get recognition for positive behavior or to explain where

he needed to do better next time. After three weeks, Ms. Finley was able to raise the criteria slightly. After two

weeks she moved to giving feedback just once a day, and eventually the contract was discontinued. The

contingency contract also had the unanticipated benefit of improving Patty’s mathematical abilities as she

counted and added her totals from day to day.

Token Economy A token economy is a system of behavior modification whereby students are able to earn a token, such

as a chip, a star, a check mark, or some sort of artificial money for outstanding performance related to academic

performance, social skills, or friendship behaviors. Students are then able to use their tokens to buy some sort

of reward or privilege such as eating lunch with the teacher, extra reading time, or prizes.

Analysis of reinforcers (AR) involves analyzing the behavior to try to determine what is reinforcing it

and either eliminating it or strengthening a positive behavior by reinforcing it. For example: Mary calls out

rude remarks in Ms. Hill’s 6th

grade social studies class. This behavior appears to be reinforced by the reaction

she gets from Ms, Hill and from peers. After an analysis of the reinforcers, Ms. Hill decides she can remove the

reinforcement of this behavior by reacting calmly to her remarks and by removing Mary from the immediate

presence of peers, seating her at a table in the back of the room. If the behavior continues, Ms. Hill may have to

pair the behavior with an aversive conditioner. The aversive conditioners that Ms. Hill uses are asking Mary to

talk with her at noon (thus removing her from her social group), or asking her to leave the room and spend time

in the principal’s office. In both situations, Ms. Hill does not have to raise her voice or react in anger. She does

not have to create a negative environment for other students. She is in full control and is focusing on the

behavior, not the student.

You must be careful in the use of aversive conditioners. Avoid using them as the sole method of

modifying behavior, for a variety of ethical reasons. Also, aversive conditioners are not effective in shaping

behavior, as they do not teach or reinforce the correct behavior.

In another example: In Mr. Jorgensen’s 4th

grade class, Robert has been struggling with learning certain

friendship behaviors. He knows that when students are learning new behaviors he needs to reinforce them

anytime he sees semblances of them. This reinforcer can be fairly quiet and simple. For example, when he sees

Robert remembering to use one of the friendship behaviors, he quietly says, “Robert, I really like how you

remembered to listen and let people talk in your small group today. Nice job.” This reinforcer can also be made

public. For example, after seeing Robert work well in a cooperative group, Mr. Jorgensen said, “Robert, you

really worked well with your cooperative learning group today. I am going to let your group have first choice of

activities during free time today.”

CONCLUDING THOUGHTS

This chapter described some of the major ideas associated with behavioral learning theory. These

theories examine only outward behavior when coming to understand learning. Classical condition involves two

stimuli being paired together many times so that eventually, when each of them occurs independently, they both

produce the same response. Operant conditioning involves behaviors that appear and continue because of a

reward or discontinue because of no reward or punishment. These theories can be used to understand behavior

occurring in a school or classroom. They can also be used for classroom management.

Summary of Key Ideas • Operant conditioning involved an organism acting upon the environment and then being reward or punished

for that action.

• The law of effect states that an action followed by a pleasurable consequence is more likely to be repeated

and one followed by an annoying or painful consequence is less likely to be repeated.

• The law of exercise states that the connections between a stimulus and response becomes strengthened with

practice and weakened when practice is discontinued.

• When a human is ready to act, it is reinforcing for it to do so and annoying for is not to do so.

Page 12: Chapter 11 Chapter 111111 Operant ConditioningOperant ...Edward Lee Thorndike Edward Lee Thorndike’s (1874-1949) theories of learning are sometimes called connectionism. Unlike Watson,

© Andy Johnson, Ph.D.

Minnesota State University, Mankato

www.OPDT-Johnson.com

• Skinner thought that most animal and human behavior is controlled by the events that precede the behavior

(antecedents) and also those that follow (consequences) the behavior

• A reinforcer is any consequence that increases the likelihood that a behavior will occur again.

• Positive reinforcement is a reward or a pleasurable thing that is attached to a behavior

• Negative reinforcement is when the removal of an annoying or painful condition is attached to a behavior.

• The schedule of reinforcement determines how quickly a behavior is learned and how long it lasts once the

reinforcement disappears

• Punishment is consequence that decreases or suppresses behavior.

References Hergenhahn, B. R., & Olson, M. H. (2005). An introduction to theories of learning (7

th ed.). Upper Saddle

River, NJ: Pearson/Prentice Hall.

Lattal, K. (1998). A century of effect: Legacies of E. L. Thorndike’s animal intelligence monograph. Journal

Of The Experimental Analysis of Behavior, 70, 325–336.

Sheppard, L. (2001). The role of classroom assessment in teaching and learning. In V. Richardson’s (Ed.).

Handbook of research on teaching (4th ed.). Washington, DC: American Educational Research Association,

1066-1101.

Watson, J.B. & Rayner, R. (1920). Conditioned emotional reactions. Journal of Experimental Psychology, 3, 1-

14.