Operant Conditioning II
Module 07 Reading
CHAPTER 7
OPERANT CONDITIONING II: BEYOND THE BASICS
A CLOSER LOOK AT REINFORCEMENT
Premack Principle
Response Deprivation Theory
COMPLEX BEHAVIORS
Shaping and Chaining
Instinctive Drift
Avoidance Learning
FACTORS INFLUENCING THE EFFECTIVENESS OF CONSEQUENCES
Satiation
Immediacy
Contingency
Cost/Benefit
LANGUAGE AS VERBAL BEHAVIOR
CHECK YOUR LEARNING: BEYOND THE BASICS
LEARNING IN THE REAL WORLD: SYSTEMATIC DESENSITIZATION
Key Terms and Definitions
References
Tables
Figures
On a recent trip to Sea World, one of your authors of this textbook had the pleasure of attending a show that included domesticated rescue animals, such as cats and dogs, and other animals (e.g., skunks, raccoons, a pig, and pigeons) participating as “actors” in a vaudeville-like production. The show was quite surprising because the animals executed a series of behaviors on cue. These actions included climbing a rope, crawling through a hole, flying from one area to another (pigeons only, of course), and knocking over various items at just the right time. Throughout the production, one could hear audience members say about the animals, “How do they get them to do that?”
***Think Ahead***
Pause for a moment and consider the various techniques that the Sea World trainers may have used to teach the various routines to the animals. Might they be related to the principles of operant conditioning you learned in Chapter 6? Write down your ideas before reading further.
Most likely you wrote down something related to operant conditioning, reinforcement, and maybe punishment. If you identified shaping as a possible technique, then you were on the right track. Animal trainers have used operant conditioning techniques to great effect to condition animals to do more than sit, stay, and retrieve. Although the Sea World show included animals demonstrating basic behaviors that were within the animal’s normal repertoire of actions, animals and humans can learn complex behaviors using operant conditioning. In fact, behavior modification is a treatment approach used by counselors that is based on the principles of operant conditioning. Using behavior modification techniques, a counselor can help an individual replace undesirable behaviors with more desirable behaviors. In this chapter, we will take a closer look at the role of operant conditioning in learning and changing behavior.
A CLOSER LOOK AT REINFORCEMENT
As we have seen in the earlier chapters of this book, reinforcement is a powerful mechanism for encouraging a target behavior.
Upon discovering and documenting the basic effects of reinforcement and punishment on animal and human behavior,
theorists became interested in the intricacies of the relationship between the consequences of behavior and how frequently and reliably that behavior is demonstrated.
In this section we will discuss …
The Premack Principle
Recall from Chapter 6 that the basic assumption of operant conditioning rests upon Thorndike’s law of effect. According to the law of effect, behaviors that are followed by favorable consequences are more likely to be repeated. David Premack (1959; 1963) refined this basic assumption by proposing that higher probability behaviors (those that are demonstrated more frequently) will reinforce lower probability behaviors. This idea is known as the Premack Principle. Simply stated, the Premack Principle means that we can use a behavior that we enjoy to reinforce a behavior that we do not enjoy as much. Premack argued that we have a hierarchy of behaviors that are based on preference. And, preferences differ by individual.
The Premack Principle was based on Premack’s (1959) observations of children’s behaviors when they had free access to a candy dispenser and a pinball machine. Specifically, he was interested in which behaviors were preferred – playing pinball or eating candy. Premack found that some children preferred pinball to candy, whereas others demonstrated the reverse preference.
For the Premack Principle to apply, one has to know which behavior is preferred more strongly. Premack found that a less preferred behavior does not reinforce a more desirable behavior.
If you think back to your grade school years, you may recall a teacher telling you and your classmates that no one would be allowed to go to the playground until everyone’s desk was clean. Your teacher knew that the class preferred the playground to cleaning desks. Thus, he or she reinforced the desk cleaning behavior with the promise of going to the playground.
***Think Ahead***
Stop and think about the “clean desk” scenario. In this scenario, what is the less preferred behavior on the part of the students? What is the more preferred behavior? Why does one behavior reinforce the other?
You most likely identified going to the playground as the more preferred behavior, and therefore as one that would reinforce the behavior of cleaning one’s desk. It seems that few students would prefer to spend an hour cleaning their desks to an hour of recreation on the playground. Thus, the promise of some playground time serves as motivation to achieve clean desks.
If you have ever made a “bargain” with yourself to be able to watch a sports event on television only if you finish a certain amount of studying before game time, you have made use of the Premack Principle. Often without knowing it, parents, teachers, and employers regularly use the Premack Principle to encourage children, students, or workers to complete a less favored activity before giving them the opportunity to engage in a more favored activity.
The important point to note, however, is that what is preferred by one individual may not be preferred by another. If you were promised the opportunity to watch a hockey game once you completed a class assignment, but you dislike hockey, there would be little incentive to finish your assignment before the game starts. Reinforcement is driven, in part, by the preferences of the individual.
In a related study, Brown, Spencer, and Swift (2002) described how the Premack Principle was used to motivate a 7-year-old child with learning difficulties to try more foods. The particular child in the study had a history of food refusal; in fact, his health was beginning to suffer because of his limited diet and his overall low intake of food. The child’s parents followed a three-month program in which they presented foods for the boy to eat that were similar to his preferred foods. After eating the new food, he was allowed to eat his previously preferred foods. For example, the boy was presented with a bread roll (a new food) first. After eating the roll, he was allowed to eat a bread slice (his previously preferred food). The parents took small steps when introducing foods throughout the three-month program. Each time they presented a new food, the boy was told that he could have his preferred food after eating the new food. The parents were careful not to overwhelm their son with a large quantity of new food. Over time, the boy began eating a wider variety of foods and an overall greater quantity of food. Thus, the Premack Principle served as an effective premise for training the child to become healthier by expanding his diet.
A friend of ours followed a similar program to modify her husband’s food preferences. After he was diagnosed with high blood pressure and a heart ailment, it was important for him to limit his intake of saturated fat and sodium. Yet he disliked vegetables and preferred foods like bacon and ham, hamburgers with pickles and ketchup, and hot dogs with baked beans. His wife began introducing a small serving of a vegetable such as broccoli, carrots, or corn with each meal, telling him he could have the meat on the condition that he finished the vegetable first. After a few months on this program, the husband had come to accept vegetables for the first time in his life.
Response Deprivation Theory
Building on Premack’s findings regarding the ability of a more preferred behavior or activity to reinforce a less preferred behavior, Timberlake and Allison (1974) questioned how restricting access to a preferred activity would influence behavior. You may have a particular activity that you enjoy. For one of your authors, that activity is running. Given clear, cool weather, one of your authors, hereafter “the runner,” will head out for a 5-mile morning run four days per week and a 10 to 13-mile run one day per week. In short, the runner really enjoys running to put in this many miles each week. If weather, illness, or a trip out of town gets in the way of the runner’s ability to run, she will do whatever is necessary to get back to her usual weekly mileage. This includes running extra miles the week after stormy weather and making a trip to a gym to get in a run on a treadmill (something the runner loathes as much as eating cauliflower). The response deprivation theory states that an animal or individual will work to maintain an optimal or best level of a preferred behavior, particularly when access to that behavior is limited. That is the pattern of behavior that the runner demonstrates in the example provided earlier in this paragraph. When access to running is restricted, she pursues all avenues to resume her usual level of that activity. The response deprivation theory also states that when access to a preferred activity is severely restricted, the activity will serve as an even stronger reinforcer for a less-preferred behavior or activity. Thus, it is the discrepancy between the baseline rate of the preferred behavior and the current opportunity to perform the behavior that determines the level of reinforcement. A greater discrepancy is associated with stronger reinforcement. So, more rainy days make the runner that much more eager to get back to running the usual amount.
Before Reading Further…
***Consider something that you enjoy doing and that, if restricted from that activity, would compel you to make a great effort just to get back to your usual preferred activities.
You may have noticed one primary difference between the Premack Principle and the response deprivation theory. That is, the Premack Principle states that a higher probability behavior can serve as a reinforcer when it is contingent upon the demonstration of some less-preferred behavior. In contrast, the response deprivation theory argues that the contingent behavior will serve as a reinforcer only when the baseline amount of that behavior is restricted (e.g., the person or animal is deprived of that behavior). So, according to the response deprivation theory, even a less probable behavior can serve as a reinforcer if the ability to engage in that behavior is reduced from its normal baseline.
What does all of this mean in terms of a real-life application? Think about how the Premack Principle and the response deprivation theory relate to a situation in which a person goes on a diet. Suppose the person going on the diet has never been tempted by desserts, but they are tempted by a big, juicy steak. Now suppose this person is going to go on a high protein diet in which foods that are high in starch and sugar (e.g., desserts) are greatly restricted and foods high in protein (e.g., juicy steaks) may be consumed with relative abandon. According to Premack, the opportunity to splurge and eat a dessert, which is a less preferred activity to eating a steak, would not serve as a reinforcer for staying on the diet. But, the response deprivation theory would predict that eating a dessert could become a reinforcer if access to eating a dessert were reduced below the usual baseline levels. The dieter may begin to miss desserts even though they were only eaten infrequently before the diet.
The example regarding the high protein diet actually is more complicated than simply determining what activity serves as a reinforcer. Other elements, including level of hunger and satiation of the appetite for steaks are part of the equation. Human and animal behaviors are complex and rarely explained by examining the relationship between a few behaviors and their consequences. Thus, we next consider how research on operant conditioning has attempted to explain complex human and animal behaviors.
COMPLEX BEHAVIORS
In Chapter 6, you learned about the basic model of operant conditioning. Specifically, you learned that a discriminative signal (SD) is a preceding event that serves as cue to an organism that a certain behavior should be demonstrated (R). The SD also signals a relationship between the behavior (R) and a reinforcer (SR). Our previous discussion of this model centered upon relatively simple behaviors, single behaviors that are signaled by a simple cue and are followed by reinforcement.
In the following sections, we will extend our discussion of the operant conditioning model to include more complex behaviors. For example, what happens if … ? After all, few human behaviors are so simplistic that they are explained by a single cue, a single behavior, and a single reinforcer.
Shaping and Chaining
Recall from Chapter 6 how behaviorists can use shaping to train an animal or person to execute a behavior that they do not yet know, or do not regularly exhibit. In shaping, the experimenter or trainer identifies a desired target behavior, such as training a dog to ring a bell to signal that he wants to go outside. The trainer then reinforces the animal (or human, depending upon the situation) for successive approximations toward that goal behavior (Skinner, 1938). Shaping is a systematic process that requires the trainer to make a decision about when to stop reinforcing one behavior and to require the animal or person to do something new to receive reinforcement. Behavior analysts often monitor the progress of shaping carefully so that they will know when to change the reinforcement contingency.
Many of the complex behaviors that humans demonstrate require us to execute a series of behaviors in a particular sequence. ep left off Behaviorists have identified a special technique to train animals and people to execute certain behaviors in a specific order. The technique they use Is called chaining, an extension of behavioral shaping. Chaining is the reinforcement of successive elements of a chain of behaviors. Thus, the difference between chaining and shaping is that shaping focuses on reinforcing a target behavior and chaining focuses on reinforcing a series of behaviors that must be executed in a specific order.
When you were young, your parents or caregivers taught you how to brush your teeth. Although there is some variation in how each of us accomplishes this task, the routine that you follow likely is one that was guided by the early instruction from the adults taking care of you. Successful brushing requires an individual to hold his or her toothbrush, apply an appropriate amount of toothpaste, brush all surfaces of the teeth, spit, rinse, and wipe the mouth. Spitting out the excess toothpaste before brushing all surfaces of the teeth would compromise the effectiveness of the entire process. Additionally, wiping the mouth before brushing, and not afterwards, would defeat the purpose of this step. If you have ever tried to teach a toddler to brush his or her teeth, you can appreciate the amount of work that goes into learning this series of behaviors. Successful tooth brushing is an example of a chain of behaviors that must be executed in a specific order.
There are two primary chaining techniques for conditioning a series of behaviors: forward chaining and backward chaining.
Forward chaining. Forward chaining is a technique for conditioning a series of behaviors that begins by training the first behavior and proceeding with conditioning in a forward manner. In this approach, reinforcement occurs each time the animal or person demonstrates the first behavior. After mastering the first behavior, the second behavior is conditioned. Reinforcement occurs only when the first and second behavior are demonstrated together and in the correct sequence. This pattern of reinforcement continues until all steps are executed sequentially.
Just as you were once taught to brush your teeth, you likely also were taught to make the bed each morning. Suppose you were to break down bed making into the following steps:
- Remove the pillows.
- Pull up the top sheet.
- Tuck in the edges of the top sheet.
- Pull up the bed cover/blanket/comforter.
- Smooth out any wrinkles.
- Place the pillows neatly at the head of the bed.
One could condition this skill using forward chaining. That is, a parent or caregiver could reinforce a child for removing the pillows consistently each morning. Then, the parent could reinforce the child to remove the pillows and pull up the sheets. In this case, the reinforcement would occur only after both behaviors were executed in the proper sequence. After mastering the first two steps, the parent then reinforces only after the child removes the pillows, pulls up the top sheet, and tucks in the edges of the top sheet. The process of adding behaviors would continue until all steps were learned.
Forward chaining, or forward progression, has been criticized for several reasons. When trying to teach a chain of behaviors that is long, the burden on the learner gets progressively more difficult as each step/skill is added. The longer the chain, the more work (mental or physical) required. Additionally, interference can occur when previously learned steps interfere with learning the next step to add to the process. One way to avoid the interference of previously learned steps is to take a backward approach to learning each step. This approach is known as backward chaining.
Backward chaining. In backward chaining, the learner begins by acquiring the last element in a series. Successful demonstration of that element is followed by reinforcement until the learner demonstrates the element reliably. Then, the next-to-last and last elements are combined and reinforcement occurs only when both are demonstrated correctly and in the correct sequence.
Building on the previous example of making the bed, a parent could condition a child to make the bed by using a backward chaining technique. In backward chaining, the parent or caregiver could complete the first five steps of the bed making process described above and require the child to complete the last step. Upon successful completion of the last step, the caregiver or parent would reinforce the child. This routine would continue until the child reliably completed the last step. Then, the caregiver or parent would alter his or her actions by completing only the first four steps and requiring the child to complete steps five and six in the proper order before receiving reinforcement. The process of adding a step to the beginning of the series of desired behaviors would continue until the child was able to execute all of the steps properly and in the desired sequence.
At first glance, backward chaining may seem like a very complicated and unintuitive way to teach a series of behaviors. However, advocates of this approach argue that backward chaining places less cognitive load on learners. That is, the newest element to be demonstrated in the series is always demonstrated first, rather than last in a forward progression. Thus, the learner does not have to hold the new behavior in memory until all other behaviors are demonstrated. Rather, he or she begins by demonstrating the new behavior and proceeds through the process by demonstrating the previously learned behaviors.
Sports coaches have embraced the backward chaining approach to condition athletes on the execution of sport-specific behaviors. The golf swing has received some attention in the sport psychology literature as a particular series of behaviors that may be best conditioned through a backward chaining technique. Rushall (1996) provided a comparison of teaching a golf swing using forward and backward chaining approaches. Both begin by teaching the individual a proper grip. From that point, however, the two approaches diverge. Backward chaining begins by teaching the final follow-through position that marks the end of a full swing. This position is reinforced to mastery and then a new step in the swing is added. The progression of the swing training proceeds so that the club position is moved backwards in the “ideal” swing. Using the backwards approach, the newest element is always executed first in the series followed by the previously mastered behaviors. This approach alleviates interference effects and cognitive load because he or she thinks about and demonstrates the newest element first. The previously learned behaviors follow.
The effectiveness of backward chaining was demonstrated by Simek and O’Brien (1988). Simek and O’Brien compared an experimental group of little leaguers who learned to hit a baseball using backward chaining and mastery techniques to a standard-practice control group of little leaguers who did not receive the same intervention. After only a few games, the children in the experimental group showed significant gains in hits and in the ability to judge the strike zone. The children in the control group did not show the same magnitude of gain.
Although backward chaining may be particularly effective for conditioning some complex behaviors, others may not lend themselves to such an approach. Rushall (1996) noted, for example, that diving would not be well-suited for backwards chaining due to the nature of the actions. That is, one cannot practice the entry into the water without first jumping from the diving board. Nevertheless, backward chaining and forward chaining are effective means of conditioning an animal or person to execute a series of behaviors that may not be part of the regular repertoire of actions.
Before Reading Further…
***Reflect on some of the complex behaviors that you have acquired in your lifetime. Did you learn them through a chaining process? If so, which approach (backward or forward chaining) was used by the person who taught you the behavior? Write down your example before reading further.
Instinctive Drift
While some researchers were busy refining methods for conditioning animals to engage in increasingly complex behaviors, others noted that animals do not always behave in the way they have been conditioned. Breland and Breland (1961) famously chronicled the misbehavior of the animals they worked with as part of their animal training business. The Brelands trained animals to engage in a variety of tasks and then they showed the animals’ tricks at county fairs, zoos, tourist attractions, and museums. As former students of Skinner, the Brelands were well versed in the use of operant conditioning techniques. By their count, they successfully conditioned thousands of animals including raccoons, pigs, and even reindeer. Despite these successes, the Brelands noticed some interesting failures in their efforts to train animals.
In one example of challenges when conditioning animals, Breland and Breland (1961) described the misbehavior of a raccoon. The Brelands attempted to condition a raccoon to pick up a coin and drop it into a metal container. The raccoon was reinforced initially for picking up the coin. Then, they introduced the metal container and required the raccoon to drop the coin in the bank before receiving reinforcement. The Brelands noticed that the raccoon had a very hard time letting go of the coin and dropping it in the bank. Instead, the raccoon wanted to rub the coin against the inside of the metal container, dip it in the container, and rub it again. Despite non-reinforcement for the rubbing and dipping behaviors, the raccoon continued.
In a similar example, Breland and Breland (1961) described the behavior of a pig that was conditioned to pick up several large wooden coins and deposit them in a “piggy” bank. The pig was conditioned to pick up the money, carry it to the bank, deposit it, and run back for more money. However, the pig began to behave in unusual ways. Instead of carrying the money to the bank, the pig would drop the coins repeatedly and root them on the ground or flip the coins in the air and root them in the ground. The pig’s completion of the entire series of behaviors slowed down so much that on some days he did not receive enough food reinforcement to meet daily dietary requirements (Breland & Breland, 1961).
You may be thinking that perhaps the Brelands were overly ambitious in their efforts to condition animals. However, they noted that raccoons and pigs generally are highly responsive to instrumental conditioning because they are naturally hungry and also are generally obedient. The question, then, is how to explain the behaviors that were displayed despite the animal’s conditioning. Breland and Breland (1961) noted that one could explain the behaviors as examples of learned superstitions. However, they felt a better explanation given the nature of the “misbehaviors” was that the animals reverted to instinctive behaviors related to their natural feeding behaviors. Specifically, the raccoon washed the coins much like he would do to remove the exoskeleton of a crawfish. Rooting is also an instinctive behavior for pigs. Thus, it appears that while conditioning may teach animals and people to engage in new behaviors, operant conditioning may not be able to change a fixed action pattern of instinctual behaviors. The Breland’s referred to the tendency to engage in instinctive behaviors as instinctive drift.
Before Reading Further…
***Consider how knowledge of instinctive drift may influence the counseling profession. What should a therapist consider when working with a client who wants to change a maladaptive behavior for one that will result in more positive effects on the individual’s life?
Avoidance Learning/Aversion Therapy
To this point, we have taken a closer look at how positive reinforcement influences complex behaviors. Negative reinforcement also is associated with complex human behaviors. Recall that negative reinforcement is defined by the strengthening of a target behavior when a stimulus, particularly one that is unpleasant, is removed. It is the removal of an unpleasant stimulus that seems to reinforce our avoidance of aversive stimuli or situations.
Solomon and Wynne (1953) conducted the classic research on avoidance learning in dogs. In this study, dogs were placed in a two-compartment conditioning apparatus. The dogs stood in one of the compartments and awaited the change in a light, which signaled the onset of an electric shock through the floor of the compartment. The dogs learned quickly that they could jump over a barrier into the other compartment to avoid being shocked. The dogs needed only a few trials to learn that the light was a cue for the upcoming shock. So, they jumped and avoided the misery. Hence, negative reinforcement increased the probability that the dogs would jump and avoid the shock in the future.
Before Reading Further…
***Think about your own behaviors. Are there situations, people, or activities that you avoid? What made you start avoiding the situation/person/activity? Why do you continue to avoid it? Write down your thoughts before reading further.
As you consider your responses to the question above, you may notice that it was an initial bad experience with a particular situation, person, or an activity that made you dislike it. Now, you find yourself avoiding the situation/person/activity so that you do not have to face the negative consequences. Two theories have been offered to explain avoidance learning: Hull’s (1943) drive reduction theory and Mowrer’s (1947) two-factor theory.
Hull (1943) proposed a general theory of learning that has been applied to avoidance/escape learning. In short, Hull stated that humans have biological needs that serve as motivation to perform certain behaviors. For example, being thirsty motivates one to drink. In relation to his assumptions regarding learning, Hull developed a mathematical equation for predicting the potential of a behavior. The equation included variables related to drive, incentive, habit strength, and inhibition. In the case of avoidance learning, the drive to avoid pain may be very great. Accordingly, the incentive for avoiding pain is high. In its most basic application, Hull’s theory did not explain the animal data well. That is, the research on avoidance learning suggested a cognitive element in escape learning.
Mowrer (1947) proposed a separate account of avoidance learning based on behaviorist principles. Rather than pursuing a cognitive explanation, Mowrer proposed a two-factor theory of avoidance learning in which fear to an unconditioned stimulus is originally learned through classical conditioning. That is, the animal or person learns an association between a stimulus and the onset of an aversive stimulus. The aversive stimulus, such as the shocks that Solomon and Wynne (1953) delivered to their dogs, is unpleasant enough to condition fear or another negative emotion. According to Mowrer, the animal or person can learn to avoid the aversive consequences when they perceive the conditioned stimulus. Avoiding the negative effects of the conditioned stimulus serves as negative reinforcement. That is, the response to the conditioned stimulus, then prompt the animal or person to demonstrate a behavior to avoid the unpleasant consequences predicted by the stimulus. The avoidance behavior, then, is maintained through negative reinforcement. Thus, the second part of avoidance learning is related to operant conditioning. Hence, Mowrer argued for the dual role of classical and operant conditioning in the learning of avoidance behaviors.
When do we see avoidance learning in our own lives? You likely have heard of post-traumatic stress disorder (PTSD). PTSD is an anxiety disorder than can develop after a frightening or life-threatening experience. Individuals who suffer from PTSD often avoid places, people, or other things that remind them of the frightening experience. For example, suppose a person is viciously attacked by a dog while out for a walk in his or her neighborhood. Thereafter, the sound of a barking dog, the site of the attack, or even some other trigger can elicit the fear response that was experienced at the time of the attack. Previously, the site or a dog barking would not have elicited fear. But, because of the attack, these stimuli now are classically conditioned to elicit fear in this individual. To avoid the feeling of fear, the individual learns to avoid any triggers associated with the dog attack. The person may no longer walk down the street where the dog attack occurred. Or, the avoidance behavior could be more severe and the individual may not leave their home at all. The avoidance behavior is maintained through negative reinforcement – taking away the aversive fear feeling.
In this section, you have learned about how operant conditioning has been used to explain complex human and animal behavior from superstitious beliefs and actions to avoidance behavior. These explanations depend upon how the consequences of behaviors influence and individual or animal. The following section takes a closer look at the various characteristics of consequences and how those consequences influence operant conditioning.
Factors Influencing the Effect of Consequences
As you now know, reinforcement and punishment are only as powerful as their importance to an animal or human. We know from Chapter 6 that animals and humans can be conditioned to discriminate between stimuli that are only slightly different. There is evidence that we are also sensitive to various properties of the consequences we experience following a behavior. The discussion of schedules of reinforcement in Chapter 6 specifically addressed how the frequency and timing of reinforcement can influence behavior. This section addresses the effects of reinforcement frequency, timing, and magnitude on behavior.
Satiation. You likely have heard the statement, “Too much of a good thing may do you harm.” In operant conditioning, that saying holds little merit. In fact, Behaviorists would argue that too much exposure to a reinforcer will overwhelm an animal or person’s appetite for that reinforcer. In turn, the reinforcer will lose some of its effectiveness.
The classic example of satiation is that food will not be an effective reinforcer for a desired behavior when the person or animal is not hungry. This author loves to eat tortilla chips and chile con queso. I like them so much that I will mow and edge the yard for a chance to eat chips and queso afterwards. There have been times, however, when I have indulged too heavily in these snacks. My stomach ties up in knots and suddenly, the last thing in the world I want to do is eat more chips and chile con queso. It is at this point that my desire to eat chips and queso has been satiated and they no longer serve as an effective reinforcer for my behavior. In fact, I have been known to eat too much of these things. In operant conditioning, a state of satiation can exist when the animal or person has had their “appetite” for a particular reinforcer fully satisfied.
Before Reading Further…
***Consider a time when you noticed that a reinforcer was less effective than normal. Was the decrease in effectiveness due to satiation? Why or why not? Write down your response before reading further.
Was your example related to the reinforcing powers of food or water? Or some other biological need? We often notice satiation in relation to reinforcers that relate to primary needs. When those needs are met, the reinforcers become less effective.
Before Reading Further…
***Consider the criminal justice system in the United States. Individuals accused of a crime often wait weeks or months for a trial. If convicted, it often means that there has been a sizable amount of time pass between the crime and the onset of punishment. Without delving too deeply into the obvious benefits associated with the time for a defendant to prepare an appropriate defense, is the delay between a crime and its associated punishment good practice? Consider this question solely from the perspective of operant conditioning. Write down your response before reading further.
Immediacy. Another factor that affects the importance of a reinforcer or a punishment to an animal or human is the timing of the consequence following the behavior. In Chapter 6 you learned about the various schedules of reinforcement. Specifically, you learned how a variable interval and fixed interval schedule influenced responding. The immediacy of a consequence, or how quickly the consequence occurs after the targeted behavior, influences the extent to which the consequence serves as reinforcement or punishment. Typically, the shorter the amount of time between the targeted behavior and the consequence (reinforcement or punishment), the stronger the effect of the behavior. It is fairly easy to understand why this would happen. When reinforcement or punishment occurs immediately following a behavior and when that happens consistently, the person or animal quickly learns the association between the behavior and its consequence.
As a practical example, suppose you burned yourself immediately after you touched the handle of a pot on the stove. The pain from the burn is aversive, punishing. You would learn quickly that the hot handle is the cause of the burn. There would be no additional time for other events or behaviors to interfere with the pot-touching behavior and the burning sensation. Thus, it would take not time at all for you to learn to grab an oven mitt before touching the pan.
Now, consider your response to the prompt above regarding crime and punishment. Is it possible to apply what operant conditioning suggests regarding the immediacy of a consequence? Or, would that be inconsistent with our Constitutional right to a fair trial? This is something to consider.
Before Reading Further…
***Have you ever done something wrong and not gotten caught? If so, did you repeat the behavior in the future because you figured that you would not suffer any negative consequences? Then, did you eventually, to great surprise, experience some negative consequences? Were you frustrated because you did not realize that the negative consequences might occur in relation to the behavior? Write down your response before reading further.
Contingency. While the immediacy of the consequence is important, operant conditioning also is strengthened by a reliable contingency between a behavior and its consequence. Recently, intersections in your author’s hometown have been outfitted with cameras to detect vehicles that run red lights. The cameras operate 24 hours a day and take pictures of the license plates of vehicles whose drivers run red lights. The camera programs have been wildly successful in terms of decreasing the incidence of drivers running red lights because getting a ticket (a punishment) is consistently and reliably related to running the red light. Previously, a driver would face a ticket only if a police officer witnessed the red light running behavior and if the police officer decided to issue a ticket. With the new system in place, a driver will be caught by the camera even if there are no other cars around. Thus, the contingency is consistent. With that in mind, consider that it takes several weeks for the ticket to arrive at the driver’s mailbox. The delay between the crime and the receipt of the ticket illustrate the relationship between the response contingency and immediacy. Learning is rapid and robust when a behavior is followed immediately by a consequence…every single time the target behavior is demonstrated.
Before Reading Further…
***Have you ever thought to yourself that you really should change a specific behavior, but it just is not worth it to you? What was the behavior? What were the potential benefits of changing the behavior? What costs (time, effort, etc.) were associated with changing behavior? Consider your answer to these questions as you read further.
Cost-benefit. The costs and benefits associated with a consequence of a behavior are important to how much that consequence influences future behavior. Typically, researchers describe the benefit in terms of the size or amount of the consequence. If the benefit of a consequence is great, then it is more likely to influence behavior. If the benefit is small or negligible, then it likely will not serve as an effective consequence in terms of changing behavior.
The classic example of cost-benefit characteristic relates to buying lottery tickets. A person may not deem a three million dollar jackpot worth making a trip to the convenience store to purchase a lottery ticket. But, a 300 million dollar jackpot may inspire the individual to go to the convenience store to purchase not one, but 10 lottery tickets, for the jackpot drawing. In this case, the larger jackpot is enticing enough (a benefit) to offset the cost of having to make the trip to the store and purchase the ticket (the costs). Thus, a stronger consequence, negative or positive, is more likely to produce a change in behavior.
LANGUAGE AS VERBAL BEHAVIOR
CHECK YOUR LEARNING: BEYOND THE BASICS
Researchers have examined the characteristics of individuals and animals, the way behaviors are sequenced when learning, and the consequences in pursuit of a full understanding of how operant conditioning applies to simple and complex learned behaviors.
- The Premack Principle states that a preferred activity can serve as a reinforcer for the demonstration of a less preferred behavior. The Response Deprivation Theory proposes that a behavior, even a less preferred behavior, can serve as a reinforcer if the opportunity to engage in that behavior falls below the usual baseline of responding. The greater the discrepancy between the current opportunity to engage in the behavior and its usual baseline, the more reinforcing the behavior becomes.
- Forward Chaining is a technique for shaping the acquisition of a series of behaviors that must be performed in a specific sequence. In forward chaining, the individual or animal is reinforced for demonstrating the first behavior in the series. Then, the individual or animal is reinforced only after performing the first and second behaviors in proper sequence. The pattern of reinforcement continues until the individual or animal can reliably execute all behaviors in proper order.
- Backward Chaining is a separate technique for conditioning the acquisition of a series of behaviors that must be performed in a specific sequence. Using this approach, individuals or animals are reinforced for demonstrating the last behavior in the sequence. When this behavior is mastered, the individual or animal then is reinforced only when the last two behaviors are demonstrated reliably and in proper sequence. Conditioning continues until all steps are learned. This approach is assumed to reduce barriers to learning associated with having to repeat previously learned steps before demonstrating the newest behavior, as is the case in forward chaining.
- Instinctive Drift is a term applied to Breland and Breland’s (1961) findings that animals that have been classically conditioned to demonstrate a particular behavior will often lapse back into displaying more instinctual behaviors.
- Fixed Action Pattern is an instinctual series of behaviors that are demonstrated in their entirety. These behaviors are innate.
- Avoidance Learning is learned behavior that allows an individual or animal to avoid unpleasant consequences associated with a person, event, location, etc.
- Mowrer’s (1947) Two Factor Theory is one explanation of avoidance learning. According to Mowrer, individuals or animals first learn to fear a previously neutral stimulus through classical conditioning. Then, the individual or animal escapes the stimulus to avoid the feelings of fear. Thereafter, the individual or animal’s avoidance behaviors are maintained through negative reinforcement.
- Satiation is the point at which an individual or animal’s appetite for a reinforcer is satisfied and the reinforcer no longer holds the same power to change behavior.
- The immediacy with which a consequence follows a behavior will influence the consequence’s effectiveness for changing a behavior. Consequences that immediately follow the targeted behavior will be associated quickly with that behavior and operant conditioning will be strong.
- The contingency between a target behavior and its consequence must be demonstrated consistently for operant conditioning to occur rapidly.
- The benefits associated with a consequence must be greater than the related costs to change a behavior for operant conditioning to occur.
______________________________________________________________________________
LEARNING IN THE REAL WORLD: SYSTEMATIC DESENSITIZATION THERAPY
This chapter began with an example of how operant conditioning can be applied to the training of animals for entertainment purposes. However, it is important to realize that operant conditioning extends beyond animal learning and training. There are important applications of operant conditioning that affect people’s lives. Systematic desensitization therapy is one such example.
Earlier in this chapter you learned about avoidance learning and Mowrer’s (1947) explanation of how we learn to avoid unpleasant situations, people, or places. You also learned about PTSD and the way the avoidance of triggers (situations, people, and places) can affect a person’s life. When those avoidance behaviors begin to restrict a person’s activities extensively, they may seek help from a therapist to condition a new type of response. Systematic desensitization therapy is one approach to helping individuals cope with the fears and avoidance behaviors associated with PTSD.
Wolpe (1964) developed the systematic desensitization technique to help individuals deal with anxiety associated with their fears. In this approach, a person with the help of a counselor or therapist develops a hierarchy of stimuli that produce the fear response. Then, the process of therapy is to start with the lowest stimulus on this hierarch, the stimulus that produces the least amount of fear or anxiety. The client puts himself or herself in a controlled situation in which he or she experiences the stimulus and practices some previously trained relaxation techniques. After the client can handle the stimulus easily, then he or she moves on to the next highest anxiety producing stimulus in the hierarchy and follows the same process. Thus, the client is systematically desensitized to the fear or anxiety invoking stimuli. Consequently, there is no longer a need to avoid the people, places, or events that were previously associated with negative consequences.
You likely can think of many applications of operant conditioning to real life situations. The next time you find yourself changing a behavior, think about how the consequences of the behavior have influenced that change.
Margin Definitions
Premack Principle
Premack’s idea that a less preferred behavior or activity can serve as a reinforcer for a more preferred behavior or activity.
Response Deprivation Theory
The theory that depriving an individual or animal the opportunity to engage in a behavior below that behavior’s usual baseline can cause the behavior to become a reinforcer for another targeted behavior.
Forward Chaining
A technique for conditioning the acquisition of a series of behaviors that must be demonstrated in a particular sequence. Behaviors are learned in sequence starting with the first behavior in the series. Reinforcement follows the successful demonstration of the first behavior, then the successful demonstration of the first and second behavior, and so on.
Backward Chaining
A technique for conditioning the acquisition of a series of behaviors that must be demonstrated in a particular sequence. Behaviors are learned in sequence starting the last behavior in the series. Reinforcement follows the successful demonstration of the last behavior, then the demonstration of the next to last and last behavior, and so on. This technique is believed to be easier for individuals to learn long sequences of behaviors because the newest behavior is demonstrated first in the sequence.
Instinctive Drift
A term used to refer to animals’ tendencies to engage in instinctual behaviors despite conditioning to learn incompatible behaviors.
Fixed Action Pattern
An instinctual series of behaviors that are demonstrated in their entirety in response to some environmental stimulus.
Avoidance Learning
The demonstration of learned avoidance behaviors in response to people, environments, or situations that induce fear or anxiety.
Mowrer’s Two Factor Theory
An explanation of avoidance learning based on the idea that individuals or animals first learn to fear a previously neutral stimulus through classical conditioning. Then, the individual or animal escapes the stimulus to avoid the feelings of fear. The avoidance behaviors are maintained through negative reinforcement.
Satiation
The idea that the effectiveness of a reinforcer will be reduced if the animal or individual’s appetite for the reinforcer has been met.
Immediacy
The idea that consequences that follow a target behavior immediately are more effective.
Contingency
The idea that effective consequences must reliably follow a target behavior each time the behavior is demonstrated.
Cost/Benefit
The idea that effective consequences present benefits to the individual or learner that outweigh the costs of demonstrating the target behavior.
References
Breland, K., & Breland, M. (1961). The misbehavior of organisms. American Psychologist, 16, 681-684.
Brown, J. F., Spencer, K., & Swift, S. (2002). A parent training programme for chronic food refusal: A case study. British Journal of Learning Disabilities, 30, 118-121.
Hull, Cl. L. (1943). Principles of behavior. New York: Appleton-Century-Crofts.
Mowrer, O. H. (1947). On the dual nature of learning – A reinterpretation of ‘conditioning’ and ‘problem solving’. Harvard Educational Review, 17, 102-148.
Premack, D. (1959). Toward empirical behavioral laws: I. Positive reinforcement. Psychological Review, 66, 219-233.
Premack, D. (1963). Rate differential reinforcement in monkey manipulation. Journal of Experimental Analysis of Behavior, 6, 81-89.
Rushall, B. S. (1996). Some practical application of psychology in physical activity settings. In K. W. Kim (Ed.), The pursuit of sport excellence, Vol. 2 (pp. 38-656). Seoul, Korea: Korean Alliance for Health, Physical Education, Recreation, and Dance.
Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York: Appleton-Century-Crofts.
Simek, T. C., & O’Brien, R. M. (1988). A chaining-mastery, discrimination training program to teach little leaguers to hit a baseball. Human Performance, 1(1), 73-84.
Solomon, R. L., & Wynne, L. C. (1953). Traumatic avoidance learning: Acquisition in normal dogs. Psychological Monographs, 67(4), 1-19.
Timberlake, W. (1984). Behavior regulation and learned performance: Some misapprehensions and disagreements. Journal of the Experimental Analysis of Behavior, 41, 355-375.
Timberlake, W., & Allison, J. (1974). Response deprivation: An empirical approach to instrumental performance. Psychological Review, 81, 146-164.
Wolpe, J. (1964). The conditioning therapies: The challenge in psychotherapy. New York: Holt, Rinehart, and Winston.