What is conditioning?
Learning refers to any change in behavior or mental processes associated with experience. Traditionally psychologists interested in learning have taken a behavioral approach, which involves studying the relationship between environmental events, and resulting behavioral changes, in detail. Though the behavioral approach typically involves studying the behavior of nonhuman subjects in controlled laboratory environments, the results that have been found in behavioral research have often found wide application and use in human contexts. Since the early twentieth century, behavioral psychologists have extensively studied two primary forms of learning, classical and operant conditioning.
Classical conditioning is also referred to as associative learning or Pavlovian conditioning, after its primary founder, the Russian physiologist Ivan Petrovich Pavlov . Pavlov’s original studies involved examining digestion in dogs. The first step in digestion is salivation. Pavlov developed an experimental apparatus that allowed him to measure the amount of saliva the dog produced when presented with food. Dogs do not need to learn to salivate when food is given to them—that is an automatic, reflexive response. However, Pavlov noticed that, with experience, the dogs began to salivate before the food was presented, suggesting that new stimuli had acquired the ability to elicit the response. To examine this unexpected finding, Pavlov selected specific stimuli, which he systematically presented to the dog just before food was presented. The classic example is the ringing of a bell, but there was nothing special about the bell per se. Dogs do not salivate in response to a bell ringing under normal circumstances. What made the bell special was its systematic relationship to the delivery of food. Over time, the dogs began to salivate in response to the ringing of the bell even when the food was not presented. In other words, the dog learned to associate the bell with food so that the response (salivation) could be elicited by either stimulus.
In classical conditioning terminology, the food is the unconditioned stimulus (US). It is unconditioned (or unlearned) because the animal naturally responds to it before the experiment is begun. The sound of the bell ringing is referred to as the conditioned stimulus (CS). It is not naturally effective in eliciting salivation—learning is required in order for it to do so. Salivating in response to food presentation is referred to as the unconditioned response (UR) and salivating when the bell is rung is referred to as the conditioned response (CR). Though it would seem that saliva is saliva, it is important to differentiate the conditioned from the unconditioned response, because these responses are not always identical. More important, one is a natural, unlearned response (the UR) while the other requires specific learning experiences to occur (the CR).
Classical conditioning is not limited to dogs and salivation. Modern researchers examine classical conditioning in a variety of ways. What is important is the specific pairing of some novel stimulus (the CS) with a stimulus that already elicits the response (the US). One common experimental procedure examines eye-blink conditioning in rabbits, where a brief puff of air to the eye serves as the US and the measured response (UR) is blinking. A tone, a light, or some other initially ineffective stimulus serves as the CS. After many pairings in which the CS precedes the air puff, the rabbit will begin to blink in response to the CS in the absence of the air puff. Another common behavior that is studied in classical conditioning research is conditioned suppression. Here a CS is paired with an aversive US, such as a mild electric shock. Presentation of the shock disrupts whatever behavior the animal is engaged in at the time, and with appropriate pairing over time the CS comes to do so as well. A final example that many humans can relate to is taste-aversion learning. Here a specific taste (CS) is paired with a drug or procedure that causes the animal to feel ill (US). In the future, the animal will avoid consuming (CR) the taste (CS) associated with illness (US). Taste aversions illustrate the fact that all forms of conditioning are not created equal. To learn a conditioned eye-blink or salivation response requires many CS-US pairings, while taste aversions are often learned with only one pairing of the taste and illness.
Psychologists have long studied the factors that are necessary and sufficient for producing classical conditioning. One important principle is contiguity, which refers to events occurring closely together in space or time. Classical conditioning is most effective when the CS and US are more contiguous, though precisely how closely together they must be presented depends on the type of classical conditioning observed. Taste-aversion conditioning, for example, will occur over much longer CS-US intervals than would be effective with other conditioning arrangements. Nevertheless, the sooner illness (US) follows taste (CS), the stronger the aversion (CR) will be.
Though seemingly necessary for classical conditioning, contiguity is not sufficient. A particularly clear demonstration of this fact is seen when the CS and US are presented at the exact same moment (a procedure called simultaneous conditioning). Though maximally contiguous, simultaneous conditioning is an extremely poor method for producing a CR. Furthermore, the order of presentation matters. If the US is presented before the CS rather than afterward, as is usually the case, then inhibitory conditioning will occur. Inhibitory conditioning is seen in experiments in which behavior can change in two directions. For example, with a conditioned suppression procedure, inhibitory conditioning is seen when the animal increases, rather than decreases, its ongoing behavior when the CS is presented.
These findings have led modern researchers to focus on the predictive relationship between the CS and the US in classical conditioning. An especially successful modern theory of classical conditioning, the Rescorla-Wagner model, suggests that the CS acquires associative strength in direct proportion to how much information it provides about the upcoming US. In addition to providing a quantitative description of the way in which a CR is learned, the Rescorla-Wagner model has predicted a number of counterintuitive conditioning phenomena, such as blocking and overshadowing. Taken as a whole, the newer theoretical conceptions of classical conditioning tend to view the learning organism less as a passive recipient of environmental events than as an active analyzer of information.
Does classical conditioning account for any human behaviors? At first glance, these processes might seem a bit simplistic to account for human behaviors. However, some common human reactions are quite obviously the result of conditioning. For instance, nearly everyone who has had a cavity filled will cringe at the sound of a dentist’s drill, because the sound of the drill (CS) has been paired in the past with the unpleasant experience of having one’s teeth drilled (US). Cringing at the sound of the drill would be a conditioned response (CR). Psychologists have found evidence implicating classical conditioning in a variety of important human behaviors, from the emotional effects of advertising to the functioning of the immune system to the development of tolerance in drug addiction.
At about the same time that Pavlov was conducting his experiments in Russia, an American psychologist named Edward L. Thorndike was examining a different form of learning that has come to be called instrumental or operant conditioning. Thorndike’s original experiments involved placing cats in an apparatus he designed, which he called a puzzle box. A plate of food was placed outside the puzzle box, but the hungry cat was trapped inside. Thorndike designed the box so that the cat needed to make a particular response, such as moving a lever or pulling a cord, for a trap door to be released, allowing escape and access to the food outside. The amount of time it took the cat to make the appropriate response was measured. With repeated experience, Thorndike found that it took less and less time for the cat to make the appropriate response.
Operant conditioning is much different from Pavlov’s classical conditioning. As was stated before, classical conditioning involves learning “what goes with what” in the environment. Learning the relationship changes behavior, though behavior does not change the environmental events themselves. Through experience, Pavlov’s dogs began to salivate when the bell was rung, because the bell predicted food. However, salivating (the CR) did not cause the food to be delivered. Thorndike’s cats, on the other hand, received no food until the appropriate response was made. Through experience, the cats learned about the effects of their own behavior on environmental events. In other words, they learned the consequences of their own actions.
To describe these changes, Thorndike postulated the law of effect. According to the law of effect, in any given situation an animal may do a variety of things. The cat in the puzzle box could walk around, groom itself, meow, or engage in virtually any type of feline behavior. It could also make the operant response, the response necessary to escape the puzzle box and gain access to the food. Initially, the cat may engage in any of these behaviors and may produce the operant response simply by accident or chance. However, when the operant response occurs, escape from the box and access to the food follows. In operant conditioning terminology, food is the reinforcer (Sr, or reinforcing stimulus), and it serves to strengthen the operant response (R) that immediately preceded it. The next time the animal finds itself in the puzzle box, its tendency to produce the operant response will be a bit stronger as a consequence of the reinforcement. Once the response is made again, the animal gains access to the food again—which strengthens the response further. Over time, the operant response is strengthened, while other behaviors that may occur are not strengthened and thus drop away. So, with repeated experience, the amount of time that it takes for the animal to make the operant response declines.
In addition to changing the strength of responses, operant conditioning can be used to mold entirely new behaviors. This process is referred to as shaping, and it was described by American psychologist B. F. Skinner , who further developed the field of operant conditioning. Suppose that the experiment’s objective was to train an animal, such as a laboratory rat, to press a lever. The rat could be given a piece of food (Sr) each time it pressed the lever (R), but it would probably be some considerable time before it would do so on its own. Lever pressing does not come naturally to rats. To speed up the process, the animal could be “shaped” by reinforcing successive approximations of lever-pressing behavior. The rat could be given a food pellet each time that it was in the vicinity of the lever. The law of effect predicts that the rat would spend more and more of its time near the lever as a consequence of reinforcement. Then the rat may be required to make some physical contact with the lever, but not necessarily press it, to be rewarded. The rat would make more and more contact with the lever as a result. Finally, the rat would be required to make the full response, pressing the lever, to get food. In many ways, shaping resembles the childhood game of selecting some object in the room without saying what it is and guiding guessers by saying “warmer” as they approach the object, and as they move away from it, saying nothing at all. Before long, the guessers will use the feedback to zero in on the selected object. In a similar manner, feedback in the form of reinforcement allows the rat to “zero in” on the operant response.
Skinner also examined situations in which reinforcement was not given for every individual response but was delivered according to various schedules of reinforcement. For example, the rat may be required to press the lever a total of five times (rather than once) to get the food pellet, or the reinforcing stimulus may be delivered only when a response occurs after a specified period of time. These scenarios correspond to ratio and interval schedules. Interval and ratio schedules can be either fixed, meaning that the exact same rule applies for the delivery of each individual reinforcement, or variable, meaning that the rule changes from reinforcer to reinforcer. For example, in a variable ratio-five schedule, a reward may be given after the first five responses, then after seven responses, then after three. On average, each five responses would be reinforced, but any particular reinforcement may require more or fewer responses.
To understand how large an impact varying the schedule of reinforcement can have on behavior, one might consider responding to a soda machine versus responding to a slot machine. In both cases the operant response is inserting money. However, the soda machine rewards (delivers a can of soda) according to a fixed-ratio schedule of reinforcement. Without reward, one will not persist very long in making the operant response to the soda machine. The slot machine, on the other hand, provides rewards (delivers a winning payout) on a variable-ratio schedule. It is not uncommon for people to empty out their pockets in front of a slot machine without receiving a single reinforcement.
As with classical conditioning, exactly what associations are learned in operant conditioning has been an important research question. For example, in a classic 1948 experiment, Skinner provided pigeons with food at regular intervals regardless of what they were doing at the time. Six of his eight pigeons developed stereotyped (consistent) patterns of behavior as a result of the experiment despite the fact that the pigeons’ behavior was not really necessary. According to the law of effect, some behavior would be occurring just prior to food delivery, and this behavior would be strengthened simply by chance pairing with reinforcement. This would increase the strength of the response, making it more likely to occur when the next reward was delivered—strengthening the response still further. Ultimately, one behavior would dominate the pigeons’ behavior in that experimental context. Skinner referred to this phenomenon as superstition. One need only observe the behavior of baseball players approaching the plate or basketball players lining up for a free-throw shot to see examples of superstition in human behavior.
Superstition again raises the issue of contiguity—simply presenting reinforcement soon after the response is made appears to strengthen it. However, later studies, especially a 1971 experiment conducted by J. E. R. Staddon and V. Simmelhag, suggested that it might not be quite that simple. Providing food rewards in superstition experiments changes a variety of responses, including natural behaviors related to the anticipation of food. In operant conditioning, animals are learning more than the simple contiguity of food and behavior; they are learning that their behavior (R) causes the delivery of food (Sr). Contiguity is important, but is not the whole story.
In addition, psychologists have explored the question “What makes a reinforcer reinforcing?” That is to say, is there some set of stimuli that will “work” to increase the behaviors they follow in every single circumstance? The answer is that there is not some set of rewards that will always increase behavior in all circumstances. David Premack was important in outlining the fact that reinforcement is a relative, rather than an absolute, thing. Specifically, Premack suggested that behaviors in which an organism is more likely to engage serve to reinforce behaviors in which they are less likely to engage. In a specific example, he examined children given the option of playing pinball or eating candy. Some children preferred pinball and spent more of their time playing the game than eating the candy. The opposite was true of other children. Those who preferred pinball would increase their candy-eating behavior (R) to gain access to the pinball machine (Sr). Those who preferred eating candy would increase their pinball-playing behavior (R) to gain access to candy (Sr). Behaviors that a child initially preferred were effective in reinforcing behaviors that the child was less likely to choose—but not the other way around.
Positive or rewarding outcomes are not the only consequences that govern behavior. In many cases, people respond to avoid negative outcomes or stop responding when doing so produces unpleasant events. These situations correspond to the operant procedures of avoidance and punishment. Many psychologists have advocated using reinforcement rather than punishment to alter behavior, not because punishment is necessarily less effective in theory but because it is usually less effective in practice. For punishment to be effective, it should be (among other things) strong, immediate, and consistent. This can be difficult to accomplish in practice. In crime, for example, many offenses may have occurred without detection prior to the punished offense, so punishment is not certain. It is also likely that an individual’s court hearing, not to mention his or her actual sentence, will be delayed by weeks or even months, so punishment is not immediate. First offenses are likely to be punished less harshly than repeated offenses, so punishment gradually increases in intensity. In the laboratory, such a situation would produce an animal that would be quite persistent in responding, despite punishment.
In addition, punishment can produce unwanted side effects, such as the suppression of other behaviors, aggression, and the learning of responses to avoid or minimize punishing consequences. Beyond this, punishment requires constant monitoring by an external authority, whereas reinforcement typically does not. For example, parents who want to punish a child for a dirty room must constantly inspect the room to determine its current state. The child certainly is not going to point out a dirty room that will cause punishment. On the other hand, if rewarded, the child will bring the clean room to the parents’ attention. This is not to suggest that punishment should necessarily be abandoned as one tool for controlling behavior. Rather, the effectiveness of punishment, like reinforcement, can be predicted on the basis of laboratory results.
Though the distinction between classical and operant conditioning is very clear in principle, it is not always so clear in practice. This makes sense if one considers real-life learning situations. In many circumstances events in the environment are associated (occur together) in a predictable fashion, and behavior will have consequences. This can be true in the laboratory as well, but carefully designed experiments can be conducted to separate out the impact of classical and operant conditioning on behavior.
In addition, the effectiveness of both classical and operant conditioning is influenced by biological factors. This can be seen both in the speed with which classically conditioned taste aversions (as compared with other CRs) are learned and in the stimulation of natural food-related behaviors in operant superstition experiments. Related findings have demonstrated that the effects of rewarding behavior can be influenced by biology in other ways that may disrupt the conditioning process. In an article published in 1961, Keller and Marian Breland described their difficulties in applying the principles of operant conditioning to their work as animal trainers in the entertainment industry. They found that when trained with food reinforcement, natural behaviors would often interfere with the trained operant response—a phenomenon they called instinctive drift. From a practical point of view, their research suggested that to be successful in animal training, one must select operant responses that do not compete with natural food-related behaviors. From a scientific point of view, their research suggested that biological tendencies must be taken into account in any complete description of conditioning processes.
Beyond being interesting and important in its own right, conditioning research also serves as a valuable tool in the psychological exploration of other issues. In essence, conditioning technology provides a means for asking animals questions—a way to explore interesting cognitive processes such as memory, attention, reasoning, and concept formation under highly controlled laboratory conditions in less complex organisms.
Another area of research is the field of behavioral neuroscience, or psychobiology, a field that combines physiological and behavioral approaches to uncover the neurological mechanisms underlying behavior. For example, the impact of various medications and substances of abuse on behavior can be observed by administering drugs as reinforcing stimuli. It is interesting to note that animals will produce operant responses to receive the same drugs to which humans become addicted. However, in animals, the neurological mechanisms involved in developing addictions can be studied directly, using both behavioral and physiological experimental techniques in a way that would not be possible with human subjects, due to ethical considerations.
In addition, the principles of classical and operant conditioning have been used to solve human problems in a variety of educational and therapeutic settings, a strategy called applied behavior analysis. The principles of operant conditioning have been widely applied in settings in which some degree of control over human behavior is desirable. Token economies are situations in which specified behaviors, such as appropriate classroom behavior, are rewarded according to some schedule of reinforcement. The reinforcers are referred to as tokens because they need not have any rewarding value in and of themselves, but they can be exchanged for reinforcers at some later time. According to the principles of operant conditioning, people should increase the operant response to gain the reinforcers, and if the token economy is developed properly, that is exactly what occurs. If token economies sound rather familiar it is for good reason. Money is an extremely potent token reinforcer for most people, who perform operant responses (work) to receive token reinforcers (money) that can later be exchanged for primary reinforcers (such as food, clothing, shelter, or entertainment).
Finally, learning principles have been applied in clinical psychology in an effort to change maladaptive behaviors. Some examples include a procedure called systematic desensitization, in which the principles of classical conditioning are applied in an effort to treat phobias (irrational beliefs), and social skills training, in which operant conditioning is used to enhance communication and other interpersonal behaviors. These are only two examples of useful applications of conditioning technology to treat mental illness. Such applications suggest the need for ongoing research into basic conditioning mechanisms. We must fully understand conditioning principles to appropriately apply them in the effort to understand and improve the human condition.
Domjan, Michael, and Barbara Burkhard. The Principles of Learning and Behavior. 5th ed. Belmont, Calif.: Wadsworth, 2006. Print.
Lavond, David G., and Joseph E. Steinmetz. Handbook of Classical Conditioning. New York: Springer, 2003. Print.
Mazur, James E. Learning and Behavior. 7th ed. New York: Psychology, 2013. eBook Collection (EBSCOhost). Web. 30 Nov. 2015.
Psychology of Learning and Motivation. San Diego: Academic, 2015. eBook Collection (EBSCOhost). Web. 30 Nov. 2015.
Schachtman, Todd R., and Steve S. Reilly. Associative Learning and Conditioning Theory: Human and Non-Human Applications. New York: Oxford UP, 2011. eBook Collection (EBSCOhost). Web. 30 Nov. 2015.
Schwartz, Barry, ed. Psychology of Learning: Readings in Behavior Theory. New York: W. W. Norton, 1984. Print.
Skinner, B. F. Beyond Freedom and Dignity. 1971. Reprint. Indianapolis, Ind.: Hackett, 2002. Print.