Who discovered operant conditioning?
In the 1920s after Watson’s withdrawal from the world of academic psychology, psychologists, particularly behaviorists, were eager to come up with new theories of learning other than those already posited by classical conditioning. The most important among these new theories was that of operant conditioning proposed by Burrhus Frederic Skinner, more commonly known as BF Skinner. Who discovered operant conditioning?
Skinner founded his theory on the simple reflection that observable behavior was more feasible to study than internal mental processes, which were not observable.
Skinner’s ideas were much less extreme than those of John B. Watson , although he considered the existence of the mind, he claimed that it was more productive to study observable behavior than internal mental events.
Skinner’s work was based on the view that classical conditioning was too simplistic to be a complete explanation of complex human behavior. He believed that the best way to understand behavior was to observe the causes of an action and its consequences. I call this approach operant conditioning.
Within behavioral procedures, operant or instrumental conditioning is probably the one with the most numerous and varied applications. From treating phobias to overcoming addictions such as smoking or alcoholism, the operant scheme allows the conceptualization and modification of practically any habit based on the intervention on a few elements. Who discovered operant conditioning?
Antecedents of operant conditioning
Operant conditioning as we know it was formulated and systematized by Burrhus Frederic Skinner on the basis of ideas previously raised by other authors.
Ivan Pavlov and John B. Watson had described classical conditioning, also known as simple or Pavlovian conditioning.
For his part, Edward Thorndike introduced the law of effect, the clearest antecedent of operant conditioning. The law of effect states that if a behavior has positive consequences for the person who performs it, it will be more likely to be repeated, while if it has negative consequences, this probability will decrease. In the context of Thorndike’s work, operant conditioning is called “instrumental”.
Difference between classical and operant conditioning
The main difference between classical and operant conditioning is that the former refers to the learning of information about a stimulus, while the latter involves learning about the consequences of the response.
Skinner believed that behavior was much easier to modify if its consequences were manipulated than if stimuli were simply associated with it, as is the case in classical conditioning. Classical conditioning is based on the acquisition of reflex responses, which explains a lower amount of learning and its uses are more limited than those of the operant since it refers to behaviors that the subject can control at will.
Concepts of operant conditioning
Next, we will define the basic concepts of operant conditioning to better understand this procedure and its applications.
Many of these terms are shared by behavioral orientations in general, although they may have specific connotations within the operant paradigm.
Instrumental or operant response
This term designates any behavior that carries a specific consequence and is liable to change based on it. Its name indicates that it serves to obtain something (instrumental) and that it acts on the medium (operant) instead of being caused by it, as in the case of classical or respondent conditioning.
In behaviorist theory, the word “response” is basically equivalent to “behavior” and “action”, although “response” seems to refer to a greater extent to the presence of antecedent stimuli.
In behavioral and cognitive-behavioral psychology a consequence is the result of the response. The consequence can be positive (reinforcement) or negative (punishment) for the subject who carries out the behavior; in the first case the probability of the answer being given will increase and in the second it will decrease. Who discovered operant conditioning?
It is important to bear in mind that the consequences affect the response and, therefore, in operant conditioning what is reinforced or punished is said behavior, not the person or animal that carries it out. At all times, work is done with the intention of influencing the way in which stimuli and responses are related, since the behaviorist philosophy avoids starting from an essentialist view of people, placing more emphasis on what can change than on what always seems to stay the same.
This term designates the consequences of behaviors when they make it more likely to happen again. Reinforcement can be positive, in which case we will be talking about obtaining a reward or prize for the execution of a response, or negative, which includes the disappearance of aversive stimuli.
Within negative reinforcement, we can distinguish between avoidance and escape responses. Avoidance behaviors prevent or prevent the appearance of an aversive stimulus; For example, a person with agoraphobia who does not leave home because they do not feel anxiety is avoiding this emotion. Instead, escape responses make the stimulus disappear when it is already present.
The difference with the word “reinforcer” is that it refers to the event that occurs as a consequence of the behavior rather than the reward or punishment procedure. Therefore, “reinforcer” is a term closer to “reward” and “reward” than to “reinforcement.”
Punishment is any consequence of certain behavior that reduces the probability that it will be repeated.
Like reinforcement, punishment can be positive or negative. Positive punishment corresponds to the presentation of an aversive stimulus after the response occurs, while negative punishment is the withdrawal of an appetitive stimulus as a consequence of the behavior.
Positive punishment can be related to the general use of the word “punishment”, while negative punishment refers more to some type of sanction or fine. If a child does not stop screaming and receives a slap from his mother to shut him up, he will be receiving a positive punishment, while if he instead takes away the console he is playing on, he will receive a negative punishment.
Discriminatory stimulus and delta stimulus
In psychology, the word “stimulus” is used to designate events that elicit a response from a person or animal. Within the operating paradigm, the discriminative stimulus is one whose presence indicates to the learning subject that if he or she carries out a certain behavior, it will result in the appearance of a reinforcer or punishment.
By contrast, the expression “delta stimulus” refers to those signals that, when present, inform that the execution of the response will not entail consequences.
Instrumental or operant conditioning is a learning procedure that is based on the fact that the probability of a given response being given depends on the expected consequences. In operant conditioning, behavior is controlled by discriminative stimuli present in the learning situation that convey information about the likely consequences of the response.
For example, an “Open” sign on a door tells us that if we try to turn the knob, it will most likely open. In this case, the poster would be the discriminative stimulus and the opening of the door would function as a positive reinforcer of the instrumental response of turning the knob.
BF Skinner’s Applied Behavioral Analysis
Skinner developed operant conditioning techniques that are encompassed in what we know as “applied behavior analysis.” It has been particularly effective in the education of children, with a special emphasis on children with developmental difficulties.
The basic scheme of applied behavioral analysis is as follows. First, a behavioral goal is set, which will consist of increasing or reducing certain behaviors. Based on this, the behaviors to be developed will be reinforced and the existing incentives for carrying out the behaviors to be inhibited will be reduced.
In general, the withdrawal of reinforcers is more desirable than positive punishment since it generates less rejection and hostility on the part of the subject. However, punishment can be useful in cases where the problem behavior is very disruptive and requires rapid reduction, for example, if there is violence. Who discovered operant conditioning?
Throughout the process, it is essential to systematically monitor progress in order to be able to check objectively if the desired objectives are being achieved. This is mainly done by recording data.
Operant techniques to develop behaviors
Given the importance and effectiveness of positive reinforcement, operant techniques for increasing behaviors have proven usefulness. Below we will describe the most relevant of these procedures.
1. Techniques of instigation
Instigation techniques are those that depend on the manipulation of discriminative stimuli to increase the probability of a behavior occurring.
This term includes instructions that increase certain behaviors, physical guidance, which consists of moving or placing parts of the body of the trained person, and modeling, in which a model is observed performing a behavior in order to be able to imitate it and learn what its effects are. consequences. These three procedures have in common that they focus on directly teaching the subject how to perform a certain action, either verbally or physically.
It consists of gradually bringing a certain behavior closer to the target behavior, starting with a relatively similar response that the subject can make and modifying it little by little. It is carried out in steps (successive approximations) to which reinforcement is applied.
Shaping is considered especially useful to establish behaviors in subjects who cannot communicate verbally, such as people with profound intellectual disabilities or animals.
Fading refers to the gradual withdrawal of prompts or prompts that had been used to reinforce a target behavior. It is intended that the subject consolidates a response and can subsequently carry it out without the need for external help. Who discovered operant conditioning?
It is one of the key concepts of operant conditioning, as it allows the progress made in therapy or training to be generalized to many other areas of life.
This procedure basically consists of substituting a discriminative stimulus for a different one.
A behavioral chain, that is, a behavior composed of several simple behaviors, is separated into different steps (links). Next, the subject must learn to execute the links one by one until they can carry out the complete chain.
The chaining can be carried out forwards or backward and its peculiarity is that each link reinforces the previous one and functions as a discriminative stimulus for the next.
In certain aspects, a good part of the skills that are considered talents because they show a high degree of skill and specialization in them (such as playing a musical instrument very well, dancing very well, etc.) can be considered the result of some form of chaining since from the basic skills progress is made until reaching others much more work. Who discovered operant conditioning?
5. Reinforcement programs
In an operant learning procedure, the reinforcement programs are the guidelines that establish when the behavior will be rewarded and when it will not.
There are two basic types of reinforcement programs: reason and interval. In reason programs, the reinforcer is obtained after a specific number of responses is given, while in interval programs this happens after a certain time has passed since the last reinforced behavior and it occurs again.
Both types of programs can be fixed or variable, which indicates that the number of responses or the time interval required to obtain the reinforcer can be constant or oscillate around an average value. They can also be continuous or intermittent; Which means that the reward can be given each time the subject carries out the target behavior or from time to time (although always as a consequence of emission of the desired response).
Continuous reinforcement is more useful to establish behaviors and intermittent reinforcement to maintain them. Thus, theoretically, a dog will learn to paw faster if we give it a treat each time it paws, but once the behavior is learned it will be more difficult for it to stop doing it if we give it the reinforcer one out of every three or five attempts.
Operant techniques to reduce or eliminate behaviors
When applying operant techniques to reduce behaviors, it should be borne in mind that, since these procedures can be unpleasant for subjects, it is always preferable to use the least aversive ones when possible. Also, these techniques are preferable to positive punishments.
Here is a list of these techniques in order from least to greatest potential to generate aversion.
Behavior that had been previously reinforced is no longer rewarded. This decreases the likelihood that the answer will occur again. Formally extinction is the opposite of positive reinforcement.
In the long term, extinction is more effective in eliminating responses than punishment and other operant techniques to reduce behaviors, although it may be slower.
A basic example of extinction is getting a child to stop kicking by simply ignoring it until he realizes that his behavior does not have the desired consequences (eg parental anger, which would function as a reinforcer) and is fed up.
2. Skip training
In this procedure, the subject’s behavior is followed by the absence of the reward; that is, if the answer is given, the reinforcer will not be obtained. An example of skip training might be parents stopping their daughter from watching television that night because she spoke disrespectfully to them. Another example would be the fact of not going to buy the toys that the children ask for, if they misbehave.
In educational settings, in addition, it serves to encourage the efforts that other people make to satisfy the little ones and that they, having become accustomed to these treatments, do not value more.
3. Differential reinforcement programs
They are a special subtype of reinforcement program that is used to reduce (not eliminate) target behaviors by augmenting other alternative responses. For example, a child could be rewarded for reading and exercising and not for playing the console if the latter behavior is intended to lose reinforcing value. Who discovered operant conditioning?
In low-rate differential reinforcement, the response is reinforced if a certain period of time occurs after the last time it occurred. In differential reinforcement of omission, reinforcement is obtained if, after a certain period of time, the response has not occurred. Differential reinforcement of incompatible behaviors consists of reinforcing responses incompatible with the problem behavior; This last procedure is applied to tics and onychophagia, among other disorders.
4. Response cost
A variant of negative punishment in which the execution of the problem behavior causes the loss of a reinforcer. The points card for drivers that were introduced in Spain a few years ago is a good example of a response cost program.
5. Time out
Time-out consists of isolating the subject, usually children, in a non-stimulating environment in case the problem behavior occurs. Also a variant of negative punishment, differs from the response cost in that what is lost is the possibility of accessing the reinforcement, not the reinforcer itself.
The reinforcement obtained for carrying out the behavior is so intense or abundant that it loses the value it had for a subject. This can take place by response satiation or massive practice (repeating the behavior until it stops being appetitive) or by stimulus satiation (the reinforcer loses its appetitiveness due to excess).
Overcorrection consists of applying a positive punishment related to the problem behavior. For example, it is widely used in cases of enuresis, in which the child is asked to wash the sheets after urinating on himself during the night.
Contingency organization techniques
Contingency organization systems are complex procedures through which some behaviors can be reinforced and others punished.
The token economy is a well-known example of this type of technique. It consists of delivering tokens (or other equivalent generic reinforcers) as a reward for the performance of the target behaviors; Subsequently, subjects can exchange their tokens for prizes of variable value. It is used in schools, prisons, and psychiatric hospitals. Who discovered operant conditioning?
Behavioral or contingency contracts are agreements between several people, usually, two, through which they agree to carry out (or not carry out) certain behaviors. The contracts detail the consequences in the event that the agreed conditions are met or breached.