Operant
   HOME

TheInfoList



OR:

Operant conditioning, also called instrumental conditioning, is a learning process where behaviors are modified through the association of stimuli with reinforcement or punishment. In it, operants—behaviors that affect one's environment—are conditioned to occur or not occur depending on the environmental consequences of the behavior. Operant conditioning originated in the work of Edward Thorndike, whose law of effect theorised that behaviors arise as a result of whether their consequences are satisfying or discomforting. In the 20th century, operant conditioning was studied by behaviorist psychologists, who believed that much, if not all, of mind and behaviour can be explained as a result of envirionmental conditioning. Reinforcements are environmental stimuli that increase behaviors, whereas punishments are stimuli that decrease behaviors. Both kinds of stimuli can be further categorised into positive and negative stimuli, which respectively involve the addition or removal of environmental stimuli. Operant conditioning differs from
classical conditioning Classical conditioning (also known as Pavlovian or respondent conditioning) is a behavioral procedure in which a biologically potent stimulus (e.g. food) is paired with a previously neutral stimulus (e.g. a triangle). It also refers to the learni ...
, which is a process where stimuli are paired with biologically significant events to produce involuntary and reflexive behaviors. In contrast, operant conditioning is voluntary and depends on the consequences of a behavior. The study of animal learning in the 20th century was dominated by the analysis of these two sorts of learning, and they are still at the core of behavior analysis. They have also been applied to the study of
social psychology Social psychology is the scientific study of how thoughts, feelings, and behaviors are influenced by the real or imagined presence of other people or by social norms. Social psychologists typically explain human behavior as a result of the ...
, helping to clarify certain phenomena such as the false consensus effect.


History


Thorndike's law of effect

Operant conditioning, sometimes called ''instrumental learning'', was first extensively studied by Edward L. Thorndike (1874–1949), who observed the behavior of cats trying to escape from home-made puzzle boxes. A cat could escape from the box by a simple response such as pulling a cord or pushing a pole, but when first constrained, the cats took a long time to get out. With repeated trials ineffective responses occurred less frequently and successful responses occurred more frequently, so the cats escaped more and more quickly. Thorndike generalized this finding in his law of effect, which states that behaviors followed by satisfying consequences tend to be repeated and those that produce unpleasant consequences are less likely to be repeated. In short, some consequences ''strengthen'' behavior and some consequences ''weaken'' behavior. By plotting escape time against trial number Thorndike produced the first known animal
learning curve A learning curve is a graphical representation of the relationship between how proficient people are at a task and the amount of experience they have. Proficiency (measured on the vertical axis) usually increases with increased experience (the ...
s through this procedure. Humans appear to learn many simple behaviors through the sort of process studied by Thorndike, now called operant conditioning. That is, responses are retained when they lead to a successful outcome and discarded when they do not, or when they produce aversive effects. This usually happens without being planned by any "teacher", but operant conditioning has been used by parents in teaching their children for thousands of years.Miltenberger, R. G., & Crosland, K. A. (2014). Parenting. The wiley blackwell handbook of operant and classical conditioning. (pp. 509–531) Wiley-Blackwell.


B. F. Skinner

B.F. Skinner (1904–1990) is referred to as the Father of operant conditioning, and his work is frequently cited in connection with this topic. His 1938 book "The Behavior of Organisms: An Experimental Analysis", initiated his lifelong study of operant conditioning and its application to human and animal behavior. Following the ideas of Ernst Mach, Skinner rejected Thorndike's reference to unobservable mental states such as satisfaction, building his analysis on observable behavior and its equally observable consequences. Skinner believed that classical conditioning was too simplistic to be used to describe something as complex as human behavior. Operant conditioning, in his opinion, better described human behavior as it examined causes and effects of intentional behavior. To implement his empirical approach, Skinner invented the operant conditioning chamber, or "''Skinner Box''", in which subjects such as pigeons and rats were isolated and could be exposed to carefully controlled stimuli. Unlike Thorndike's puzzle box, this arrangement allowed the subject to make one or two simple, repeatable responses, and the rate of such responses became Skinner's primary behavioral measure. Another invention, the cumulative recorder, produced a graphical record from which these response rates could be estimated. These records were the primary data that Skinner and his colleagues used to explore the effects on response rate of various reinforcement schedules.Ferster, C. B. & Skinner, B. F. "Schedules of Reinforcement", 1957 New York: Appleton-Century-Crofts A reinforcement schedule may be defined as "any procedure that delivers reinforcement to an organism according to some well-defined rule". The effects of schedules became, in turn, the basic findings from which Skinner developed his account of operant conditioning. He also drew on many less formal observations of human and animal behavior. Many of Skinner's writings are devoted to the application of operant conditioning to human behavior. In 1948 he published '' Walden Two'', a fictional account of a peaceful, happy, productive community organized around his conditioning principles. In 1957,
Skinner Skinner may refer to: People and fictional characters *Skinner (surname), a list of people and fictional characters with that surname * Skinner (profession), a person who makes a living by working with animal skins or driving mules *Skinner, a rin ...
published ''
Verbal Behavior ''Verbal Behavior'' is a 1957 book by psychologist B. F. Skinner, in which he describes what he calls verbal behavior, or what was traditionally called linguistics. Skinner's work describes the controlling elements of verbal behavior with termino ...
'', which extended the principles of operant conditioning to language, a form of human behavior that had previously been analyzed quite differently by linguists and others. Skinner defined new functional relationships such as "mands" and "tacts" to capture some essentials of language, but he introduced no new principles, treating verbal behavior like any other behavior controlled by its consequences, which included the reactions of the speaker's audience.


Concepts and procedures


Origins of operant behavior: operant variability

Operant behavior is said to be "emitted"; that is, initially it is not elicited by any particular stimulus. Thus one may ask why it happens in the first place. The answer to this question is like Darwin's answer to the question of the origin of a "new" bodily structure, namely, variation and selection. Similarly, the behavior of an individual varies from moment to moment, in such aspects as the specific motions involved, the amount of force applied, or the timing of the response. Variations that lead to reinforcement are strengthened, and if reinforcement is consistent, the behavior tends to remain stable. However, behavioral variability can itself be altered through the manipulation of certain variables.


Modifying operant behavior: reinforcement and punishment

Reinforcement and punishment are the core tools through which operant behavior is modified. These terms are defined by their effect on behavior. Either may be positive or negative. * Positive reinforcement and negative reinforcement increase the probability of a behavior that they follow, while positive punishment and negative punishment reduce the probability of behavior that they follow. Another procedure is called "extinction". * Extinction occurs when a previously reinforced behavior is no longer reinforced with either positive or negative reinforcement. During extinction the behavior becomes less probable. Occasional reinforcement can lead to an even longer delay before behavior extinction due to the learning factor of repeated instances becoming necessary to get reinforcement, when compared with reinforcement being given at each opportunity before extinction. There are a total of five consequences. #
Positive reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fr ...
occurs when a behavior (response) is rewarding or the behavior is followed by another stimulus that is rewarding, increasing the frequency of that behavior. For example, if a rat in a
Skinner box An operant conditioning chamber (also known as a Skinner box) is a laboratory apparatus used to study animal behavior. The operant conditioning chamber was created by B. F. Skinner while he was a graduate student at Harvard University. The cha ...
gets food when it presses a lever, its rate of pressing will go up. This procedure is usually called simply ''reinforcement''. #
Negative reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
(a.k.a. escape) occurs when a behavior (response) is followed by the removal of an aversive stimulus, thereby increasing the original behavior's frequency. In the Skinner Box experiment, the aversive stimulus might be a loud noise continuously inside the box; negative reinforcement would happen when the rat presses a lever to turn off the noise. #
Positive punishment In operant conditioning, punishment is any change in a human or animal's surroundings which, occurring after a given behavior or response, reduces the likelihood of that behavior occurring again in the future. As with reinforcement, it is the ''be ...
(also referred to as "punishment by contingent stimulation") occurs when a behavior (response) is followed by an aversive stimulus. Example: pain from a
spanking Spanking is a form of corporal punishment involving the act of striking, with either the palm of the hand or an implement, the buttocks of a person to cause physical pain. The term spanking broadly encompasses the use of either the hand or im ...
, which would often result in a decrease in that behavior. ''Positive punishment'' is a confusing term, so the procedure is usually referred to as "punishment". #
Negative punishment In operant conditioning, punishment is any change in a human or animal's surroundings which, occurring after a given behavior or response, reduces the likelihood of that behavior occurring again in the future. As with reinforcement, it is the ''be ...
(penalty) (also called "punishment by contingent withdrawal") occurs when a behavior (response) is followed by the removal of a stimulus. Example: taking away a child's toy following an undesired behavior by him/her, which would result in a decrease in the undesirable behavior. #
Extinction Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
occurs when a behavior (response) that had previously been reinforced is no longer effective. Example: a rat is first given food many times for pressing a lever, until the experimenter no longer gives out food as a reward. The rat would typically press the lever less often and then stop. The lever pressing would then be said to be "extinguished." It is important to note that actors (e.g. a rat) are not spoken of as being reinforced, punished, or extinguished; it is the ''actions'' that are reinforced, punished, or extinguished. Reinforcement, punishment, and extinction are not terms whose use is restricted to the laboratory. Naturally-occurring consequences can also reinforce, punish, or extinguish behavior and are not always planned or delivered on purpose.


Schedules of reinforcement

Schedules of reinforcement are rules that control the delivery of reinforcement. The rules specify either the time that reinforcement is to be made available, or the number of responses to be made, or both. Many rules are possible, but the following are the most basic and commonly used * Fixed interval schedule: Reinforcement occurs following the first response after a fixed time has elapsed after the previous reinforcement. This schedule yields a "break-run" pattern of response; that is, after training on this schedule, the organism typically pauses after reinforcement, and then begins to respond rapidly as the time for the next reinforcement approaches. * Variable interval schedule: Reinforcement occurs following the first response after a variable time has elapsed from the previous reinforcement. This schedule typically yields a relatively steady rate of response that varies with the average time between reinforcements. * Fixed ratio schedule: Reinforcement occurs after a fixed number of responses have been emitted since the previous reinforcement. An organism trained on this schedule typically pauses for a while after a reinforcement and then responds at a high rate. If the response requirement is low there may be no pause; if the response requirement is high the organism may quit responding altogether. * Variable ratio schedule: Reinforcement occurs after a variable number of responses have been emitted since the previous reinforcement. This schedule typically yields a very high, persistent rate of response. * Continuous reinforcement: Reinforcement occurs after each response. Organisms typically respond as rapidly as they can, given the time taken to obtain and consume reinforcement, until they are satiated.


Factors that alter the effectiveness of reinforcement and punishment

The effectiveness of reinforcement and punishment can be changed. # Satiation/Deprivation: The effectiveness of a positive or "appetitive" stimulus will be reduced if the individual has received enough of that stimulus to satisfy his/her appetite. The opposite effect will occur if the individual becomes deprived of that stimulus: the effectiveness of a consequence will then increase. A subject with a full stomach wouldn't feel as motivated as a hungry one.Miltenberger, R. G. "Behavioral Modification: Principles and Procedures".
Thomson/Wadsworth Cengage Group is an American educational content, technology, and services company for the higher education, K-12, professional, and library markets. It operates in more than 20 countries around the world.(Jun 27, 2014Global Publishing Leaders ...
, 2008. p. 84.
# Immediacy: An immediate consequence is more effective than a delayed one. If one gives a dog a treat for sitting within five seconds, the dog will learn faster than if the treat is given after thirty seconds. # Contingency: To be most effective, reinforcement should occur consistently after responses and not at other times. Learning may be slower if reinforcement is intermittent, that is, following only some instances of the same response. Responses reinforced intermittently are usually slower to extinguish than are responses that have always been reinforced. # Size: The size, or amount, of a stimulus often affects its potency as a reinforcer. Humans and animals engage in cost-benefit analysis. If a lever press brings ten food pellets, lever pressing may be learned more rapidly than if a press brings only one pellet. A pile of quarters from a slot machine may keep a gambler pulling the lever longer than a single quarter. Most of these factors serve biological functions. For example, the process of satiation helps the organism maintain a stable internal environment (
homeostasis In biology, homeostasis (British also homoeostasis) (/hɒmɪə(ʊ)ˈsteɪsɪs/) is the state of steady internal, physical, and chemical conditions maintained by living systems. This is the condition of optimal functioning for the organism and ...
). When an organism has been deprived of sugar, for example, the taste of sugar is an effective reinforcer. When the organism's
blood sugar Glycaemia, also known as blood sugar level, blood sugar concentration, or blood glucose level is the measure of glucose concentrated in the blood of humans or other animals. Approximately 4 grams of glucose, a simple sugar, is present in the blo ...
reaches or exceeds an optimum level the taste of sugar becomes less effective or even aversive.


Shaping

Shaping is a conditioning method much used in animal training and in teaching nonverbal humans. It depends on operant variability and reinforcement, as described above. The trainer starts by identifying the desired final (or "target") behavior. Next, the trainer chooses a behavior that the animal or person already emits with some probability. The form of this behavior is then gradually changed across successive trials by reinforcing behaviors that approximate the target behavior more and more closely. When the target behavior is finally emitted, it may be strengthened and maintained by the use of a schedule of reinforcement.


Noncontingent reinforcement

Noncontingent reinforcement is the delivery of reinforcing stimuli regardless of the organism's behavior. Noncontingent reinforcement may be used in an attempt to reduce an undesired target behavior by reinforcing multiple alternative responses while extinguishing the target response. As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".


Stimulus control of operant behavior

Though initially operant behavior is emitted without an identified reference to a particular stimulus, during operant conditioning operants come under the control of stimuli that are present when behavior is reinforced. Such stimuli are called "discriminative stimuli." A so-called "
three-term contingency The three-term contingency (also known as the ABC contingency) in operant conditioning—or contingency management—describes the relationship between a behavior, its consequence, and the environmental context. The three-term contingency was first ...
" is the result. That is, discriminative stimuli set the occasion for responses that produce reward or punishment. Example: a rat may be trained to press a lever only when a light comes on; a dog rushes to the kitchen when it hears the rattle of his/her food bag; a child reaches for candy when s/he sees it on a table.


Discrimination, generalization & context

Most behavior is under stimulus control. Several aspects of this may be distinguished: *Discrimination typically occurs when a response is reinforced only in the presence of a specific stimulus. For example, a pigeon might be fed for pecking at a red light and not at a green light; in consequence, it pecks at red and stops pecking at green. Many complex combinations of stimuli and other conditions have been studied; for example an organism might be reinforced on an interval schedule in the presence of one stimulus and on a ratio schedule in the presence of another. *Generalization is the tendency to respond to stimuli that are similar to a previously trained discriminative stimulus. For example, having been trained to peck at "red" a pigeon might also peck at "pink", though usually less strongly. *Context refers to stimuli that are continuously present in a situation, like the walls, tables, chairs, etc. in a room, or the interior of an operant conditioning chamber. Context stimuli may come to control behavior as do discriminative stimuli, though usually more weakly. Behaviors learned in one context may be absent, or altered, in another. This may cause difficulties for behavioral therapy, because behaviors learned in the therapeutic setting may fail to occur in other situations.


Behavioral sequences: conditioned reinforcement and chaining

Most behavior cannot easily be described in terms of individual responses reinforced one by one. The scope of operant analysis is expanded through the idea of behavioral chains, which are sequences of responses bound together by the three-term contingencies defined above. Chaining is based on the fact, experimentally demonstrated, that a discriminative stimulus not only sets the occasion for subsequent behavior, but it can also reinforce a behavior that precedes it. That is, a discriminative stimulus is also a "conditioned reinforcer". For example, the light that sets the occasion for lever pressing may be used to reinforce "turning around" in the presence of a noise. This results in the sequence "noise – turn-around – light – press lever – food". Much longer chains can be built by adding more stimuli and responses.


Escape and avoidance

In escape learning, a behavior terminates an (aversive) stimulus. For example, shielding one's eyes from sunlight terminates the (aversive) stimulation of bright light in one's eyes. (This is an example of negative reinforcement, defined above.) Behavior that is maintained by preventing a stimulus is called "avoidance," as, for example, putting on sun glasses before going outdoors. Avoidance behavior raises the so-called "avoidance paradox", for, it may be asked, how can the non-occurrence of a stimulus serve as a reinforcer? This question is addressed by several theories of avoidance (see below). Two kinds of experimental settings are commonly used: discriminated and free-operant avoidance learning.


Discriminated avoidance learning

A discriminated avoidance experiment involves a series of trials in which a neutral stimulus such as a light is followed by an aversive stimulus such as a shock. After the neutral stimulus appears an operant response such as a lever press prevents or terminate the aversive stimulus. In early trials, the subject does not make the response until the aversive stimulus has come on, so these early trials are called "escape" trials. As learning progresses, the subject begins to respond during the neutral stimulus and thus prevents the aversive stimulus from occurring. Such trials are called "avoidance trials." This experiment is said to involve classical conditioning because a neutral CS (conditioned stimulus) is paired with the aversive US (unconditioned stimulus); this idea underlies the two-factor theory of avoidance learning described below.


Free-operant avoidance learning

In free-operant avoidance a subject periodically receives an aversive stimulus (often an electric shock) unless an operant response is made; the response delays the onset of the shock. In this situation, unlike discriminated avoidance, no prior stimulus signals the shock. Two crucial time intervals determine the rate of avoidance learning. This first is the S-S (shock-shock) interval. This is time between successive shocks in the absence of a response. The second interval is the R-S (response-shock) interval. This specifies the time by which an operant response delays the onset of the next shock. Note that each time the subject performs the operant response, the R-S interval without shock begins anew.


Two-process theory of avoidance

This theory was originally proposed in order to explain discriminated avoidance learning, in which an organism learns to avoid an aversive stimulus by escaping from a signal for that stimulus. Two processes are involved: classical conditioning of the signal followed by operant conditioning of the escape response: a) ''Classical conditioning of fear.'' Initially the organism experiences the pairing of a CS with an aversive US. The theory assumes that this pairing creates an association between the CS and the US through classical conditioning and, because of the aversive nature of the US, the CS comes to elicit a conditioned emotional reaction (CER) – "fear." b) ''Reinforcement of the operant response by fear-reduction.'' As a result of the first process, the CS now signals fear; this unpleasant emotional reaction serves to motivate operant responses, and responses that terminate the CS are reinforced by fear termination. Note that the theory does not say that the organism "avoids" the US in the sense of anticipating it, but rather that the organism "escapes" an aversive internal state that is caused by the CS. Several experimental findings seem to run counter to two-factor theory. For example, avoidance behavior often extinguishes very slowly even when the initial CS-US pairing never occurs again, so the fear response might be expected to extinguish (see
Classical conditioning Classical conditioning (also known as Pavlovian or respondent conditioning) is a behavioral procedure in which a biologically potent stimulus (e.g. food) is paired with a previously neutral stimulus (e.g. a triangle). It also refers to the learni ...
). Further, animals that have learned to avoid often show little evidence of fear, suggesting that escape from fear is not necessary to maintain avoidance behavior.Pierce & Cheney (2004) Behavior Analysis and Learning


Operant or "one-factor" theory

Some theorists suggest that avoidance behavior may simply be a special case of operant behavior maintained by its consequences. In this view the idea of "consequences" is expanded to include sensitivity to a pattern of events. Thus, in avoidance, the consequence of a response is a reduction in the rate of aversive stimulation. Indeed, experimental evidence suggests that a "missed shock" is detected as a stimulus, and can act as a reinforcer. Cognitive theories of avoidance take this idea a step farther. For example, a rat comes to "expect" shock if it fails to press a lever and to "expect no shock" if it presses it, and avoidance behavior is strengthened if these expectancies are confirmed.


Operant hoarding

Operant hoarding refers to the observation that rats reinforced in a certain way may allow food pellets to accumulate in a food tray instead of retrieving those pellets. In this procedure, retrieval of the pellets always instituted a one-minute period of
extinction Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
during which no additional food pellets were available but those that had been accumulated earlier could be consumed. This finding appears to contradict the usual finding that rats behave impulsively in situations in which there is a choice between a smaller food object right away and a larger food object after some delay. See
schedules of reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fr ...
.


Neurobiological correlates

The first scientific studies identifying
neuron A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa. ...
s that responded in ways that suggested they encode for conditioned stimuli came from work by Mahlon deLongRichardson RT, DeLong MR (1991): Electrophysiological studies of the function of the nucleus basalis in primates. In Napier TC, Kalivas P, Hamin I (eds), ''The Basal Forebrain: Anatomy to Function'' (''Advances in Experimental Medicine and Biology''), vol. 295. New York, Plenum, pp. 232–252 and by R.T. Richardson. They showed that
nucleus basalis The nucleus basalis, also known as the nucleus basalis of Meynert or nucleus basalis magnocellularis, is a group of neurons located mainly in the substantia innominata of the basal forebrain. Most neurons of the nucleus basalis are rich in the ...
neurons, which release acetylcholine broadly throughout the
cerebral cortex The cerebral cortex, also known as the cerebral mantle, is the outer layer of neural tissue of the cerebrum of the brain in humans and other mammals. The cerebral cortex mostly consists of the six-layered neocortex, with just 10% consistin ...
, are activated shortly after a conditioned stimulus, or after a primary reward if no conditioned stimulus exists. These neurons are equally active for positive and negative reinforcers, and have been shown to be related to
neuroplasticity Neuroplasticity, also known as neural plasticity, or brain plasticity, is the ability of neural networks in the brain to change through growth and reorganization. It is when the brain is rewired to function in some way that differs from how it p ...
in many cortical regions. Evidence also exists that dopamine is activated at similar times. There is considerable evidence that dopamine participates in both reinforcement and aversive learning. Dopamine pathways project much more densely onto frontal cortex regions.
Cholinergic Cholinergic agents are compounds which mimic the action of acetylcholine and/or butyrylcholine. In general, the word " choline" describes the various quaternary ammonium salts containing the ''N'',''N'',''N''-trimethylethanolammonium cati ...
projections, in contrast, are dense even in the posterior cortical regions like the
primary visual cortex The visual cortex of the brain is the area of the cerebral cortex that processes visual information. It is located in the occipital lobe. Sensory input originating from the eyes travels through the lateral geniculate nucleus in the thalamus and ...
. A study of patients with
Parkinson's disease Parkinson's disease (PD), or simply Parkinson's, is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. The symptoms usually emerge slowly, and as the disease worsens, non-motor symptoms becom ...
, a condition attributed to the insufficient action of dopamine, further illustrates the role of dopamine in positive reinforcement. It showed that while off their medication, patients learned more readily with aversive consequences than with positive reinforcement. Patients who were on their medication showed the opposite to be the case, positive reinforcement proving to be the more effective form of learning when dopamine activity is high. A neurochemical process involving dopamine has been suggested to underlie reinforcement. When an organism experiences a reinforcing stimulus, dopamine pathways in the brain are activated. This network of pathways "releases a short pulse of dopamine onto many
dendrites Dendrites (from Greek δένδρον ''déndron'', "tree"), also dendrons, are branched protoplasmic extensions of a nerve cell that propagate the electrochemical stimulation received from other neural cells to the cell body, or soma, of the ...
, thus broadcasting a global reinforcement signal to
postsynaptic neuron Chemical synapses are biological junctions through which neurons' signals can be sent to each other and to non-neuronal cells such as those in muscles or glands. Chemical synapses allow neurons to form circuits within the central nervous syste ...
s." This allows recently activated synapses to increase their sensitivity to efferent (conducting outward) signals, thus increasing the probability of occurrence for the recent responses that preceded the reinforcement. These responses are, statistically, the most likely to have been the behavior responsible for successfully achieving reinforcement. But when the application of reinforcement is either less immediate or less contingent (less consistent), the ability of dopamine to act upon the appropriate synapses is reduced.


Questions about the law of effect

A number of observations seem to show that operant behavior can be established without reinforcement in the sense defined above. Most cited is the phenomenon of autoshaping (sometimes called "sign tracking"), in which a stimulus is repeatedly followed by reinforcement, and in consequence the animal begins to respond to the stimulus. For example, a response key is lighted and then food is presented. When this is repeated a few times a pigeon subject begins to peck the key even though food comes whether the bird pecks or not. Similarly, rats begin to handle small objects, such as a lever, when food is presented nearby. Strikingly, pigeons and rats persist in this behavior even when pecking the key or pressing the lever leads to less food (omission training). Another apparent operant behavior that appears without reinforcement is
contrafreeloading Contrafreeloading is an observed behavior in which an organism, when offered a choice between provided food or food that requires effort to obtain, prefers the food that requires effort. The term was coined in 1963 by animal psychologist Glen ...
. These observations and others appear to contradict the law of effect, and they have prompted some researchers to propose new conceptualizations of operant reinforcement (e.g.) A more general view is that autoshaping is an instance of
classical conditioning Classical conditioning (also known as Pavlovian or respondent conditioning) is a behavioral procedure in which a biologically potent stimulus (e.g. food) is paired with a previously neutral stimulus (e.g. a triangle). It also refers to the learni ...
; the autoshaping procedure has, in fact, become one of the most common ways to measure classical conditioning. In this view, many behaviors can be influenced by both classical contingencies (stimulus-response) and operant contingencies (response-reinforcement), and the experimenter's task is to work out how these interact.


Applications

Reinforcement and punishment are ubiquitous in human social interactions, and a great many applications of operant principles have been suggested and implemented. The following are some examples.


Addiction and dependence

Positive and negative reinforcement play central roles in the development and maintenance of
addiction Addiction is a neuropsychological disorder characterized by a persistent and intense urge to engage in certain behaviors, one of which is the usage of a drug, despite substantial harm and other negative consequences. Repetitive drug use o ...
and drug dependence. An addictive drug is intrinsically rewarding; that is, it functions as a primary positive reinforcer of drug use. The brain's reward system assigns it incentive salience (i.e., it is "wanted" or "desired"), so as an addiction develops, deprivation of the drug leads to craving. In addition, stimuli associated with drug use – e.g., the sight of a syringe, and the location of use – become associated with the intense reinforcement induced by the drug. These previously neutral stimuli acquire several properties: their appearance can induce craving, and they can become conditioned positive reinforcers of continued use. Thus, if an addicted individual encounters one of these drug cues, a craving for the associated drug may reappear. For example, anti-drug agencies previously used posters with images of
drug paraphernalia "Drug paraphernalia" is a term to denote any equipment, product or accessory that is intended or modified for making, using or concealing drugs, typically for recreational purposes. Drugs such as marijuana, cocaine, heroin, and methampheta ...
as an attempt to show the dangers of drug use. However, such posters are no longer used because of the effects of incentive salience in causing
relapse In internal medicine, relapse or recidivism is a recurrence of a past (typically medical) condition. For example, multiple sclerosis and malaria often exhibit peaks of activity and sometimes very long periods of dormancy, followed by relapse or ...
upon sight of the stimuli illustrated in the posters. In drug dependent individuals, negative reinforcement occurs when a drug is self-administered in order to alleviate or "escape" the symptoms of
physical dependence Physical dependence is a physical condition caused by chronic use of a tolerance-forming drug, in which abrupt or gradual drug withdrawal causes unpleasant physical symptoms. Physical dependence can develop from low-dose therapeutic use of certai ...
(e.g., tremors and sweating) and/or
psychological dependence Psychological dependence is a cognitive disorder that involves emotional–motivational withdrawal symptoms—e.g. anxiety and anhedonia—upon cessation of prolonged drug abuse or certain repetitive behaviors. It develops through frequent exp ...
(e.g.,
anhedonia Anhedonia is a diverse array of deficits in hedonic function, including reduced motivation or ability to experience pleasure. While earlier definitions emphasized the inability to experience pleasure, anhedonia is currently used by researchers t ...
, restlessness, irritability, and anxiety) that arise during the state of
drug withdrawal Drug withdrawal, drug withdrawal syndrome, or substance withdrawal syndrome, is the group of symptoms that occur upon the abrupt discontinuation or decrease in the intake of pharmaceutical or recreational drugs. In order for the symptoms of wit ...
.


Animal training

Animal trainers and pet owners were applying the principles and practices of operant conditioning long before these ideas were named and studied, and animal training still provides one of the clearest and most convincing examples of operant control. Of the concepts and procedures described in this article, a few of the most salient are the following: (a) availability of primary reinforcement (e.g. a bag of dog yummies); (b) the use of secondary reinforcement, (e.g. sounding a clicker immediately after a desired response, then giving yummy); (c) contingency, assuring that reinforcement (e.g. the clicker) follows the desired behavior and not something else; (d) shaping, as in gradually getting a dog to jump higher and higher; (e) intermittent reinforcement, as in gradually reducing the frequency of reinforcement to induce persistent behavior without satiation; (f) chaining, where a complex behavior is gradually constructed from smaller units. Example of animal training from Seaworld related on Operant conditioning Animal training has effects on positive reinforcement and negative reinforcement. Schedules of reinforcements may play a big role on the animal training case.


Applied behavior analysis

Applied behavior analysis is the discipline initiated by
B. F. Skinner Burrhus Frederic Skinner (March 20, 1904 – August 18, 1990) was an American psychologist, behaviorist, author, inventor, and Social philosophy, social philosopher. He was a professor of psychology at Harvard University from 1958 until his ret ...
that applies the principles of conditioning to the modification of socially significant human behavior. It uses the basic concepts of conditioning theory, including conditioned stimulus (SC), discriminative stimulus (Sd), response (R), and reinforcing stimulus (Srein or Sr for reinforcers, sometimes Save for aversive stimuli). A conditioned stimulus controls behaviors developed through respondent (classical) conditioning, such as emotional reactions. The other three terms combine to form Skinner's "three-term contingency": a discriminative stimulus sets the occasion for responses that lead to reinforcement. Researchers have found the following protocol to be effective when they use the tools of operant conditioning to modify human behavior: # State goal Clarify exactly what changes are to be brought about. For example, "reduce weight by 30 pounds." # Monitor behavior Keep track of behavior so that one can see whether the desired effects are occurring. For example, keep a chart of daily weights. # Reinforce desired behavior For example, congratulate the individual on weight losses. With humans, a record of behavior may serve as a reinforcement. For example, when a participant sees a pattern of weight loss, this may reinforce continuance in a behavioral weight-loss program. However, individuals may perceive reinforcement which is intended to be positive as negative and vice versa. For example, a record of weight loss may act as negative reinforcement if it reminds the individual how heavy they actually are. The
token economy A token economy is a system of contingency management based on the systematic reinforcement of target behavior. The reinforcers are symbols or tokens that can be exchanged for other reinforcers. A token economy is based on the principles of o ...
, is an exchange system in which tokens are given as rewards for desired behaviors. Tokens may later be exchanged for a desired prize or rewards such as power, prestige, goods or services. # Reduce
incentive In general, incentives are anything that persuade a person to alter their behaviour. It is emphasised that incentives matter by the basic law of economists and the laws of behaviour, which state that higher incentives amount to greater levels of ...
s to perform undesirable behavior For example, remove candy and fatty snacks from kitchen shelves. Practitioners of applied behavior analysis (ABA) bring these procedures, and many variations and developments of them, to bear on a variety of socially significant behaviors and issues. In many cases, practitioners use operant techniques to develop constructive, socially acceptable behaviors to replace aberrant behaviors. The techniques of ABA have been effectively applied in to such things as early intensive behavioral interventions for children with an
autism spectrum disorder The autism spectrum, often referred to as just autism or in the context of a professional diagnosis autism spectrum disorder (ASD) or autism spectrum condition (ASC), is a neurodevelopmental condition (or conditions) characterized by difficulti ...
(ASD) research on the principles influencing
criminal behavior In ordinary language, a crime is an unlawful act punishable by a state or other authority. The term ''crime'' does not, in modern criminal law, have any simple and universally accepted definition,Farmer, Lindsay: "Crime, definitions of", in Can ...
, HIV prevention, conservation of natural resources, education, gerontology, health and exercise,
industrial safety Occupational safety and health (OSH), also commonly referred to as occupational health and safety (OHS), occupational health, or occupational safety, is a multidisciplinary field concerned with the safety, health, and welfare of people at ...
, language acquisition, littering,
medical procedures Medicine is the science and practice of caring for a patient, managing the diagnosis, prognosis, prevention, treatment, palliation of their injury or disease, and promoting their health. Medicine encompasses a variety of health care pract ...
, parenting, psychotherapy, seatbelt use, severe mental disorders, sports, substance abuse,
phobias A phobia is an anxiety disorder defined by a persistent and excessive fear of an object or situation. Phobias typically result in a rapid onset of fear and are usually present for more than six months. Those affected go to great lengths to avoi ...
, pediatric feeding disorders, and zoo management and care of animals. Some of these applications are among those described below.


Child behavior – parent management training

Providing positive reinforcement for appropriate child behaviors is a major focus of parent management training. Typically, parents learn to reward appropriate behavior through social rewards (such as praise, smiles, and hugs) as well as concrete rewards (such as stickers or points towards a larger reward as part of an incentive system created collaboratively with the child).Kazdin AE (2010). Problem-solving skills training and parent management training for oppositional defiant disorder and conduct disorder. ''Evidence-based psychotherapies for children and adolescents (2nd ed.),'' 211–226. New York: Guilford Press. In addition, parents learn to select simple behaviors as an initial focus and reward each of the small steps that their child achieves towards reaching a larger goal (this concept is called "successive approximations").Forgatch MS, Patterson GR (2010). Parent management training — Oregon model: An intervention for antisocial behavior in children and adolescents. ''Evidence-based psychotherapies for children and adolescents (2nd ed.),'' 159–78. New York: Guilford Press.


Economics

Both psychologists and economists have become interested in applying operant concepts and findings to the behavior of humans in the marketplace. An example is the analysis of consumer demand, as indexed by the amount of a commodity that is purchased. In economics, the degree to which price influences consumption is called "the price elasticity of demand." Certain commodities are more elastic than others; for example, a change in price of certain foods may have a large effect on the amount bought, while gasoline and other everyday consumables may be less affected by price changes. In terms of operant analysis, such effects may be interpreted in terms of motivations of consumers and the relative value of the commodities as reinforcers.


Gambling – variable ratio scheduling

As stated earlier in this article, a variable ratio schedule yields reinforcement after the emission of an unpredictable number of responses. This schedule typically generates rapid, persistent responding. Slot machines pay off on a variable ratio schedule, and they produce just this sort of persistent lever-pulling behavior in gamblers. The variable ratio payoff from slot machines and other forms of gambling has often been cited as a factor underlying gambling addiction.


Military psychology

Human beings have an innate resistance to killing and are reluctant to act in a direct, aggressive way towards members of their own species, even to save life. This resistance to killing has caused infantry to be remarkably inefficient throughout the history of military warfare. This phenomenon was not understood until S.L.A. Marshall (Brigadier General and military historian) undertook interview studies of WWII infantry immediately following combat engagement. Marshall's well-known and controversial book, Men Against Fire, revealed that only 15% of soldiers fired their rifles with the purpose of killing in combat. Following acceptance of Marshall's research by the US Army in 1946, the Human Resources Research Office of the US Army began implementing new training protocols which resemble operant conditioning methods. Subsequent applications of such methods increased the percentage of soldiers able to kill to around 50% in Korea and over 90% in Vietnam. Revolutions in training included replacing traditional pop-up firing ranges with three-dimensional, man-shaped, pop-up targets which collapsed when hit. This provided immediate feedback and acted as positive reinforcement for a soldier's behavior. Other improvements to military training methods have included the timed firing course; more realistic training; high repetitions; praise from superiors; marksmanship rewards; and group recognition. Negative reinforcement includes peer accountability or the requirement to retake courses. Modern military training conditions
mid-brain The midbrain or mesencephalon is the forward-most portion of the brainstem and is associated with vision, hearing, motor control, sleep and wakefulness, arousal (alertness), and temperature regulation. The name comes from the Greek ''mesos'', "m ...
response to combat pressure by closely simulating actual combat, using mainly Pavlovian
classical conditioning Classical conditioning (also known as Pavlovian or respondent conditioning) is a behavioral procedure in which a biologically potent stimulus (e.g. food) is paired with a previously neutral stimulus (e.g. a triangle). It also refers to the learni ...
and Skinnerian operant conditioning (both forms of behaviorism).
Modern marksmanship training is such an excellent example of behaviorism that it has been used for years in the introductory psychology course taught to all cadets at the US Military Academy at West Point as a classic example of operant conditioning. In the 1980s, during a visit to West Point, B.F. Skinner identified modern military marksmanship training as a near-perfect application of operant conditioning.
Lt. Col. Dave Grossman states about operant conditioning and US Military training that:
It is entirely possible that no one intentionally sat down to use operant conditioning or behavior modification techniques to train soldiers in this area…But from the standpoint of a psychologist who is also a historian and a career soldier, it has become increasingly obvious to me that this is exactly what has been achieved.


Nudge theory

Nudge theory (or nudge) is a concept in
behavioural science Behavioral sciences explore the cognitive processes within organisms and the behavioral interactions between organisms in the natural world. It involves the systematic analysis and investigation of human and animal behavior through naturalistic o ...
, political theory and
economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics analyzes ...
which argues that indirect suggestions to try to achieve non-forced compliance can
influence Influence or influencer may refer to: *Social influence, in social psychology, influence in interpersonal relationships ** Minority influence, when the minority affect the behavior or beliefs of the majority *Influencer marketing, through individ ...
the motives, incentives and decision making of groups and individuals, at least as effectively – if not more effectively – than direct instruction, legislation, or enforcement.


Praise

The concept of praise as a means of behavioral reinforcement is rooted in B.F. Skinner's model of operant conditioning. Through this lens, praise has been viewed as a means of positive reinforcement, wherein an observed behavior is made more likely to occur by contingently praising said behavior. Hundreds of studies have demonstrated the effectiveness of praise in promoting positive behaviors, notably in the study of teacher and parent use of praise on child in promoting improved behavior and academic performance, but also in the study of work performance. Praise has also been demonstrated to reinforce positive behaviors in non-praised adjacent individuals (such as a classmate of the praise recipient) through vicarious reinforcement. Praise may be more or less effective in changing behavior depending on its form, content and delivery. In order for praise to effect positive behavior change, it must be contingent on the positive behavior (i.e., only administered after the targeted behavior is enacted), must specify the particulars of the behavior that is to be reinforced, and must be delivered sincerely and credibly. Acknowledging the effect of praise as a positive reinforcement strategy, numerous behavioral and cognitive behavioral interventions have incorporated the use of praise in their protocols. The strategic use of praise is recognized as an evidence-based practice in both classroom management and parenting training interventions, though praise is often subsumed in intervention research into a larger category of positive reinforcement, which includes strategies such as strategic attention and behavioral rewards. Several studies have been done on the effect cognitive-behavioral therapy and operant-behavioral therapy have on different medical conditions. When patients developed cognitive and behavioral techniques that changed their behaviors, attitudes, and emotions; their pain severity decreased. The results of these studies showed an influence of cognitions on pain perception and impact presented explained the general efficacy of Cognitive-Behavioral therapy (CBT) and Operant-Behavioral therapy (OBT).


Psychological manipulation

Braiker identified the following ways that manipulators
control Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controllin ...
their victims: *
Positive reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fr ...
: includes praise, superficial charm, superficial
sympathy Sympathy is the perception of, understanding of, and reaction to the distress or need of another life form. According to David Hume, this sympathetic concern is driven by a switch in viewpoint from a personal perspective to the perspective of an ...
(
crocodile tears Crocodile tears, or superficial sympathy, is a false, insincere display of emotion such as a hypocrite crying fake tears of grief. The phrase derives from an ancient belief that crocodiles shed tears while consuming their prey, and as such is p ...
), excessive apologizing, money, approval, gifts, attention, facial expressions such as a forced laugh or
smile A smile is a facial expression formed primarily by flexing the muscles at the sides of the mouth. Some smiles include a contraction of the muscles at the corner of the eyes, an action known as a Duchenne smile. Among humans, a smile expresses ...
, and public recognition. *
Negative reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
: may involve removing one from a negative situation * Intermittent or partial reinforcement: Partial or intermittent negative reinforcement can create an effective
climate of fear Culture of fear (or climate of fear) is the concept that people may incite fear in the general public to achieve political or workplace goals through emotional bias; it was developed as a sociological framework by Frank Furedi and has been mo ...
and doubt. Partial or intermittent positive reinforcement can encourage the victim to persist – for example in most forms of gambling, the gambler is likely to win now and again but still lose money overall. * Punishment: includes
nagging Nagging, in interpersonal communication, is repetitious behaviour in the form of pestering, hectoring, harassing, or otherwise continuously urging an individual to complete previously discussed requests or act on advice. The word is derived from th ...
, yelling, the
silent treatment Silent treatment is the refusal to communicate verbally and electronically with someone who is trying to communicate and elicit a response. It may range from just sulking to malevolent abusive controlling behaviour. It may be a passive-aggressiv ...
,
intimidation Intimidation is to "make timid or make fearful"; or to induce fear. This includes intentional behaviors of forcing another person to experience general discomfort such as humiliation, embarrassment, inferiority, limited freedom, etc and the victi ...
, threats, swearing,
emotional blackmail Emotional blackmail and FOG are terms popularized by psychotherapist Susan Forward about controlling people in relationships and the theory that fear, obligation and guilt (FOG) are the transactional dynamics at play between the controller and t ...
, the
guilt trip A guilt trip is a feeling of guilt or responsibility, especially an unjustified one induced by someone else. Overview Creating a guilt trip in another person may be considered to be manipulation in the form of punishment for a perceived trans ...
, sulking, crying, and playing the victim. * Traumatic one-trial learning: using
verbal abuse Verbal abuse (also known as verbal aggression, verbal attack, verbal violence, verbal assault, psychic aggression, or psychic violence) is a type of psychological/mental abuse that involves the use of oral, gestured, and written language direct ...
, explosive anger, or other intimidating behavior to establish dominance or superiority; even one incident of such behavior can condition or train victims to avoid upsetting, confronting or contradicting the manipulator.


Traumatic bonding

Traumatic bonding occurs as the result of ongoing cycles of abuse in which the intermittent reinforcement of reward and punishment creates powerful emotional bonds that are resistant to change.Chrissie Sanderson.
Counselling Survivors of Domestic Abuse
'. Jessica Kingsley Publishers; 15 June 2008. . p. 84.
The other source indicated that 'The necessary conditions for traumatic bonding are that one person must dominate the other and that the level of abuse chronically spikes and then subsides. The relationship is characterized by periods of permissive, compassionate, and even affectionate behavior from the dominant person, punctuated by intermittent episodes of intense abuse. To maintain the upper hand, the victimizer manipulates the behavior of the victim and limits the victim's options so as to perpetuate the power imbalance. Any threat to the balance of dominance and submission may be met with an escalating cycle of punishment ranging from seething intimidation to intensely violent outbursts. The victimizer also isolates the victim from other sources of support, which reduces the likelihood of detection and intervention, impairs the victim's ability to receive countervailing self-referent feedback, and strengthens the sense of unilateral dependency...The traumatic effects of these abusive relationships may include the impairment of the victim's capacity for accurate self-appraisal, leading to a sense of personal inadequacy and a subordinate sense of dependence upon the dominating person. Victims also may encounter a variety of unpleasant social and legal consequences of their emotional and behavioral affiliation with someone who perpetrated aggressive acts, even if they themselves were the recipients of the aggression. '.


Video games

The majority of
video games Video games, also known as computer games, are electronic games that involves interaction with a user interface or input device such as a joystick, controller, keyboard, or motion sensing device to generate visual feedback. This feedbac ...
are designed around a compulsion loop, adding a type of positive reinforcement through a variable rate schedule to keep the player playing. This can lead to the pathology of
video game addiction Video game addiction (VGA), also known as gaming disorder or internet gaming disorder, is generally defined as the problematic, compulsive use of video games that results in significant impairment to an individual's ability to function in vario ...
. As part of a trend in the monetization of video games during the 2010s, some games offered
loot box In video games, a loot box (also called a loot crate or prize crate) is a consumable virtual item which can be redeemed to receive a randomised selection of further virtual items, or loot, ranging from simple customization options for a player ...
es as rewards or as items purchasable by real world funds. Boxes contains a random selection of in-game items. The practice has been tied to the same methods that slot machines and other gambling devices dole out rewards, as it follows a variable rate schedule. While the general perception that loot boxes are a form of gambling, the practice is only classified as such in a few countries. However, methods to use those items as
virtual currency Virtual currency, or virtual money, is a digital currency that is largely unregulated and issued and usually controlled by its developers and used and accepted electronically among the members of a specific virtual community. In 2014, the Europ ...
for
online gambling Online gambling is any kind of gambling conducted on the internet. This includes virtual poker, casinos and sports betting. The first online gambling venue opened to the general public was ticketing for the Liechtenstein International Lottery i ...
or trading for real world money has created a skin gambling market that is under legal evaluation.


Workplace culture of fear

Ashforth discussed potentially destructive sides of
leadership Leadership, both as a research area and as a practical skill, encompasses the ability of an individual, group or organization to "lead", influence or guide other individuals, teams, or entire organizations. The word "leadership" often gets vi ...
and identified what he referred to as petty tyrants: leaders who exercise a tyrannical style of management, resulting in a climate of fear in the workplace.''Petty tyranny in organizations'', Ashforth, Blake, Human Relations, Vol. 47, No. 7, 755–778 (1994) Partial or intermittent
negative reinforcement In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
can create an effective climate of fear and
doubt Doubt is a mental state in which the mind remains suspended between two or more contradictory propositions, unable to be certain of any of them. Doubt on an emotional level is indecision between belief and disbelief. It may involve uncertainty ...
. When employees get the sense that bullies are tolerated, a climate of fear may be the result.Helge H, Sheehan MJ, Cooper CL, Einarsen S "Organisational Effects of Workplace Bullying" in Bullying and Harassment in the Workplace: Developments in Theory, Research, and Practice (2010) Individual differences in sensitivity to reward, punishment, and motivation have been studied under the premises of
reinforcement sensitivity theory Reinforcement sensitivity theory (RST) proposes three brain-behavioral systems that underlie individual differences in sensitivity to reward, punishment, and motivation. While not originally defined as a theory of personality, the RST has been use ...
and have also been applied to workplace performance. One of the many reasons proposed for the dramatic costs associated with healthcare is the practice of defensive medicine. Prabhu reviews the article by Cole and discusses how the responses of two groups of neurosurgeons are classic operant behavior. One group practice in a state with restrictions on medical lawsuits and the other group with no restrictions. The group of neurosurgeons were queried anonymously on their practice patterns. The physicians changed their practice in response to a negative feedback (fear from lawsuit) in the group that practiced in a state with no restrictions on medical lawsuits.Operant Conditioning and the Practice of Defensive Medicine. Vikram C. Prabhu World Neurosurgery, 2016-07-01, Volume 91, Pages 603–605


See also


References


External links


Operant conditioning
article in
Scholarpedia ''Scholarpedia'' is an English-language wiki-based online encyclopedia with features commonly associated with open-access online academic journals, which aims to have quality content in science and medicine. ''Scholarpedia'' articles are written ...

Journal of Applied Behavior Analysis

Journal of the Experimental Analysis of Behavior



scienceofbehavior.com
{{DEFAULTSORT:Operant Conditioning Educational technology Behaviorism Experimental psychology Behavioral concepts History of psychology