Introduction to Operant Conditioning

Operant conditioning is the use of consistent consequences to mold a desired type of behaviour. While the use of operant conditioning traces back at least as far as the history of animal husbandry, it was originally identified and generalised as a behaviourist training technique by Burrhus Frederic Skinner, who demonstrated its use in rats and pigeons and advocated its use in prison rehabilitation. Other human applications of operant conditioning range from early childhood education to modern marketing techniques.

There are four types of operant conditioning: positive reinforcement, negative reinforcement, punishment, and extinction. The first two reinforce a desired behaviour, while the last two aim to weaken and ideally remove an undesired behaviour.

In positive reinforcement, a living organism is rewarded for having performed a desired action. Negative reinforcement, in contrast, “rewards” a organism for having performed a desired action by withdrawing some undesirable condition. Perhaps the best-known human example of these is the application of grades to academic work.

Punishment is the technique of following an undesired behaviour immediately with an undesired condition. In time, the organism learns not to exhibit the undesired behaviour. One expression of the punishment variant of operant conditioning can sometimes be learned helplessness. In this situation, the organism has learned that nothing it can do will make a difference to its condition, and thus it ceases to keep trying, even if its environment changes. The behaviour that has been extinguished here is any type of attempt to improve the organism’s personal conditions. In effect, it has given up. Learned helplessness can be acquired extremely quickly; and is particularly stubborn to extinguish.

Extinction occurs when a behaviour, previously rewarded, ceases to be rewarded. In time, the behaviour may vanish if it is not being otherwise rewarded. However, behaviours can be extremely persistent, and even very sporadic rewards can suffice to keep the behaviour active.

All forms of operant conditioning take the form of stimulus, conditioned response, reward. In time, the organism may perform the desired action without being rewarded at all, although to remove the reward consistently over an extended period of time may result in behaviour extinction. Even if the reward happens to come independently of whether or not a particular action is performed, whatever type of action had occurred just prior to the reward may end up being reinforced unwittingly: it is in this manner that superstitions are born. The reward may be tangible, such as a food pellet or the removal of an electric shock, or it may symbolise status or material gains.

The strongly behaviourist tenets of operant conditioning do not take into account genetic predisposition, where prior behavioural patterns may conflict with the cognitive learning required by operant conditioning; or latent learning, where new information may be absorbed independent of reward. Interestingly, both genetic predisposition and latent learning can sometimes be permanently erased by the successful introduction of an operant conditioned reward-oriented behaviour. For example, children who had been rewarded for drawing pictures and then had that reward withdrawn ended up drawing much less than children who had never been rewarded for drawing pictures. This finding has strong implications for our strongly grade- and payscale-oriented schools and workplaces.

Introduction to Operant Conditioning

Related posts: