Operant Conditioning, Reinforcement and Rewards

Session 173

Get ready for a super high-yield topic! Listen and learn how to sort through all the positive/negative reinforcements through reward and punishment. Phil from Blueprint MCAT (formerly Next Step Test Prep) joins us for another round of deep dive into the MCAT content outline, specifically AAMC Content Category 7C.

Phil is also the “office hours” guys at Blueprint MCAT (formerly Next Step Test Prep) for four out of the five days of the live office hours that you get access to when you go through their MCAT course. Check out the full review of that course on MCATCourseReview.com.

Listen to this podcast episode with the player above, or keep reading for the highlights and takeaway points.

[02:10] A Brief Overview of Content Category 7C

Operant conditioning is one of the highest yield areas in this category. It actually shows up in three or four different sections including language acquisition. It’s like B.F. Skinner operant conditioning trying to increase and decrease behaviors.

'It's something you're dealing with in your family trying to change behaviors with children.'Click To Tweet

Reinforcement is done to increase some behavior. Punishment is done to decrease a behavior.

For instance, if you want your kid to brush their teeth more, you use a reinforcement. If you want them to stop kicking their sibling, you give them a punishment.

[03:20] Positive vs. Negative Reinforcement

Positive means you added something and negative means you remove something.

Positive Reinforcement

If you’re trying to get a kid to brush their teeth more often, you give them a candy bar. You’re trying to increase their behavior of brushing their teeth by giving something.

Negative Reinforcement

Negative reinforcement is when you take something away. For example, if you get straight A’s, you don’t have to do chores anymore. So you’re reinforcing the behavior by taking something away.

'Taking something bad and giving something good is how you reinforce a behavior.'Click To Tweet

[04:39] Positive vs. Negative Punishment

Positive Punishment

This can be positive or negative as well. For example, you yelled at your dog whenever it got on the couch. This would be positive punishment. You’re adding something unpleasant to the dog’s life to try to get them to not do something. Or if your kid gets a bad grade, you punish the behavior so you give them extra chores.

Negative Punishment

Negative punishment is when you take something away. If your kid gets bad grades, they don’t get to go to the prom. You’re taking away something from them so it’s a negative but it’s also a punishment because you’re trying to decrease that behavior of bad grades.

'Think about whether you're adding or taking something away in order to pull those things apart.'Click To Tweet

[06:00] Reinforcement Schedules

The MCAT is going to ratchet up the difficulty. The schedule is an important thing. You can have a continuous schedule. For instance, every time your kid brushes their teeth, they get a sticker. This is hard to keep up.

As somebody who rewards or punishes, you want to have a little difference of the schedule. You can have a fixed schedule or variable schedule. Is it a set number of occurrences or time? Or is it more variable and fluid?

It can also be ratio-based which is based on occurrences or interval-based which is based on time. This is going to be very confusing.

'Ratio-based is based on occurrences; interval-based is based on time.'Click To Tweet

Interval-Based Schedule vs. Ratio-Based Schedule

For example, you want your kid to get to play the piano. You give him $5 every time he plays for an hour. It’s going to be a positive reinforcement because you’re giving them something to increase a behavior. But you’re doing this based on some timeframe. It’s not based on every time he sits down with the piano. But it’s based on every hour he plays the piano. So there’s an interval going on here.

Using the same example given above, a ratio-based is when you give the reward every time he sits down with the piano. It’s a bit different than giving it every hour.

Fixed-Interval vs. Variable-Interval

If the reward is always given every hour, it’s going to be a fixed-interval. It’s a set amount of time.

A variable-interval would be like if the kid plays for an hour, you give him $5 and if he plays for $20, you give him another $5. And then three hours later, you give him another $5. It’s unexpected because you don’t know when the reward is going to come there.

[08:15] Quiz Mode On

Your dog is still going onto the couch. Sometimes, you hell at him and sometimes you don’t. How do you categorize this? This is going to be ratio-based because it’s based on occurrence. So this is variable-ratio.

If it were variable-interval, your dog is on the couch and you’re always getting to yell at him. It might take you two minutes or an hour and a half. So the amount of time is changing in that case.

It gets complex as you start to work through all of these different scheduling of things. There’s fixed-ratio positive punishment or variable interval negative reinforcement.

'You have to have a systematic way of working through these and looking at these one step at a time.'Click To Tweet

[09:53] Sorting Through the Differences

Gambling at a slot machine is variable-ratio positive reinforcement. You’re paid out so something is given to you for you to do it more.

Variable-ratio is actually the most powerful for increasing the behavior. This is something the MCAT might ask.

'Generally, ratio is better than interval, and variable is better than fixed.'Click To Tweet

For example, sometimes you give your kid a dollar and sometimes you don’t. Or you could give him a dollar three times in a row or four times in a row that you don’t. The idea here is he doesn’t know when he’s going to get paid.

So every time he sits down, there’s a chance he’s going to get paid just for sitting down. This causes them to do that behavior more often because every moment, there’s a chance. Right after they get paid, there’s no incentive to keep playing.

Fixed interval tends to be worse than variable-ratio. Variable-ratio is what you see on slot machines, and even on social media. If you check your email and you know every time you check your emails there’s going to be 5 there, that’s not exciting. But every time you check your email sometimes there’s zero. sometimes there’s one, or there’s 40.

This is something to keep in mind if you want to teach your kid to play the piano for example.

But if you’re doing variable-ratio in everything, it feels like you’re putting the kid in a weird place. They’re never going to know when the punishment or rewards are going to come. And they’re constantly on the edge of their seats.

Although it’s best for increasing the behavior, Phil doesn’t think if this is best for the psychology of the child.

[12:20] Operant Conditioning – Skinner

All the examples mentioned above are part of operant conditioning. The whole idea here is rewards and punishments. B.F. Skinner is a behaviorist focusing on the behavior and how to increase and decrease it.

He has this idea that if you gave him a child, he’d turn that child into a pillar of humanity, by constantly rewarding good stuff and punishing whenever they life or steal. Or it could be the opposite. Reward them every time they lie and punish them every time they say the truth and they’d turn into a deceitful person.

Here’s a good way to look at this for your MCAT prep: Operant conditioning is Skinner and slot machine. And you will rub the skin off your palms if you have the slot machine for too long.

[13:15] Classical Conditioning

Classic conditioning is Pavlov. The idea here is every time he gets meat, the dog will start to drool. Pavlov wanted to see if he could create a different relationship with this. So he’d ring a bell every time he would give the dog some meat. Eventually, he would just ring the bell and the dog would start to drool without any meat involved.

There’s a really specific language for classical conditioning – conditioned, unconditioned, stimuli, and responses.

The unconditioned response and stimulus that would just happen naturally. Meat and drooling are the unconditioned stimulus and unconditioned response. Versus the bell and drooling would be the conditioned stimulus and the conditioned response.

Drooling is both the unconditioned and the conditioned response. This should always be the same.

'The idea here is you're trying to get something else to cause the same effect.'Click To Tweet

You can tie in some behavior to something already naturally occuring. Tie this into MCAT studying for example. If you are really happy when you’re listening to a certain song, maybe you should study listening to that song. And when you’re studying, the song makes you happy. And you’re creating an unconditioned stimulus and response. And if you relate studying to that song, studying is going to make you happy.

[15:50] Classical Conditioning vs. Operant Conditioning

Classical conditioning is tying some behavior to something that’s already naturally occurring. Operant conditioning is a lot more specific. You’re really trying to increase the behavior with punishments or reinforcements. Make sure you can tell the difference between these two.

'The conditioned response and the unconditioned response are always the same.'Click To Tweet

If you ever see an MCAT question where they ask you that based on the passage, doing a backflip was what? If you see unconditioned and conditioned responses as your answer choices, those have to be the same. You can’t have two correct answers. Then that means they’re both wrong because they have to be the same thing. That’s the way it’s set up. This will help you cross off answers.

[18:55] Final Thoughts

Be aware and try to understand and utilize these things. Make sure you’re able to tell the differences. Operant conditioning is the more complex and it’s so easy to trip up on those.

'Coming up with your own examples is always the best way to work through those.'Click To Tweet

Start thinking about something you may have tried to get someone to do or stop doing. Or something your parents did to get you to start or stop doing something. Figure out what category those fit into.