Nearly thirty years ago now, I popularized the use of food lures and rewards in dog training. Way back then, people had to wait until their dogs were six months to a year old to get into obedience class, largely because leash training was considered too challenging for their immature brains. However, when I used science-based lure/reward methods, it was obvious that puppies could learn very quickly and master quite complex tasks. In fact, I once demonstrated a puppy performing the Utility Signal Exercise in week 3 of Puppy Class.
Unfortunately, people have forgotten that the food lure is meant to be phased out in the very first session. After just a dozen or so trials, hand signals (with no food) work just as well to teach the puppy verbal commands. If you continue to use food as a lure and as a reward, it strongly approximates a bribe, which of course the dog will most probably blow off as soon as he develops competing doggy interests during adolescence. Also, quite simply, food rewards should be replaced by life rewards because the latter are much more effective.
This may sound strange coming from the person who championed the use of food for dog training, but nowadays most people simply use too much food and, as a result, their dogs’ behavior becomes unreliable and sloppy. (Of course, I am specifically talking about putting behaviors on cue here; one can never use too much food for Classical Conditioning, such as when encouraging children to hand feed dogs so the dogs learn, “We love children!”)
Rewarding dogs on a continuous basis has several drawbacks. Initially, the dog learns very quickly, but glitches develop in the dog’s behavior over time. When they figure out that they’re likely to get food rewards anyway, some dogs respond only when and where they want. Or the dog may eventually respond to the owner’s commands, but in his own good time. Other dogs will only respond if the owner has food at hand and will otherwise ignore all requests.Luckily, intermittent reinforcement increases reliability.
The easiest way to phase out food rewards is to ask “more for less” — to progressively increase the number of responses required for each food reward. For example, the average Golden Retriever puppy will gladly perform 20 puppy push-ups (down and sits) for the prospect of a single treat. There is another way, too. The study of learning theory has come up with many reinforcement schedules for paring down the number of rewards while increasing their effectiveness in maintaining learned behavior.
We now know that food rewards may be dispensed intermittently according to fixed or variable schedules. However, fixed schedules tend to cause marked variations in the rate and quality of behavior. Fixed Interval schedules (rewarding a dog after a fixed time, for example after every 15 seconds of sit-stay) cause decreases in the quantity and quality of responses immediately following each reward. The dog learns that after each reward another one won’t be coming for some time, so he slows down, loses attention, and becomes less reliable.
Fixed Ratio schedules (rewarding a dog after a fixed number of responses, for example after every 10 puppy pushups) causes an increase in the rate and speed of responding but a decrease in the quality of responses. The dog is motivated to work fast but gets sloppy in the process. And of course, if stretched too thin, Fixed Ratio schedules may cause the dog to stop working altogether. The dog goes on strike because he wants more rewards for fewer responses.
Now, no one in their right mind would use fixed schedules to train a puppy — young dogs are much too unreliable. (Ironically though, the entire human world’s work force is “maintained” on fixed schedules — Pay Day (fixed interval) and Piece Rate (fixed ratio). Duh!)
Variable Interval and Variable Ratio schedules are much better at motivating dogs to respond quickly and reliably, while at the same time reducing the number of rewards that are necessary. Think about the difference in commitment and motivation between people working a slot machine (Variable Ratio reinforcement) and those using a food vending machine (Continuous Reinforcement). Using a Variable Ratio 5 schedule, for example, the dog is rewarded on average after five trials, however, the dog may be rewarded for two trials in succession or he may go a dozen trials without a single reward.
Variable reinforcement schedules teach the dog to work for longer and longer periods without rewards, so he is less likely to refuse to respond if you don’t have food or a toy at hand. However, there’s a big problem. No person can progressively compute the schedules and train a dog at the same time. Thus, these schedules are theoretically interesting but have limited practical application for training animals or teaching people.
But you know what’s so entirely wonderful about reward-training? Random reinforcement is just as effective for maintaining reliability of responses. This of course is in stark comparison to the use of punishment, wherein effectiveness absolutely depends on consistency — you must punish the dog each and every time that it misbehaves. With reward training, you can actually be as inconsistent as you like, reward your dog at random, and he will still be motivated to do your bidding.
Although variable and random reinforcement schedules are great at improving a dog’s work ethic and increasing the speed and quantity of responses, none of the above schedules do anything to improve the quality of the dog’s responses.
The quality of behavior never improves because all of the above reinforcement schedules reward the dog for just as many below-average responses as above-average responses. This is really just too silly for words. I wouldn’t use any of these schedules when motivating and training a puppy/dog or a person.
To motivate the dog to maintain high rates of responding and to ensure ongoing qualitative improvement to speed, precision, pizzazz, and panache, Differential Reinforcement is the only way to go. But just doing it is not sufficient. Consequential feedback should reflect the quality of the response. In order to even be considered for a reward, the dog’s response must at least be above average; better responses get better rewards and the best responses get the best rewards. As a rule of thumb, never reward your dog for more than a third of his responses. This ensures that his behavior will improve from day to day.
Ian Dunbar is a veterinarian, canine behaviorist, and puppy training pioneer. He is the founder of SIRIUS® Puppy Training; Scientific Director for www.dogstardaily.com; and author of several best-selling books and videos. For more information, visit www.siriuspup.com.