What is the Base Rate Fallacy?
The concept in one sentence:
We tend to focus on specific information and ignore general statistics.
The concept in one quote:
Many situations present the decision maker with two kinds of information: background or base-rate information about how things usually are in such situations and indicator or diagnostic information telling how things appear to be in the particular situation.
Failure to consider background information in situations in which it is actually very relevant is called the base-rate fallacy. Such failure appears to be very widespread and to affect even trained statisticians when they rely on intuition rather than calculation.
The benefit of avoiding the bias:
Make more accurate observations.
Bob is an opera fan who enjoys touring art museums when on holiday. Growing up, he enjoyed playing chess with family members and friends.
Which situation is more likely?
Bob plays trumpet for a major symphony orchestra
Bob is a farmer
A large proportion of people will choose 1 in the above problem because Bob’s description matches the stereotype we may hold about classical musicians rather than farmers. In reality, the likelihood of 2 being true is far greater, because farmers make up a much larger proportion of the population.
Donald Jones is either a librarian or a salesman. His personality can best be described as shy.
What are the odds that he is a librarian?
Prenuptial agreements are marriage contracts that set out how your finances and properties will be resolved if you and your partner get divorced.
Should most couples get a prenuptial agreement before their marriage?
A group of police officers has breathalyzers displaying false drunkenness in 5% of the cases in which the driver is sober. However, the breathalyzers never fail to detect a truly drunk person. One in a thousand drivers is driving drunk. Suppose the police officers then stop a driver at random to administer a breathalyzer test. It indicates that the driver is drunk. We assume you do not know anything else about them.
What is the probability that they are drunk?
In a city of 100 inhabitants let there be 1 terrorist and 99 non-terrorists. In an attempt to catch the terrorist, the city installs an alarm system with a surveillance camera and automatic facial recognition software. The software has a failure rate of 5%—if the camera scans a non-terrorist, the alarm will ring 5% of the time, but will not ring 95% of the time.
Suppose that an inhabitant triggers the alarm. What is the chance that the person is a terrorist?
You buy a "Diet Coke 6 Pack" labeled "50% extra free".
How many cans are free?
The acceptance rate at MIT is 6.7%. In other words, of 100 students who apply, only 7 are admitted. This means the school is very selective. Imagine your child was brilliant (for example top scores at school).
What is the probability that they would be accepted?
A faith healer states: "Faith healing works, but not all the time, especially when one's faith is not strong enough. Unbiased, empirical tests, demonstrate that a small but noticeable percentage of people are cured of “incurable” diseases such as cancer."
Does that mean that "Faith healing" works?
Bonus Exercise — Facebook Interview Question (Data Scientist)
You're about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it's raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of messing with you by lying. All 3 friends tell you that "Yes" it is raining.
What is the probability that it's actually raining in Seattle?
Answer to Exercise 1
When we use this little problem in seminars, the typical response goes something like this: “Oh, it’s pretty clear that he’s a librarian. It’s much more likely that a librarian will be shy; salesmen usually have outgoing personalities. The odds that he’s a librarian must be at least 90 percent.” Sounds good, but it’s totally wrong.
The trouble with this logic is that it neglects to consider that there are far more salesmen than male librarians. In fact, in the United States, salesmen outnumber male librarians 100 to 1. Before you even considered the fact that Donald Jones is “shy,” therefore, you should have assigned only a 1 percent chance that Jones is a librarian. That is the base rate.
Now, consider the characteristic “shy.” Suppose half of all male librarians are shy, whereas only 5 percent of salesmen are. That works out to 10 shy salesmen for every shy librarian — making the odds that Jones is a librarian closer to 10 percent than to 90 percent. Ignoring the base rate can lead you wildly astray.
Answer to Exercise 2
The suggestion of a prenuptial agreement is often viewed as a sign of bad faith. However, in far too many cases, the failure to create prenuptial agreements occurs when individuals approach marriage with the false belief that the high base rate for divorce does not apply to them.
Also, unnecessary emotional distress is caused in the divorce process because of the failure of couples to create prenuptial agreements that facilitate the peaceful resolution of a marriage.
Answer to Exercise 3
Many would answer as high as 95%, but the correct probability is about 2%. An explanation for this is as follows: on average, for every 1,000 drivers tested:
1 driver is drunk, and since the breathalyzer never fails to detect a truly drunk person, there is 1 true positive test result.
999 drivers are not drunk, and among those drivers, there are 5% false-positive test results, so there are 49.95 false-positive test results.
Therefore, the probability that one of the drivers among the 50.95 positive test results (1 positive test result + 49.95 false-positive test results) really is drunk is 1 / 50.95, which is around 2% (1.96).
The validity of this result does, however, hinge on the validity of the initial assumption that the police officer stopped the driver truly at random, and not because of bad driving. If that or another non-arbitrary reason for stopping the driver was present, then the calculation also involves the probability of a drunk driver driving competently and a non-drunk driver driving (in-)competently.
Answer to Exercise 4
Someone making the 'base rate fallacy' would infer that there is a 95% chance that the detected person is a terrorist. Although the inference seems to make sense, it is actually wrong, and a calculation below will show that the chances they are a terrorist are actually near 17%, not near 95%.
Imagine every inhabitant were scanned by the automatic facial recognition software, we would have:
99 non-terrorists, and among those non-terrorists, 5% would make the alarm ring. 5% of 99 is 4.95 so around 5 non-terrorists.
Therefore, the probability that one of the inhabitants among the 6 who made the alarm ring is a terrorist is 1 / 6, which is around 17% (16.66).
Answer to Exercise 5
When you buy six cans of Coke labeled "50% extra free," only two of the cans are free, not three. (It's because the original pack had four cans, and 50% of the original amount is two cans.)
Lots of food companies exploit the Base Rate Fallacy on their packaging. When something says "50% extra free," only a third (33%) of what you're looking at is free.
Answer to Exercise 6
Statistically speaking, your child may still have a low chance of acceptance. The school is for brilliant kids (and everyone knows this), so the vast majority of kids who apply are brilliant. Of the whole population of brilliant kids who apply, only about 6.7% get accepted. So even if your child is brilliant, they still have a low chance of being accepted (about 6.7%).
Answer to Exercise 7
What is not mentioned above is the number of cases of cancer that just go away without any kind of faith healing, in other words, the base rate of cancer remission. It is a statistical certainty that among those with cancer, there will be a percentage with spontaneous remission. If that percentage is the same as the faith-healing group, then that is what is to be expected, and no magic or divine healing is taking place.
The following is from the American Cancer Society:
Available scientific evidence does not support claims that faith healing can cure cancer or any other disease. Some scientists suggest that the number of people who attribute their cure to faith healing is lower than the number predicted by calculations based on the historical percentage of spontaneous remissions seen among people with cancer. However, faith healing may promote peace of mind, reduce stress, relieve pain and anxiety, and strengthen the will to live.
Answer to the Bonus Exercise (Facebook Interview Question)
The correct answer is to tell the interviewer that it depends on the probability of rain in Seattle (the base rate) and to ask for that figure. Once you have it, you will be able to calculate the probability.
Messages and feedback are welcome via email.