The false positive paradox explains why you misjudge risk

Prev Article Next Article

You’ve probably come across the prime rate bug, and it’s probably fooled you. Part mathematical paradox and part cognitive bias, this mental oversight has surprisingly strong things to say about many real-world situations, from our public health policies to mass surveillance programs. For example, take the following two riddles. The first comes from the late psychologist Daniel Kahneman’s popular science book Thinking, fast and slow:

One person has been described by a neighbor as follows: “Steve is very shy and withdrawn, always helpful, but with little interest in people or the real world. A meek and orderly soul, he has a need for order and structure, and a passion for detail.” Is Steve more likely to be a librarian or a farmer?

When asked this question in an experiment, Kahneman wrote, most people say Steve is more likely to be a librarian. They believe that Steve’s personality is more consistent with stereotypes of librarians than with that of farmers. But they ignore a very relevant statistical detail: Farmers outnumber library professionals in the United States by more than 11 to one. A description of a person’s personality should not overstate the vast size differences of the relevant employment populations. With such a preponderance of farmers, we should expect many of them to have a passion for detail. This statistical bias becomes more obvious when the career options have a stronger difference in population sizes: Steve loves astronomy. Is he more likely to be a banker or an astronaut?

On supporting science journalism

If you like this article, please consider supporting our award-winning journalism by subscribes. By purchasing a subscription, you help secure the future of impactful stories about the discoveries and ideas that shape our world today.

Puzzle number two will be more quantitative. Let’s say a doctor randomly decides to take a blood test on you to check for a certain disease that affects one in 1,000 people. The test is remarkably effective: it never gives a false negative, which means that if you have the disease, the test will detect it. False positives can happen, but are rare: if you don’t have the disease, the test will say as much 99 percent of the time. Your test will come back positive. With these prices in mind, what is the probability that you have the disease? Following Steve’s example, you can be alert, ready for a trick. Try to inhabit the situation. You have just received a positive result on an exceptionally accurate medical test. How worried do you feel?

With the given parameters, the chance that you actually have the disease is only around 9 percent. Imagine we tested 1000 people. We expect one person in that group to have the disease and get a really positive result. Of the remaining 999 tests, 1 percent will give false positives. That rounds up to 10 people. So we expect 11 positive tests: 10 false positives and one true positive. Your positive test is one of these 11. What are the chances that you are the unlucky one?

Graphic shows an array of 1000 dots representing a group of people tested for a disease. Of the total dots, 989 are colored gray, indicating a negative test result, 10 are light blue, indicating a false positive, and one is dark blue, indicating a true positive.

These riddles show the basic sentence error; the second is also an example of the false positive paradox. When people assess the likelihood of a scenario, they tend to overweight the specific details in front of them and underweight the general prevalence of that scenario. They overemphasize the description of Steve as “a meek and orderly soul” and neglect the prevalence of farmers in relation to librarians. They overestimate a positive result on a test that is 99 percent accurate and ignore the rarity of the disease.

Of course, you should not reject medical tests; that is not the lesson here. Instead, the false positive paradox shows that interpreting the results of medical tests and deciding when to administer them requires statistical competence. Usually, doctors order tests when they have reason to believe you may have a condition. If we randomly select you for testing, your probability of having the disease before testing is simply the prevalence in the general population. But if you walk into a clinic with a characteristic rash and a high fever, you’ve moved into a different statistical bucket. You are no longer compared to the general public, but with other people with the specific symptoms. In this smaller group, the disease is far more common, making a positive result more indicative of a true case.

This situation explains why we do not carry out mass screenings for rare diseases. When a disease has a small enough base rate in the population, even highly accurate tests will produce more false positives than true positives. The benefit of catching a few cases is outweighed by the medical, financial and psychological damage caused by a large number of false positives.

Welsh police learned this lesson the hard way during the 2017 Union of European Football Association’s Champions League final. They deployed cameras throughout Cardiff, where the football event was held, and used automated facial recognition software to analyze the footage. The software scanned the faces of around 170,000 fans, looking for anyone who matched people of interest. The system flagged 2,470 potential criminals, of which 2,297 were false positives. The software was not broken. It did what any system with a small chance of failure does when used randomly. The case made national news and led to an ongoing legal battle in Wales over facial recognition technology.

For similar reasons, any data mining technique used to catch potential terrorists will fail, as security expert Bruce Schneier has written about more widely. These programs scour phone records, location data and social networks for patterns that might indicate terrorist plots. The problem: terrorist plots don’t always have clearly identifiable warning signs (indicating some chance of false positives), and most people aren’t terrorists (indicating a microscopic base rate in the population). Schneier’s back-of-the-envelope calculation suggests that for every real threat uncovered, tens of millions of false alarms could divert the attention of federal agents—with all the associated costs and freedom violations.

None of this means we should stop screening for rare events, but we should understand the trade-offs. Most fire alarms are false alarms, but they are a minor inconvenience in exchange for saving lives when disaster strikes. The base rate error teaches us to contextualize false alarms and stop conflating the accuracy of a test for an event with the probability of the event itself. It reminds us that when wading through knotty questions of probability, the most salient details may not be the most statistically relevant.

It’s time to stand up for science

If you liked this article, I would like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in its two-century history.

I have been one Scientific American subscriber since I was 12 years old, and it helped shape the way I see the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does for you too.

If you subscribe to Scientific Americanyou help ensure our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten laboratories across the United States; and that we support both budding and working scientists at a time when the value of science itself is too often not recognised.

In return, you receive important news, captivating podcasts, brilliant infographics, can’t-miss newsletters, must-see videos, challenging games, and the world of science’s best writing and reporting. You can even give someone a subscription.

There has never been a more important time for us to stand up and show why science is important. I hope you will support us in that mission.

Click Here to Get More