Tuesday, 10 December 2013 at 5:38 pm

by Damian Kao

Bayes theorem is perhaps the most well known theorem in the statistics of conditional probabilities. It goes like this:

P(A|B) = P(B|A) * P(A) / P(B)

P(A) means the probability of an outcome named 'A'. P(A|B) means the probability of an outcome named 'A' given that the outcome, named 'B' has occurred.

In this post, I'll present a couple of intuitions about this theorem.

## Intuition 1

Imagine a situation where there are 100 possible outcomes. A simple example is a lottery where a pool of tickets numbered 1 to 100 is drawn at random. There are a 100 possible outcomes for each drawing (assuming we replace the drawn ticket after each drawing).

The total probability space is 100 outcomes. We can also further classify these outcomes. Going back to the lottery example, we might classify the tickets into even numbers outcomes or odd number outcomes.

Let's say 20 / 100 of these outcomes are type 'A' and 30 / 100 of these outcomes are type 'B'.

P(A) = 0.2

P(B) = 0.3

The probability of getting a type A outcome is 20 /100 = 0.2. The probability of getting a type B outcome is 30 / 100 = 0.3.

And let's also say the intersection between A and B is 10. Meaning there are 10 outcomes that are categorized as both type 'A' and type 'B'. The probability of both A and B is 10 / 100 = 0.1.

P(A, B) = 0.1

Let's say now that I want to find probability of outcome A given outcome B has occurred.

P(A|B) = ?

We'll focus on the "given outcome B has occurred" part of the sentence. Since we know that outcome B has occurred, instead of looking at the entire population of 100 outcomes, we can just look at the 30 outcomes with type B.

We want to find the probability of outcome A within this new smaller subset of only outcome B's. We know that there are 10 outcomes that are both A and B. This means that the probability of A given only population of B is 10 / 30 = 1/3.

P(A|B) = P(A, B) / P(B) = 0.1 / 0.3 = 0.33..

The "given outcome X" portion of the sentence intuitively means we are limiting our outcomes to a subset of the possible outcomes. And when we divide the intersection to this smaller subset, we are really just normalizing the intersection to this new smaller subset.

## Intuition 2

This one is pretty straightforward. We just rearrange the equation in intuition 1:

P(A|B) = P(A, B) / P(B)

to:

P(A, B) = P(A|B) * P(B)

So the probability of A and B is equal to the probability of A given B re-normalized to the total population.

## Intuition 3

From the previous intuition, the probability of A and B is A given B re-normalized to the total population. It is also equal to B given A re-normalized to the total population.

P(A, B) = P(A|B) * P(B) = P(B|A) * P(A)

## Intuition 4

This one is a bit more complicated.

P(B) = P(A, B) + P(B|not A) * P(not A)

The probability of B (the purple circle) has to include P(A, B), the intersection between A and B.

But what does the rest of the equation mean? Intuitively, the rest of the equation is calculating the probability of B where the outcome is not A. Remeber from intuition 1 that:

P(B|not A) * P(not A) = P(B, not A)The sum of these two terms is equal to the probability of all outcomes that are B.

## The theorem

Given that:

P(A|B) = P(A, B) / P(B)

P(A, B) = P(A|B) * P(B) = P(B|A) * P(A)

We can now write Bayes theorem:

P(A|B) = P(B|A) * P(A) / P(B)

## Example

The prevalence of a disease in a given population is 1%. A medical company has come up with a test that can tell you whether you have this disease. The test is advertised to be 99% accurate if you are diseased.

**The probability of diseased is 1%:**

P(diseased) = 0.01

**The probability of not diseased is 99%:**

P(not diseased) = 0.99

**The probability of testing positive given diseased AND the probability of testing negative given not diseased is 99%:**

P(positive | diseased) = P(negative | not diseased) = 0.99

**The probability of testing falsely negative given diseased AND the probability of falsely positive given not having the disease is 1%:**

P(negative | diseased) = P(positive | not diseased) = 0.01

What is the the probability of being diseased given tested positive? In other words, what is P(diseased | positive)?

Using Bayes theorem we can:

P(diseased | positive) = P(positive | diseased) * P(diseased) / P(positive)

We can easily get all the terms except for P(positive), which we will use intuition 4:

P(positive) = P(positive | diseased) * P(diseased)

+ P(positive | not diseased) * P(not diseased)

= 0.99 * 0.01 + 0.01 * 0.99 = 0.0198

Plugging in all the values:

P(diseased | positive) = 0.99 * 0.01 / 0.0198 = 0.5

The chance of someone actually having the disease given that she/he tests positive is 50%. This intuitively might sound strange as the test is advertised as 99% accurate. But we have to remember that it is 99% accurate given diseased. What about the rest of the population that are not diseased?

GIven a population of 100 people and a prevalnce rate of 1%. 1 person is diseased. This single diseased person takes the test and has a 99/100 chance of being tested positive.

The rest of the population consisting of 99 people also takes this test, each person has a 0.01 chance of testing falsely positive. Since there are 99 non-diseased people, there is a 99/100 (1/100 * 99) chance that a given person will be falsely positive.

Given a random single person who tests positive from this population of 100 people. If this person is truely diseased, the test will be 99% accurate, if the person is not diseased, there is also a 99% chance the result will be falsely positive. Hence, given a positive test, there is a 50% chance the person is diseased.