Bayes’ Rule Demystified: Five Intuitive Perspectives
Example
Consider the following scenario:
- The probability that a person has a disease is 0.01.
- The probability that a person with the disease tests positive is 0.90.
- The probability that a person without the disease tests positive is 0.10.
Question: What is the probability that a person who tests positive actually has the disease?
First Approach: Counting
Assuming 1,000 people are tested:
- \(1000 \times 0.01 = 10\) people have the disease.
- \(1000 \times 0.99 = 990\) people do not have the disease.
- Of the 10 people with the disease, \(10 \times 0.90 = 9\) people test positive.
- Of the 990 people without the disease, \(990 \times 0.10 = 99\) people test positive.
- Therefore, out of \(9 + 99 = 108\) people who tested positive, only 9 people actually have the disease.
Thus, the probability that a person who tests positive actually has the disease is:
\[\frac{9}{108} \approx 0.0833.\]
More Resources:
Second Approach: Using Bayes' Rule
Bayes' Rule states:
\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}.\]
Where:
- \(A\) is the event that a person has the disease.
- \(B\) is the event that a person tests positive.
- \(P(A) = 0.01\) is the prior probability of having the disease.
- \(P(B|A) = 0.90\) is the probability of testing positive given that the person has the disease.
- \(P(B)\) is the total probability of testing positive.
We calculate \(P(B)\) using the law of total probability:
\[P(B) = P(B|A) \cdot P(A) + P(B|\neg A) \cdot P(\neg A) = (0.90)(0.01) + (0.10)(0.99) = 0.009 + 0.099 = 0.108.\]
Therefore:
\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} = \frac{(0.90)(0.01)}{0.108} \approx 0.0833.\]
This result matches the counting method.
Proof of Bayes' Rule
Starting from the definition of conditional probability:
\[P(A|B) = \frac{P(A \cap B)}{P(B)}.\]
Similarly:
\[P(B|A) = \frac{P(A \cap B)}{P(A)} \implies P(A \cap B) = P(B|A) \cdot P(A).\]
Substituting back:
\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}.\]
Third Approach: Odds Form of Bayes' Rule
Bayes' Rule can also be expressed in terms of odds:
\[\frac{P(A|B)}{P(\neg A|B)} = \frac{P(A)}{P(\neg A)} \cdot \frac{P(B|A)}{P(B|\neg A)}.\]
Here, the posterior odds equal the prior odds multiplied by the likelihood ratio.
Using our example:
\[\frac{P(A|B)}{P(\neg A|B)} = \frac{0.01}{0.99} \cdot \frac{0.90}{0.10} = \frac{0.009}{0.099} = \frac{9}{99} = \frac{1}{11}.\]
Thus, the odds of having the disease given a positive test are \(1:11\). To find the probability:
\[P(A|B) = \frac{\text{odds}}{1 + \text{odds}} = \frac{1/11}{1 + 1/11} = \frac{1}{12} \approx 0.0833.\]
Note: Since \(P(A|B) + P(\neg A|B) = 1\), we have:
\[\frac{P(A|B)}{P(\neg A|B)} = \frac{P(A|B)}{1 - P(A|B)}.\]
Solving for \(P(A|B)\):
\[P(A|B) = \frac{\text{odds}}{1 + \text{odds}}.\]
The odds form is convenient when comparing relative probabilities and can simplify calculations by focusing on ratios.
More Resources:
- Waterfall diagrams and relative odds
- Introduction to Bayes' Rule: Odds form
- Bayes' Rule: Odds Form
- Bayes' Rule: Vector Form
Fourth Approach: Eliminating Probability Mass
Using the same example, we construct the following table:
Disease | No Disease | Total | |
---|---|---|---|
Test Positive | 9 | 99 | 108 |
Test Negative | 1 | 891 | 892 |
Total | 10 | 990 | 1,000 |
Dividing each entry by 1,000 to get probabilities:
Disease | No Disease | Total | |
---|---|---|---|
Test Positive | 0.009 | 0.099 | 0.108 |
Test Negative | 0.001 | 0.891 | 0.892 |
Total | 0.01 | 0.99 | 1 |
By focusing on the "Test Positive" row, we eliminate the probability mass associated with "Test Negative":
Disease | No Disease | Total | |
---|---|---|---|
Test Positive | 0.009 | 0.099 | 0.108 |
The probability that a person who tests positive actually has the disease is:
\[\frac{0.009}{0.108} \approx 0.0833.\]
This method mirrors the counting approach and offers another perspective on Bayes' Rule.
More Resources:
Fifth Approach: Log Odds
The log odds form of Bayes' Rule is:
\[\log \frac{P(A|B)}{P(\neg A|B)} = \log \frac{P(A)}{P(\neg A)} + \log \frac{P(B|A)}{P(B|\neg A)}.\]
This transforms multiplication into addition, which can simplify calculations, especially when dealing with multiple pieces of evidence. For example, each bit of evidence doubles the odds in favor of an event.
Notes:
- In information theory, the Shannon information content of an outcome \(x\) is defined as:
\[h(x) = \log_2 \left( \frac{1}{P(x)} \right).\]
For example, if \(P(x) = 0.5\), then \(h(x) = 1\) bit. - The difference in self-information between two outcomes is:
\[I(x_1, x_2) = h(x_1) - h(x_2) = \log_2 \left( \frac{P(x_2)}{P(x_1)} \right).\]
More Resources:
Other References
In future blog posts, I will explore the fascinating implications of Bayes' Rule, including its applications in machine learning, scientific reasoning, and everyday decision making. Stay tuned as we delve deeper into how this fundamental principle shapes our understanding of probability and inference.
Comments ()