a bayesian dive
DESCRIPTION
These are the slides from a talk given to Vaidya Fellows and others at the Institute of Ayurveda and Integrative Medicine (IAIM). Simple applications of Bayes' Rule show how inference can be done with clarity.TRANSCRIPT
A Bayesian DiveSomik Raha, Vedika Research
Question
If someone is a haemophiliac, what is your probability that this person is a male?
If someone is a male, what is your probability that this person is a haemophiliac?
Question
If someone is a haemophiliac, what is your probability that this person is a male?
If someone is a male, what is your probability that this person is a haemophiliac?
Haemophilia A (clotting factor VIII deficiency) is the most common form of the disorder, present in about 1 in 5,000–10,000 male births.
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
Willing to be shot if you are wrong!
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
Willing to be shot if you are wrong! And, you are wrong!
Placing a 100% probability on anything implies you are willing to be shot if you are wrong.
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)44%44%
12%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
88%
12%
What if you thought that P(Haemophiliac|Male) = P(Male|Haemophiliac)?
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
44%44%
12%
instead of
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)44%44%
12%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
88%
12%
What if you thought that P(Haemophiliac|Male) = P(Male|Haemophiliac)?
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
44%44%
12%
instead of
Associative Logic Error
Examples of Associative Logic Error
Naseeruddin Shah in Court Scene of “Khuda Key Liye”
Deen mey daari hai, daari mey deen nahiThe faithful have beards, but the beard does not have any faith
Examples of Associative Logic Error
What is the essence of Jainism?
Cultural Jains: Non-violence and vegetarianism
Examples of Associative Logic Error
Mahavira
What is the essence of Jainism?
Cultural Jains: Non-violence and vegetarianism
Examples of Associative Logic Error
Mahavira
What is the essence of Jainism?
Cultural Jains: Non-violence and vegetarianism
Essence: Aliveness of the Universe
Examples of Associative Logic Error
Mahavira
What is the essence of Jainism?
Cultural Jains: Non-violence and vegetarianism
Essence: Aliveness of the Universe
You cannot die vs
Suicide
Question
If someone has lung cancer, what is your probability that this person was a smoker?
If someone is a smoker, what is your probability that this person will get lung cancer?
Smoker given Lung Cancer (n=9) Lung Cancer given Smoker (n=9)
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Smoker,(given(lung(cancer(
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Lung%cancer,%given%smoker%
What do you notice?
33%22%
44%
33%22%
33%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
Smoker given Lung Cancer (n=9) Lung Cancer given Smoker (n=9)
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Smoker,(given(lung(cancer(
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Lung%cancer,%given%smoker%
What do you notice?
33%22%
44%
33%22%
33%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
1 in 5000 males is haemophiliac
95% of all hemophilia cases are male
Condition Gender Joint
0.01% * 95% = 0.01% * 5% =
99.99% * 50% = 99.99% * 50% =
The Math of Probability
LikelihoodPrior
1 in 5000 males is haemophiliac
95% of all hemophilia cases are male
Joint
0.01% * 95% = 0.01% * 5% =
99.99% * 50% = 99.99% * 50% =
The Math of Probability
0.0095%0.0005%
49.995%49.995%
Condition Gender
LikelihoodPrior
Joint
0.01% * 95% = 0.01% * 5% =
The Math of Probability
0.0095%0.0005%
?
???
Condition Gender
99.99% * 50% = 99.99% * 50% =
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
Joint
0.01% * 95% = 0.01% * 5% =
The Math of Probability
0.0095%0.0005%
Condition Gender
99.99% * 50% = 99.99% * 50% =
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
0.0095%
49.995%
??
Joint
0.01% * 95% = 0.01% * 5% =
The Math of Probability
0.0095%0.0005%
Condition Gender
99.99% * 50% = 99.99% * 50% =
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%
Joint
0.01% * 95% = 0.01% * 5% =
The Math of Probability
0.0095%0.0005%
Condition Gender
99.99% * 50% = 99.99% * 50% =
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50% = 50% * ?= 50% * ?
= 50% * ?= 50% * ?
Joint
0.01% * 95% = 0.01% * 5% =
The Math of Probability
0.0095%0.0005%
Condition Gender
99.99% * 50% = 99.99% * 50% =
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50%99.998%
0.019%
0.001%
99.999%
Joint
The Math of Probability
0.0095%0.0005%
Condition Gender
49.995%49.995%
LikelihoodPrior
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50%99.998%
0.019%
0.001%
99.999%
The Math of Probability
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50%99.998%
0.019%
0.001%
99.999%
In this case, your intuition matched the math!
The Math of Probability
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50%99.998%
0.019%
0.001%
99.999%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
In this case, your intuition matched the math!
The Math of Probability
PosteriorPre-Posterior
0.0095%
49.995%
49.995%
0.0005%50%
50%99.998%
0.019%
0.001%
99.999%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
Male given Hemophiliac (n=9) Haemophiliac given Male (n=9)
44%44%
12%
88%
12%
In this case, your intuition matched the math!
Smoker given Lung Cancer (n=9)
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Smoker,(given(lung(cancer(
Now let’s work this example Instructions:1. Fill in the prior and likelihood 2. Calculate joint probability 3. Flipped tree 4. Place joints correctly 5. Calculate pre-posterior
probability (add up joints) 6. Calculate posterior probability
(divide joint by pre-posterior) 7. Report probability of lung
cancer given smoker
Put the probability you thought of over here
Smoker given Lung Cancer (n=9)
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Smoker,(given(lung(cancer(
Now let’s work this example
CDC: 19.3% of all Americans are smokers (2010) Na<onal Cancer Ins<tute: 226,000 Americans in 2012 will be diagnosed with lung cancer US Census Bureau: 313 million people in the US as of Apr 21,2012 % with lung cancer: 0.07% Lung Cancer Prognosis: 8.9% of around 25,000 lung cancer pa<ents were never smokers; therefore 91.1% of lung cancer pa<ents were smokers
Lung Cancer given Smoker (n=9)
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Lung%cancer,%given%smoker%
33%22%
33%
Bayesian mathematics is how our brain is actually wired.
It is the math of common sense.
So what’s the big deal about all this?
Core of machine learning
Spam filters
Alison Gopnik, TED Talk, “What Do Babies Think?”
Turns out this is how we normally learn
Alison Gopnik, TED Talk, “What Do Babies Think?”
Turns out this is how we normally learn
Reflections?
Suggested further reading: “The Theory That Would Not Die”
Ayurvedic probabilistic fun
Arthritis Diagnostic EngineOsteo
Rheumatoid Gout
From Vedika’s Research Labs
Swelling( Symptoms(
Pain(Arthri4s(
Diges4ve(Problems(
Star4ng(Loca4on(
Effect(of(Oiling(
Tongue(Coa4ng(
Relevance Diagrams (this is an exact computational tree)
Swelling Symptoms Pain Digestive ProblemsTight, Inflamed
Inflammation & Redness
General Swelling
Cracking of joints
Loss of appetite
Skin issues
Fixed
Sharp Shooting
Present
Absent
Tongue Coating
No Tongue Coating
…
Trace one pathway of logic
Osteo Arthritis
Rheumatoic Arth
Gout
Cracking of joints
Fixed Pain
Digestive Problems Present
pOsteo
p1p2
p3Joint
= pOsteo * p1 * p2 * p3 = pJoint
Cracking of joints
Fixed Pain
Digestive Problems Present
p1p2
p3= pJointOsteo Arthritis
Rheumatoic Arth
Gout
Flip it! pOsteo*
pOsteo* = pJointp1 * p2 * p3
Let’s try it
Need a volunteer Vaidya
Rest of you please follow along and answer the questions as well
Getting into continuous land
PDF and CDF
No new idea, really!
The thing about binomialsToss n independent coins
p : probability of 1 heads
p(k,n) : probability of getting k heads in n tosses
p(k,n) = C(n,k) * p^k * (1-p)^(n-k)
Appendix
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Historical (n=30) Vaidya Scientists (n = 9)
Male given Hemophilia Haemophilia given Male
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Historical (n=30) Vaidya Scientists (n = 9)63% 100%
Male given Hemophilia Haemophilia given Male
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Historical (n=30) Vaidya Scientists (n = 9)63% 100%
66% 89%
Male given Hemophilia Haemophilia given Male
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Haemophilia*given*Male*
0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#
0%#
>0%#but#<=20%#
>20%#but#<=40%#
>40%#but#<=60%#
>60%#but#<=80%#
>80%#but#<100%#
100%#
Male%given%Haemophilia%
Historical (n=30) Vaidya Scientists (n = 9)63% 100%
66% 89%
Male given Hemophilia Haemophilia given Male