comparing distributions ii: bayes rule and acceptance sampling by peter woolf ([email protected])...
Post on 15-Jan-2016
220 views
TRANSCRIPT
Comparing Distributions II:Bayes Rule and Acceptance Sampling
By Peter Woolf ([email protected])University of Michigan
Michigan Chemical Process Dynamics and Controls Open Textbook
version 1.0
Creative commons
From last lecture found that variations in the product yield were significantly related to runny feed
One solution is to find a way to identify runny feed before it was fed into the process and avoid it.
RunnyfeedometerTM
Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors
You develop an offline tool to detect runny feed using a cone and plate viscometer. The test is inexpensive, but not always accurate due to inhomogeneous feed.
You have a more accurate way of measuring runny feed but it is slow and expensive, so maybe you can get away with multiple reads on the RunnyfeedometerTM?Experimental Data: 100 known runny and 100 known normal samples tested in the RunnyfeedometerTM
P(+ test | runny) = 98:100P(- test | runny) = 2:100P(+ test | normal) = 3:100P(- test | normal) = 97:100
True positiveFalse negativeFalse positiveTrue negative
What are the odds that 9 in 10 tests on a runny sample would all come back positive?
P(+ test | runny) = 98:100P(- test | runny) = 2:100
Question: What are the odds that 9 in 10 tests on a runny sample would all come back positive?
10 combinations
€
10
1
⎛
⎝ ⎜
⎞
⎠ ⎟=
10!
1!(10 −1)!=10
Probability of a particular outcome
(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*(0.98)*(0.02)
Overall probability=probability of a particular outcome* # combinations=10*(0.98)9(0.02)1=0.1667
Possible results:{+,+,+,+,+,+,+,+,+,-}{+,+,+,+,+,+,+,+,-,+}{+,+,+,+,+,+,+,-,+,+}{+,+,+,+,+,+,-,+,+,+}{+,+,+,+,+,-,+,+,+,+}{+,+,+,+,-,+,+,+,+,+}{+,+,+,-,+,+,+,+,+,+}{+,+,-,+,+,+,+,+,+,+}{+,-,+,+,+,+,+,+,+,+}{-,+,+,+,+,+,+,+,+,+}
Note: hard to list if 2 or more fail..
In our case:P(+ test | runny) = 98:100 = pP(- test | runny) = 2:100 = (1-p)
Binomial Distribution
Describes the probability of obtaining k events from N independent samples of a binary outcome with known probability.
Examples: • Odds of getting 20 heads from 30 coin tosses• Odds of finding 3 broken bolts in a box of 100
€
pbinomial (k,N, p) =N
k
⎛
⎝ ⎜
⎞
⎠ ⎟pk (1− p)N−k =
N!
k!(N − k)!pk (1− p)N−k
In Mathematica
Probability of exactly 5 heads out of 10 tossesProbability of 0-5 heads out of 10 tosses
Probability test: What are the odds of getting 5 heads out of 10 coin tosses? (a) 25%
(b) 50%(c) 62%
Probability of exactly 5 heads out of 10 tossesProbability of 0-5 heads out of 10 tosses
Probability test: What are the odds of getting 5 heads out of 10 tosses?
Note axes are off by 1
25%62%
(a) 25%(b) 50%(c) 62%
=5 OkayNo≤5 Okay
RunnyfeedometerTM
Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors
P(+ test | runny) = 98:100P(- test | runny) = 2:100P(+ test | normal) = 3:100P(- test | normal) = 97:100
Given these data what acceptance sampling criteria would be required to correctly identify a normal sample with 99.99% confidence?
Example acceptance sampling criteria: Accept sample if from 10 samples, 3 or fewer test positive
Translation: We want the followingP(normal | 3 or fewer positive results from 10 tests)
Using our binomial distribution we can calculate a related quantity
(0 in 10 positive: very likely normal, 10 in 10: very likely runny)
x
P(x)
Using our binomial distribution we can calculate a related quantity
P(3 or fewer positive results from 10 tests | normal)
€
=10!
i!(10 − i)!pi(1− p)10−i
i= 0
3
∑
Where i=# of positive resultsp= probability of a positive result given a normal feed=0.03
If normal will get ≤3 positive tests with 99% probability!
Not the same!Translation: We want the followingP(normal | 3 or fewer positive results from 10 tests)
1. Joint Probability
2. Conditional Probability
3. Marginalization
Three Probability Definitions
€
P(A,B)
€
P(A |B)
€
P(A) = P(A |Bi)P(Bi)i=1
n
∑
1. Joint Probability
Three Probability Definitions
€
P(A,B)What is the probability of drawing an ace first and then a jack from a deck of 52 cards?
What is the probability of a protein being highly expressed and phosphorylated?
What is the probability that valves A and B both fail?
€
4
52
⎛
⎝ ⎜
⎞
⎠ ⎟
4
51
⎛
⎝ ⎜
⎞
⎠ ⎟
(# highly expressed and phosphorylated proteins)/(total proteins)
(# times A & B fail) (total observations)
2. Conditional Probability
Three Probability Definitions
€
P(A |B)
What is the probability of drawing an ace given that you just drew a jack from a deck of 52 cards?
What is the probability of a protein being highly expressed given that it is phosphorylated?
What is the probability that valve A fails given that B has failed?
€
4
51
⎛
⎝ ⎜
⎞
⎠ ⎟
(# highly expressed phosphorylated proteins)/(total phosphorylated proteins)
(# times A & B fail)(total observations where B fails)
3. Marginalization
Three Probability Definitions
€
P(A) = P(A |Bi)P(Bi)i=1
n
∑
What is the probability of drawing an ace given that you just drew one other card from a deck of 52 cards?
€
P(Ace) = P(Ace | previous ace)P(previous ace) +
P(Ace |¬previous ace)P(¬previous ace)
P(Ace) =3
51
⎛
⎝ ⎜
⎞
⎠ ⎟
4
52
⎛
⎝ ⎜
⎞
⎠ ⎟+
4
51
⎛
⎝ ⎜
⎞
⎠ ⎟48
52
⎛
⎝ ⎜
⎞
⎠ ⎟
in general if independent
Probability Algebra
€
P(A,B) = P(A |B)P(B)
€
⇒ P(A)P(B)
€
P(A,B) = P(A |B)P(B) = P(B | A)P(A)
€
P(A |B) =P(B | A)P(A)
P(B)Bayes’ Rule
We want the followingP(normal | 3 or fewer positive results from 10 tests)
€
P(A |B) =P(B | A)P(A)
P(B)Bayes’ Rule
P(normal | 3 or fewer positive results from 10 tests)=P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
MarginalizeBinomial distribution Prior
P(3 or fewer positive results from 10 tests | normal):
P(normal): from prior observations, what are the odds of getting a batch of normal feed?From previous data found normal feed in 19 of 25 samples, so a first approximation could be 0.76
P(normal | 3 or fewer positive results from 10 tests)=P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
=0.9998
€
=10!
i!(10 − i)!pi(1− p)10−i
i= 0
3
∑
P(3 or fewer positive results from 10 tests): Found by marginalizing over runny and normal
=P(≤3 of 10 positive | runny)P(runny)+ P(≤3 of 10 positive | normal)P(normal)
€
P(A) = P(A |Bi)P(Bi)i=1
n
∑
P(≤3 of 10 positive | runny)
P(+ test | runny) = 98:100
~0% of the time will a runny sample yield ≤3 pos.
P(runny)=1-P(normal)= 0.24
P(normal | 3 or fewer positive results from 10 tests)=P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
€
=10!
i!(10 − i)!pi(1− p)10−i
i= 0
3
∑
P(3 or fewer positive results from 10 tests): Found by marginalizing over runny and normal
€
P(A) = P(A |Bi)P(Bi)i=1
n
∑
=(0)(0.24)+(0.9998)(0.76)=0.75988
P(runny | 3 or fewer positive results from 10 tests)=(0.9998) (0.76)= 1 0.75988
Acceptance sampling criteria will identify runny feeds essentially 100% of the time.. May be too strict!
=P(≤3 of 10 positive | runny)P(runny)+ P(≤3 of 10 positive | normal)P(normal)
P(normal | 3 or fewer positive results from 10 tests)=P(3 or fewer positive results from 10 tests | normal) P(normal)
P(3 or fewer positive results from 10 tests)
Test different acceptance sampling criteria:
Acceptance sampling criteria will identify normal feeds >99.99% of the time
Remember:0 in 10 positive: very likely normal10 in 10 positive: very likely runny0 to 10 positive: no information--> 0 to 6 positive: likely normal
RunnyfeedometerTM
Image from http://controls.engin.umich.edu/wiki/index.php/PHandViscositySensors
Analysis result:If ≤6 of 10 samples report positive then I am >99.99% sure the feed is normal.
Acceptance criteria:If ≤6 of 10 tests are positive, use feed, otherwise reject feed.
Q: What are the odds of rejecting normal feed?
P(normal | 7 or more positive results from 10 tests)=P(7 or more positive results from 10 tests | normal) P(normal)
P(7 or more positive results from 10 tests)
Very rarely..
Take Home Messages
• Acceptance sampling provides an easy to implement way to eliminate variation
• Basic probability rules like Bayes Rule help to rearrange your expressions to get to things you can solve.