chapter 3 discrete random variables and probability distributions 3.1 - random variables.2 -...
TRANSCRIPT
Chapter 3Discrete Random Variables and Probability Distributions
3.1 - Random Variables 3.2 - Probability Distributions for Discrete
Random Variables
3.3 - Expected Values
3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative
Binomial Distributions
3.6 - The Poisson Probability Distribution
POPULATION
Discrete random variable X Examples: shoe size, dosage (mg), # cells,…
Pop values x
Probabilitiesf (x)
Cumul ProbsF (x)
x1 f (x1) f(x1)
x2 f (x2) f(x1) + f(x2)
x3 f (x3) f(x1) + f(x2) + f(x3)
⋮ ⋮ ⋮1
Total 1
all 2 2
all
( )
( ) ( )x
x
x f x
x f x
MeanVariance
X
Total Area = 1
Recall…
~ The Binomial Distribution ~
Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population.
Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.”
Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.
4
For any randomly selected individual, define a binary random variable:
1 if Male, with prob 0.40 if Female, with prob 1 0.6
Y
POPULATION40% Male, 60% Female
RANDOMSAMPLE n = 100
Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)
How can we calculate the probability ofP(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)?P(X = x), for x = 0, 1, 2, 3, …,100?f(x) =
F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?
How can we calculate the probability ofP(X = x), for x = 0, 1, 2, 3, …,100?f(x) =
x f (x)
x1 f (x1)
x2 f (x2)
x3 f (x3)
⋮ ⋮
1
P(X = x), for x = 0, 1, 2, 3, …,100?f(x) = f(25) = P(X = 25)?How can we calculate the probability of
F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?
5
For any randomly selected individual, define a binary random variable:
1 if Male, with prob 0.40 if Female, with prob 1 0.6
Y
POPULATION40% Male, 60% Female
RANDOMSAMPLE n = 100
Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)
Example:
Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where
P(H) = 0.4, P(T) = 0.6
Solution:
.… etc….
permutations of 25 among 100
…etc…etc…etc…
There are 100 possible open slots for H1 to occupy.
X = 25 Heads: { H1, H2, H3,…, H25 }
For each one of them, there are 76 possible open slots left for H25 to occupy.
How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
For each one of them, there are 99 possible open slots left for H2 to occupy.
For each one of them, there are 98 possible open slots left for H3 to occupy.
For each one of them, there are 77 possible open slots left for H24 to occupy.
Hence, there are ?????????????????????? possible outcomes. 100 99 98 … 77 76
How many possible outcomes of n = 100 tosses exist?1002
…
This value is the number of permutations of the coins, denoted 100P25.
HOWEVER…
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
permutations of 25 among 100
How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
How many possible outcomes of n = 100 tosses exist?1002
HOWEVER…100 99 98 … 77 76
X = 25 Heads: { H1, H2, H3,…, H25 }
This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions.For example: We would not want to count this as a distinct outcome.
permutations of 25 among 100
How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
How many possible outcomes of n = 100 tosses exist?1002
HOWEVER…100 99 98 … 77 76
X = 25 Heads: { H1, H2, H3,…, H25 }
This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions.
How many is that? By the same logic…... 25 24 23 … 3 2 1
“25 factorial” - denoted 25!
“100-choose-25” - denoted or 100C25
This value counts the number of combinations of 25 Heads among 100 coins.
10025
100 99 98 … 77 7625 24 23 … 3 2 1
100!_25! 75!
= 100 nCr 25 on your
calculator.
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
What is the probability of each such outcome?
0.4 0.6 0.6 0.4 0.6 . . . . . . 0.6 0.4 0.4 0.6
Answer: Via independence in binary outcomes between any two coins,0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 = .25 75(0.4) (0.6)
Therefore, the probability P(X = 25) is equal to……. 10025
25 75(0.4) (0.6)
How many possible outcomes of n = 100 tosses exist?1002
Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ?
Answer: 10025
Recall that, per toss, P(Heads) = = 0.4 P(Tails) = 1 – = 0.6
Recall that, per toss, P(Heads) = = 0.4 P(Tails) = 1 – = 0.6
How many possible outcomes of n = 100 tosses exist with X = 25 Heads?
1 2 3 4 5 . . . . . . 97 98 99 100
. . . . . .
What is the probability of each such outcome?
0.4 0.6 0.6 0.4 0.6 . . . . . . 0.6 0.4 0.4 0.6
Answer: Via independence in binary outcomes between any two coins,0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 = .25 75(0.4) (0.6)
Therefore, the probability P(X = 25) is equal to……. 10025
25 75(0.4) (0.6)
How many possible outcomes of n = 100 tosses exist?1002
Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ?
Answer: 10025
0.5 0.5 0.5 0.5 0.5 . . . . . . 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 … 0.5 0.5 0.5 0.5 =
100(0.5)
= 0.5 1 – = 0.5
100(0.5)100(1/ 2)100 2
This is the “equally likely” scenario!
1 if Male, with prob 0.40 if Female, with prob 1 0.6
Y“Failure”“Success”
x = 0, 1, 2, 3, …,100 What is the probability P(X = 25)?
F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?
11
For any randomly selected individual, define a binary random variable:
POPULATION40% Male, 60% Female
RANDOMSAMPLE n = 100
Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)
Example:
Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution:
.… etc….
x
100100(0.4) (0.6)x x
x 10025
25 75(0.4) (0.6)
1 –
size n
100100(1 ) x x
x
n
(1 ) x n xnx
Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – .n
“Success” vs. “Failure”
Discrete random variableX = # Males in sample(0, 1, 2, 3, …, n)
independent, with constant probability () per trial
Discrete random variableX = # “Successes” in sample(0, 1, 2, 3, …, n)
Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function”
f(x) = , x = 0, 1, 2, …, n.
x n xnx
(1 )
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Example: Blood Type probabilities, revisited
Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.
Suppose n = 10 individuals are to be selected at random from the population.
Probability table for X = #(Type O)
Binomial model applies?
Check:1. Independent outcomes?
2. Constant probability ?
From table, = P(Type O) = .461 throughout population.
Example: Blood Type probabilities, revisited
X ~ Bin(10, .461)
x f (x) F (x)
0 (.461)0 (.539)10 = 0.002070.0020
7
1 (.461)1 (.539)9 = 0.017700.0197
7
2 (.461)2 (.539)8 = 0.068130.0879
0
3 (.461)3 (.539)7 = 0.155380.2432
8
4 (.461)4 (.539)6 = 0.232570.4758
5
5 (.461)5 (.539)5 = 0.238700.7145
5
6 (.461)6 (.539)4 = 0.170130.8846
8
7 (.461)7 (.539)3 = 0.083150.9678
3
8 (.461)8 (.539)2 = 0.026670.9945
0
9 (.461)9 (.539)1 = 0.005070.9995
7
10 (.461)10 (.539)0 = 0.000431.0000
0
1
Suppose n = 10 individuals are to be selected at random from the population.
Probability table for X = #(Type O)
Binomial model applies.
100
101
102
103
104
105
106
107
108
109
1010
R: dbinom(0:10, 10, .461)Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =
nn (1 – )
= (10)(.461)= 4.61
f(x) = (.461)x (.539)10 – x x10
= 2.48
x f (x) F (x)
0 (.461)0 (.539)10 = 0.002070.0020
7
1 (.461)1 (.539)9 = 0.017700.0197
7
2 (.461)2 (.539)8 = 0.068130.0879
0
3 (.461)3 (.539)7 = 0.155380.2432
8
4 (.461)4 (.539)6 = 0.232570.4758
5
5 (.461)5 (.539)5 = 0.238700.7145
5
6 (.461)6 (.539)4 = 0.170130.8846
8
7 (.461)7 (.539)3 = 0.083150.9678
3
8 (.461)8 (.539)2 = 0.026670.9945
0
9 (.461)9 (.539)1 = 0.005070.9995
7
10 (.461)10 (.539)0 = 0.000431.0000
0
1
100
101
102
103
104
105
106
107
108
109
1010
R: dbinom(0:10, 10, .461)
Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Example: Blood Type probabilities, revisited
X ~ Bin(10, .461)
Suppose n = 10 individuals are to be selected at random from the population.
Probability table for X = #(Type O)
Binomial model applies.
f(x) = (.461)x (.539)10 – x x10
nn (1 – )
= 4.61
= 2.48
X ~ Bin(10, .461)X ~ Bin(1500, .007)
2.48
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Example: Blood Type probabilities, revisited
Suppose n = 10 individuals are to be selected at random from the population.
Probability table for X = #(Type AB–)
n = 1500 individuals are to
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Therefore,
f(x) =
x = 0, 1, 2, …, 1500.
x x
x15001500
(.007) (.993)
RARE EVENT!
Binomial model applies.
Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =
nn (1 – )
= 10.5
= 10.43
Chapter 3Discrete Random Variables and Probability Distributions
3.1 - Random Variables 3.2 - Probability Distributions for Discrete
Random Variables
3.3 - Expected Values
3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative
Binomial Distributions
3.6 - The Poisson Probability Distribution
X ~ Bin(1500, .007)
Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =
Poisson distribution
x = 0, 1, 2, …,
where mean and variance are = n and 2 = n
x
f x =x
( )μe μ
!
Is there a better alternative?
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Example: Blood Type probabilities, revisited
Suppose n = 10 individuals are to be selected at random from the population.
Probability table for X = #(Type AB–)
n = 1500 individuals are to
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Therefore,
f(x) =
x = 0, 1, 2, …, 1500.
x x
x15001500
(.007) (.993)
Binomial model applies.
RARE EVENT!
= 10.5 = 10.5
X ~ Poisson(10.5)
= 10.5
= 10.43
nn (1 – )
Notation: Sometimes the symbol (“lambda”) is used
instead of (“mu”).
Suppose n = 1500 individuals are to be selected at random from the population.
Probability table for X = #(Type AB–)
Poisson distribution
x = 0, 1, 2, …,
where mean and variance are = n and 2 = n
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999
Example: Blood Type probabilities, revisited
Rh Factor
Blood Type + –
O .384 .077 .461
A .323 .065 .388
B .094 .017 .111
AB .032 .007 .039
.833 .166 .999RARE EVENT!
= 10.5 = 10.5
X ~ Poisson(10.5)
x
f x =x
( )e
!
x
x
e 10.5 1 .5)0(
!
Ex: Probability of exactly X = 15 Type(AB–) individuals = ?
Binomial:
15 14851500(.007) (.993)
15Poisson:
e 10.5 15(10.5)
15!(both ≈ .0437)
Therefore,
f(x) =
x = 0, 1, 2, …, 1500.
x x
x15001500
(.007) (.993)
Example: Deaths in Wisconsin
Example: Deaths in Wisconsin Assuming deaths among young adults are relatively rare, we know the following:
• Average 584 deaths per year
λ =
• Mortality rate (α) seems constant.
Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply.
Probability of exactly X = 600 deaths next year
P(X = 600) =e
584 600(584)
600! 0.0131
Probability of exactly X = 1200 deaths in the next two years
P(X = 1200) =e
1168 1200(1168)
1200!0.00746
R: dpois(600, 584)
Mean of 584 deaths per yr Mean of 1168 deaths per two yrs, so let λ = 1168:
Probability of at least one death per day: λ =584 deaths / yr
365 days / yr = 1.6 deaths/day
P(X = 1) + P(X = 2) + P(X = 3) + … P(X ≥ 1) = True, but not practical.
P(X ≥ 1) = 1 – P(X = 0) = 1 –
e 1.6 0(1.6)
0!= 1 – e–1.6 = 0.798
Classical Discrete Probability Distributions
● Binomial ~ X = # Successes in n trials, P(Success) =
● Poisson ~ As above, but n large, small, i.e., Success RARE
● Negative Binomial ~ X = # trials for k Successes, P(Success) =
● Geometric ~ As above, but specialized to k = 1
● Hypergeometric ~ As Binomial, but changes between trials
● Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n