chapter 3 discrete random variables and probability distributions 3.1 - random variables.2 -...

21
Chapter 3 Discrete Random Variables and Probability Distributions 3.1 - Random Variables 3.2 - Probability Distributions for Discrete Random Variables 3.3 - Expected Values 3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative Binomial Distributions

Upload: domenic-carpenter

Post on 18-Jan-2016

281 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Chapter 3Discrete Random Variables and Probability Distributions

3.1 - Random Variables 3.2 - Probability Distributions for Discrete

Random Variables

3.3 - Expected Values

3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative

Binomial Distributions

3.6 - The Poisson Probability Distribution

Page 2: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

POPULATION

Discrete random variable X Examples: shoe size, dosage (mg), # cells,…

Pop values x

Probabilitiesf (x)

Cumul ProbsF (x)

x1 f (x1) f(x1)

x2 f (x2) f(x1) + f(x2)

x3 f (x3) f(x1) + f(x2) + f(x3)

⋮ ⋮ ⋮1

Total 1

all 2 2

all

( )

( ) ( )x

x

x f x

x f x

MeanVariance

X

Total Area = 1

Recall…

Page 3: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

~ The Binomial Distribution ~

Used only when dealing with binary outcomes (two categories: “Success” vs. “Failure”), with a fixed probability of Success () in the population.

Calculates the probability of obtaining any given number of Successes in a random sample of n independent “Bernoulli trials.”

Has many applications and generalizations, e.g., multiple categories, variable probability of Success, etc.

Page 4: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

4

For any randomly selected individual, define a binary random variable:

1 if Male, with prob 0.40 if Female, with prob 1 0.6

Y

POPULATION40% Male, 60% Female

RANDOMSAMPLE n = 100

Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)

How can we calculate the probability ofP(X = 0), P(X = 1), P(X = 2), …, P(X = 99), P(X = 100)?P(X = x), for x = 0, 1, 2, 3, …,100?f(x) =

F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

How can we calculate the probability ofP(X = x), for x = 0, 1, 2, 3, …,100?f(x) =

x f (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

1

Page 5: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

P(X = x), for x = 0, 1, 2, 3, …,100?f(x) = f(25) = P(X = 25)?How can we calculate the probability of

F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

5

For any randomly selected individual, define a binary random variable:

1 if Male, with prob 0.40 if Female, with prob 1 0.6

Y

POPULATION40% Male, 60% Female

RANDOMSAMPLE n = 100

Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)

Example:

Solution: Model the sample as a sequence of independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female), where

P(H) = 0.4, P(T) = 0.6

Solution:

.… etc….

Page 6: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

permutations of 25 among 100

…etc…etc…etc…

There are 100 possible open slots for H1 to occupy.

X = 25 Heads: { H1, H2, H3,…, H25 }

For each one of them, there are 76 possible open slots left for H25 to occupy.

How many possible outcomes of n = 100 tosses exist with X = 25 Heads?

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

For each one of them, there are 99 possible open slots left for H2 to occupy.

For each one of them, there are 98 possible open slots left for H3 to occupy.

For each one of them, there are 77 possible open slots left for H24 to occupy.

Hence, there are ?????????????????????? possible outcomes. 100 99 98 … 77 76

How many possible outcomes of n = 100 tosses exist?1002

This value is the number of permutations of the coins, denoted 100P25.

HOWEVER…

Page 7: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

permutations of 25 among 100

How many possible outcomes of n = 100 tosses exist with X = 25 Heads?

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

How many possible outcomes of n = 100 tosses exist?1002

HOWEVER…100 99 98 … 77 76

X = 25 Heads: { H1, H2, H3,…, H25 }

This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions.For example: We would not want to count this as a distinct outcome.

Page 8: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

permutations of 25 among 100

How many possible outcomes of n = 100 tosses exist with X = 25 Heads?

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

How many possible outcomes of n = 100 tosses exist?1002

HOWEVER…100 99 98 … 77 76

X = 25 Heads: { H1, H2, H3,…, H25 }

This number unnecessarily includes the distinct permutations of the 25 among themselves, all of which have Heads in the same positions.

How many is that? By the same logic…... 25 24 23 … 3 2 1

“25 factorial” - denoted 25!

“100-choose-25” - denoted or 100C25

This value counts the number of combinations of 25 Heads among 100 coins.

10025

100 99 98 … 77 7625 24 23 … 3 2 1

100!_25! 75!

= 100 nCr 25 on your

calculator.

Page 9: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

How many possible outcomes of n = 100 tosses exist with X = 25 Heads?

What is the probability of each such outcome?

0.4 0.6 0.6 0.4 0.6 . . . . . . 0.6 0.4 0.4 0.6

Answer: Via independence in binary outcomes between any two coins,0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 = .25 75(0.4) (0.6)

Therefore, the probability P(X = 25) is equal to……. 10025

25 75(0.4) (0.6)

How many possible outcomes of n = 100 tosses exist?1002

Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ?

Answer: 10025

Recall that, per toss, P(Heads) = = 0.4 P(Tails) = 1 – = 0.6

Page 10: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Recall that, per toss, P(Heads) = = 0.4 P(Tails) = 1 – = 0.6

How many possible outcomes of n = 100 tosses exist with X = 25 Heads?

1 2 3 4 5 . . . . . . 97 98 99 100

. . . . . .

What is the probability of each such outcome?

0.4 0.6 0.6 0.4 0.6 . . . . . . 0.6 0.4 0.4 0.6

Answer: Via independence in binary outcomes between any two coins,0.4 0.6 0.6 0.4 0.6 … 0.6 0.4 0.4 0.6 = .25 75(0.4) (0.6)

Therefore, the probability P(X = 25) is equal to……. 10025

25 75(0.4) (0.6)

How many possible outcomes of n = 100 tosses exist?1002

Question: What if the coin were “fair” (unbiased), i.e., = 1 – = 0.5 ?

Answer: 10025

0.5 0.5 0.5 0.5 0.5 . . . . . . 0.5 0.5 0.5 0.5

0.5 0.5 0.5 0.5 0.5 … 0.5 0.5 0.5 0.5 =

100(0.5)

= 0.5 1 – = 0.5

100(0.5)100(1/ 2)100 2

This is the “equally likely” scenario!

Page 11: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

1 if Male, with prob 0.40 if Female, with prob 1 0.6

Y“Failure”“Success”

x = 0, 1, 2, 3, …,100 What is the probability P(X = 25)?

F(x) = P(X ≤ x), for x = 0, 1, 2, 3, …,100?

11

For any randomly selected individual, define a binary random variable:

POPULATION40% Male, 60% Female

RANDOMSAMPLE n = 100

Discrete random variableX = # Males in sample(0, 1, 2, 3, …, 99, 100)

Example:

Solution: Model the sample as a sequence of n = 100 independent coin tosses, with 1 = Heads (Male), 0 = Tails (Female). Solution:

.… etc….

x

100100(0.4) (0.6)x x

x 10025

25 75(0.4) (0.6)

1 –

size n

100100(1 ) x x

x

n

(1 ) x n xnx

Bernoulli trials with P(“Success”) = , P(“Failure”) = 1 – .n

“Success” vs. “Failure”

Discrete random variableX = # Males in sample(0, 1, 2, 3, …, n)

independent, with constant probability () per trial

Discrete random variableX = # “Successes” in sample(0, 1, 2, 3, …, n)

Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function”

f(x) = , x = 0, 1, 2, …, n.

x n xnx

(1 )

Page 12: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Example: Blood Type probabilities, revisited

Reasonably assume that outcomes “Type O” vs. “Not Type O” between two individuals are independent of each other.

Suppose n = 10 individuals are to be selected at random from the population.

Probability table for X = #(Type O)

Binomial model applies?

Check:1. Independent outcomes?

2. Constant probability ?

From table, = P(Type O) = .461 throughout population.

Page 13: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Example: Blood Type probabilities, revisited

X ~ Bin(10, .461)

x f (x) F (x)

0 (.461)0 (.539)10 = 0.002070.0020

7

1 (.461)1 (.539)9 = 0.017700.0197

7

2 (.461)2 (.539)8 = 0.068130.0879

0

3 (.461)3 (.539)7 = 0.155380.2432

8

4 (.461)4 (.539)6 = 0.232570.4758

5

5 (.461)5 (.539)5 = 0.238700.7145

5

6 (.461)6 (.539)4 = 0.170130.8846

8

7 (.461)7 (.539)3 = 0.083150.9678

3

8 (.461)8 (.539)2 = 0.026670.9945

0

9 (.461)9 (.539)1 = 0.005070.9995

7

10 (.461)10 (.539)0 = 0.000431.0000

0

1

Suppose n = 10 individuals are to be selected at random from the population.

Probability table for X = #(Type O)

Binomial model applies.

100

101

102

103

104

105

106

107

108

109

1010

R: dbinom(0:10, 10, .461)Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =

nn (1 – )

= (10)(.461)= 4.61

f(x) = (.461)x (.539)10 – x x10

= 2.48

Page 14: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

x f (x) F (x)

0 (.461)0 (.539)10 = 0.002070.0020

7

1 (.461)1 (.539)9 = 0.017700.0197

7

2 (.461)2 (.539)8 = 0.068130.0879

0

3 (.461)3 (.539)7 = 0.155380.2432

8

4 (.461)4 (.539)6 = 0.232570.4758

5

5 (.461)5 (.539)5 = 0.238700.7145

5

6 (.461)6 (.539)4 = 0.170130.8846

8

7 (.461)7 (.539)3 = 0.083150.9678

3

8 (.461)8 (.539)2 = 0.026670.9945

0

9 (.461)9 (.539)1 = 0.005070.9995

7

10 (.461)10 (.539)0 = 0.000431.0000

0

1

100

101

102

103

104

105

106

107

108

109

1010

R: dbinom(0:10, 10, .461)

Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Example: Blood Type probabilities, revisited

X ~ Bin(10, .461)

Suppose n = 10 individuals are to be selected at random from the population.

Probability table for X = #(Type O)

Binomial model applies.

f(x) = (.461)x (.539)10 – x x10

nn (1 – )

= 4.61

= 2.48

Page 15: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

X ~ Bin(10, .461)X ~ Bin(1500, .007)

2.48

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Example: Blood Type probabilities, revisited

Suppose n = 10 individuals are to be selected at random from the population.

Probability table for X = #(Type AB–)

n = 1500 individuals are to

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Therefore,

f(x) =

x = 0, 1, 2, …, 1500.

x x

x15001500

(.007) (.993)

RARE EVENT!

Binomial model applies.

Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =

nn (1 – )

= 10.5

= 10.43

Page 16: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Chapter 3Discrete Random Variables and Probability Distributions

3.1 - Random Variables 3.2 - Probability Distributions for Discrete

Random Variables

3.3 - Expected Values

3.4 - The Binomial Probability Distribution 3.5 - Hypergeometric and Negative

Binomial Distributions

3.6 - The Poisson Probability Distribution

Page 17: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

X ~ Bin(1500, .007)

Also, can show mean = x f (x) =and variance 2 = (x – ) 2 f (x) =

Poisson distribution

x = 0, 1, 2, …,

where mean and variance are = n and 2 = n

x

f x =x

( )μe μ

!

Is there a better alternative?

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Example: Blood Type probabilities, revisited

Suppose n = 10 individuals are to be selected at random from the population.

Probability table for X = #(Type AB–)

n = 1500 individuals are to

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Therefore,

f(x) =

x = 0, 1, 2, …, 1500.

x x

x15001500

(.007) (.993)

Binomial model applies.

RARE EVENT!

= 10.5 = 10.5

X ~ Poisson(10.5)

= 10.5

= 10.43

nn (1 – )

Notation: Sometimes the symbol (“lambda”) is used

instead of (“mu”).

Page 18: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Suppose n = 1500 individuals are to be selected at random from the population.

Probability table for X = #(Type AB–)

Poisson distribution

x = 0, 1, 2, …,

where mean and variance are = n and 2 = n

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999

Example: Blood Type probabilities, revisited

Rh Factor

Blood Type + –

O .384 .077 .461

A .323 .065 .388

B .094 .017 .111

AB .032 .007 .039

.833 .166 .999RARE EVENT!

= 10.5 = 10.5

X ~ Poisson(10.5)

x

f x =x

( )e

!

x

x

e 10.5 1 .5)0(

!

Ex: Probability of exactly X = 15 Type(AB–) individuals = ?

Binomial:

15 14851500(.007) (.993)

15Poisson:

e 10.5 15(10.5)

15!(both ≈ .0437)

Therefore,

f(x) =

x = 0, 1, 2, …, 1500.

x x

x15001500

(.007) (.993)

Page 19: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Example: Deaths in Wisconsin

Page 20: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Example: Deaths in Wisconsin Assuming deaths among young adults are relatively rare, we know the following:

• Average 584 deaths per year

λ =

• Mortality rate (α) seems constant.

Therefore, the Poisson distribution can be used as a good model to make future predictions about the random variable X = “# deaths” per year, for this population (15-24 yrs)… assuming current values will still apply.

Probability of exactly X = 600 deaths next year

P(X = 600) =e

584 600(584)

600! 0.0131

Probability of exactly X = 1200 deaths in the next two years

P(X = 1200) =e

1168 1200(1168)

1200!0.00746

R: dpois(600, 584)

Mean of 584 deaths per yr Mean of 1168 deaths per two yrs, so let λ = 1168:

Probability of at least one death per day: λ =584 deaths / yr

365 days / yr = 1.6 deaths/day

P(X = 1) + P(X = 2) + P(X = 3) + … P(X ≥ 1) = True, but not practical.

P(X ≥ 1) = 1 – P(X = 0) = 1 –

e 1.6 0(1.6)

0!= 1 – e–1.6 = 0.798

Page 21: Chapter 3 Discrete Random Variables and Probability Distributions  3.1 - Random Variables.2 - Probability Distributions for Discrete Random Variables.3

Classical Discrete Probability Distributions

● Binomial ~ X = # Successes in n trials, P(Success) =

● Poisson ~ As above, but n large, small, i.e., Success RARE

● Negative Binomial ~ X = # trials for k Successes, P(Success) =

● Geometric ~ As above, but specialized to k = 1

● Hypergeometric ~ As Binomial, but changes between trials

● Multinomial ~ As Binomial, but for multiple categories, with 1 + 2 + … + last = 1 and x1 + x2 + … + xlast = n