5.1 random variable and probability distribution · 5.1 random variable and probability ... of a...

22

Upload: trankhanh

Post on 03-Jul-2018

283 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table
Page 2: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

5.1 Random Variable and Probability Distribution

Definition 1 A random variable is a variable whose value is determined by the outcome of a

random experiment.

Definition 2 A random variable that assumes countable values is called a Discrete Random

Variable.

Some examples of discrete random variables are

1. The number of cars sold at a dealership during a given month

2. The number of people coming to a theater on a certain day

3. The number of complaints received at the office of an airline on a given day

4. The number of customers visiting a bank during any given hour

Definition 3 A random variable that can assume any value contained in one or more intervals is

called a Continuous Random Variable.

The following are a few examples of continuous random variables.

1. The height of a person

2. The time taken to complete an examination

3. The amount of milk in a gallon (note that we do not expect a gallon to contain exactly one

gallon of milk but either slightly more or slightly less than a gallon)

2

Page 3: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

Example 1 Suppose the following gives the frequency and relative frequency distributions of the

vehicles owned by all 2000 families living in a small town.

Number of Vehicles Owned Frequency Relative Frequency

0 30 30/2000 = 0.015

1 470 470/2000 = 0.235

2 850 850/2000 = 0.425

3 490 490/2000 = 0.245

4 160 160/2000 = 0.080

N = 2000 Sum = 1.000Table 1

Suppose one family is randomly selected from this population. The act of randomly selecting a

family is called a random or chance experiment. Let X denote the number of vehicles owned

by the selected family. Then X can assume any of the five possible values (0, 1, 2, 3, and 4) listed

in the first column of Table 1. The value assumed by X depends on which family is selected. Thus,

this value depends on the outcome of a random experiment.

Definition 4 The probability function or probability distribution P (X) of a discrete

random variable can be represented by a formula, a table or a graph, which provides the proba-

bilities P (x) or P (X = x) or f (x) corresponding to each and every value of x.

Example 2 Recall the frequency and relative frequency distributions of the number of vehicles

owned by families given in Table 1. Let X be the number of vehicles owned by a randomly se-

lected family. Write the probability distribution of X.

Number of Relative

Vehicles Owned Frequency Frequency

0 30 0.015

1 470 0.235

2 850 0.425

3 490 0.245

4 160 0.080

N = 2000 Sum = 1.0Table 2

Number of Vehicles Owned Probability

x P (x) = P (X = x)

0 0.015

1 0.235

2 0.425

3 0.245

4 0.080∑

P (x) = 1.0Table 3

3

Page 4: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

We have learned that the relative frequencies obtained from an experiment or a sample can

be used as approximate probabilities. However, when the relative frequencies are known for the

population, they give the actual (theoretical) probabilities of outcomes.

Using the relative frequencies of Table 2, we can write the probability distribution of the discrete

random variable X in Table 3.

From Table 3, the probability for any value of X can be read. For example, the probability that

a randomly selected family from this town owns two vehicles is 0.425. This probability is written

as

P (X = 2) = 0.425

The probability that the selected family owns more than two vehicles is given by the sum of the

probabilities of three and four vehicles, respectively. This probability is 0.245+0.080 = 0.325. This

probability is written as

P (X > 2) = P (X = 3) + P (X = 4) = 0.245 + 0.080 = 0.325

4

Page 5: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

• Three Characteristics of a Probability Distribution

The probability distribution of a discrete random variable possesses the following there charac-

teristics.

1. 0 ≤ P (x) ≤ 1 for each value of x

2.∑

P (x) = 1

3. P (x) = P (X = x)

Example 3 The following table lists the probability distribution of the number of breakdowns per

week for a machine based on past data.

Breakdowns per week 0 1 2 3

Probability 0.15 0.20 0.35 0.30

Find the probability that the number of breakdowns for this machine during a given week is (a)

exactly two (b) zero to two (c) more than one (d) at most one.

Solution: Let X denote the number of breakdowns for this machine during a given week. We

can calculate the required probabilities as follows.

(a) The probability of exactly two breakdowns is

P (exactly 2 breakdowns) = P (X = 2) = 0.35

(b) The probability of zero to two breakdowns is given by the sum of the probabilities of 0, 1,

and 2 breakdowns.

P (0 to 2 breakdowns) = P (X = 0) + P (X = 1) + P (X = 2) = 0.15 + 0.20 + 0.35 = 0.70

(c) The probability of more than one breakdown is obtained by adding the probabilities of 2

and 3 breakdowns.

P (more than 1 breakdown) = P (X > 1) = P (X = 2) + P (X = 3) = 0.35 + 0.30 = 0.65

(d) The probability of at most one breakdown is given by the sum of the probabilities of 0 and

1 breakdown.

P (at most 1 breakdown) = P (X ≤ 1) = P (X = 0) + P (X = 1) = 0.15 + 0.20 = 0.35

5

Page 6: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

5.2 Mean and Variance of a Random Variable

Definition 5 Let X be a discrete random variable with probability distribution P (X). The mean

or expected value is

µ = E (X) =∑

xP (x)

Example 4 Recall Example 3. Find the mean number of breakdowns per week for this machine.

Solution: To find the mean number of breakdowns per week for this machine, we multiply each

value of x by its probability and add these products. This sum gives the mean of the probability

distribution of X.

x P (x) xP (x)

0 0.15 0(0.15) = 0.00

1 0.20 1(0.20) = 0.20

2 0.35 2(0.35) = 0.70

3 0.30 3(0.30) = 0.90∑

xP (x) = 1.80

The mean is µ = E (X) =∑

xP (x) = 1.80. Thus, on average, this machine is expected to break

down 1.80 times per week over a period of time. In other words, if this machine is used for many

weeks, then for certain weeks we will observe no breakdowns; for some other weeks we will observe

1 breakdown per week; and for still other weeks we will observe 2 or 3 breakdowns per week. The

mean number of breakdowns is expected to be 1.80 per week for the entire period.

Example 5 A fair die is tossed. If 2, 3 or 5 occur the player wins that number of dollars, but if 1,

4, or 6 occurs, the player loses that number of dollars. The possible payoffs for the player and their

respective probabilities follow:

x 2 3 5 −1 −4 −6

P (x) 1

6

1

6

1

6

1

6

1

6

1

6

Solution: The negative numbers −1, −4, −6 refer to the fact that the player loses when 1, 4, or

6 occurs. Then the expected value of the game is as follows:

E (X) =∑

xP (x) = 2 ×

(

1

6

)

+ 3 ×

(

1

6

)

+ 5 ×

(

1

6

)

− 1 ×

(

1

6

)

− 4 ×

(

1

6

)

− 6 ×

(

1

6

)

= −1

6

Thus, the game is unfavorable to the player since the expected value is negative.

6

Page 7: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

Definition 6 Let X be a discrete random variable with probability distribution P (X) and mean µ.

The variance of X is

σ2 = V ar (X) =∑

(x − µ)2 P (x)

• Alternative formula

σ2 = V ar (X) =∑

x2P (x) − µ2

Example 6 The following table gives the probability distribution of a random variable X

x 0 1 2 3

P (x) 0.2 0.1 0.3 0.4

Find the variance of the random variable X.

Solution: By definition,

x P (x) xP (x) (x − µ) (x − µ)2 (x − µ)2 P (x)

0 0.2 0(0.2) = 0 0 − 1.9 = −1.9 3.61 0.722

1 0.1 1(0.1) = 0.1 1 − 1.9 = −0.9 0.81 0.081

2 0.3 2(0.3) = 0.6 2 − 1.9 = 0.1 0.01 0.003

3 0.4 3(0.4) = 1.2 3 − 1.9 = 1.1 1.21 0.484

µ =∑

xP (x) = 1.90 σ2 =∑

(x − µ)2 P (x) = 1.29

OR by alternative formula

x P (x) xP (x) x x2P (x)

0 0.2 0(0.2) = 0 0 0

1 0.1 1(0.1) = 0.1 1 0.1

2 0.3 2(0.3) = 0.6 4 1.2

3 0.4 3(0.4) = 1.2 9 3.6

µ =∑

xP (x) = 1.90∑

x2P (x) = 4.9

The variance is

σ2 =∑

x2P (x) − µ2 = 4.9 − (1.9)2 = 1.29.

5.3 The Binomial Probability Distribution

The binomial probability distribution is one of the most widely used discrete probability distri-

butions. To apply the binomial probability distribution, the random variable x must be a discrete

7

Page 8: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

random variable. In other words, the variable must be a discrete random variable and each repe-

tition of the experiment must result in one of two possible outcomes. The binomial distribution is

applied to experiments that satisfy the four conditions of a binomial experiment. Each repetition

of a binomial experiment is called a trial or a Bernoulli trial (after Jacob Bernoulli).

Definition 7 An experiment that satisfies the following four conditions is called a binomial ex-

periment.

1. There are n identical trials. In other words, the given experiment is repeated n times. All

these repetitions are performed under identical conditions.

2. Each trial has two and only two outcomes. These outcomes are usually called a success and

a failure.

3. The probability of success is denoted by p and that of failure by 1 − p. The probabilities p

and 1 − p remain constant for each trial.

4. The trials are independent. In other words, the outcome of one trial does not affect the

outcome of another trial.

Example 7 Five percent of all VCRs manufactured by a large electronics company are defective.

Three VCRs are randomly selected from the production line of this company. The selected VCRs are

inspected to determine if each of them is defective or good. Is this experiment a binomial experiment?

Solution:

1. This example consists of three identical trials. A trial represents the selection of a VCR.

2. Each trial has two outcomes: A VCR is defective or a VCR is good. Let a defective VCR be

called a success and a good VCR be called a failure.

3. Five percent of all VCRs are defective. So, the probability p that a VCR is defective is 0.05.

As a result, the probability 1− p that a VCR is good is 0.95. These two probabilities add up

to 1.

4. Each trial (VCR) is independent. In other words, if one VCR is defective it does not affect the

outcome of another VCR being defective or good. This is so because the size of the population

is very large as compared to the sample size.

8

Page 9: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

Since all four conditions of a binomial experiment are satisfied, this is an example of a binomial

experiment.

Definition 8 The Binomial Probability Distribution – The random variable X that repre-

sents the number of successes in n trials for a binomial experiment is called a binomial random

variable. The probability distribution of X in such experiments is called the binomial probability

distribution or simply binomial distribution.

Theorem 1 (Binomial Formula) For a binomial experiment, the probability of exactly x suc-

cesses in n trials is given by the binomial formula

P (x) = P (X = x) =

(

n

x

)

px (1 − p)n−x

where x = 0, 1, 2, . . . , n and

n = total number of trials

p = probability of success

x = number of successes in n trials

n − x = number of failures in n trials(

n

x

)

=n!

x! (n − x)!and n! = n × (n − 1) × · · · × 2 × 1 and 0! = 1

• Notation – Usually, we use X ∼ B (n, p) to represent that X follows a binomial distribution

with parameters n and p.

9

Page 10: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

Use scientific calculator to find

(

n

x

)

= n SHIFT

nCr

x =

Method 1:

(

6

4

)

= 6 SHIFT

nCr

4 = = 15

Method 2:

(

6

4

)

= 6 SHIFT

x!

÷ 4 SHIFT

x!

÷ 2 SHIFT

x!

= = 15

Method 3:

(

6

4

)

= 6 x! ÷ 4 x!

÷ 2 x! = = 15

Use scientific calculator to find pn = p SHIFT

xy

n =

Method 1: (0.3)5 = 0 · 3 SHIFT

xy

5 = = 0.00243

Method 2: (0.3)5 = 0 · 3 xy 5 = = 0.00243

Example 8 The answer to a true/false question is either correct or incorrect. Assume that (i) an

examination consists of four true/false questions and (ii) a student has no knowledge of the subject

matter. The chance (probability) that the student will guess the correct answer to the first question

is 0.50. Likewise, the probability of guessing each of the remaining questions correctly is 0.50. What

is the probability of:

(a) Getting exactly none of four correct?

10

Page 11: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(b) Getting exactly one of four correct?

(c) Getting exactly two of four correct?

Solution: Note that the binomial conditions are met:

(1) There is a fixed number of trials (i.e. n = 4)

(2) There are only two possible outcomes (i.e. correct/incorrect)

(3) There is a constant probability of success (i.e. p = 0.5)

(4) The trials are independent

(a) The probability of guessing none of the four correctly is 0.0625, found by applying Binomial

formula:

P (X = 0) =

(

4

0

)

(0.5)0 (1 − 0.5)4−0

=4!

0! (4 − 0)!× 1 × 0.54

= 0.0625

Q1 Q2 Q3 Q4 Probability

Case 1 × × × × 0.5 × 0.5 × 0.5 × 0.5

(

4

0

)

= 1

(

4

0

)

= 4 SHIFT

nCr

0 = = 1

(0.5)0 = 0.5 SHIFT

xy

0 = = 1

(0.5)4 = 0.5 SHIFT

xy

4 = = 0.0625

(b) The probability of getting exactly one of four correct is 0.2500, found by:

P (X = 1) =

(

4

1

)

(0.5)1 (1 − 0.5)4−1

=4!

1! (4 − 1)!× 0.5 × 0.53

= 0.25

Q1 Q2 Q3 Q4 Probability

Case 1 X × × × 0.5 × 0.5 × 0.5 × 0.5

Case 2 × X × × 0.5 × 0.5 × 0.5 × 0.5

Case 3 × × X × 0.5 × 0.5 × 0.5 × 0.5

Case 4 × × × X 0.5 × 0.5 × 0.5 × 0.5

(

4

1

)

= 4

11

Page 12: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(

4

1

)

= 4 SHIFT

nCr

1 = = 4

(0.5)1 = 0.5 SHIFT

xy

1 = = 0.5

(0.5)3 = 0.5 SHIFT

xy

3 = = 0.125

(c) The probability of getting exactly two of four correct is 0.3750, found by:

P (X = 2) =

(

4

2

)

(0.5)2 (1 − 0.5)4−2

=4!

2! (4 − 2)!× 0.52

× 0.52

= 0.375

Q1 Q2 Q3 Q4 Probability

Case 1 X X × × 0.5 × 0.5 × 0.5 × 0.5

Case 2 X × X × 0.5 × 0.5 × 0.5 × 0.5

Case 3 X × × X 0.5 × 0.5 × 0.5 × 0.5

Case 4 × X X × 0.5 × 0.5 × 0.5 × 0.5

Case 5 × X × X 0.5 × 0.5 × 0.5 × 0.5

Case 6 × × X X 0.5 × 0.5 × 0.5 × 0.5

(

4

2

)

= 6

(

4

2

)

= 4 SHIFT

nCr

2 = = 6

(0.5)2 = 0.5 SHIFT

xy

2 = = 0.25

(0.5)2 = 0.5 SHIFT

xy

2 = = 0.25

Example 9 At the Express House Delivery Service, providing high-quality to its customers is the

top priority of the management. The company guarantees a refund all charges if a package it is

delivering does not arrive at its destination by the specified time. It is known from past data that

despite all efforts, 2% of the packages mailed through company do not arrive at their destinations

within the specified time. A corporation mailed 10 packages through Express House Delivery Service

on Monday.

(a) Find the probability that exactly two of these 10 packages will not arrive at its destination

within the specified time.

12

Page 13: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(b) Find the probability that at most one of these 10 packages will not arrive at its destination

within the specified time.

(c) Find the probability that at least two of these 10 packages will not arrive at its destination

within the specified time.

Solution: Let us call it a success if a package does not arrive at its destination within the

specified time and a failure if it does arrive within the specified time. Then,

n = total number of packages mailed = 10,

p = P (success) = 0.02,

1 − p = P (failure) = 1 − 0.02 = 0.98

(a) For this part,

x = number of successes = 2, n − x = number of failures = 10 − 2 = 8

Substituting all values in the binomial formula, we obtain

P (X = 2) =

(

10

2

)

(0.02)2(1 − 0.02)10−2

=10!

2!8!(0.02)2(0.98)8

= (45)(0.0004)(0.8508) = 0.0153

Thus, there is a 0.0153 probability that exactly two of the 10 packages mailed will not arrive at its

destination within the specified time.

(

10

2

)

= 10 SHIFT

nCr

2 = = 45

(0.02)2 = 0.02 SHIFT

xy

2 = = 0.0004

(0.98)8 = 0.98 SHIFT

xy

8 = = 0.8508

13

Page 14: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(b) The probability that at most one of the 10 packages will not arrive at its destination within

the specified time is given by the sum of the probabilities of X = 0 and X = 1. Thus,

P (X ≤ 1) = P (X = 0) + P (X = 1)

=

(

10

0

)

(0.02)0(0.98)10 +

(

10

1

)

(0.02)1(0.98)9

= (1)(1)(0.8171) + (10)(0.02)(0.8337)

= 0.8171 + 0.1667 = 0.9838

Thus, the probability that at most one of the 10 packages will not arrive at its destination within

the specified time is 0.9838.

(c) The probability that at least two of the 10 packages will not arrive at its destination within

the specified time is given by the sum of the probabilities of X = 2, X = 3, X = 4, . . . , X = 9 and

X = 10. Thus,

P (X ≥ 2) = 1 − P (X = 0) − P (X = 1)

= 1 −

(

10

0

)

(0.02)0(0.98)10−

(

10

1

)

(0.02)1(0.98)9

= 1 − 0.8171 − 0.1667

= 0.0162

Thus, the probability that at least two of the 10 packages will not arrive at its destination within

the specified time is 0.0162.

Theorem 2 The mean and variance of the binomial distribution are

µ = E (X) = np and σ2 = V ar (X) = np (1 − p)

5.4 The Poisson Probability Distribution

The Poisson probability distribution is applied to experiments with random and independent

occurrences. The occurrences are random in the sense that they do not follow any pattern and,

hence, they are unpredictable. Independence of occurrences means that one occurrence (or nonoc-

currence) of an event does not influence the successive occurrences or nonoccurrence of that event.

The occurrences are always considered with respect to an interval. The interval may be a time

interval, a space interval, or a volume interval. The actual number of occurrences within an interval

14

Page 15: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

is random and independent. If the average number of occurrences for a given interval is known, then

by using the Poisson probability distribution we can compute the probability of a certain number

of occurrences X in that interval.

• Conditions to apply Poisson probability distribution

The following three conditions must be satisfied to apply the Poisson probability distribution.

1. The experiment consists of counting the number, X, of times a particular event occurs during

a given unit of time, or a given area or volume (or weight, or distance, or any other unit of

measurement)

2. The probability that an event occurs in a given unit of time, area, or volume is the same for

all the units.

3. The number of events that occur in one unit of time, area, or volume is independent of the

number that occur in other units.

The following are a few examples of discrete random variables for which the occurrences are

random and independent. Hence, these are examples to which the Poisson probability distribution

can be applied.

1. Consider the number of patients arriving at the emergency ward of a hospital during a one-

hour interval. In this example, an occurrence is the arrival of a patient at the emergency

ward, the interval is one hour (an interval of time), and the occurrences are random.

The total number of patients who may arrive at this emergency ward during a one-hour

interval may be 0, 1, 2, 3, 4, .... The independence of occurrences in this example means that

patients arrive individually and the arrival of any two (or more) patients is not related.

2. Consider the number of defective items in the next 10,000 items manufactured on a machine.

In this case, the interval is a volume interval (10,000 items). The occurrences (number of

defective items) are random because there may be 0, 1, 2, 3, ..., 10000 defective items in 10,000

items. We can assume the occurrence of defective items to be independent of one another.

3. Consider the number of defects in a 5-foot-long iron rod. The interval, in this example, is

a space interval (5 feet). The occurrences (defects) are random because there may be any

15

Page 16: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

number of defects in a 5-foot iron rod. We can assume that these defects are independent of

one another.

In the Poisson probability distribution terminology, the average number or occurrences in an

interval is denoted by µ. The actual number of occurrences in that interval is denoted by X.

Then, using the Poisson probability distribution, we find the probability of X occurrences during

an interval given the mean occurrences are µ during that interval.

Theorem 3 (Poisson Probability Distribution Formula) According to the Poisson probabil-

ity distribution, the probability of x occurrences in an interval is

P (x) = P (X = x) =µxe−µ

x!

where µ is the mean number of occurrences in that interval and x = 0, 1, 2, 3, . . ..

Example 10 A washing machine in a laundry breaks down an average of three times per month.

Using the Poisson probability distribution formula, find the probability that during the next month

this machine will have (a) exactly two breakdowns (b) at most one breakdown.

Solution: Let µ be the mean number of breakdowns per month and X be the actual number of

breakdowns observed during the next month for this machine. Then, µ = 3.

(a) The probability that exactly two breakdowns will be observed during the next month is

P (X = 2) =µxe−µ

x!=

(3)2 e−3

2!=

(9)(0.049787)

2= 0.224

(3)2 = 3 SHIFT

xy

2 = = 9

e−3 = 3 +/− SHIFT

ex

= 0.049787

2! = 2 SHIFT

x!

= 2

16

Page 17: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(b) The probability that at most one breakdown will be observed during the next month is given

by the sum of the probabilities of zero and one breakdown. Thus,

P (at most 1 breakdown) = P (X ≤ 1)

= P (X = 0) + P (X = 1)

=(3)0 e−3

0!+

(3)1 e−3

1!

=(1)(0.049787)

1+

(3)(0.049787)

1

= 0.0498 + 0.1494

= 0.1992

Example 11 A study of the lines at the checkout registers of Wellcome Supermarket revealed that,

during a certain period at the rush hour, the number of customers waiting averaged four. What is

the probability that during that period:

(a) No customers were waiting?

(b) Three customers were waiting?

(c) Three or fewer were waiting?

(d) Two or more were waiting?

Solution: Let X be the number customers who were waiting. Note that µ = 4.

(a)

P (X = 0) =40e−4

0!= 0.0183

(b)

P (X = 3) =43e−4

3!= 0.1954

(c)

P (X ≤ 3) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3)

=40e−4

0!+

41e−4

1!+

42e−4

2!+

43e−4

3!

= 0.0183 + 0.0733 + 0.1465 + 0.1954

= 0.4335

17

Page 18: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

(d)

P (X ≥ 2) = 1 − P (X < 2)

= 1 − P (X = 0) − P (X = 1)

= 1 −40e−4

0!−

41e−4

1!

= 1 − 0.0183 − 0.0733

= 0.9084

Theorem 4 The mean and variance of the Poisson distribution are

µ = E (X) = µ and σ2 = V ar (X) = µ

5.5 Poisson Approximation to the Binomial Distribution

The computation of probabilities for a binomial distribution could be tedious if the number of trials

n is large. In the sub-section, we note that the Poisson distribution can be used to approimate the

binomial probabilities when the number of trials n is large and at the same time the probability p

is small (generally such that µ = np ≤ 7).

Proposition 1 Let X be the number of success resulting from n independent trials, each with

probability of success p. The distribution of the number of success X is binomial, with mean np. If

the number of trials n is large and np is of only moderate size (perferably np ≤ 7), this distribution

can be approximated by the Poisson distribution with µ = np. The probability of the approximating

distribution is then:

P (X = x) =e−np(np)x

x!for x = 0, 1, 2, 3, . . . (1)

Example 12 An analyst predicted that 3.5% of small corporations would file for bankruptcy in the

coming year. For a random sample of 100 small corporations, estimate the probability that at least

three will file for bankrupty in the next year, assuming that the analyst’s prediction is correct.

The distribution of the number X of of filinfs for bankrupty is binomial with n = 100 and

p = 0.035, so that the mean of the distribution is µ = np = 3.5. Using the Poisson distribution to

18

Page 19: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

approximate the probability of at least three bankruptcies, we find

P (X ≥ 3) = 1 − P (X ≤ 2)

= 1 − (e−3.5(3.5)0

0!+

e−3.5(3.5)1

1!+

e−3.5(3.5)2

2!)

= 1 − (0.03 + 0.106 + 0.185)

= 0.679.

5.6 Hypergeometric Probability Distribution

One of the applications of the binomial probability distribution is its use as the probability distribu-

tion for the number X of favorable (or unfavorable) responses in a public opinion or market survey.

Ideally, it is applicable when sampling is with replacement—that is, each item drawn from the

population is observed and returned to the population before the next item is drawn. Practically

speaking, the sampling in surveys is rarely conducted with replacement. Nevertheless, the binomial

probability distribution is still appropriate if the number N of elements in the population is large

and the sample size n is small relative to N .

When sampling is without replacement, and the number of elements N in the population is

small (or when the sample size n is large relative to N), the number of “successes” in a random

sample of n items has a hypergeometric probability distribution. The defining characteristics and

probability for a hypergeometric random variable are stated in the boxes.

Characteristics That Define a Hypergeometric Random Variable

1. The experiment consists of randomly drawing n elements without replacement from a set of

N elements, r of which are “successes” and N − r of which are “failures.”

2. The hypergeometric random variable X is the number of r’s in the draw of n elements.

Theorem 5 (Hypergeometric Probability Distribution Formula) For a Hypergeometric Prob-

ability Distribution, the probability of exactly x successes in n trials is given by the Hypergeometric

Probability distribution formula

P (x) = P (X = x) =

(

r

x

)(

N − r

n − x

)

(

N

n

)

19

Page 20: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

where

N = Number of elements in the population

r = Number of “successes” in the population

n = Sample size

x = Number of “successes” in the sample

Assumptions:

1. The sample of n elements is randomly selected from the N elements of the population.

2. The value of X is restricted so that all factorials are nonnegative.

Example 13 Of six lecturers, four have been with the college five or more years. If three lecturers

are chosen randomly from the group of six, find the probability that exactly two will have five or

more years seniority.

Solution: For this problem, the number of elements in the population is N = 6, the number of

successes is r = 4, and the sample size is n = 3. The required probability is

P (X = 2) =

(

4

2

)(

6−4

3−2

)

(

6

3

)

=

(

4

2

)(

2

1

)

(

6

3

)

=(6) (2)

20

=12

20

=3

5

Example 14 Suppose you plan to select four of 10 new stock issues (stock issued by new companies)

and that, unknown to you, three of the 10 will result in substantial profits and seven will result in

losses.

(a) What is the probability that all three of the profitable issues will appear in your selection?

(b) What is the probability that at least two of the three profitable issues will appear in your

selection?

20

Page 21: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

Solution: For this problem, the number of elements in the population is N = 10, the number of

successes is r = 3, and the sample size is n = 4.

(a) We use the formula for the hypergeometric probability distribution to compute the proba-

bility of observing exactly X = 3 successes in the sample of n = 4.

P (X = 3) =

(

3

3

)(

10−3

4−3

)

(

10

4

)

=

(

3

3

)(

7

1

)

(

10

4

)

=1 × 7

210

=1

30

(b) The probability of observing at least two successes in the sample of n = 4 is

P (X ≥ 2) = P (X = 2) + P (X = 3)

=

(

3

2

)(

10−3

4−2

)

(

10

4

) +

(

3

3

)(

10−3

4−3

)

(

10

4

)

=

(

3

2

)(

7

2

)

(

10

4

) +

(

3

3

)(

7

1

)

(

10

4

)

=3 × 21

210+

1 × 7

210

=9

30+

1

30

=10

30

=1

3

Theorem 6 The mean and variance of the Hypergeometric Distribution are

µ = E (X) =nr

Nand σ2 = V ar (X) = n

( r

N

)

(

N − r

N

) (

N − n

N − 1

)

The expression (N − n) / (N − 1) is known as the finite population correction factor.

5.7 Negative Binomial Distribution

Suppose that independent trials, each having probability p, 0 < p < 1, of being a success are

performed until a total of r successes is accumulated. If we let X be the number of trials required,

21

Page 22: 5.1 Random Variable and Probability Distribution · 5.1 Random Variable and Probability ... of a discrete random variable can be represented by a ... Example 3 The following table

then

P (X = k) = Ck−1r−1 pr(1 − p)k−r k = r, r + 1, . . . (2)

The above equation follows because, in order for the r-th success to occur at the n-th trial, there

must be r-1 successes in the first n-1 trials, and the n-th trial must be a success. Any random

variable X whose probability mass function is given by (2) is said to be a negative binomial random

variable with parameters (r, p), written as X ∼ NegBin(r, p).

Theorem 7 The mean and variance of the Negative Binomial distribution with parameters r, p are

equal tor

pand

r(1 − p)

p2respectively.

Example 15 If independent trials, each resulting in a success with probability p, are performed,

what is the probability of r successes occurring before m failures?

The solution will be arrived at by noting that r successes will occur before m failures if and

only if the r-th success occurs no later than the r + m − 1 trial. This follows because if the r-th

success occurs before or at the r + m − 1 trial, then it must have occurred before the m-th failure,

and conversely. Hence, from equation (2), the desired probability is

r+m−1∑

k=r

Ck−1r−1 pr(1 − p)k−r.

Example 16 A company takes out an insurance policy to cover accidents that occur at its manu-

facturing plant. The probability that one or more accidents will occur during any given month is3

5.

The number of accidents that occur in any given month is independent of the number of accidents

that occur in all other months. Calculate the probability that there will be at least four months in

which no accidents occur before the fourth month in which at least one accident occurs.

If a month with one or more accidents is regarded as success and X be the number of failures be-

fore the fourth success, then X follows a negative binomial distribution of the requested probability

is

P (X ≥ 4) = 1 − P (X ≤ 3)

= 1 −

3∑

k=0

C3+kk

(

3

5

)4 (

2

5

)k

= 1 −

(

3

5

)4 (

1 +8

5+

8

5+

32

25

)

= 0.2898.

22