continuous random variables

98
Continuous Random Variables

Upload: blaze

Post on 05-Jan-2016

29 views

Category:

Documents


1 download

DESCRIPTION

Continuous Random Variables. Consider the following table of sales, divided into intervals of 1000 units each,. and the relative frequency of each interval. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Continuous Random Variables

Continuous Random Variables

Page 2: Continuous Random Variables

Consider the following table of sales, divided into intervals of 1000 units each,

interval

(0,1000]

(1000,2000]

(2000,3000]

(3000,4000]

(4000,5000]

(5000,6000]

(6000,7000]

Page 3: Continuous Random Variables

and the relative frequency of each interval.

intervalrelative

freq.

(0,1000] 0

(1000,2000] 0.05

(2000,3000] 0.25

(3000,4000] 0.30

(4000,5000] 0.25

(5000,6000] 0.10

(6000,7000] 0.05

1.00

Page 4: Continuous Random Variables

We’re going to divide the relative frequencies by the width of the cells (which here is 1000).

This will make the graph have an area of 1.

intervalrelative

freq.

(0,1000] 0 0

(1000,2000] 0.05 0.00005

(2000,3000] 0.25 0.00025

(3000,4000] 0.30 0.00030

(4000,5000] 0.25 0.00025

(5000,6000] 0.10 0.00010

(6000,7000] 0.05 0.00005

widthcell

freq. relative f(x)

Page 5: Continuous Random Variables

Graph

interval

(0,1000] 0

(1000,2000] 0.00005

(2000,3000] 0.00025

(3000,4000] 0.00030

(4000,5000] 0.00025

(5000,6000] 0.00010

(6000,7000] 0.00005

widthcell

freq. relative f(x)

0 1000 2000 3000 4000 5000 6000 7000

sales

f(x) = p(x)

0.00030

0.00025

0.00020

0.00015

0.00010

0.00005

0

The area of each bar is the frequency of the category, so the total area is 1.

Page 6: Continuous Random Variables

Graph

interval

(0,1000] 0

(1000,2000] 0.00005

(2000,3000] 0.00025

(3000,4000] 0.00030

(4000,5000] 0.00025

(5000,6000] 0.00010

(6000,7000] 0.00005

widthcell

freq. relative f(x)

0 1000 2000 3000 4000 5000 6000 7000

sales

f(x) = p(x)

0.00030

0.00025

0.00020

0.00015

0.00010

0.00005

0

Here is the frequency polygon.

Page 7: Continuous Random Variables

If we make the intervals 500 units instead of 1000, the graph would probably look something like this:

sales

f(x) = p(x)

The height of the bars increases and decreases more gradually.

Page 8: Continuous Random Variables

If we made the intervals infinitesimally small, the bars and the frequency polygon would become

smooth, looking something like this:

f(x) = p(x)

sales

This what the distribution of a continuous random variable looks like.

This curve is denoted f(x) or p(x) and is called the probability density function.

Page 9: Continuous Random Variables

pmf versus pdf

For a discrete random variable, we had a probability mass function (pmf).

The pmf looked like a bunch of spikes, and probabilities were represented by the heights of the spikes.

For a continuous random variable, we have a probability density function (pdf).

The pdf looks like a curve, and probabilities are represented by areas under the curve.

Page 10: Continuous Random Variables

Pr(a < X < b)

f(x) = p(x)

salesa b

Page 11: Continuous Random Variables

A continuous random variable has an infinite number of possible values & the probability

of any one particular value is zero.

Page 12: Continuous Random Variables

If X is a continuous random variable, which of the following probabilities is largest?

(Hint: This is a trick question.)

1. Pr(a < X < b)

2. Pr(a ≤ X < b)

3. Pr(a < X ≤ b)

4. Pr(a ≤ X ≤ b)

They’re all equal.

They differ only in whether they include the individual values a and b, and any one particular value has zero probability!

Page 13: Continuous Random Variables

Properties of probability density functions (pdfs)

1. f(x) ≥ 0 for values of xThis means that when we draw the pdf

curve, while it may be on the left side of the vertical axis (have negative values of x), it can not go below the horizontal axis, where f would be negative.

Pr( - ∞ < X < ∞) = 1The total area under the pdf curve, which

corresponds to the total probability, is 1.

Page 14: Continuous Random Variables

Example

f(x) = 2 if 1 ≤ x ≤ 1.5 and f(x) = 0 otherwise

0 1.0 1.5 x

f(x)

2.0

This function satisfies both the properties of pdfs.

First, it’s never negative.

Second, the total area under the curve is (1/2) (2) = 1.

Page 15: Continuous Random Variables

Cumulative Distribution Function for a Continuous Random Variable

F(x) = Pr(X ≤ x) = area under the f(x) curve up to where X=x.

Page 16: Continuous Random Variables

Rectangle Example: What is F(1.2)?

F(1.2) = Pr(X ≤ 1.2)= the area under the pdf up to where x is 1.2.

0 1.0 1.5 x

f(x)

2.0

Page 17: Continuous Random Variables

Rectangle Example: What is F(1.2)?

F(1.2) = Pr(X ≤ 1.2)= the area under the pdf up to where x is 1.2.

0 1.0 1.2 1.5 x

f(x)

2.0

= (0.2) (2.0)

= 0.4

Page 18: Continuous Random Variables

Continuous Uniform Distributions.

Distributions like our rectangle example are called uniform distributions.

Notice that there are both discrete uniform distributions that we discussed earlier and continuous uniform distributions.

Page 19: Continuous Random Variables

The continuous uniform distribution has the following form:

1( ) 1b a

b a

1

b a

1( ) if

( ) 0 otherwise

f x a x bb a

f x

Notice that the area of the rectangle will always be 1 because

area = length • width

( )f x

0 a b x

Page 20: Continuous Random Variables

Mean, variance, and standard deviation of the continuous uniform distribution

mean: 2

a b

variance:2( )2

12

b a

standard deviation:2( )

12

b a

Page 21: Continuous Random Variables

The most famous distribution is the Normal or Gaussian distribution.

Its probability density function (pdf) is

-for 2

1)(

2]/)[(2

1

xexfx

is the mean of the distribution, is the standard deviation, . 3.14 and ,718.2 e

It is sometimes denoted N (, 2), which means the normal distribution with a mean of and a variance of 2.

Page 22: Continuous Random Variables

If you have three normal distributions with the same standard deviation (same spread),

but different means (different averages), they would look like this:

1 2 3

Page 23: Continuous Random Variables

If you had the same mean but different standard deviations, it would look like this:

large standard deviation

Page 24: Continuous Random Variables

If you had the same mean but different standard deviations, it would look like this:

medium standard deviation

large standard deviation

Page 25: Continuous Random Variables

If you had the same mean but different standard deviations, it would look like this:

Keep in mind that the areas are all the same, since they all equal 1.

small standard deviation

middle standard deviation

largest standard deviation

Page 26: Continuous Random Variables

Recall that if a random variable has mean and standard deviation , then (X-)/ has mean 0 and standard deviation 1.

If X is normally distributed, then (X-)/ will be standard normal, N( normal with mean 0 and variance 1.

This theorem is extremely useful.It means that we don’t need to use the messy normal formula.We can standardize any normal distribution and look up probabilities in tables for the standard normal distribution.

Page 27: Continuous Random Variables

Using the standard normal table is not difficult, but it takes practice to get accustomed to it.

The table in your textbook gives probabilities that the standard normal (often called Z) is less than a particular number, that is Pr(Z ≤ a).

Some tables are set up differently, so you need to notice how a table is computed when you use it.

For example, a book we used previously gave probabilities that Z is between zero and a positive number, that is, Pr(0 ≤ Z ≤ a).

Given any setup, you can always calculate the probabilities that you need.

Page 28: Continuous Random Variables

Z table: You get the integer part & the 1st decimal from the left column & the second decimal from the top row.

0 2.63 Z

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

0.0

0.1

0.2

2.6 .9957

3.0

Example: Pr(Z ≤ 2.63) = 0.9957

0.9957

?

Page 29: Continuous Random Variables

Do not memorize a lot of rules. You just need to remember 2 easy facts.

1. The graph is symmetric about 0.2. The total area under the curve is 1.

0 Z

Page 30: Continuous Random Variables

Example

Pr(Z < 1.85)

0 1.85 Z

Page 31: Continuous Random Variables

Example

Pr(Z < 1.85) = 0.9678

0 1.85 Z

0.9678

Page 32: Continuous Random Variables

Example

Pr(0 < Z < 1.85)

0 1.85 Z

0.9678

Page 33: Continuous Random Variables

Example

Pr(0 < Z < 1.85) = 0.9678 – 0.5 = 0.4678

0 1.85 Z

0.46780.5

0.9678

Page 34: Continuous Random Variables

Example

Pr(Z > 1.85)

0 1.85 Z

?0.9678

Page 35: Continuous Random Variables

Example

Pr(Z > 1.85) = 1 - 0.9678 = 0.0322

0 1.85 Z

0.03220.9678

Page 36: Continuous Random Variables

Example

Pr(Z < -1.85)

There are two ways to do this.

-1.85 0 1.85 Z

0.96780.0322

Pr( Z < -1.85) = Pr(Z > 1.85) = 0.0322

Page 37: Continuous Random Variables

Example

The second way to determine Pr(Z < -1.85)

is directly from the negative part of the Z table.

-1.85 0 Z

Pr( Z < -1.85) = 0.0322

Page 38: Continuous Random Variables

Example

Pr(Z > -1.85)

-1.85 0 1.85 Z

Page 39: Continuous Random Variables

Example

Pr(Z > -1.85) = Pr(Z < 1.85) = 0.9678

-1.85 0 1.85 Z

Page 40: Continuous Random Variables

Example

Pr(-1< Z < 2)

-1.00 0 2.00 Z

Page 41: Continuous Random Variables

Example

Pr(-1< Z < 2) = 0.9772 - 0.1587 = 0.8185

-1.00 0 2.00 Z

0.1587 0.9772

Page 42: Continuous Random Variables

Example

Pr(1< Z < 2)

0 1 2 Z

?

Page 43: Continuous Random Variables

Example

Pr(1< Z < 2) = 0.9772 - 0.8413 = 0.1359

0 1 2 Z

0.8413 0.9772

Page 44: Continuous Random Variables

Example

0 7 Z

Pr( Z < 7)

Page 45: Continuous Random Variables

Example

0 7 Z

Pr( Z < 7) = 1.0000 (to 4 decimal places)

Page 46: Continuous Random Variables

Example

0 7 Z

Pr( Z > 7)

Page 47: Continuous Random Variables

Example

0 7 Z

Pr( Z > 7) = 0.0000 (to 4 decimal places)

Page 48: Continuous Random Variables

Example

0 7 Z

Pr(0 < Z < 7)

Page 49: Continuous Random Variables

Example

0 7 Z

Pr(0 < Z < 7) = 0.5000 (to 4 decimal places)

Page 50: Continuous Random Variables

Example

What is the value of a such that Pr(Z < a) = 0.9207 ?

0 a Z

0.9207

Page 51: Continuous Random Variables

Example

What is the value of a such that Pr(Z < a) = 0.9207 ?

0 1.41 Z

0.9207

Page 52: Continuous Random Variables

Example

What is the value of b such that Pr(Z > b) = 0.0250 ?

0 b Z

0.0250

Page 53: Continuous Random Variables

Example

What is the value of b such that Pr(Z > b) = 0.0250 ?

0 b Z

0.02501 – 0.0250 = 0.9750

Page 54: Continuous Random Variables

Example

What is the value of b such that Pr(Z > b) = 0.0250 ?

0 1.96 Z

0.02501 – 0.0250 = 0.9750

Page 55: Continuous Random Variables

Example

What is the value of k such that Pr(0 < Z < k) = 0.4750 ?

0 k Z

0.4750

0.5 + 0.4750 = 0.9750

Page 56: Continuous Random Variables

Example

What is the value of k such that Pr(0 < Z < k) = 0.4750 ?

0 1.96 Z

0.4750

0.5 + 0.4750 = 0.9750

Page 57: Continuous Random Variables

Example: If X is N(2, 9), determine Pr(X ≤ 5).

5) Pr(X

)1Pr( Z

8413.0

5-X

Pr

3

25-XPr

0 1.00 Z

0.8413

Page 58: Continuous Random Variables

Useful Fact

The distribution of the individual observation is the same as the distribution of the population from which it was drawn.

For example, if the mean height of a population of men is 70 inches, then the expected value or mean of a randomly selected man will be 70 inches.

Also, if 5% of the population of men is over 78 inches, then the probability that a randomly selected man will be over 78 inches tall is 5%.

Page 59: Continuous Random Variables

60) Pr(X

)33.1Pr( Z 0918.0

Example: Suppose that women’s heights are normally distributed with mean 64 inches & standard

deviation 3 inches. What is the probability that a randomly selected woman is under five feet tall?

-1.33 0 Z

3

6460-XPr

60-X

Pr

Page 60: Continuous Random Variables

Sample Distribution of

xthe probability distribution of all possible values of

that could occur when a sample of size n is

taken from some specified population

X

Page 61: Continuous Random Variables

Example: Suppose we have a population of chips, 1/3 of which have a 1 on them, 1/3 have a 2, & 1/3 have a 3.

(a) Show in table form the distribution of the sample mean (with n=2, sampled with replacement).

(b) Graph the distribution of the sample mean.

(c) Graph the distribution of the original population of chips.

(d) What are the mean & variance of the original population?

(e) What are the mean & variance of the sample mean?

Page 62: Continuous Random Variables

Show in table form the distribution of the sample mean (with n=2, sampled with replacement).

sample sample mean1,1 1.01,2 1.52,1 1.51,3 2.03,1 2.02,2 2.02,3 2.53,2 2.53,3 3.0

xsample mean probability

1.0 1/91.5 2/9

2.0 3/9

2.5 2/9

3.0 1/9

x

Page 63: Continuous Random Variables

Graph the distribution of the sample mean.

xsample mean probability

1.0 1/91.5 2/9

2.0 3/9

2.5 2/9

3.0 1/9

0 0.5 1.0 1.5 2.0 2.5 3.0

sample mean

Probability

3/9

2/9

1/9

x

Page 64: Continuous Random Variables

Graph the distribution of the original population of chips.

0 1 2 3 x

Probability

1/3

Since there are three equally likely values (1, 2, and 3), the distribution looks like this.

Page 65: Continuous Random Variables

What are the mean & variance of the original population?

x p(x) xp(x)

1 1/3 1/3

2 1/3 2/3

3 1/3 3/3

=6/3=2

Page 66: Continuous Random Variables

What are the mean & variance of the original population?

x2 x2p(x)

1 1/3

4 4/3

9 9/3

E(X2) =14/3

V(X) = E(X2)- [E(X)]2 = 14/3 – 22 = 2/3

x p(x) xp(x)

1 1/3 1/3

2 1/3 2/3

3 1/3 3/3

=6/3=2

Page 67: Continuous Random Variables

What are the mean & variance of the sample mean ?

1.0 1/9 1/9

1.5 2/9 3/9

2.0 3/9 6/9

2.5 2/9 5/9

3.0 1/9 3/9

X

)xp(x )xp( x

29/18)( XE

Page 68: Continuous Random Variables

What are the mean & variance of the sample mean ?

1.0 1/9 1/9

1.5 2/9 3/9

2.0 3/9 6/9

2.5 2/9 5/9

3.0 1/9 3/9

X

)xp(x )xp( x

29/18)( XE

)xp( 22

xx

1.00 0.111

2.25 0.500

4.00 1.333

6.25 1.389

9.00 1.000333.4)(

2

XE

1/3or 333.0 2333.4

)]([)()(2

222

XEXEXV

Page 69: Continuous Random Variables

In our chip example, we found that

3

2)( XV

2)( XE2)( XE

3

1)( XV

The expected value (or mean) of the original population and the sample mean are the same.

However, the variance of the sample mean is smaller than the variance of the original population.

Page 70: Continuous Random Variables

In general,

2)( XV

)(XE)(XE

nXV

2

)(

The mean of the original population and the mean of the sample are the same.

However, as long as the sample size is more than one, the variance of the sample mean is smaller than the variance of the original population.

Page 71: Continuous Random Variables

Intuition: Suppose that each person in the class randomly sampled 50 men from a population of men whose average height is 70 inches. Each student then calculated his/her sample mean.

If you averaged together all the sample means, you’d expect to get something very close to 70 inches.

The expected value or mean of the original population and the expected value of the sample mean are the same.

Page 72: Continuous Random Variables

Why does the sample mean have a smaller variance than the original population?

Suppose 5% of the population is taller than 78 inches (6’6”).

Then there’s a 0.05 probability of selecting at random an individual with a height over 6’6” inches.

It is much less likely that you will select at random 50 men whose average is over 6’6”.

You get a mixture of tall guys & short guys, so your average tends to be pretty close to the average for the population.

So the distribution of the sample mean clusters more tightly about the average than does the distribution of the original population.

That is, the variance of the sample mean is smaller than the variance of the original population.

Page 73: Continuous Random Variables

Central Limit Theorem

As the sample size n increases, the distribution of

the sample mean of a random sample from a

population (not necessarily normal) with mean

and variance 2 approaches normal with mean

and variance 2/n.

X

Page 74: Continuous Random Variables

Example: Suppose the grades of a large class have a mean of 72 and a standard deviation of 9.

a. What is the probability that the average grade of a random sample of 25 students will be above 77?

b. If the population is normal, what is the probability that an individual student drawn at random will have a grade over 77?

Notice that in part b, we need to assume normality but we didn’t in part a.

This is because the central limit theorem assures us of an approximately normal distribution for the sample mean of a reasonably large sample, but not for the distribution of a single observation.

To be assured of that, we need to know that the distribution of the original population was normal.

Page 75: Continuous Random Variables

What is the probability that the average grade of a random sample of 25 students will be above 77?

78.2Pr Z

)77Pr( X

= 1 - 0.9973

= 0.00270 2.78 Z

259

7277Pr

n

X

0.9973

Page 76: Continuous Random Variables

If the population is normal, what is the probability that an individual student drawn at random

will have a grade over 77?

56.0Pr Z

)77Pr( X

= 1 - 0.7123

= 0.28770 0.56 Z

9

7277Pr

X

0.7123

Page 77: Continuous Random Variables

Notice

The probability of selecting an individual at random who has a grade five points above the class mean is much greater than the probability of randomly selecting a sample of 25 students who have an average that is five points above the class mean. (0.2877 versus 0.0027)

Page 78: Continuous Random Variables

Problem

Suppose we are sampling (without replacement) and we sample the entire population.

Then our sample mean will always be the same as the population mean. No spread. Zero variance.

If we don’t sample the entire population but do sample a large part of it, our variance will not be zero but it will be very small.

Thus, as the sample size n approaches the population size N, the variance of the sample mean approaches zero.

Our formula for the variance of the sample mean was 2/n.2/n approaches zero as n approaches infinity, not as n

approaches N. So we need to make some adjustment when we sample a

large part of our population.

Page 79: Continuous Random Variables

Finite Population Correction Factor

1

N

nN

1

N

nN

n

When we sample a substantial part of our population (more than 5%), we need to multiply the standard deviation of our sample mean by this factor:

So the standard deviation of the sample mean which was: n

becomes:

Page 80: Continuous Random Variables

The standard deviation of the sample mean adjusted for a large sample

1

N

nN

n

Notice that if you sample just one observation (n=1), the square root part becomes 1, and the formula becomes the old formula:

If you sample the entire population (n=N), the formula has a value of 0.

So the adjustment works the way it should.

n

Page 81: Continuous Random Variables

1N

nN

n

X

now is X edstandardiz our So

Page 82: Continuous Random Variables

Example: What is the probability that the mean of a sample of 36 observations, from a population of 300, will be less than 14, if the population mean & population standard

deviation are 15.30 & 4.10 respectively?

If we sample more than 5% of our population, we need to use the finite population correction factor, and here we are sampling 12%.

)00.14Pr( X

1300

36300

36

10.4

30.1500.14

1

Pr

N

nN

n

X

02.2Pr Z

= 0.0217

-2.02 0 Z

Page 83: Continuous Random Variables

Up until now, as long as our sample was not too large, we have used this standardization formula:

nsX

t

n

XZ

However, what if we don’t know what the population standard deviation is?

The next best thing is the sample standard deviation:

But when we use s instead of , the result is not a normally distributed variable.

It has what is called a t or student’s t distribution.

1

)( 2

1

n

XXs

n

ii

Page 84: Continuous Random Variables

Our t distribution looks very similar to the Z distribution.

Both are symmetric, bell-shaped, & have a mean of zero.

But while the Z has a variance of 1, the t has a variance of (n-1)/(n-3), which is greater than one. So the t is wider than the Z.

So, the numbers are different & we need to use a different table.

Page 85: Continuous Random Variables

The t distribution is tabulated based on the number of “degrees of freedom.”

The number of degrees of freedom, is denoted by dof, df, or the Greek letter nu:

In this context, the number of degrees of freedom is n-1, where n is the number of observations.

For each number of degrees of freedom, there is a different t distribution.

The degrees of freedom are often indicated as a subscript on the t.

For example, t15 is a t distribution with 15 degrees of freedom.

Page 86: Continuous Random Variables

When the sample size is large, the Z and t distributions are virtually indistinguishable.

When we have at least 100 observations, we will be able to use values from the Z table for our t.

Some people use the Z to approximate the t when there are 30 or more observations. When we’re working with s, I prefer to use the t until 100 observations.

When you take Intermediate Statistics (EC252), make sure you know what your instructor uses.

Page 87: Continuous Random Variables

The t distribution is set up differently from the Z table.

The table in your text book gives probabilities that the t is greater than a specified positive number, that is, Pr(t ≥ a).

Some tables are set up differently, so you need to notice how a table is computed when you use it.

Page 88: Continuous Random Variables

0.10 0.05 0.025 0.01 0.005

d.f. = 1

2

3

4

5 3.3649

6

100

0 3.3649 t5

Example: For 6 observations or 5 df, Pr(t5 > 3.3649) = 0.01

Page 89: Continuous Random Variables

Example: What is the probability that the sample mean is less than 800, if the population mean is 843.3419 and the sample standard deviation is 105, based on a sample of 25.

)800Pr( X

X 800 843.3419Pr

s 105n 25

24Pr( t 2.0639 )

025.0

-2.0639 0 t24

Page 90: Continuous Random Variables

Using Excel to solve t distribution problems

On an Excel spreadsheet, you can get the t distribution as follows:

Either click the fx (insert function) button ORclick the formulas tab, and then click insert function

select statistical as the category of function, scroll down to the t.dist or t.inv function, and click on it

fill in the information in the dialog box .

Page 91: Continuous Random Variables

Using Excel to solve t distribution problems

We’re going to look at Excel’s two-tailed t functions. There are also some other t functions on Excel that you can explore.

t.dist.2t - you provide the number x on the horizontal axis and the degrees of freedom, and it then gives you the area or probability in the two tails. (Note that the number you provide must be positive.)

t.inv.2t - you provide the probability or area that you want in the two tails and the degrees of freedom, and it then gives you the number on the horizontal axis.

Page 92: Continuous Random Variables

Example: Suppose that you wanted to use Excel to find the probability that a t with 24 degrees of freedom is less than -2.0639.

Note that the area to the left of -2.0639 is the same as the area to the right of +2.0639.

Select t.dist.2t . Specify 2.0639 as x, 24 for degrees of freedom.

Excel will provide you the probability value of 0.05. Since you only want the left tail, the number you want is half of 0.05 or 0.025 .

-2.0639 0 2.0639 t24

Page 93: Continuous Random Variables

Example: Suppose instead you wanted to use Excel to find out for what value of k Pr(t24 < -k or t24 > k) is equal to 0.05.

Select t.inv.2t . Specify 0.05 for probability, and 24 for degrees of freedom.

Excel will provide you the value on the horizontal axis of 2.0639. -k 0 k t24

Page 94: Continuous Random Variables

Normal Approximations

Recall that in the section on discrete distributions, we discussed the binomial distribution, which we used when we were considering the number of successes (x) on n independent trials with a probability of success on any given trial .We said that the binomial distribution is approximately symmetric if n ≥ 5 and n( ≥ 5 .We also said that the mean of the binomial distribution is = n and the variance is 2 = n(), so the standard deviation is . n (1- )

Page 95: Continuous Random Variables

It is sometimes easier to use the normal distribution to approximate a binomial probability than to calculate the precise binomial probability.

The normal distribution provides a reasonable approximation for the binomial if n ≥ 5 and n( ≥ 5.

However, because the binomial is discrete and the normal is continuous, we make an adjustment which we call the “continuity correction”.

Recall that for discrete probability distributions, the probability that the variable takes on a particular value x is the height of a spike at that point. However, for a continuous probability distribution, the probability is the area under the probability density curve.

To adjust for that difference, we include an extra half a unit on the ends of our intervals when we calculate normal probabilities as approximations for binomial probabilities.

Page 96: Continuous Random Variables

Examples

(1) To estimate binomial probability Pr(X = 20), calculate normal probability Pr(19.5 ≤ X ≤ 20.5).

(2) To estimate binomial probability Pr(3 ≤ X ≤ 7), calculate normal probability Pr(2.5 ≤ X ≤ 7.5).

(3) To estimate binomial probability Pr(X ≥ 10), calculate normal probability Pr(X ≥ 9.5).

(4) To estimate binomial probability Pr(X < 15), calculate normal probability Pr(X < 15.5).

Page 97: Continuous Random Variables

Example: Suppose that the probability that any individual in a given population supports a particular proposition is 0.20. Consider a group of 500 people. Determine the probability that at least 75 people support the proposition.

In this problem, n = 500 and So n = 100 ≥ 5 and n( = 400 ≥ 5 , and this should be an excellent approximation.The mean is n = 100 and the standard deviation is

n (1- ) = 500(0.20)(0.80) 80 8.9443 .

For the continuity correction, estimate the binomial probability Pr(X ≥ 75) using the normal probability Pr(X ≥ 74.5).

We need to standardize the normal by subtracting off the mean and dividing by the standard deviation.

Page 98: Continuous Random Variables

So we have this: Pr(X ≥ 74.5) = Pr[(X- )/≥ (74.5 – 100)/8.9443]

= 1- Pr(Z < -2.85)= 1 - 0.0022 = 0.9978So it’s extremely likely that at least 75 people in

a sample of 500 will support the proposition.

= Pr(Z ≥ -2.85)

-2.85 0 Z

0.9978